Help
What is HTML?
HTML (abbreviation of Hypertext Markup Language) is a markup language used to create web pages. It's an application of SGML (Standard Generalized Markup Language).
(read more >>)Why would I need to clean HTML?
There are various reasons, like
Migration to a new website or system
When you are a developer rewriting an old app, you need to resuse the existing HTML from the old app, but you need to reformat it. Before reformatting, it is better get rid of the old formatting. This is what HTML Washer is good for.Utilizing a generated HTML
When you export a HTML from a system, it is sometimnes very sofiticatelly formated, but perhaps you don't need such a complicated formattingUtilizing a HTML from someone else
When someone gives you HTML texts, you want the HTML structure but don't like their formatting, you need to clean it up to the basics
How do I clean the HTML?
You can do that by the copy&pasting the text or uploading a HTML file.
Copy-Pasting
Go to the Homepage and copy&paste the text there, then hit the Wash button. Then Copy&paste your clean HTML.Uploading
Go to the Upload page, then uppload a HTML file, then you will be able to download the cleaned-up HTML.
What exactly does it do?
- Fixes or removes non-well formed tags and attributes (e.g. adds alt attributes to images if missing)
- Converts the markup to HTML5 (if it is XHTML for example)
- Reduces the markup to: <a href>, <body>, <h1>, <h2>, <h3>, <h4>, <h5>, <h6>, <head>, <hr>, <html>, <i>, <img src width height alt>, <li>, <ol>, <p>, <ruby>, <strong>, <table>, <tbody>, <td colspan rowspan>, <th colspan rowspan>, <title>, <tr>, <ul>
- Replaces: <b> to <strong>, <div> to <p>
- Reformats the HTML (line breaks, indents)
Example
Input: <p class="funny" onlick="alert('LOL')">bla bla</p>
Will be simplified to: <p>bla bla</p>