HTML Entity Encoder / Decoder
Transform raw characters into safe HTML entities - or decode them back. Real-time, 100% in your browser, zero data sent anywhere.
The Complete Guide to HTML Entity Encoding
If you are building web applications or generating HTML dynamically, understanding entity encoding is not optional - it is a core security and correctness requirement. This guide explains what entities are, why they matter, and how to apply them correctly.
How to use this tool
Paste any raw text, HTML snippet, or code fragment into the left pane. The right pane instantly shows the converted output. Use the Mode toggle to switch between encoding raw text into safe entities and decoding entities back to raw characters. The Encoding Strictness dropdown controls how aggressively characters are converted:
Standard / Safe - escapes only the five reserved HTML characters: < > & " '. This is the correct setting for most XSS prevention tasks when inserting untrusted strings into HTML body or attribute contexts.
Extended (Named Entities) - additionally encodes a wide dictionary of characters that have named entity equivalents, such as © to ©, € to €, and curly quotes. Useful when targeting older systems or email HTML renderers that do not handle Unicode well.
Hexadecimal / Decimal (Numeric) - converts every character to its numeric Unicode reference. Hex mode prefixes the code point with x (e.g. <). This is the most universal form and works for any character regardless of whether it has a named entity.
Why entity encoding is a security requirement
Every modern web application receives input from untrusted sources: form fields, URL parameters, API responses, database records created by users. When that data is rendered inside an HTML document without transformation, the browser parses it as markup. An attacker who can control even a small piece of displayed text can inject <script> tags, onerror attributes on images, or javascript: protocol URIs in anchor tags, all of which execute arbitrary JavaScript.
Entity encoding removes the attack surface: <script> becomes <script>, which the browser renders as the visible string <script> without ever parsing it as a tag. This is why every modern web security framework (React's JSX, Django's template engine, Rails' ERB) applies HTML escaping by default on any value interpolated into the template output.
Named vs. numeric entities: when to use which
Named entities like & and < are readable and concise. They are the preferred form for the five reserved characters. For everything else, numeric entities are more portable because they depend only on the Unicode standard, not on an HTML entity dictionary. When writing HTML emails or targeting very old browsers, numeric entities are the safer choice. For modern web output, Standard encoding covering the five reserved characters is almost always sufficient.
The output context rule
The most common encoding mistake is applying the right encoding in the wrong place. HTML entity encoding is correct only when inserting data into an HTML text node or HTML attribute value. Inside a <script> block, you need JavaScript string escaping (backslash sequences), not HTML entities. Inside a URL parameter, you need percent-encoding. Inside a CSS content property, you need a different escape syntax. Mixing these up - for instance, percent-encoding a value placed in an HTML attribute - both breaks functionality and can introduce new vulnerabilities.
Frequently Asked Questions
< becomes <, > becomes >, turning executable tags into inert visible text. A payload like <script>alert('xss')</script> becomes <script>alert('xss')</script> and is displayed harmlessly on screen without ever running.
< or &. At worst, it enables stored or reflected XSS: an attacker plants a script in your database (stored) or crafts a URL that delivers the payload (reflected), and every visitor who loads the page executes the injected code. Consequences range from session hijacking and credential theft to full account takeover. XSS is consistently ranked in the OWASP Top 10 most critical web application security risks for this reason.
& for &, < for <, or © for the copyright symbol. Named entities exist only for the characters the spec explicitly lists. A numeric entity references any character by its Unicode code point: decimal format uses A for the letter A, hexadecimal format uses A for the same character. Numeric entities cover the entire Unicode range of over 140,000 characters, making them far more universal than named entities.