HTML Entity Encoder
Convert special characters to HTML entities and vice versa.
About this tool
Encode special characters like <, >, &, and quotes into their HTML entity equivalents for safe display in web pages. Decode HTML entities back to their original characters.
Features
- Convert special characters to HTML entities
- Decode HTML entities back to original characters
- Handle all standard HTML entity references
- Copy converted output to clipboard
How to Use
- Paste text containing special characters into the input
- Select "Encode" to convert to HTML entities or "Decode" to reverse
- View the converted output instantly
- Copy the result for use in your HTML pages
Frequently Asked Questions
Which characters actually need encoding?
In body text, only < > and &. In attribute values, add " and '. Everything else is legal as its literal character in UTF-8-encoded HTML. Encoding more is harmless but clutters the source — the minimum safe set is five characters.
What's the difference between named and numeric entities?
Named entities (&, ©) are easier to read; numeric (&, © or &, ©) work regardless of the parser's entity table. HTML5 defines about 2200 named entities; XML parsers only know five. For cross-context safety, prefer numeric.
Does encoding defend against XSS?
Partly. Output-encoding in the right context (HTML body, attribute, URL, CSS, JS) is a core XSS defence, but each context needs the right encoder. HTML-entity-encoding a value destined for a JS string doesn't help — it still gets executed. Use context-aware sanitisers like DOMPurify for full defence.
Why do non-ASCII characters like é or 中 not get encoded?
They don't need to be, as long as your page is served as UTF-8 (which it should be). Legacy systems that can't declare a charset still use entities like é, but modern HTML5 documents pass literal Unicode through safely.
Will this break apostrophes inside JSON-in-HTML?
The entity ' (or ') is safe inside HTML but not strictly valid JSON — JSON.parse will choke on it. If you're emitting JSON inside an HTML script tag, encode only the three characters that break the script context (< > &) and leave quotes alone.