Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Unescape HTML entities in JavaScript?
To unescape HTML entities in JavaScript, we need to understand the difference between URL encoding (handled by unescape()) and HTML entity decoding. HTML entities like <, >, and & require different approaches than URL-encoded strings like %20.
The legacy unescape() function only handles URL percent-encoded strings, not actual HTML entities. For true HTML entity decoding, we use modern methods like the DOM's innerHTML property or DOMParser.
Syntax
Following is the syntax for the legacy unescape() function for URL decoding
unescape(encodedString)
Following is the syntax for HTML entity decoding using DOM methods
// Method 1: Using innerHTML
var element = document.createElement('div');
element.innerHTML = htmlString;
var decoded = element.textContent;
// Method 2: Using DOMParser
var parser = new DOMParser();
var decoded = parser.parseFromString(htmlString, 'text/html').documentElement.textContent;
Understanding URL Decoding vs HTML Entity Decoding
The unescape() function decodes URL percent-encoded strings (like %20 for spaces) but does not handle HTML entities (like < for <). Let's clarify this distinction with examples.
Example URL Percent Decoding
Using the legacy unescape() method to decode URL-encoded strings
<!DOCTYPE html>
<html>
<head>
<title>URL Percent Decoding</title>
</head>
<body style="font-family: Arial, sans-serif; padding: 10px;">
<h2>URL Percent Decoding Example</h2>
<div id="output"></div>
<script>
// URL encoded string with %20 (space), %21 (!)
var urlEncoded = "Demo%20Text%21%21%21";
var decoded = unescape(urlEncoded);
document.getElementById('output').innerHTML =
"Original: " + urlEncoded + "<br>" +
"Decoded: " + decoded;
</script>
</body>
</html>
The output shows URL percent-encoded characters being decoded
Original: Demo%20Text%21%21%21 Decoded: Demo Text!!!
True HTML Entity Decoding
To decode actual HTML entities like <, >, &, and ", we need DOM-based methods since unescape() doesn't handle these.
Method 1 Using innerHTML Property
The most common approach uses a temporary DOM element and its innerHTML property
<!DOCTYPE html>
<html>
<head>
<title>HTML Entity Decoding</title>
</head>
<body style="font-family: Arial, sans-serif; padding: 10px;">
<h2>HTML Entity Decoding Example</h2>
<div id="output"></div>
<script>
function unescapeHTML(htmlString) {
var element = document.createElement('div');
element.innerHTML = htmlString;
return element.textContent || element.innerText || '';
}
// HTML entities string
var htmlEntities = "<div>Hello & Welcome</div>";
var decoded = unescapeHTML(htmlEntities);
document.getElementById('output').innerHTML =
"HTML Entities: " + htmlEntities + "<br>" +
"Decoded: " + decoded;
</script>
</body>
</html>
The output shows HTML entities being properly decoded
HTML Entities: <div>Hello & Welcome</div> Decoded: Hello & Welcome
Method 2 Using DOMParser
Another modern approach uses the DOMParser API for safer HTML entity decoding
<!DOCTYPE html>
<html>
<head>
<title>DOMParser HTML Decoding</title>
</head>
<body style="font-family: Arial, sans-serif; padding: 10px;">
<h2>DOMParser Decoding Example</h2>
<div id="output"></div>
<script>
function unescapeHTMLWithParser(htmlString) {
var parser = new DOMParser();
var doc = parser.parseFromString(htmlString, 'text/html');
return doc.documentElement.textContent;
}
var htmlEntities = "Price: £100 & tax: €20";
var decoded = unescapeHTMLWithParser(htmlEntities);
document.getElementById('output').innerHTML =
"HTML Entities: " + htmlEntities + "<br>" +
"Decoded: " + decoded;
</script>
</body>
</html>
The output demonstrates decoding of currency symbols and special characters
HTML Entities: Price: £100 & tax: ?20 Decoded: Price: £100 & tax: ?20
Complete Example Encoding and Decoding
Following example shows both encoding and decoding processes using modern methods
<!DOCTYPE html>
<html>
<head>
<title>Complete Encoding/Decoding Example</title>
</head>
<body style="font-family: Arial, sans-serif; padding: 10px;">
<h2>HTML Encoding and Decoding</h2>
<div id="output"></div>
<script>
function escapeHTML(text) {
var div = document.createElement('div');
div.textContent = text;
return div.innerHTML;
}
function unescapeHTML(htmlString) {
var div = document.createElement('div');
div.innerHTML = htmlString;
return div.textContent || div.innerText || '';
}
// Original string with special characters
var originalText = "<script>alert('Hello');</script>";
var encoded = escapeHTML(originalText);
var decoded = unescapeHTML(encoded);
document.getElementById('output').innerHTML =
"Original: " + originalText + "<br>" +
"Encoded: " + encoded + "<br>" +
"Decoded: " + decoded;
</script>
</body>
</html>
The output shows the complete cycle of encoding and decoding HTML entities
Original:
Encoded: <script>alert('Hello');</script>
Decoded:
Modern Alternatives to unescape()
The unescape() function is deprecated in modern JavaScript. For URL decoding, use decodeURIComponent() instead
Example Modern URL Decoding
<!DOCTYPE html>
<html>
<head>
<title>Modern URL Decoding</title>
</head>
<body style="font-family: Arial, sans-serif; padding: 10px;">
<h2>Modern vs Legacy URL Decoding</h2>
<div id="output"></div>
<script>
var urlEncoded = "Hello%20World%21%20%40TutorialsPoint";
// Modern approach (recommended)
var modernDecoded = decodeURIComponent(urlEncoded);
// Legacy approach (deprecated)
var legacyDecoded = unescape(urlEncoded);
document.getElementById('output').innerHTML =
"URL Encoded: " + urlEncoded + "<br>" +
"Modern decode: " + modernDecoded + "<br>" +
"Legacy decode: " + legacyDecoded;
</script>
</body>
</html>
Both methods produce the same result, but decodeURIComponent() is the modern standard
URL Encoded: Hello%20World%21%20%40TutorialsPoint Modern decode: Hello World! @TutorialsPoint Legacy decode: Hello World! @TutorialsPoint
Comparison of Methods
| Method | Use Case | Status | Example Input | Example Output |
|---|---|---|---|---|
unescape() |
URL percent encoding | Deprecated | Hello%20World |
Hello World |
decodeURIComponent() |
URL percent encoding | Modern standard | Hello%20World |
Hello World |
innerHTML |
HTML entities | Standard DOM method | <div> |
<div> |
DOMParser |
HTML entities | Modern, safer | |
|
Conclusion
While unescape() only handles URL percent-encoded strings, true HTML entity decoding
