Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Unicode Property Escapes JavaScript Regular Expressions
Unicode property escapes in JavaScript regular expressions allow you to match characters based on their Unicode properties using the u flag. This feature enables precise matching of characters by their Unicode categories, scripts, or properties.
Syntax
/\p{PropertyName}/u
/\P{PropertyName}/u // Negated form
The \p{} matches characters with the specified property, while \P{} matches characters WITHOUT that property.
Common Unicode Properties
| Property | Description | Example Characters |
|---|---|---|
Letter |
Any letter | A, B, ?, ?, ? |
Number |
Any number | 1, 2, ?, ? |
Emoji_Presentation |
Emoji characters | ?, ?, ? |
Script=Latin |
Latin script | A-Z, a-z |
Example: Extracting Emojis
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Unicode Property Escapes</title>
</head>
<body>
<div id="text">Hello ?? World ?? 123!</div>
<button onclick="extractEmojis()">Extract Emojis</button>
<div id="result"></div>
<script>
function extractEmojis() {
const text = document.getElementById('text').textContent;
const emojiRegex = /\p{Emoji_Presentation}/gu;
const emojis = text.match(emojiRegex);
document.getElementById('result').innerHTML =
'Found emojis: ' + (emojis ? emojis.join(' ') : 'None');
}
</script>
</body>
</html>
Example: Matching Letters and Numbers
const text = "Hello ?? 123 ??? !@#";
// Match all letters
const letters = text.match(/\p{Letter}/gu);
console.log("Letters:", letters);
// Match all numbers
const numbers = text.match(/\p{Number}/gu);
console.log("Numbers:", numbers);
// Match non-letters (negated)
const nonLetters = text.match(/\P{Letter}/gu);
console.log("Non-letters:", nonLetters);
Letters: [ 'H', 'e', 'l', 'l', 'o', '?', '?', '?', '?', '?' ] Numbers: [ '1', '2', '3' ] Non-letters: [ ' ', ' ', '1', '2', '3', ' ', ' ', '!', '@', '#' ]
Example: Script-Specific Matching
const mixedText = "Hello ???? ?????? ??? ?? ????";
// Match Latin script
const latin = mixedText.match(/\p{Script=Latin}/gu);
console.log("Latin:", latin.join(''));
// Match Devanagari (Hindi)
const devanagari = mixedText.match(/\p{Script=Devanagari}/gu);
console.log("Devanagari:", devanagari.join(''));
// Match Han (Chinese)
const han = mixedText.match(/\p{Script=Han}/gu);
console.log("Han:", han.join(''));
Latin: Hello Devanagari: ??????????????? Han: ????
Key Points
- Always use the
uflag for Unicode property escapes to work - Use
\p{}for positive matching and\P{}for negative matching - Property names are case-sensitive
- Supports both general categories (Letter, Number) and specific scripts (Latin, Han)
Browser Compatibility
Unicode property escapes are supported in modern browsers (Chrome 64+, Firefox 78+, Safari 11.1+). Not supported in Internet Explorer.
Conclusion
Unicode property escapes provide powerful character matching capabilities based on Unicode properties. They're essential for internationalized applications and precise text processing across different languages and scripts.
