Unicode Encoder/Decoder
Convert text to Unicode escape sequences or decode them back.
How to Calculate
Unicode escape represents characters as code points in the \uXXXX format.
Common use cases:
• Representing special characters in source code
• Encoding non-ASCII characters in JSON data
• Internationalization (i18n) tasks
Encoding formats:
• Basic Multilingual Plane (BMP): \uXXXX (4-digit hex)
• Supplementary characters (emoji, etc.): \uXXXX\uXXXX (surrogate pair, JSON/Java compatible)
• All characters (including ASCII) are converted to \uXXXX format.
Decoding supports: \uXXXX · \u{XXXXX}
Example
FAQ
What is the difference between Unicode and UTF-8?
Unicode is a standard that assigns a unique code point to every character worldwide (e.g., U+AC00 = 가). UTF-8 is one of several encoding formats that store those code points as bytes. UTF-16 and UTF-32 are other formats.
What are surrogate pairs?
Characters above U+FFFF (emoji and other supplementary characters) cannot be represented with a single 4-digit Unicode escape. They are encoded as two code units: a high surrogate (U+D800–U+DBFF) followed by a low surrogate (U+DC00–U+DFFF). This is the standard encoding method for supplementary characters in JSON and Java.
Is Unicode escaping required in JSON?
No. The JSON spec allows but does not require escaping non-ASCII characters as \uXXXX. If stored as UTF-8, they can be used as-is. However, escaping is useful in ASCII-only environments or for debugging.
Are ASCII characters also Unicode?
Yes. The first 128 Unicode code points (U+0000–U+007F) are identical to ASCII. For example, A is U+0041 and a is U+0061. This tool encodes all characters including ASCII as \uXXXX.
Related Tools
ASCII Encoder/Decoder
Convert text to and from ASCII codes.
HTML Encoder/Decoder
Encode and decode HTML entities.
Base64 Encoder/Decoder
Encode and decode Base64 strings.
URL Encoder/Decoder
Encode and decode URL strings.
Case Converter
Convert text to UPPER, lower, camelCase, snake_case, Title Case, and more.