Tools

Unicode Encoder/Decoder

Convert text to Unicode escape sequences or decode them back.

How to Calculate

Unicode escape represents characters as code points in the \uXXXX format. Common use cases: • Representing special characters in source code • Encoding non-ASCII characters in JSON data • Internationalization (i18n) tasks Encoding formats: • Basic Multilingual Plane (BMP): \uXXXX (4-digit hex) • Supplementary characters (emoji, etc.): \uXXXX\uXXXX (surrogate pair, JSON/Java compatible) • All characters (including ASCII) are converted to \uXXXX format. Decoding supports: \uXXXX · \u{XXXXX}

Example

Input: 안녕하세요 Encoded: \uC548\uB155\uD558\uC138\uC694 Input: Hi! Encoded: \u0048\u0069\u0021

FAQ

What is the difference between Unicode and UTF-8?
Unicode is a standard that assigns a unique code point to every character worldwide (e.g., U+AC00 = 가). UTF-8 is one of several encoding formats that store those code points as bytes. UTF-16 and UTF-32 are other formats.
What are surrogate pairs?
Characters above U+FFFF (emoji and other supplementary characters) cannot be represented with a single 4-digit Unicode escape. They are encoded as two code units: a high surrogate (U+D800–U+DBFF) followed by a low surrogate (U+DC00–U+DFFF). This is the standard encoding method for supplementary characters in JSON and Java.
Is Unicode escaping required in JSON?
No. The JSON spec allows but does not require escaping non-ASCII characters as \uXXXX. If stored as UTF-8, they can be used as-is. However, escaping is useful in ASCII-only environments or for debugging.
Are ASCII characters also Unicode?
Yes. The first 128 Unicode code points (U+0000–U+007F) are identical to ASCII. For example, A is U+0041 and a is U+0061. This tool encodes all characters including ASCII as \uXXXX.

Related Tools