Question 1

What is the difference between Unicode and UTF-8?

Accepted Answer

Unicode is a standard that assigns a unique code point to every character worldwide (e.g., U+AC00 = 가). UTF-8 is one of several encoding formats that store those code points as bytes. UTF-16 and UTF-32 are other formats.

Question 2

What are surrogate pairs?

Accepted Answer

Characters above U+FFFF (emoji and other supplementary characters) cannot be represented with a single 4-digit Unicode escape. They are encoded as two code units: a high surrogate (U+D800–U+DBFF) followed by a low surrogate (U+DC00–U+DFFF). This is the standard encoding method for supplementary characters in JSON and Java.

Question 3

Is Unicode escaping required in JSON?

Accepted Answer

No. The JSON spec allows but does not require escaping non-ASCII characters as \uXXXX. If stored as UTF-8, they can be used as-is. However, escaping is useful in ASCII-only environments or for debugging.

Question 4

Are ASCII characters also Unicode?

Accepted Answer

Yes. The first 128 Unicode code points (U+0000–U+007F) are identical to ASCII. For example, A is U+0041 and a is U+0061. This tool encodes all characters including ASCII as \uXXXX.

Unicode Encoder/Decoder

How to Calculate

Example

FAQ

Related Tools

ASCII Encoder/Decoder

HTML Encoder/Decoder

Base64 Encoder/Decoder

URL Encoder/Decoder

Case Converter