🔰surrogate pair
two-code-unit character.
Last updated
two-code-unit character.
Last updated
JS ⟩ value ⟩ primitive ⟩ String ⟩ Unicode ⟩ encoding ⟩ UTF-16 ⟩ surrogate pair
code points beyond the basic multilingual plane (also called astral code points) that are transformed into two UTF-16 code units.
two parts of the pair must be in 0xD800
~ 0xDFFF
. (2^11 = 0x800)
high-surrogate code unit : 0xD800 ~ 0xDBFF. (2^10 = 0x400)
low-surrogate code unit : 0xDC00 ~ 0xDFFF. (2^10 = 0x400)
these code units (called "lone surrogates") are not used to encode single-code-unit characters.