🚧 under construction
Last updated 1 year ago
JS ⟩ value ⟩ primitive ⟩ String ⟩ Unicode ⟩ encoding ⟩ UTF-16 ⟩ code unit
UTF-16 code units are 16-bit unsigned integers. (0x0000 ~ 0xFFFF)
string.codePointAt(index)
string.slice(start, end)
string[index]
uses index of code units, not characters
.split("") will split by UTF-16 code units and will separate surrogate pairs
split("")
By default, regular expressions work on code units, not actual characters
(dangerous) use code units
string.length
string.charCodeAt(index)
(safe) use code points
string iteration (for-of loop)
replit ⟩ code unit
// code
Unicode
Eloquent JavaScript ⟩ Strings & Character Codes
UTF-16 characters, code points, grapheme clusters
String.prototype ⟩
.charCodeAt()
.codePointAt()
String ⟩
.fromCodePoint()
.fromCharCode()