🔰Unicode
🚧 under construction -> tidy this page
JS ⟩ value ⟩ primitive ⟩ String ⟩ Unicode
every Unicode character is assigned a code point.
code points are divided into 17 code planes.
one or more code points can be combined into a single grapheme cluster.
character encoding transforms code points into code units.
most JavaScript engines use UTF-16 encoding.
• Script (a writing system) = Cyrillic, Greek, Arabic, Han (Chinese) ... 👉 full list)
main categories and subcategories
Letter
L
:lowercase
Ll
modifier
Lm
,titlecase
Lt
,uppercase
Lu
,other
Lo
.
Number
N
:decimal digit
Nd
,letter number
Nl
,other
No
.
Punctuation
P
:connector
Pc
,dash
Pd
,initial quote
Pi
,final quote
Pf
,open
Ps
,close
Pe
,other
Po
.
Mark
M
(accents etc):spacing combining
Mc
,enclosing
Me
,non-spacing
Mn
.
Symbol
S
:currency
Sc
,modifier
Sk
,math
Sm
,other
So
.
Separator
Z
:line
Zl
,paragraph
Zp
,space
Zs
.
Other
C
:control
Cc
,format
Cf
,not assigned
Cn
,private use
Co
,surrogate
Cs
.
Last updated