🔰Unicode
🚧 under construction -> tidy this page
Last updated
🚧 under construction -> tidy this page
Last updated
JS ⟩ value ⟩ primitive ⟩ String ⟩ Unicode
every Unicode character is assigned a code point.
code points are divided into 17 code planes.
one or more code points can be combined into a single grapheme cluster.
character encoding transforms code points into code units.
most JavaScript engines use UTF-16 encoding.
• Script (a writing system) = Cyrillic, Greek, Arabic, Han (Chinese) ... 👉 full list)
Letter L
:
lowercase Ll
modifier Lm
,
titlecase Lt
,
uppercase Lu
,
other Lo
.
Number N
:
decimal digit Nd
,
letter number Nl
,
other No
.
Punctuation P
:
connector Pc
,
dash Pd
,
initial quote Pi
,
final quote Pf
,
open Ps
,
close Pe
,
other Po
.
Mark M
(accents etc):
spacing combining Mc
,
enclosing Me
,
non-spacing Mn
.
Symbol S
:
currency Sc
,
modifier Sk
,
math Sm
,
other So
.
Separator Z
:
line Zl
,
paragraph Zp
,
space Zs
.
Other C
:
control Cc
,
format Cf
,
not assigned Cn
,
private use Co
,
surrogate Cs
.