特殊 (Unicode區段)

包含一些特殊代码点和两个非字符的Unicode块

特殊字元是Unicode的一個簡短的區段,分配在基本多文種平面的最末端,位於U+FFF0-FFFF。在這16個碼位中,有5個是從Unicode 3.0開始分配的。

  • U+FFF9 行間註解錨,標誌著註解文字的開始。
  • U+FFFA 行間註解分隔符,標記註解字元的開始。
  • U+FFFB 行間註解終止符,標誌著註解塊的結束。
  • U+FFFC  OBJECT REPLACEMENT CHARACTER,在文字中為另一個未指定的對象提供預留位置,例如在一個複合檔案中。
  • U+FFFD � REPLACEMENT CHARACTER(替換字元),用於替換一個未知的、不被認可的或無法表示的字元。
  • U+FFFE <非字元-FFFE> 不是一個字元。
  • U+FFFF <非字元-FFFF> 不是一個字元。
特殊字元
Specials
範圍U+FFF0..U+FFFF
(16個碼位)
平面基本多文種平面BMP
文字通用
已分配5個碼位
未分配9個保留碼位
2個非字元
統一碼版本歷史
1.0.01 (+1)
2.12 (+1)
3.05 (+3)
碼表
點擊此處
註釋[1][2]

FFFE和FFFF不是通常意義上的未分配字元,但不是Unicode字元。它們可以用來猜測一個文字的編碼方案,因為根據定義,任何包含這些的文字都不是一個正確編碼的Unicode文字。Unicode的U+FEFF BYTE ORDER MARK字元可以插在Unicode文字的開頭,以表示它的位元組性:一個程式在閱讀這樣的文字並遇到0xFFFE時,就會知道它應該為後面的所有字元轉換位元組順序。

它在Unicode 1.0中的區段名是特殊。[3]

特殊字元[1][2][3]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+FFFx IAA IAS IAT
注釋
1.^ 依據Unicode 15.0
2.^ 灰色區域表示未分配的代碼點。
3.^ 黑色區域表示非字元英語Universal_Character_Set_characters#Non-characters(保證在Unicode標準中永遠不會被分配為編碼字元的碼位)。

歷史

以下Unicode文件記錄了定義本區塊中特定字元的目的與過程:

版本 最終碼位[a] 碼位數 L2英語International Committee for Information Technology Standards ID WG2英語ISO/IEC JTC 1/SC 2 ID 文件
1.0.0 U+FFFD 1 (to be determined)
U+FFFE..FFFF 2 (to be determined)
L2/01-295R Moore, Lisa, Motion 88-M2, Minutes from the UTC/L2 meeting #88, 2001-11-06 
L2/01-355 N2369 (html, doc頁面存檔備份,存於網際網路檔案館)) Davis, Mark, Request to allow FFFF, FFFE in UTF-8 in the text of ISO/IEC 10646, 2001-09-26 
L2/02-154 N2403頁面存檔備份,存於網際網路檔案館 Umamaheswaran, V. S., 9.3 Allowing FFFF and FFFE in UTF-8, Draft minutes of WG 2 meeting 41, Hotel Phoenix, Singapore, 2001-10-15/19, 2002-04-22 
2.1 U+FFFC 1 UTC/1995-056 Sargent, Murray, Recommendation to encode a WCH_EMBEDDING character, 1995-12-06 
UTC/1996-002 Aliprand, Joan; Hart, Edwin; Greenfield, Steve, Embedded Objects, UTC #67 Minutes, 1996-03-05 
N1365 Sargent, Murray, Proposal Summary – Object Replacement Character, 1996-03-18 
N1353頁面存檔備份,存於網際網路檔案館 Umamaheswaran, V. S.; Ksar, Mike, 8.14, Draft minutes of WG2 Copenhagen Meeting # 30, 1996-06-25 
L2/97-288 N1603頁面存檔備份,存於網際網路檔案館 Umamaheswaran, V. S., 7.3, Unconfirmed Meeting Minutes, WG 2 Meeting # 33, Heraklion, Crete, Greece, 20 June – 4 July 1997, 1997-10-24 
L2/98-004R N1681 Text of ISO 10646 – AMD 18 for PDAM registration and FPDAM ballot, 1997-12-22 
L2/98-070 Aliprand, Joan; Winkler, Arnold, Additional comments regarding 2.1, Minutes of the joint UTC and L2 meeting from the meeting in Cupertino, February 25-27, 1998 
L2/98-318 N1894頁面存檔備份,存於網際網路檔案館 Revised text of 10646-1/FPDAM 18, AMENDMENT 18: Symbols and Others, 1998-10-22 
3.0 U+FFF9..FFFB 3 L2/97-255R Aliprand, Joan, 3.D Proposal for In-Line Notation (ruby), Approved Minutes – UTC #73 & L2 #170 joint meeting, Palo Alto, CA – August 4-5, 1997, 1997-12-03 
L2/98-055 Freytag, Asmus, Support for Implementing Inline and Interlinear Annotations, 1998-02-22 
L2/98-070 Aliprand, Joan; Winkler, Arnold, 3.C.5. Support for implementing inline and interlinear annotations, Minutes of the joint UTC and L2 meeting from the meeting in Cupertino, February 25-27, 1998 
L2/98-099 N1727 Freytag, Asmus, Support for Implementing Interlinear Annotations as used in East Asian Typography, 1998-03-18 
L2/98-158 Aliprand, Joan; Winkler, Arnold, Inline and Interlinear Annotations, Draft Minutes – UTC #76 & NCITS Subgroup L2 #173 joint meeting, Tredyffrin, Pennsylvania, April 20-22, 1998, 1998-05-26 
L2/98-286 N1703頁面存檔備份,存於網際網路檔案館 Umamaheswaran, V. S.; Ksar, Mike, 8.14, Unconfirmed Meeting Minutes, WG 2 Meeting #34, Redmond, WA, USA; 1998-03-16--20, 1998-07-02 
L2/98-270 Hiura, Hideki; Kobayashi, Tatsuo, Suggestion to the inline and interlinear annotation proposal, 1998-07-29 
L2/98-281R (pdf, html頁面存檔備份,存於網際網路檔案館)) Aliprand, Joan, In-Line and Interlinear Annotation (III.C.1.c), Unconfirmed Minutes – UTC #77 & NCITS Subgroup L2 # 174 JOINT MEETING, Redmond, WA -- July 29-31, 1998, 1998-07-31 
L2/98-363 N1861頁面存檔備份,存於網際網路檔案館 Sato, T. K., Ruby markers, 1998-09-01 
L2/98-372 N1884R2 (pdf, doc頁面存檔備份,存於網際網路檔案館)) Whistler, Ken; et al, Additional Characters for the UCS, 1998-09-22 
L2/98-416 N1882.zip Support for Implementing Interlinear Annotations, 1998-09-23 
L2/98-329 N1920頁面存檔備份,存於網際網路檔案館 Combined PDAM registration and consideration ballot on WD for ISO/IEC 10646-1/Amd. 30, AMENDMENT 30: Additional Latin and other characters, 1998-10-28 
L2/98-421R Suignard, Michel; Hiura, Hideki, Notes concerning the PDAM 30 interlinear annotation characters, 1998-12-04 
L2/99-010 N1903 (pdf, html頁面存檔備份,存於網際網路檔案館), doc頁面存檔備份,存於網際網路檔案館)) Umamaheswaran, V. S., 8.2.15, Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25, 1998-12-30 
L2/98-419 (pdf, doc頁面存檔備份,存於網際網路檔案館)) Aliprand, Joan, Interlinear Annotation Characters, Approved Minutes -- UTC #78 & NCITS Subgroup L2 # 175 Joint Meeting, San Jose, CA -- December 1-4, 1998, 1999-02-05 
UTC/1999-021 Duerst, Martin; Bosak, Jon, W3C XML CG statement on annotation characters, 1999-06-08 
L2/99-176R Moore, Lisa, W3C Liaison Statement on Annotation Characters, Minutes from the joint UTC/L2 meeting in Seattle, June 8-10, 1999, 1999-11-04 
L2/01-301 Whistler, Ken, E. Indicated as "strongly discouraged" for plain text interchange, Analysis of Character Deprecation in the Unicode Standard, 2001-08-01 
  1. ^ 建議的碼位和字元名稱可能與最終的結果不同。

參考資料

  1. ^ Unicode character database. The Unicode Standard. [2016-07-09]. (原始內容存檔於2022-09-25). 
  2. ^ Enumerated Versions of The Unicode Standard. The Unicode Standard. [2016-07-09]. (原始內容存檔於2016-06-29). 
  3. ^ 3.8: Block-by-Block Charts (PDF). The Unicode Standard. version 1.0. Unicode Consortium. [2022-09-30]. (原始內容存檔 (PDF)於2016-02-11).