
ASCLL 和 Unicode 是两种字符编码方式,表示如何将字符使用二进制存储。(ASCII and Unicode are two character encodings. Basically, they are standards on how to represent difference characters in binary so that they can be written, stored, transmitted, and read in digital media.)
ASCLL 和 Unicode 的主要区别在于字符编码方式编码字符使用的字节数。 ASCII 使用 8 bite 来编码每个字符,而 Unicode 使用可变长度来编码字符。(The main difference between the two is in the way they encode the character and the number of bits that they use for each. ASCII originally used seven bits to encode each character. This was later increased to eight with Extended ASCII to address the apparent inadequacy of the original. In contrast, Unicode uses a variable bit encoding characters where you can choose between 32, 16, and 8-bit encodings.)
Unicode 出现的主要原因之一是越来越多的对 ASCII 的非标准扩展。Unicode几乎消除了这个问题,因为所有字符都是标准化的。(One of the main reasons why create Unicode was the problem arose from the many non-standard extended ASCII programs. Unicode virtually eliminates this problem as all the character code points were standardized.)
Unicode 可以容纳更多字符。目前, Unicode 包含大多数书面语言,且仍有大量剩余空间。所以 Unicode 不会很快被替换。(Another major advantage of Unicode is that at its maximum it can accommodate a huge number of characters. Because of this, Unicode currently contains most written languages and still has room for even more. So Unicode won’t be replaced anytime soon.)
Unicode 兼容 ASCII。(In order to maintain compatibility with the older ASCII, which was already in widespread use at the time,Unicode was designed in such a way that the first eight bits matched that of the most popular ASCII page.)

  1. ASCII uses an 8-bit encoding while Unicode uses a variable bit encoding.
  2. Unicode is standardized while ASCII isn’t.
  3. Unicode represents most written languages in the world while ASCII does not.
  4. ASCII has its equivalent within Unicode.

UTF-8, UTF-16, UTF-32

UTF(Unicode Transformation Format)表示 Unicode 转换格式。UTF 是将 Unicode 字符集编码为等效二进制的标准家族。UTF 是可变宽度编码,可使用户能够以最小的空间量对字符进行编码的标准化方法。UTF编码标准中有三类:UTF-8, UTF-16, UTF-32。三种编码标准只是用于编码每个字符的最小字节数不同。简单来说,Unicode是一种编码规范,UTF是Unicode编码规范的实际存储方式
如 UTF-8 最小可使用1个字节(8位),UTF-16最小可使用2个字节(16位),UTF-32最小可使用4个字节(32位)。(UTF stands for Unicode Transformation Format. It is a family of standards for encoding the Unicode character set into its equivalent binary value. UTF was developed so that users have a standardized means of encoding the characters with the minimal amount of space. UTF-8,UTF 16 and UTF 32 are only three of the established standards for encoding. They only differ in how many bytes they use to encode each character. Since three are variable width encoding, they can use up to four bytes to encode the data but when it comes to the minimum, UTF-8 only uses 1 byte (8bits) and UTF-16 uses 2 bytes(16bits) and UTF-32 uses 4 bytes(32bits).
注意,UTF-8 兼容 ASCII 编码,而 UTF-16 和 UTF-32 则不兼容 ASCII 编码。(The main advantage of UTF-8 is that it is backwards compatible with ASCII. When encoding a file that uses only ASCII characters with UTF-8, the resulting file would be identical to a file encoded with ASCII. This is not possible when using UTF-16 or UTF-32 as each character would be two or four bytes long.)
UTF-8 是面向字节的格式,因此可用于面向字节网络或文件。而 UTF-16 和 UTF-32 不是面向字节的,需要建立字节顺序才可用于面向字节的网络。(UTF-8 is byte oriented format and therefore has no problems with byte oriented networks or file. UTF-16 and UTF-32, on the other hand, are not byte oriented and need to establish a byte order in order to work with byte oriented networks.

  1. UTF-8 and UTF-16 and UTF-32 are all used for encoding characters
  2. UTF-8 uses a byte at the minimum in encoding the characters while UTF-16 uses two and UTF-32 uses four
  3. A UTF-8 encoded file tends to be smaller than a UTF-16 or UTF-32 encoded file
  4. UTF-8 is compatible with ASCII while UTF-16 and UTF-32 are incompatible with ASCII
  5. UTF-8 is byte oriented while UTF-16 and UTF-32 is not
  6. UTF-8 is better in recovering from errors compared to UTF-16 and UTF-32



参考 Difference Between Unicode and ASCII

原创不易,如果本文对您有帮助,欢迎关注我,谢谢 ~_~


