3GPP 27.007

5.5       Select TE character set +CSCS

Table 6: +CSCS parameter command syntax

Command

Possible response(s)

+CSCS=[<chset>]

+CSCS?

+CSCS: <chset>

+CSCS=?

+CSCS: (list of supported <chset>s)

 

Description

Set command informs TA which character set <chset> is used by the TE. TA is then able to convert character strings correctly between TE and MT character sets.

When TA‑TE interface is set to 8‑bit operation and used TE alphabet is 7‑bit, the highest bit shall be set to zero.

NOTE:      It is manufacturer specific how the internal alphabet of MT is converted to/from the TE alphabet.

Read command shows current setting and test command displays conversion schemes implemented in the TA.

Defined values

<chset>: character set as a string type (conversion schemes not listed here can be defined by manufacturers)

"GSM"     GSM 7 bit default alphabet (3GPP TS 23.038 [25]); this setting causes easily software flow control (XON/XOFF) problems.

"HEX"           Character strings consist only of hexadecimal numbers from 00 to FF; e.g. "032FE6" equals three 8-bit characters with decimal values 3, 47 and 230; no conversions to the original MT character set shall be done.

If MT is using GSM 7 bit default alphabet, its characters shall be padded with 8th bit (zero) before converting them to hexadecimal numbers (i.e. no SMS‑style packing of 7‑bit alphabet).

"IRA"           International reference alphabet (see ITU‑T Recommendation T.50 [13]).

"PCCPxxx" PC character set Code Page xxx

"PCDN"    PC Danish/Norwegian character set

"UCS2"         16-bit universal multiple-octet coded character set (see ISO/IEC10646 [32]); UCS2 character strings are converted to hexadecimal numbers from 0000 to FFFF; e.g. "004100620063" equals three 16-bit characters with decimal values 65, 98 and 99.

"UTF-8"      Octet (8-bit) lossless encoding of UCS characters (see RFC 3629 [69]); UTF-8 encodes each UCS character as a variable number of octets, where the number of octets depends on the integer value assigned to the UCS character. The input format shall be a stream of octets. It shall not be converted to hexadecimal numbers as in "HEX" or "UCS2". This character set requires an 8-bit TA – TE interface.

"8859-n"  ISO 8859 Latin n (1‑6) character set

"8859-C"  ISO 8859 Latin/Cyrillic character set

"8859-A"  ISO 8859 Latin/Arabic character set

"8859-G"  ISO 8859 Latin/Greek character set

"8859-H"  ISO 8859 Latin/Hebrew character set

Implementation

Mandatory when a command using the setting of this command is implemented.

======================================================================================

IRA

http://mercury.webster.edu/aleshunas/COSC%205130/Q-IRA.pdf

A familiar example of data is text or character strings. While textual data are most convenient
for human beings, they cannot, in character form, be easily stored or transmitted by data
processing and communications systems. Such systems are designed for binary data. Thus a
number of codes have been devised by which characters are represented by a sequence of bits.
Perhaps the earliest common example of this is the Morse code. Today, the most commonly used
text code is the International Reference Alphabet (IRA).1 Each character in this code is
represented by a unique 7-bit binary code; thus, 128 different characters can be represented.
Table Q.1 lists all of the code values. In the table, the bits of each character are labeled from b7,
which is the most significant bit, to b1, the least significant bit. Characters are of two types:
printable and control (Table Q.2). Printable characters are the alphabetic, numeric, and special
characters that can be printed on paper or displayed on a screen. For example, the bit
representation of the character "K" is b7b6b5b4b3b2b1 = 1001011. Some of the control characters
have to do with controlling the printing or displaying of characters; an example is carriage return.
Other control characters are concerned with communications procedures.
IRA-encoded characters are almost always stored and transmitted using 8 bits per
character. The eighth bit is a parity bit used for error detection. The parity bit is the most
significant bit and is therefore labeled b8. This bit is set such that the total number of binary 1s in
each octet is always odd (odd parity) or always even (even parity). Thus a transmission error that
changes a single bit, or any odd number of bits, can be detected

GSM

https://en.wikipedia.org/wiki/GSM_03.38

GSM 7-bit default alphabet and extension table of 3GPP TS 23.038 / GSM 03.38[edit]

The standard encoding for GSM messages is the 7-bit default alphabet as defined in the 23.038 recommendation.

Seven-bit characters must be encoded into octets following one of three packing modes:

  • CBS: using this encoding, it is possible to send up to 93 characters (packed in up to 82 octets) in one SMS message in a Cell Broadcast Service.
  • SMS: using this encoding, it is possible to send up to 160 characters (packed in up to 140 octets) in one SMS message in the GSM network.
  • USSD: using this encoding, it is possible to send up to 182 characters (packed in up to 160 octets) in one SMS message of Unstructured Supplementary Service Data.

GSM 8-bit data encoding[edit]

8-bit data encoding mode treats the information as raw data. According to the standard, the alphabet for this encoding is user-specific.

UCS-2 Encoding[edit]

This encoding allows use of a greater range of characters and languages. UCS-2 can represent the most commonly used Latin and eastern characters at the cost of a greater space expense. Actually, some cell phones (e.g. iPhones) use UTF-16 instead of UCS-2 to display emoticons in short messages.[4]

A single SMS GSM message using this encoding can have at most 70 characters (140 octets).

Note that on many GSM cell phones, there's no specific preselection of the UCS-2 encoding. The default is to use the 7-bit encoding described above, until one enters a character that is not present in the GSM 7-bit table (for example the lowercase 'a' with acute: 'á'). In that case, the whole message gets reencoded using the UCS-2 encoding, and the maximum length of the message sent in only 1 SMS is immediately reduced to 70 characters, instead of 160. On smartphones the message encoding depends on the SMS application used and its setting as well as on the length of the message. Some smartphones even send longer messages as a multimedia message (MMS).

To avoid unexpected costs for senders that have a subscription for a limited pack of sent SMS, smartphones should display the number of character used and the maximum number of characters in the composed SMS. When a message does exceeds this maximum, the message will be sent as multiple successive SMS containing parts of the message (each one containing a sequence number, which also uses a few leading characters in each part); these parts will be reassembled later by the recipient.

Some GSM smartphones will alert the user about the number of SMS messages needed to send the message, when it requires more than one.

转载于:https://www.cnblogs.com/yueyuechen/p/6520266.html

【IRA/GSM/UCS2】the difference of IRA/GSM/UCS2 character set相关推荐

  1. java ucs 2,【字符编码系列】JavaScript使用的编码-UCS-2

    写在前面的话 本文属于 字符编码系列文章之一,更多请前往 字符编码系列. 在JavaScrip中,进行一些GBK或者UTF-8编码的字符操作时,打印出来的经常是乱码,其原因就是因为JavaScript ...

  2. 【AI初识境】深度学习模型评估,从图像分类到生成模型

    文章首发于微信公众号<有三AI> [AI初识境]深度学习模型评估,从图像分类到生成模型 这是<AI初识境>第10篇,这次我们说说深度学习模型常用的评价指标.所谓初识,就是对相关 ...

  3. 【物联网(IoT)开发】Arduino 简介

    Arduino 的开源.开放.廉价.简单.跨平台等特点使其快速发展起来,成为学习微控制器的首选,成为物联网(IoT)开发的重要组成部分,通过Arduino我们可以从各种传感器感知世界,也可以控制各种执 ...

  4. 视频提取关键帧的三种方式【已调通】

    推荐优化后的视频关键帧提取方法,已经包装成工具类,代码做了优化,性能和效果更好. 视频提取关键帧工具类KeyFramesExtractUtils.py,动态支持三种取帧方式,关键参数可配置,代码经过优 ...

  5. 【论文笔记09】Differentially Private Hypothesis Transfer Learning 差分隐私迁移学习模型, ECMLPKDD 2018

    目录导引 系列传送 Differentially Private Hypothesis Transfer Learning 1 Abstract 2 Bg & Rw 3 Setting &am ...

  6. 【图文详解】HBase 的数据模型与架构原理详解

    HBase 简介 https://hbase.apache.org/ HBase, Hadoop Database,是一个高可靠性.高性能.面向列.可伸缩. 实时读写的分布式开源 NoSQL 数据库, ...

  7. 【托福独立写作】ETS 官方新托福 185 个作文题库话题分类

    ETS 官方新托福 185 作文题库话题分类 目录 第一大类  教育类 第二大类 技术与进步类 第三大类  环境类 第四大类  媒体类 第五大类   工作与成功类 第六大类  生活与健康类 第七大类  ...

  8. 【雅思写作】第一章:写作基础

    [雅思写作]第一章:写作基础 标签(空格分隔):[雅思写作] 第一章:写作基础 文章目录 第一章:写作基础 1.1句子翻译练习 1.1.1 简单的主谓宾结构 1.1.2 主语 + 及物动词 + 宾语 ...

  9. 【文章翻译+笔记】Towards the Next Generation of Recommender Systems:A Survey of the State-of-the-Art and Pos

    Towards the Next Generation of Recommender Systems:A Survey of the State-of-the-Art and Possible Ext ...

  10. 【Python自动化测试16】测试用例数据分离

    文章目录 一.前言 二.什么是数据分离? 三.进行数据分离 四.Python操作excel 一.前言   本文章主要讲解Python自动化测试中测试用例的数据分离,在自动化测试中能够更好的管理测试用例 ...

最新文章

  1. 我来做百科(第九天)
  2. 用python的turtle画圆-Python turtle 绘图画圆
  3. Quartz定时框架CronTrigger开发使用实例
  4. java ip吸附_IP层的封装(Java的InetAddress类的C++实现)
  5. 数据科学项目(二)之明确问题及确立目标
  6. Tomcat开发Web项目基本结构
  7. 监控开发之用munin来自定义插件监控redis和mongodb
  8. linux tar权限不够,Linux用户,群组,权限 ,tar命令
  9. [VB]用记录集填充表格函数
  10. P1160 队列安排 洛谷
  11. 4个你未必知道的内存小知识
  12. php调用第三方接口代码,PHP接口编程——调用第三方接口获取天气
  13. 湖工微型计算机及原理题目,2017年湖北工业大学电气与电子工程学院942微机原理与应用考研导师圈点必考题汇编...
  14. 红米note9pro刷鸿蒙,红米Note9Pro稳定版刷机包(官方系统固件升级包MIUI11)
  15. String类常用方法
  16. java数据过载_java区分过载的方法
  17. 【git】git+码云上传代码
  18. VNet医学影像网络论文详解
  19. 摩托罗拉的新一代智能家庭和婴儿监护亮相香港电子产品展
  20. 新研究发现Masimo ORi™(氧储备指数)可用于早期检出单肺通气期间的血氧降低

热门文章

  1. 广西小学计算机教案上册,广西三年级信息技术教案
  2. php time of 0,PHP程序时出现 Fatal error: Maximum execution time of 30 seconds exceeded in 提示...
  3. 微型计算机原理控制,微机原理与控制技术(试题).doc
  4. 【学习笔记】平衡二叉树(AVL树)简介及其查找、插入、建立操作的实现
  5. 呆瓜半小时入门python数据分析_金融和生物信息学多个实战案例
  6. 深度神经网络 分布式训练 动手学深度学习v2
  7. 计算机机房建设公司 武汉,武汉信息化机房建设企业
  8. android 平板怎么截图,在安卓手机或平板电脑上截屏的5种方法,学起来!
  9. 3d安卓环境搭建_RoboCup 仿真3D简介及环境搭建
  10. 找零钱问题系列之暴力搜索