[转载] Python字符串操作方法详解

参考链接： Python字符串方法| 2(len，count，center，ljust，rjust，isalpha，isalnum，isspace和join)

最近处理近10万条数据，大量字符串需要处理，各种特殊符号，空格，导致各种隐秘BUG！所以写了这篇文章！深入学习下str操作函数！

Methods defined here:

| capitalize(...) | S.capitalize() -> string

# 返回一个首字母大写的字符串！ | Return a copy of the string S with only its first character

| capitalized.

>>> test = "abc"

>>> test.capitalize()

'Abc'

| center(...) | S.center(width[, fillchar]) -> string

# 返回一个把原字符串移到中间，默认两边添加空格的字符串，也可以自己指定填充物。

| Return S centered in a string of length width. Padding is done using the specified fill character (default is a space)

>>> a = "abcd"

>>> a.center(8)

' abcd '

>>> a.center(8,"*")

'**abcd**'

>>> a.center(3)

'abcd' # 小于字符串长度不会变

| count(...) | S.count(sub[, start[, end]]) -> int

# 返回子字符串在S中出现的次数，可以指定起始位置

Return the number of non-overlapping occurrences of substring sub in string S[start:end]. Optional arguments start and end are interpreted as in slice notation.

| decode(...) | S.decode([encoding[,errors]]) -> object

# 重要！

| Decodes S using the codec registered for encoding. encoding defaults to the default encoding. errors may be given to set a different error handling scheme. Default is 'strict' meaning that encoding errors raise a UnicodeDecodeError. Other possible values are 'ignore' and 'replace' as well as any other name registered with codecs.register_error that is able to handle UnicodeDecodeErrors.

| encode(...) | S.encode([encoding[,errors]]) -> object

# 重要！ | Encodes S using the codec registered for encoding. encoding defaults to the default encoding. errors may be given to set a different error handling scheme. Default is 'strict' meaning that encoding errors raise a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and 'xmlcharrefreplace' as well as any other name registered with codecs.register_error that is able to handle UnicodeEncodeErrors.

| endswith(...) | S.endswith(suffix[, start[, end]]) -> bool

# 检查是否以suffix结尾，可以指定位置。做循环的判定条件很有用，免去==！

| Return True if S ends with the specified suffix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. suffix can also be a tuple of strings to try.

| expandtabs(...) | S.expandtabs([tabsize]) -> string

# 把字符串中的制表符tab转换为tabsize个空格，默认为8个字符 | Return a copy of S where all tab characters are expanded using spaces. If tabsize is not given, a tab size of 8 characters is assumed.

| find(...) | S.find(sub [,start [,end]]) -> int

| # 在S中寻找sub，并可以指定起始位置，返回sub在S中的下标index，找不到返回-1 | Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation. Return -1 on failure.

>>> a = "abcdabcd"

>>> a.find("b")

>>> a.find("b", 2, 7)

>>> a.find("b", 3, 7)

>>> a.find("b", 6, 7)

-1

| format(...) | S.format(*args, **kwargs) -> string

| # 字符串的格式化输出！# 例子太多了，官方文档很多：

点击打开链接，我只给出我用过的例子。

Return a formatted version of S, using substitutions from args and kwargs.The substitutions are identified by braces ('{' and '}').【2014.06.28特别标注，format中转义用的是{}而不是\】

>>> '{:,}'.format(1234567890) # Using the comma as a thousands separator

'1,234,567,890'

>>> 'Correct answers: {:.2%}'.format(19.5/22) # Expressing a percentage

'Correct answers: 88.64%'

>>> import datetime

>>> d = datetime.datetime(2010, 7, 4, 12, 15, 58)

>>> '{:%Y-%m-%d %H:%M:%S}'.format(d)

'2010-07-04 12:15:58'

>>> # 替代center等方法的功能

>>> '{:<30}'.format('left aligned')

'left aligned '

>>> '{:>30}'.format('right aligned')

' right aligned'

>>> '{:^30}'.format('centered')

' centered '

>>> '{:*^30}'.format('centered') # use '*' as a fill char

'***********centered***********

>>> # Accessing arguments by position:

>>> '{0}, {1}, {2}'.format('a', 'b', 'c')

'a, b, c'

>>> '{}, {}, {}'.format('a', 'b', 'c') # 2.7+ only

'a, b, c'

>>> '{2}, {1}, {0}'.format('a', 'b', 'c')

'c, b, a'

>>> '{2}, {1}, {0}'.format(*'abc') # unpacking argument sequence

'c, b, a'

>>> '{0}{1}{0}'.format('abra', 'cad') # arguments' indices can be repeated

'abracadabra'

【更新于：2014.06.28。Python format()怎么输出{}的问题，请看下面代码，很奇葩，学的还很浅显啊！

>>> print "hi {{}} {{key}}".format(key = "string")

hi {} {key}

】

| index(...) | S.index(sub [,start [,end]]) -> int

# 类似find寻找下标，找不到会报错，find找不到返回-1

| Like S.find() but raise ValueError when the substring is not found.

| isalnum(...) | S.isalnum() -> bool

# 判断S中是否全为数字或者字母【并至少有一个字符】，是则返回True。有中文或者符号或者没有字符返回False

| Return True if all characters in S are alphanumeric and there is at least one character in S, False otherwise.

>>> a = "adad1122"

>>> a.isalnum()

True

>>> a = "3123dddaw''[]"

>>> a.isalnum()

False

>>> a = "你好hello"

>>> a.isalnum()

False

| isalpha(...) | S.isalpha() -> bool

# 判断是否全为字母【并至少有一个字符】

| Return True if all characters in S are alphabetic and there is at least one character in S, False otherwise.

| isdigit(...) | S.isdigit() -> bool

| # 判断是否全为数字

【并至少有一个字符】

| Return True if all characters in S are digits and there is at least one character in S, False otherwise.

| islower(...) | S.islower() -> bool

| # 判断字母是否全为小写（有数字不影响）

【并至少有一个字符】

| Return True if all cased characters in S are lowercase and there is at least one cased character in S, False otherwise.

| isspace(...) | S.isspace() -> bool

| # 判断是否全为空白字符

【并至少有一个字符】

| Return True if all characters in S are whitespace and there is at least one character in S, False otherwise.

| istitle(...) | S.istitle() -> bool

# 判断S中

每个单词是否首字母大写，

并且后面字母都为小写！【并至少有一个字符】

# 很多bolg无限复制的都是错的。实践很重要！ | Return True if S is a titlecased string and there is at least one character in S, i.e. uppercase characters may only follow uncased characters and lowercase characters only cased ones. Return False otherwise.

>>> a = "Abc"

>>> a.istitle()

True

>>> a = "aBc"

>>> a.istitle()

False

>>> a = "AbC"

>>> a.istitle()

False

>>> a = "Abc Cbc"

>>> a.istitle()

True

>>> a = "Abc cbc"

>>> a.istitle()

False

| isupper(...) | S.isupper() -> bool

# 判断字母是否是全大写（有数字不影响）【并至少有一个字符】

| Return True if all cased characters in S are uppercase and there is at least one cased character in S, False otherwise.

| join(...) | S.join(iterable) -> string

# 经常使用！

把迭代器中的内容用S作为连接符连接起来！迭代器中内容必须也为子符串（以前没留意）！

| Return a string which is the concatenation of the strings in the iterable. The separator between elements is S.

| ljust(...) | S.ljust(width[, fillchar]) -> string

#输出width个字符，S左对齐，不足部分用fillchar填充，默认的为空格。

| Return S left-justified in a string of length width. Padding is done using the specified fill character (default is a space).

| lower(...) | S.lower() -> string

# 返回一个全部变为小写的字符串。

| Return a copy of the string S converted to lowercase.

| lstrip(...) | S.lstrip([chars]) -> string or unicode

# 去掉字符串左边的空格或者删除掉指定的chars（如果有的话）。 | Return a copy of the string S with leading whitespace removed. If chars is given and not None, remove characters in chars instead. If chars is unicode, S will be converted to unicode before stripping

| partition(...) | S.partition(sep) -> (head, sep, tail)

| # 接受一个字符串参数，并返回一个3个元素的 tuple 对象。如果sep没出现在母串中，返回值是 (sep, ‘’, ‘’)；否则，返回值的第一个元素是 sep 左端的部分，第二个元素是 sep 自身，第三个元素是 sep 右端的部分。 | Search for the separator sep in S, and return the part before it, the separator itself, and the part after it. If the separator is not found, return S and two empty strings.

| replace(...) | S.replace(old, new[, count]) -> string

# 替换！没有给定count时，默认替换所有字符串，如果给定了count，则只替换指定count个！

| Return a copy of string S with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.

>>> a = " 213 213 1312 "

>>> a.replace(" ", "")

'2132131312'

>>> a.replace(" ", "", 3)

'213 213 1312 '

>>> a.replace(" ", "", 5)

'2132131312 '

>>> a

' 213 213 1312 '

【2014.05.22更新】

初学时候一直觉得str方法操作不怎么合理。因为a.replace()操作过后a是不会变的。刚开始使用的时候很不习惯。现在想想这么设计是很合理的！

为什么呢？在学习tuple和dict的时候大家会学习不可变的对象！其中就会说到str。Python这样设计的目的就是保证a不会改变！！！保证不可变对象自身永不可变。

| rfind(...) | S.rfind(sub [,start [,end]]) -> int

# 查找，返回最大的index，也可以指定位置（切片中）查找，找不到返回-1

| Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

| Return -1 on failure.

| rindex(...) | S.rindex(sub [,start [,end]]) -> int

| # 同rfind，没找到报错。 | Like S.rfind() but raise ValueError when the substring is not found.

| rjust(...) | S.rjust(width[, fillchar]) -> string

#输出width个字符，S右对齐，不足部分用fillchar填充，默认的为空格。

| Return S right-justified in a string of length width. Padding is done using the specified fill character (default is a space)

| rpartition(...) | S.rpartition(sep) -> (head, sep, tail)

| Search for the separator sep in S, starting at the end of S, and return the part before it, the separator itself, and the part after it. If the separator is not found, return two empty strings and S.

| rsplit(...) | S.rsplit([sep [,maxsplit]]) -> list of strings

| # 和split()相同，只不过从尾部开始分割 | Return a list of the words in the string S, using sep as the delimiter string, starting at the end of the string and working to the front. If maxsplit is given, at most maxsplit splits are done. If sep is not specified or is None, any whitespace string is a separator.

| rstrip(...) | S.rstrip([chars]) -> string or unicode

# 去掉字符串s右变的空格或者指定的chars | Return a copy of the string S with trailing whitespace removed. If chars is given and not None, remove characters in chars instead. If chars is unicode, S will be converted to unicode before stripping

| split(...) | S.split([sep [,maxsplit]]) -> list of strings

# 经常使用！用sep作为标记把S切分为list（sep在S中），和join()配合使用。 | Return a list of the words in the string S, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done. If sep is not specified or is None, any whitespace string is a separator and empty strings are removed from the result.

| splitlines(...) | S.splitlines(keepends=False) -> list of strings

| Return a list of the lines in S, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and true.

| startswith(...) | S.startswith(prefix[, start[, end]]) -> bool

# 判断s是否以prefix开头，s的切片字符串是否以prefix开头 | Return True if S starts with the specified prefix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. prefix can also be a tuple of strings to try.

| strip(...) | S.strip([chars]) -> string or unicode

# 去掉字符串s两端的空格或者指定的chars

| Return a copy of the string S with leading and trailing whitespace removed. If chars is given and not None, remove characters in chars instead. If chars is unicode, S will be converted to unicode before stripping

【更新于2014.07.31：去掉指定的字符串不要使用replace("", "')，strip快！】

| swapcase(...) | S.swapcase() -> string

# 大小写互换

| Return a copy of the string S with uppercase characters converted to lowercase and vice versa.

| title(...) | S.title() -> string

# 返回一个每个单词首字母大写，其他小写的字符串。 | Return a titlecased version of S, i.e. words start with uppercase characters, all remaining cased characters have lowercase.

| translate(...) | S.translate(table [,deletechars]) -> string

| Return a copy of the string S, where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table, which must be a string of length 256 or None. If the table argument is None, no translation is applied and the operation simply removes the characters in deletechars.

| upper(...) | S.upper() -> string

# 小写字母变为大写

| Return a copy of the string S converted to uppercase.

| zfill(...) | S.zfill(width) -> string

| # zfill()左侧以字符0进行填充，在输出数值时常用！ | Pad a numeric string S with zeros on the left, to fill a field of the specified width. The string S is never truncated.

| ----------------------------------------------------------------------

str切片操作：

str[0:5] 截取第一位到第四位的字符 str[:] 截取字符串的全部字符 str[4:] 截取第五个字符到结尾 str[:-3] 截取从头开始到倒数第三个字符之前 str[2] 截取第三个字符 str[::-1] 创造一个与原字符串顺序相反的字符串字符串的反转

decode&encode

暂时不更新

不定期更新，转载请带上本文地址:

http://blog.csdn.net/zhanh1218/article/details/21826239

本文由@The_Third_Wave原创。不定期更新，有错误请指正。

Sina微博关注：@The_Third_Wave

如果这篇博文对您有帮助，为了好的网络环境，不建议转载，建议收藏！如果您一定要转载，请带上后缀和本文地址。

[转载] Python字符串操作方法详解相关推荐

Python 字符串方法详解
Python 字符串方法详解本文最初发表于赖勇浩(恋花蝶)的博客(http://blog.csdn.net/lanphaday),如蒙转载,敬请保留全文完整,切勿去除本声明和作者信息. 在编程中,几 ...
python中字符串类型的encode()方法_第五章 Python字符串常用方法详解
5.1 Python字符串拼接(包含字符串拼接数字) 在 Python中拼接(连接)字符串很简单,可以直接将两个字符串紧挨着写在一起,具体格式为: strname = "str1" ...
[转载] python set 集合详解
参考链接: Python集合set| pop函数 Python 中的集合,和数学中的集合概念一样,用来保存不重复的元素,即集合中的元素都是唯一的,互不相同.从形式上看,和字典类似,Python 集合会 ...
python字符串格式化详解_Python字符串格式化%s%d%f详解
Python字符串格式化%s%d%f详解来源:中文源码网浏览: 次日期:2018年9月2日 [下载文档: Python字符串格式化%s%d%f详解.txt ] (友情提示:右键点上 ...
python中的real函数_RealPython 基础教程：Python 字符串用法详解
字符串是一个由字符数据组成的序列.字符串处理是编程必备的技能,很少有应用程序不需要操作字符串的. Python 提供了丰富的运算符.函数和类方法来操作字符串. 通过本文,你将了解如何访问字符串以及提取 ...
Python字符串介绍详解
Python字符串介绍字符串是一系列字符.在 Python 中,引号内的任何内容都是字符串.您可以使用单引号或双引号.例如: message = 'This is a string in Pytho ...
Python字符串使用详解
除了数字,Python中最常见的数据类型就是字符串,无论那种编程语言,字符串无处不在.例如,从用户哪里读取字符串,并将字符串打印到屏幕显示出来. 字符串是一种数据结构,这让我们有机会学习索引和切片-- ...
[转载] Python：Numpy详解
参考链接: Python中的numpy.vdot NumPy Ndarray 对象 NumPy 最重要的一个特点是其 N 维数组对象 ndarray,它是一系列同类型数据的集合,以 0 下标为开始进行 ...
python字符串用法详解（str、下标、切片、查找、修改、判断）
1. 认识字符串字符串是 Python 中最常⽤的数据类型.⼀般使⽤引号来创建字符串.创建字符串很简单,只要为变量分配⼀个值即可. a = 'hello world' b = "abcde ...

[转载] Python字符串操作方法详解

[转载] Python字符串操作方法详解相关推荐

最新文章

热门文章