用NumPy genfromtxt导入数据
NumPy provides several functions to create arrays from tabular data. We focus here on the genfromtxt function.
In a nutshell, genfromtxt runs two main loops. The first loop converts each line of the file in a sequence of strings. The second loop converts each string to the appropriate data type. This mechanism is slower than a single loop, but gives more flexibility. In particular, genfromtxt is able to take missing data into account, when other faster and simpler functions like loadtxt cannot.
genfromtxt运行两个主循环。第一个循环以字符串序列转换文件的每一行。第二个循环将每个字符串转换为适当的数据类型。这种机制比单循环慢,但具有更大的灵活性。特别是,当其他更快,更简单的功能(如loadtxt不能)无法处理时, genfromtxt能够考虑丢失的数据。
When giving examples, we will use the following conventions: 在给出示例时,将使用以下约定:

import numpy as np
from io import StringIO
Defining the input
The only mandatory argument of genfromtxt is the source of the data. It can be a string, a list of strings, a generator or an open file-like object with a read method, for example, a file or io.StringIO object. If a single string is provided, it is assumed to be the name of a local or remote file. If a list of strings or a generator returning strings is provided, each string is treated as one line in a file. When the URL of a remote file is passed, the file is automatically downloaded to the current directory and opened.
Recognized file types are text files and archives. Currently, the function recognizes gzip and bz2 (bzip2) archives. The type of the archive is determined from the extension of the file: if the filename ends with ‘.gz’, a gzip archive is expected; if it ends with ‘bz2’, a bzip2 archive is assumed.
唯一强制性参数genfromtxt是数据源。可以是字符串,字符串列表,生成器或带有read方法的打开的类似文件的对象,例如文件或 io.StringIO对象。如果提供单个字符串,假定是本地文件或远程文件的名称。如果提供了字符串列表或返回字符串的生成器,将每个字符串视为文件中的一行。传递远程文件的URL后,该文件将自动下载到当前目录并打开。
公认的文件类型是文本文件和存档。当前,该功能可识别gzip和bz2(bzip2)存档。存档的类型由文件的扩展名决定:如果文件名以’.gz’结尾,则应使用gzip存档;否则,将使用默认的存档。如果结尾为 ‘bz2’,bzip2则假定为存档。
Splitting the lines into columns
The delimiter argument
Once the file is defined and open for reading, genfromtxt splits each non-empty line into a sequence of strings. Empty or commented lines are just skipped. The delimiter keyword is used to define how the splitting should take place.
Quite often, a single character marks the separation between columns. For example, comma-separated files (CSV) use a comma (,) or a semicolon (

