1 第一章:数据载入及初步观察
1.1 载入数据
1.1.1 任务一:导入numpy和pandas

import numpy as np
import pandas as pd

1.1.2 任务二:载入数据
(1) 使用相对路径载入数据

df = pd.read_csv("train.csv")

(2) 使用绝对路径载入数据

df = pd.read_csv(r"C:\Users\Administrator\Desktop\数据分析/train.csv")

【提示】相对路径载入报错时,尝试使用os.getcwd()查看当前工作目录。
【思考】pd.read_csv()和pd.read_table()的不同

df = pd.read_csv("train.csv")
print(df.values)
df.values.shape
[[1 0 3 ... 7.25 nan 'S'][2 1 1 ... 71.2833 'C85' 'C'][3 1 3 ... 7.925 nan 'S']...[889 0 3 ... 23.45 nan 'S'][890 1 1 ... 30.0 'C148' 'C'][891 0 3 ... 7.75 nan 'Q']]
(891, 12)
df = pd.read_table("train.csv")
df.values.shape

(891, 1)
如上所示,首先两者的默认分隔符不同其次两者分割的值和方向不同.通过上述例子可以看到read_csv读取时每一个字符串都作为一列,而read_table读取时把整体字符串作为一列

【总结】加载的数据是所有工作的第一步,我们的工作会接触到不同的数据格式(eg:.csv;.tsv;.xlsx),但是加载的方法和思路都是一样的

1.1.3 任务三:每1000行为一个数据模块,逐块读取

df = pd.read_csv("train.csv", chunksize=1000)
df = pd.read_csv("train.csv", chunksize=500)
for temp in df:print(temp)
     PassengerId  Survived  Pclass  \
0              1         0       3
1              2         1       1
2              3         1       3
3              4         1       1
4              5         0       3
5              6         0       3
6              7         0       1
7              8         0       3
8              9         1       3
9             10         1       2
10            11         1       3
11            12         1       1
12            13         0       3
13            14         0       3
14            15         0       3
15            16         1       2
16            17         0       3
17            18         1       2
18            19         0       3
19            20         1       3
20            21         0       2
21            22         1       2
22            23         1       3
23            24         1       1
24            25         0       3
25            26         1       3
26            27         0       3
27            28         0       1
28            29         1       3
29            30         0       3
..           ...       ...     ...
470          471         0       3
471          472         0       3
472          473         1       2
473          474         1       2
474          475         0       3
475          476         0       1
476          477         0       2
477          478         0       3
478          479         0       3
479          480         1       3
480          481         0       3
481          482         0       2
482          483         0       3
483          484         1       3
484          485         1       1
485          486         0       3
486          487         1       1
487          488         0       1
488          489         0       3
489          490         1       3
490          491         0       3
491          492         0       3
492          493         0       1
493          494         0       1
494          495         0       3
495          496         0       3
496          497         1       1
497          498         0       3
498          499         0       1
499          500         0       3   Name     Sex   Age  SibSp  \
0                              Braund, Mr. Owen Harris    male  22.0      1
1    Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1
2                               Heikkinen, Miss. Laina  female  26.0      0
3         Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1
4                             Allen, Mr. William Henry    male  35.0      0
5                                     Moran, Mr. James    male   NaN      0
6                              McCarthy, Mr. Timothy J    male  54.0      0
7                       Palsson, Master. Gosta Leonard    male   2.0      3
8    Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)  female  27.0      0
9                  Nasser, Mrs. Nicholas (Adele Achem)  female  14.0      1
10                     Sandstrom, Miss. Marguerite Rut  female   4.0      1
11                            Bonnell, Miss. Elizabeth  female  58.0      0
12                      Saundercock, Mr. William Henry    male  20.0      0
13                         Andersson, Mr. Anders Johan    male  39.0      1
14                Vestrom, Miss. Hulda Amanda Adolfina  female  14.0      0
15                    Hewlett, Mrs. (Mary D Kingcome)   female  55.0      0
16                                Rice, Master. Eugene    male   2.0      4
17                        Williams, Mr. Charles Eugene    male   NaN      0
18   Vander Planke, Mrs. Julius (Emelia Maria Vande...  female  31.0      1
19                             Masselmani, Mrs. Fatima  female   NaN      0
20                                Fynney, Mr. Joseph J    male  35.0      0
21                               Beesley, Mr. Lawrence    male  34.0      0
22                         McGowan, Miss. Anna "Annie"  female  15.0      0
23                        Sloper, Mr. William Thompson    male  28.0      0
24                       Palsson, Miss. Torborg Danira  female   8.0      3
25   Asplund, Mrs. Carl Oscar (Selma Augusta Emilia...  female  38.0      1
26                             Emir, Mr. Farred Chehab    male   NaN      0
27                      Fortune, Mr. Charles Alexander    male  19.0      3
28                       O'Dwyer, Miss. Ellen "Nellie"  female   NaN      0
29                                 Todoroff, Mr. Lalio    male   NaN      0
..                                                 ...     ...   ...    ...
470                                  Keefe, Mr. Arthur    male   NaN      0
471                                    Cacic, Mr. Luka    male  38.0      0
472            West, Mrs. Edwy Arthur (Ada Mary Worth)  female  33.0      1
473       Jerwan, Mrs. Amin S (Marie Marthe Thuillard)  female  23.0      0
474                        Strandberg, Miss. Ida Sofia  female  22.0      0
475                        Clifford, Mr. George Quincy    male   NaN      0
476                            Renouf, Mr. Peter Henry    male  34.0      1
477                          Braund, Mr. Lewis Richard    male  29.0      1
478                          Karlsson, Mr. Nils August    male  22.0      0
479                           Hirvonen, Miss. Hildur E  female   2.0      0
480                     Goodwin, Master. Harold Victor    male   9.0      5
481                   Frost, Mr. Anthony Wood "Archie"    male   NaN      0
482                           Rouse, Mr. Richard Henry    male  50.0      0
483                             Turkula, Mrs. (Hedwig)  female  63.0      0
484                            Bishop, Mr. Dickinson H    male  25.0      1
485                             Lefebre, Miss. Jeannie  female   NaN      3
486    Hoyt, Mrs. Frederick Maxfield (Jane Anne Forby)  female  35.0      1
487                            Kent, Mr. Edward Austin    male  58.0      0
488                      Somerton, Mr. Francis William    male  30.0      0
489              Coutts, Master. Eden Leslie "Neville"    male   9.0      1
490               Hagland, Mr. Konrad Mathias Reiersen    male   NaN      1
491                                Windelov, Mr. Einar    male  21.0      0
492                         Molson, Mr. Harry Markland    male  55.0      0
493                            Artagaveytia, Mr. Ramon    male  71.0      0
494                         Stanley, Mr. Edward Roland    male  21.0      0
495                              Yousseff, Mr. Gerious    male   NaN      0
496                     Eustis, Miss. Elizabeth Mussey  female  54.0      1
497                    Shellard, Mr. Frederick William    male   NaN      0
498    Allison, Mrs. Hudson J C (Bessie Waldo Daniels)  female  25.0      1
499                                 Svensson, Mr. Olof    male  24.0      0   Parch            Ticket      Fare        Cabin Embarked
0        0         A/5 21171    7.2500          NaN        S
1        0          PC 17599   71.2833          C85        C
2        0  STON/O2. 3101282    7.9250          NaN        S
3        0            113803   53.1000         C123        S
4        0            373450    8.0500          NaN        S
5        0            330877    8.4583          NaN        Q
6        0             17463   51.8625          E46        S
7        1            349909   21.0750          NaN        S
8        2            347742   11.1333          NaN        S
9        0            237736   30.0708          NaN        C
10       1           PP 9549   16.7000           G6        S
11       0            113783   26.5500         C103        S
12       0         A/5. 2151    8.0500          NaN        S
13       5            347082   31.2750          NaN        S
14       0            350406    7.8542          NaN        S
15       0            248706   16.0000          NaN        S
16       1            382652   29.1250          NaN        Q
17       0            244373   13.0000          NaN        S
18       0            345763   18.0000          NaN        S
19       0              2649    7.2250          NaN        C
20       0            239865   26.0000          NaN        S
21       0            248698   13.0000          D56        S
22       0            330923    8.0292          NaN        Q
23       0            113788   35.5000           A6        S
24       1            349909   21.0750          NaN        S
25       5            347077   31.3875          NaN        S
26       0              2631    7.2250          NaN        C
27       2             19950  263.0000  C23 C25 C27        S
28       0            330959    7.8792          NaN        Q
29       0            349216    7.8958          NaN        S
..     ...               ...       ...          ...      ...
470      0            323592    7.2500          NaN        S
471      0            315089    8.6625          NaN        S
472      2        C.A. 34651   27.7500          NaN        S
473      0   SC/AH Basle 541   13.7917            D        C
474      0              7553    9.8375          NaN        S
475      0            110465   52.0000          A14        S
476      0             31027   21.0000          NaN        S
477      0              3460    7.0458          NaN        S
478      0            350060    7.5208          NaN        S
479      1           3101298   12.2875          NaN        S
480      2           CA 2144   46.9000          NaN        S
481      0            239854    0.0000          NaN        S
482      0          A/5 3594    8.0500          NaN        S
483      0              4134    9.5875          NaN        S
484      0             11967   91.0792          B49        C
485      1              4133   25.4667          NaN        S
486      0             19943   90.0000          C93        S
487      0             11771   29.7000          B37        C
488      0        A.5. 18509    8.0500          NaN        S
489      1        C.A. 37671   15.9000          NaN        S
490      0             65304   19.9667          NaN        S
491      0  SOTON/OQ 3101317    7.2500          NaN        S
492      0            113787   30.5000          C30        S
493      0          PC 17609   49.5042          NaN        C
494      0         A/4 45380    8.0500          NaN        S
495      0              2627   14.4583          NaN        C
496      0             36947   78.2667          D20        C
497      0         C.A. 6212   15.1000          NaN        S
498      2            113781  151.5500      C22 C26        S
499      0            350035    7.7958          NaN        S  [500 rows x 12 columns]PassengerId  Survived  Pclass  \
500          501         0       3
501          502         0       3
502          503         0       3
503          504         0       3
504          505         1       1
505          506         0       1
506          507         1       2
507          508         1       1
508          509         0       3
509          510         1       3
510          511         1       3
511          512         0       3
512          513         1       1
513          514         1       1
514          515         0       3
515          516         0       1
516          517         1       2
517          518         0       3
518          519         1       2
519          520         0       3
520          521         1       1
521          522         0       3
522          523         0       3
523          524         1       1
524          525         0       3
525          526         0       3
526          527         1       2
527          528         0       1
528          529         0       3
529          530         0       2
..           ...       ...     ...
861          862         0       2
862          863         1       1
863          864         0       3
864          865         0       2
865          866         1       2
866          867         1       2
867          868         0       1
868          869         0       3
869          870         1       3
870          871         0       3
871          872         1       1
872          873         0       1
873          874         0       3
874          875         1       2
875          876         1       3
876          877         0       3
877          878         0       3
878          879         0       3
879          880         1       1
880          881         1       2
881          882         0       3
882          883         0       3
883          884         0       2
884          885         0       3
885          886         0       3
886          887         0       2
887          888         1       1
888          889         0       3
889          890         1       1
890          891         0       3   Name     Sex   Age  SibSp  \
500                                   Calic, Mr. Petar    male  17.0      0
501                                Canavan, Miss. Mary  female  21.0      0
502                     O'Sullivan, Miss. Bridget Mary  female   NaN      0
503                     Laitinen, Miss. Kristina Sofia  female  37.0      0
504                              Maioni, Miss. Roberta  female  16.0      0
505         Penasco y Castellana, Mr. Victor de Satode    male  18.0      1
506      Quick, Mrs. Frederick Charles (Jane Richards)  female  33.0      0
507      Bradley, Mr. George ("George Arthur Brayton")    male   NaN      0
508                           Olsen, Mr. Henry Margido    male  28.0      0
509                                     Lang, Mr. Fang    male  26.0      0
510                           Daly, Mr. Eugene Patrick    male  29.0      0
511                                  Webber, Mr. James    male   NaN      0
512                          McGough, Mr. James Robert    male  36.0      0
513     Rothschild, Mrs. Martin (Elizabeth L. Barrett)  female  54.0      1
514                                  Coleff, Mr. Satio    male  24.0      0
515                       Walker, Mr. William Anderson    male  47.0      0
516                       Lemore, Mrs. (Amelia Milley)  female  34.0      0
517                                  Ryan, Mr. Patrick    male   NaN      0
518  Angle, Mrs. William A (Florence "Mary" Agnes H...  female  36.0      1
519                                Pavlovic, Mr. Stefo    male  32.0      0
520                              Perreault, Miss. Anne  female  30.0      0
521                                    Vovk, Mr. Janko    male  22.0      0
522                                 Lahoud, Mr. Sarkis    male   NaN      0
523    Hippach, Mrs. Louis Albert (Ida Sophia Fischer)  female  44.0      0
524                                  Kassem, Mr. Fared    male   NaN      0
525                                 Farrell, Mr. James    male  40.5      0
526                               Ridsdale, Miss. Lucy  female  50.0      0
527                                 Farthing, Mr. John    male   NaN      0
528                          Salonen, Mr. Johan Werner    male  39.0      0
529                        Hocking, Mr. Richard George    male  23.0      2
..                                                 ...     ...   ...    ...
861                        Giles, Mr. Frederick Edward    male  21.0      1
862  Swift, Mrs. Frederick Joel (Margaret Welles Ba...  female  48.0      0
863                  Sage, Miss. Dorothy Edith "Dolly"  female   NaN      8
864                             Gill, Mr. John William    male  24.0      0
865                           Bystrom, Mrs. (Karolina)  female  42.0      0
866                       Duran y More, Miss. Asuncion  female  27.0      1
867               Roebling, Mr. Washington Augustus II    male  31.0      0
868                        van Melkebeke, Mr. Philemon    male   NaN      0
869                    Johnson, Master. Harold Theodor    male   4.0      1
870                                  Balkic, Mr. Cerin    male  26.0      0
871   Beckwith, Mrs. Richard Leonard (Sallie Monypeny)  female  47.0      1
872                           Carlsson, Mr. Frans Olof    male  33.0      0
873                        Vander Cruyssen, Mr. Victor    male  47.0      0
874              Abelson, Mrs. Samuel (Hannah Wizosky)  female  28.0      1
875                   Najib, Miss. Adele Kiamie "Jane"  female  15.0      0
876                      Gustafsson, Mr. Alfred Ossian    male  20.0      0
877                               Petroff, Mr. Nedelio    male  19.0      0
878                                 Laleff, Mr. Kristo    male   NaN      0
879      Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)  female  56.0      0
880       Shelley, Mrs. William (Imanita Parrish Hall)  female  25.0      0
881                                 Markun, Mr. Johann    male  33.0      0
882                       Dahlberg, Miss. Gerda Ulrika  female  22.0      0
883                      Banfield, Mr. Frederick James    male  28.0      0
884                             Sutehall, Mr. Henry Jr    male  25.0      0
885               Rice, Mrs. William (Margaret Norton)  female  39.0      0
886                              Montvila, Rev. Juozas    male  27.0      0
887                       Graham, Miss. Margaret Edith  female  19.0      0
888           Johnston, Miss. Catherine Helen "Carrie"  female   NaN      1
889                              Behr, Mr. Karl Howell    male  26.0      0
890                                Dooley, Mr. Patrick    male  32.0      0   Parch            Ticket      Fare        Cabin Embarked
500      0            315086    8.6625          NaN        S
501      0            364846    7.7500          NaN        Q
502      0            330909    7.6292          NaN        Q
503      0              4135    9.5875          NaN        S
504      0            110152   86.5000          B79        S
505      0          PC 17758  108.9000          C65        C
506      2             26360   26.0000          NaN        S
507      0            111427   26.5500          NaN        S
508      0            C 4001   22.5250          NaN        S
509      0              1601   56.4958          NaN        S
510      0            382651    7.7500          NaN        Q
511      0  SOTON/OQ 3101316    8.0500          NaN        S
512      0          PC 17473   26.2875          E25        S
513      0          PC 17603   59.4000          NaN        C
514      0            349209    7.4958          NaN        S
515      0             36967   34.0208          D46        S
516      0        C.A. 34260   10.5000          F33        S
517      0            371110   24.1500          NaN        Q
518      0            226875   26.0000          NaN        S
519      0            349242    7.8958          NaN        S
520      0             12749   93.5000          B73        S
521      0            349252    7.8958          NaN        S
522      0              2624    7.2250          NaN        C
523      1            111361   57.9792          B18        C
524      0              2700    7.2292          NaN        C
525      0            367232    7.7500          NaN        Q
526      0       W./C. 14258   10.5000          NaN        S
527      0          PC 17483  221.7792          C95        S
528      0           3101296    7.9250          NaN        S
529      1             29104   11.5000          NaN        S
..     ...               ...       ...          ...      ...
861      0             28134   11.5000          NaN        S
862      0             17466   25.9292          D17        S
863      2          CA. 2343   69.5500          NaN        S
864      0            233866   13.0000          NaN        S
865      0            236852   13.0000          NaN        S
866      0     SC/PARIS 2149   13.8583          NaN        C
867      0          PC 17590   50.4958          A24        S
868      0            345777    9.5000          NaN        S
869      1            347742   11.1333          NaN        S
870      0            349248    7.8958          NaN        S
871      1             11751   52.5542          D35        S
872      0               695    5.0000  B51 B53 B55        S
873      0            345765    9.0000          NaN        S
874      0         P/PP 3381   24.0000          NaN        C
875      0              2667    7.2250          NaN        C
876      0              7534    9.8458          NaN        S
877      0            349212    7.8958          NaN        S
878      0            349217    7.8958          NaN        S
879      1             11767   83.1583          C50        C
880      1            230433   26.0000          NaN        S
881      0            349257    7.8958          NaN        S
882      0              7552   10.5167          NaN        S
883      0  C.A./SOTON 34068   10.5000          NaN        S
884      0   SOTON/OQ 392076    7.0500          NaN        S
885      5            382652   29.1250          NaN        Q
886      0            211536   13.0000          NaN        S
887      0            112053   30.0000          B42        S
888      2        W./C. 6607   23.4500          NaN        S
889      0            111369   30.0000         C148        C
890      0            370376    7.7500          NaN        Q  [391 rows x 12 columns]

【思考】什么是逐块读取?为什么要逐块读取呢?
通过将数据集划分,按块读取数据集
read_csv中的chunksize参数设置分块大小,返回的是可迭代对象

逐块读取原因:

1.数据集较大,完全读取不易看到样貌
2.读取时间消耗大,占用内存大
3.简单读取遇到MemoryError

1.1.4 任务四:将表头改成中文,索引改为乘客ID [对于某些英文资料,我们可以通过翻译来更直观的熟悉我们的数据]
PassengerId => 乘客ID
Survived => 是否幸存
Pclass => 乘客等级(1/2/3等舱位)
Name => 乘客姓名
Sex => 性别
Age => 年龄
SibSp => 堂兄弟/妹个数
Parch => 父母与小孩个数
Ticket => 船票信息
Fare => 票价
Cabin => 客舱
Embarked => 登船港口

titles = ["乘客ID", "是否幸存", "乘客等级(1/2/3等舱位)", "乘客姓名", "性别", "年龄", "堂兄弟/妹个数", "父母与小孩个数", "船票信息", "票价", "客舱", "登船港口"]
index_name = "乘客ID"
df = pd.read_csv("train.csv")
df.columns = titles
df = df.set_index("乘客ID")
df.head()


1.2 初步观察
1.2.1 任务一:查看数据的基本信息

df.info():          # 打印摘要
df.describe():      # 描述性统计信息
df.values:          # 数据
df.shape:           # 形状 (行数, 列数)
df.columns:         # 列标签
df.columns.values:  # 列标签
df.index:           # 行标签
df.index.values:    # 行标签
df.head(n):         # 前n行
df.tail(n):         # 尾n行
pd.options.display.max_columns=n: # 最多显示n列
pd.options.display.max_rows=n:    # 最多显示n行
df.memory_usage():                # 占用内存(字节B)

1.2.2 任务二:观察表格前10行的数据和后15行的数据

#前十行的乘客
df.head(10)

#后15行的乘客
df.tail(15)

1.2.4 任务三:判断数据是否为空,为空的地方返回True,其余地方返回False

df.isna()
df.isna().sum()
df[df.notna().all(1)]
 是否幸存    乘客等级(1/2/3等舱位)  乘客姓名    性别  年龄  堂兄弟/妹个数 父母与小孩个数 船票信息    票价  客舱  登船港口
乘客ID
2   1   1   Cumings, Mrs. John Bradley (Florence Briggs Th...   female  38.0    1   0   PC 17599    71.2833 C85 C
4   1   1   Futrelle, Mrs. Jacques Heath (Lily May Peel)    female  35.0    1   0   113803  53.1000 C123    S
7   0   1   McCarthy, Mr. Timothy J male    54.0    0   0   17463   51.8625 E46 S
11  1   3   Sandstrom, Miss. Marguerite Rut female  4.0 1   1   PP 9549 16.7000 G6  S
12  1   1   Bonnell, Miss. Elizabeth    female  58.0    0   0   113783  26.5500 C103    S
22  1   2   Beesley, Mr. Lawrence   male    34.0    0   0   248698  13.0000 D56 S
24  1   1   Sloper, Mr. William Thompson    male    28.0    0   0   113788  35.5000 A6  S
28  0   1   Fortune, Mr. Charles Alexander  male    19.0    3   2   19950   263.0000    C23 C25 C27 S
53  1   1   Harper, Mrs. Henry Sleeper (Myna Haxtun)    female  49.0    1   0   PC 17572    76.7292 D33 C
55  0   1   Ostby, Mr. Engelhart Cornelius  male    65.0    0   1   113509  61.9792 B30 C
63  0   1   Harris, Mr. Henry Birkhardt male    45.0    1   0   36973   83.4750 C83 S
67  1   2   Nye, Mrs. (Elizabeth Ramell)    female  29.0    0   0   C.A. 29395  10.5000 F33 S
76  0   3   Moen, Mr. Sigurd Hansen male    25.0    0   0   348123  7.6500  F G73   S
89  1   1   Fortune, Miss. Mabel Helen  female  23.0    3   2   19950   263.0000    C23 C25 C27 S
93  0   1   Chaffee, Mr. Herbert Fuller male    46.0    1   0   W.E.P. 5734 61.1750 E31 S
97  0   1   Goldschmidt, Mr. George B   male    71.0    0   0   PC 17754    34.6542 A5  C
98  1   1   Greenfield, Mr. William Bertram male    23.0    0   1   PC 17759    63.3583 D10 D12 C
103 0   1   White, Mr. Richard Frasar   male    21.0    0   1   35281   77.2875 D26 S
111 0   1   Porter, Mr. Walter Chamberlain  male    47.0    0   0   110465  52.0000 C110    S
119 0   1   Baxter, Mr. Quigg Edmond    male    24.0    0   1   PC 17558    247.5208    B58 B60 C
124 1   2   Webber, Miss. Susan female  32.5    0   0   27267   13.0000 E101    S
125 0   1   White, Mr. Percival Wayland male    54.0    0   1   35281   77.2875 D26 S
137 1   1   Newsom, Miss. Helen Monypeny    female  19.0    0   2   11752   26.2833 D47 S
138 0   1   Futrelle, Mr. Jacques Heath male    37.0    1   0   113803  53.1000 C123    S
140 0   1   Giglio, Mr. Victor  male    24.0    0   0   PC 17593    79.2000 B86 C
149 0   2   Navratil, Mr. Michel ("Louis M Hoffman")  male    36.5    0   2   230080  26.0000 F2  S
152 1   1   Pears, Mrs. Thomas (Edith Wearne)   female  22.0    1   0   113776  66.6000 C2  S
171 0   1   Van der hoef, Mr. Wyckoff   male    61.0    0   0   111240  33.5000 B19 S
175 0   1   Smith, Mr. James Clinch male    56.0    0   0   17764   30.6958 A7  C
178 0   1   Isham, Miss. Ann Elizabeth  female  50.0    0   0   PC 17595    28.7125 C49 C
... ... ... ... ... ... ... ... ... ... ... ...
738 1   1   Lesurer, Mr. Gustave J  male    35.0    0   0   PC 17755    512.3292    B101    C
742 0   1   Cavendish, Mr. Tyrell William   male    36.0    1   0   19877   78.8500 C46 S
743 1   1   Ryerson, Miss. Susan Parker "Suzette" female  21.0    2   2   PC 17608    262.3750    B57 B59 B63 B66 C
746 0   1   Crosby, Capt. Edward Gifford    male    70.0    1   1   WE/P 5735   71.0000 B22 S
749 0   1   Marvin, Mr. Daniel Warner   male    19.0    1   0   113773  53.1000 D30 S
752 1   3   Moor, Master. Meier male    6.0 0   1   392096  12.4750 E121    S
760 1   1   Rothes, the Countess. of (Lucy Noel Martha Dye...   female  33.0    0   0   110152  86.5000 B77 S
764 1   1   Carter, Mrs. William Ernest (Lucile Polk)   female  36.0    1   2   113760  120.0000    B96 B98 S
766 1   1   Hogeboom, Mrs. John C (Anna Andrews)    female  51.0    1   0   13502   77.9583 D11 S
773 0   2   Mack, Mrs. (Mary)   female  57.0    0   0   S.O./P.P. 3 10.5000 E77 S
780 1   1   Robert, Mrs. Edward Scott (Elisabeth Walton Mc...   female  43.0    0   1   24160   211.3375    B3  S
782 1   1   Dick, Mrs. Albert Adrian (Vera Gillespie)   female  17.0    1   0   17474   57.0000 B20 S
783 0   1   Long, Mr. Milton Clyde  male    29.0    0   0   113501  30.0000 D6  S
790 0   1   Guggenheim, Mr. Benjamin    male    46.0    0   0   PC 17593    79.2000 B82 B84 C
797 1   1   Leader, Dr. Alice (Farnham) female  49.0    0   0   17465   25.9292 D17 S
803 1   1   Carter, Master. William Thornton II male    11.0    1   2   113760  120.0000    B96 B98 S
807 0   1   Andrews, Mr. Thomas Jr  male    39.0    0   0   112050  0.0000  A36 S
810 1   1   Chambers, Mrs. Norman Campbell (Bertha Griggs)  female  33.0    1   0   113806  53.1000 E8  S
821 1   1   Hays, Mrs. Charles Melville (Clara Jennings Gr...   female  52.0    1   1   12749   93.5000 B69 S
824 1   3   Moor, Mrs. (Beila)  female  27.0    0   1   392096  12.4750 E121    S
836 1   1   Compton, Miss. Sara Rebecca female  39.0    1   1   PC 17756    83.1583 E49 C
854 1   1   Lines, Miss. Mary Conover   female  16.0    0   1   PC 17592    39.4000 D28 S
858 1   1   Daly, Mr. Peter Denis   male    51.0    0   0   113055  26.5500 E17 S
863 1   1   Swift, Mrs. Frederick Joel (Margaret Welles Ba...   female  48.0    0   0   17466   25.9292 D17 S
868 0   1   Roebling, Mr. Washington Augustus II    male    31.0    0   0   PC 17590    50.4958 A24 S
872 1   1   Beckwith, Mrs. Richard Leonard (Sallie Monypeny)    female  47.0    1   1   11751   52.5542 D35 S
873 0   1   Carlsson, Mr. Frans Olof    male    33.0    0   0   695 5.0000  B51 B53 B55 S
880 1   1   Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)   female  56.0    0   1   11767   83.1583 C50 C
888 1   1   Graham, Miss. Margaret Edith    female  19.0    0   0   112053  30.0000 B42 S
890 1   1   Behr, Mr. Karl Howell   male    26.0    0   0   111369  30.0000 C148    C
183 rows × 11 columns

1.3 保存数据
1.3.1 任务一:将你加载并做出改变的数据,在工作目录下保存为一个新文件train_chinese.csv

df.to_csv("train_chinese.csv" "encoding=‘utf-8")

注意:不同的操作系统保存下来可能会有乱码。大家可以加入"encoding=“GBK” 或者 "encoding = “uft-8"”

数据分析task01(2021.06.15)相关推荐

  1. 《惢客创业日记》2021.06.15(周二)凉粉儿的防骗三板斧

    今天,凉粉儿给我打了个电话,告诉我一个消息,说是有一个投资机构对我们惢客项目感兴趣,想聊聊.说实话,有了上一次"深圳行"的经历,我和凉粉儿都怕了,为此,我还在2021年3月2日写下 ...

  2. Python 最近两条好消息:①TIOBE排名超过C和Java②新版本发布3.10.0,还有今天刚发布的《What’s New in Python(2021.10.15)》

    来自TIOBE的最新10月份统计数据显示,Python首次超越Java.JavaScript.C语言等,成为最受欢迎的编程语言.TIOBE过去20年一直在追踪编程语言的受欢迎程度,其数据来自于对25个 ...

  3. 2021.06.03邮票面值设计

    2021.06.03邮票面值设计 题目描述 给定一个信封,最多只允许粘贴 N 张邮票,计算在给定 K(N+K≤15)种邮票的情况下(假定所有的邮票数量都足够),如何设计邮票的面值,能得到最大值 MAX ...

  4. 2021.06.02税收和补贴问题

    2021.06.02税收和补贴问题 (题目来源:洛谷-P1023) 题目描述 每样商品的价格越低,其销量就会相应增大.现已知某种商品的成本及其在若干价位上的销量(产品不会低于成本销售),并假设相邻价位 ...

  5. 【跃迁之路】【495天】程序员高效学习方法论探索系列(实验阶段252-2018.06.15)...

    @(跃迁之路)专栏 实验说明 从2017.10.6起,开启这个系列,目标只有一个:探索新的学习方法,实现跃迁式成长 实验期2年(2017.10.06 - 2019.10.06) 我将以自己为实验对象. ...

  6. 电动力学每日一题 2021/10/15 Fourier变换法计算均匀电流密度产生的磁场

    电动力学每日一题 2021/10/15 Fourier变换法计算均匀电流密度产生的磁场 无限长均匀电流 无限长圆柱面均匀电流密度 无限长均匀电流 假设z轴上有一根非常细的电线,携带均匀电流I0I_0I ...

  7. ALLyeSNO 优化版浩方 第二版 Ver 2007 06 15 清除广告 自动挤房间

    分享一下我老师大神的人工智能教程.零基础!通俗易懂!风趣幽默!还带黄段子!希望你也加入到我们人工智能的队伍中来!https://blog.csdn.net/jiangjunshow allyesno ...

  8. NGS数据分析实践:06. 数据预处理 - 序列比对+PCR重复标记+Indel区域重比对+碱基质量重校正

    NGS数据分析实践:06. 数据预处理 - 序列比对+PCR重复标记+Indel区域重比对+碱基质量重校正 1. 序列比对 1.1 参考基因组建索引 1.2 序列比对 2. 排序 3. PCR重复标记 ...

  9. 2021.1.15——星露谷作物计算器的小改进

    2021.1.15--星露谷作物计算器的小改进 前言 目标 excel表格 代码 总结 前言 2021.1.13做的星露谷作物计算器,初步只完成了对excel表内数据和图表的生成,交互也只是input ...

最新文章

  1. 你是否对它有一种责任感
  2. 深度学习实现缺陷检测
  3. Windows 10全新分支版本曝光!专门优化高配置PC
  4. 游戏开发基础:A*算法(转)
  5. 74HC595驱动程序
  6. 细数:数据中心机房对环境的严格要求有哪些?
  7. 2019年平面设计趋势
  8. 接口使用jwt返回token_JWT实现token验证
  9. 区间调度之区间交集问题
  10. 多命令顺序执行,单引号,双引号,反引号,转义符
  11. 【翻译】C#表达式中的动态查询
  12. QImage与Mat之间的相互转换
  13. C库函数与Linux系统函数之间的关系
  14. Sublime Text 3无法安装Package Control插件的解决
  15. 一种单片机支持WiFi的应用——SimpleWiFi在单片机中的应用
  16. Golang | Go 语言 编程练习 100题
  17. Ubuntu 关闭服务详解
  18. 【华为云】python调用华为云API,获取token值
  19. c语言 打印奇数魔法阵,[luogu2119]魔法阵 NOIP2016T4
  20. AMA回顾|走进“元宇宙工厂”BreederDAO

热门文章

  1. 【C语言/C++程序员编程】一小时做出来的数字雨(一颗开花的树)!
  2. electron-builder打包过程中报错——网络下载篇
  3. 中国魔笛痛改前非做好准备 国足不能失去传奇大师
  4. 微软行星云计算Planetary Computer——从 STAC API 读取数据
  5. 不同网络环境下监控视频统一汇聚集中管理方案介绍
  6. 参加过知了堂成都Java培训后,需要多久能达到年薪十万?
  7. S5PV210Kernel移植6之什么是进程,线程?
  8. 南京邮电大学计算机专业录取分数线2019,南京邮电大学录取分数线
  9. 与鲨共舞:当AI遇见海洋杀手
  10. Request(HTTP请求对象)的笔记和底层原理