tf.data 加载 pandas dataframes

code

# -*- coding: utf-8 -*-
"""
Created on  2020/11/20 16:39
@Author: CY
@email: 5844104706@qq.com
"""#小型数据集,由克利夫兰诊所心脏病基金会(Cleveland Clinic Foundation for Heart Disease)提供
# 此数据集中有几百行CSV。每行表示一个患者,每列表示一个属性(describe)。
# 我们将使用这些信息来预测患者是否患有心脏病,
# 这是一个二分类问题# !pip install -q tensorflow-gpu==2.0.0-rc1
import pandas as pd
import tensorflow as tfcsv_file = tf.keras.utils.get_file('heart.csv', 'https://storage.googleapis.com/applied-dl/heart.csv')
df = pd.read_csv(csv_file)
print(df.head())
print(df.dtypes)#将 thal 列(数据帧(dataframe)中的 object )转换为离散数值。
df['thal'] = pd.Categorical(df['thal'])
df['thal'] = df.thal.cat.codesprint("转换为离散数值",df.head())print("#使用 tf.data.Dataset 读取数据")
#使用 tf.data.Dataset.from_tensor_slices 从 pandas dataframe 中读取数值。
#使用 tf.data.Dataset 的其中一个优势是可以允许您写一些简单而又高效的数据管道(data pipelines)。从 loading data guide 可以了解更多。target = df.pop('target')dataset = tf.data.Dataset.from_tensor_slices((df.values, target.values))for feat, targ in dataset.take(5):print ('Features: {}, Target: {}'.format(feat, targ))tf.constant(df['thal'])
print("随机读取(shuffle)并批量处理数据集。")
train_dataset = dataset.shuffle(len(df)).batch(1)
print("#创建并训练模型")
def get_compiled_model():model = tf.keras.Sequential([tf.keras.layers.Dense(10, activation='relu'),tf.keras.layers.Dense(10, activation='relu'),tf.keras.layers.Dense(1, activation='sigmoid')])model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])return modelmodel = get_compiled_model()
model.fit(train_dataset, epochs=15)
print("#代替特征列")
#将字典作为输入传输给模型就像创建 tf.keras.layers.Input 层的匹配字典一样简单,
# 应用任何预处理并使用 functional api。 您可以使用它作为 feature columns 的替代方法。
inputs = {key: tf.keras.layers.Input(shape=(), name=key) for key in df.keys()}
x = tf.stack(list(inputs.values()), axis=-1)x = tf.keras.layers.Dense(10, activation='relu')(x)
output = tf.keras.layers.Dense(1, activation='sigmoid')(x)model_func = tf.keras.Model(inputs=inputs, outputs=output)model_func.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
# 与 tf.data 一起使用时,
# 保存 pd.DataFrame 列结构的最简单方法是将 pd.DataFrame 转换为 dict ,并对该字典进行切片。
dict_slices = tf.data.Dataset.from_tensor_slices((df.to_dict('list'), target.values)).batch(16)for dict_slice in dict_slices.take(1):print (dict_slice)model_func.fit(dict_slices, epochs=15)

执行结果

   age  sex  cp  trestbps  chol  ...  oldpeak  slope  ca        thal  target
0   63    1   1       145   233  ...      2.3      3   0       fixed       0
1   67    1   4       160   286  ...      1.5      2   3      normal       1
2   67    1   4       120   229  ...      2.6      2   2  reversible       0
3   37    1   3       130   250  ...      3.5      3   0      normal       0
4   41    0   2       130   204  ...      1.4      1   0      normal       0[5 rows x 14 columns]
age           int64
sex           int64
cp            int64
trestbps      int64
chol          int64
fbs           int64
restecg       int64
thalach       int64
exang         int64
oldpeak     float64
slope         int64
ca            int64
thal         object
target        int64
dtype: object
转换为离散数值    age  sex  cp  trestbps  chol  fbs  ...  exang  oldpeak  slope  ca  thal  target
0   63    1   1       145   233    1  ...      0      2.3      3   0     2       0
1   67    1   4       160   286    0  ...      1      1.5      2   3     3       1
2   67    1   4       120   229    0  ...      1      2.6      2   2     4       0
3   37    1   3       130   250    0  ...      0      3.5      3   0     3       0
4   41    0   2       130   204    0  ...      0      1.4      1   0     3       0[5 rows x 14 columns]
#使用 tf.data.Dataset 读取数据Features: [ 63.    1.    1.  145.  233.    1.    2.  150.    0.    2.3   3.    0.2. ], Target: 0
Features: [ 67.    1.    4.  160.  286.    0.    2.  108.    1.    1.5   2.    3.3. ], Target: 1
Features: [ 67.    1.    4.  120.  229.    0.    2.  129.    1.    2.6   2.    2.4. ], Target: 0
Features: [ 37.    1.    3.  130.  250.    0.    0.  187.    0.    3.5   3.    0.3. ], Target: 0
Features: [ 41.    0.    2.  130.  204.    0.    2.  172.    0.    1.4   1.    0.3. ], Target: 0
随机读取(shuffle)并批量处理数据集。
#创建并训练模型
Epoch 1/15
WARNING:tensorflow:Layer dense is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because its dtype defaults to floatx.If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.303/303 [==============================] - 0s 928us/step - loss: 1.2399 - accuracy: 0.6469
Epoch 2/15
303/303 [==============================] - 0s 980us/step - loss: 0.7994 - accuracy: 0.7063
Epoch 3/15
303/303 [==============================] - 0s 928us/step - loss: 0.7980 - accuracy: 0.7195
Epoch 4/15
303/303 [==============================] - 0s 928us/step - loss: 0.6322 - accuracy: 0.6997
Epoch 5/15
303/303 [==============================] - 0s 877us/step - loss: 0.6632 - accuracy: 0.7129
Epoch 6/15
303/303 [==============================] - 0s 877us/step - loss: 0.6147 - accuracy: 0.7195
Epoch 7/15
303/303 [==============================] - 0s 928us/step - loss: 0.6463 - accuracy: 0.7063
Epoch 8/15
303/303 [==============================] - 0s 928us/step - loss: 0.5977 - accuracy: 0.7162
Epoch 9/15
303/303 [==============================] - 0s 928us/step - loss: 0.5960 - accuracy: 0.7162
Epoch 10/15
303/303 [==============================] - 0s 928us/step - loss: 0.6153 - accuracy: 0.7426
Epoch 11/15
303/303 [==============================] - 0s 989us/step - loss: 0.5872 - accuracy: 0.7294
Epoch 12/15
303/303 [==============================] - 0s 926us/step - loss: 0.5682 - accuracy: 0.7228
Epoch 13/15
303/303 [==============================] - 0s 877us/step - loss: 0.5923 - accuracy: 0.6964
Epoch 14/15
303/303 [==============================] - 0s 928us/step - loss: 0.5583 - accuracy: 0.7162
Epoch 15/15
303/303 [==============================] - 0s 928us/step - loss: 0.5403 - accuracy: 0.7525
#代替特征列
({'age': <tf.Tensor: shape=(16,), dtype=int32, numpy=array([63, 67, 67, 37, 41, 56, 62, 57, 63, 53, 57, 56, 56, 44, 52, 57])>, 'sex': <tf.Tensor: shape=(16,), dtype=int32, numpy=array([1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1])>, 'cp': <tf.Tensor: shape=(16,), dtype=int32, numpy=array([1, 4, 4, 3, 2, 2, 4, 4, 4, 4, 4, 2, 3, 2, 3, 3])>, 'trestbps': <tf.Tensor: shape=(16,), dtype=int32, numpy=
array([145, 160, 120, 130, 130, 120, 140, 120, 130, 140, 140, 140, 130,120, 172, 150])>, 'chol': <tf.Tensor: shape=(16,), dtype=int32, numpy=
array([233, 286, 229, 250, 204, 236, 268, 354, 254, 203, 192, 294, 256,263, 199, 168])>, 'fbs': <tf.Tensor: shape=(16,), dtype=int32, numpy=array([1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0])>, 'restecg': <tf.Tensor: shape=(16,), dtype=int32, numpy=array([2, 2, 2, 0, 2, 0, 2, 0, 2, 2, 0, 2, 2, 0, 0, 0])>, 'thalach': <tf.Tensor: shape=(16,), dtype=int32, numpy=
array([150, 108, 129, 187, 172, 178, 160, 163, 147, 155, 148, 153, 142,173, 162, 174])>, 'exang': <tf.Tensor: shape=(16,), dtype=int32, numpy=array([0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0])>, 'oldpeak': <tf.Tensor: shape=(16,), dtype=float32, numpy=
array([2.3, 1.5, 2.6, 3.5, 1.4, 0.8, 3.6, 0.6, 1.4, 3.1, 0.4, 1.3, 0.6,0. , 0.5, 1.6], dtype=float32)>, 'slope': <tf.Tensor: shape=(16,), dtype=int32, numpy=array([3, 2, 2, 3, 1, 1, 3, 1, 2, 3, 2, 2, 2, 1, 1, 1])>, 'ca': <tf.Tensor: shape=(16,), dtype=int32, numpy=array([0, 3, 2, 0, 0, 0, 2, 0, 1, 0, 0, 0, 1, 0, 0, 0])>, 'thal': <tf.Tensor: shape=(16,), dtype=int32, numpy=array([2, 3, 4, 3, 3, 3, 3, 3, 4, 4, 2, 3, 2, 4, 4, 3])>}, <tf.Tensor: shape=(16,), dtype=int64, numpy=array([0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0], dtype=int64)>)
Epoch 1/15
19/19 [==============================] - 0s 2ms/step - loss: 57.2277 - accuracy: 0.2739
Epoch 2/15
19/19 [==============================] - 0s 822us/step - loss: 32.8823 - accuracy: 0.3102
Epoch 3/15
19/19 [==============================] - 0s 2ms/step - loss: 12.6299 - accuracy: 0.4323
Epoch 4/15
19/19 [==============================] - 0s 2ms/step - loss: 4.3907 - accuracy: 0.6964
Epoch 5/15
19/19 [==============================] - 0s 2ms/step - loss: 3.6353 - accuracy: 0.7162
Epoch 6/15
19/19 [==============================] - 0s 2ms/step - loss: 3.5936 - accuracy: 0.7261
Epoch 7/15
19/19 [==============================] - 0s 2ms/step - loss: 3.5603 - accuracy: 0.7228
Epoch 8/15
19/19 [==============================] - 0s 2ms/step - loss: 3.5349 - accuracy: 0.7228
Epoch 9/15
19/19 [==============================] - 0s 2ms/step - loss: 3.5064 - accuracy: 0.7228
Epoch 10/15
19/19 [==============================] - 0s 2ms/step - loss: 3.4748 - accuracy: 0.7228
Epoch 11/15
19/19 [==============================] - 0s 2ms/step - loss: 3.4417 - accuracy: 0.7228
Epoch 12/15
19/19 [==============================] - 0s 2ms/step - loss: 3.4077 - accuracy: 0.7228
Epoch 13/15
19/19 [==============================] - 0s 2ms/step - loss: 3.3728 - accuracy: 0.7228
Epoch 14/15
19/19 [==============================] - 0s 1ms/step - loss: 3.3370 - accuracy: 0.7228
Epoch 15/15
19/19 [==============================] - 0s 2ms/step - loss: 3.3004 - accuracy: 0.7228

tf.data 加载 pandas dataframes相关推荐

  1. Tensorflow2.* 加载和预处理数据之用 tf.data 加载磁盘图片数据(4)

    Tensorflow2.* 机器学习基础知识篇: 对服装图像进行分类 使用Tensorflow Hub对未处理的电影评论数据集IMDB进行分类 Keras 机器学习基础知识之对预处理的电影评论文本分类 ...

  2. Tensorflow2.* 加载和预处理数据之用 tf.data 加载 Numpy数据(2)

    Tensorflow2.* 机器学习基础知识篇: 对服装图像进行分类 使用Tensorflow Hub对未处理的电影评论数据集IMDB进行分类 Keras 机器学习基础知识之对预处理的电影评论文本分类 ...

  3. tf.data 加载图片

    代码 # -*- coding: utf-8 -*- """ Created on 2020/11/20 14:24 @Author: CY @email: 584410 ...

  4. SAP Spartacus language和currency Component data加载

    ComponentWrapperDirective: 首先给components字典分配一个uid entry: 如果pageContext为undefined,进入this.createCompon ...

  5. 解决mysql load data加载本地null数据,表里出现0的情况

    解决mysql load data加载本地null数据,表里出现0的情况 问题说明: sql执行语句如下代码块: ---- 欢迎讨论沟通 ---- 问题说明: 本人在使用mysql加载本地数据过程中, ...

  6. 使用tf.data 加载文件夹下的图片集合并分类

    Tensorflow原始教程链接在官网: https://tensorflow.google.cn/tutorials/load_data/images 简化版: https://colab.rese ...

  7. 使用tf.data.Dataset加载numpy数据

    Mnist数据集 0~9的手写体图片,该数据默认已经将数据分成训练集和测试集.训练集有60000张图片,测试集有10000张图片. 导入必要库 import tensorflow as tf from ...

  8. 2.2-tensorflow2-基础教程-加载和预处理数据

    文章目录 1.CSV 2.Numpy 3.pandas.DataFrame 4.图像 5.文本 6.Unicode 7.TF.Text 8.TFRecord和tf.Example 1.CSV TRAI ...

  9. TensorFlow——在web.py、Django环境下TensorFlow(Keras、tf.keras)加载和使用多模型失败解决方案

    问题描述 Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder_8:0", shape=(3, ...

最新文章

  1. 转 C++STL之string
  2. jQuery操作json
  3. 【代码】使用reentrantlock必须要手动释放锁
  4. Linux学习笔记15—RPM包的安装OR源码包的安装
  5. 爱计算机辅助筛查肺结节,计算机辅助检测系统提高CT肺结节检出方式的研究
  6. 考二级计算机专业哪个科目好,计算机二级考哪个科目比较好?
  7. 验证码图片 java_验证码图片
  8. Tomcat8.5访问HTML页面出现乱码
  9. 明翰英语教学系列之音标篇V0.2(持续更新)
  10. 周记0053:0054
  11. 高手速成android开源项目[View篇]
  12. 【zookeeper】Apache curator优点介绍
  13. mysql 导入unl文件_informix数据库及数据导入导出
  14. 逆向破解之易语言按钮事件特征码
  15. 娱乐网站(博主自用,他人勿扰)
  16. 用什么软件能测试cpu好坏,如何检测cpu是否损坏
  17. 威纶触摸屏中如何组态设置多国语言进行切换?
  18. 云数据中心安全设计要点
  19. 【敬伟ps教程】蒙版和通道的基础知识
  20. 数的进制的表示与转换

热门文章

  1. 小程序审核经验分享|小程序发布如何快速过审?
  2. 基于搜狗微信搜索获取公众号文章的阅读量及点赞量
  3. 如何做好一名前端Leader
  4. Linux有几种系列的发行版本?
  5. 什么样的投资者适合做股票配资?
  6. mac 安装exe文件的方法 mac怎么安装exe文件
  7. 交往实践视域中的2018CPCI检索一般多久与德性
  8. 知其然知其所以然 | Graph
  9. Numpy之国际象棋棋盘(8行8列)
  10. 论文快速阅读的方法总结_20180503