tensorflow综合示例3：对结构化数据进行分类：csv keras feature

文章目录

1、数据集
- 1.1 使用 Pandas 从csv创建一个 dataframe
- 1.2 将 dataframe 拆分为训练、验证和测试集
- 1.3 用 tf.data 创建输入流水线Dataset
- 1.4 理解输入流水线
2、特征列 feature_column
- 2.1 数值列
- 2.2 分桶列
- 2.3 分类列
- 2.4 嵌入列
- 2.5 经过哈希处理的特征列
- 2.6 组合的特征列
- 2.7 选择要使用的列
3、构建&运行模型
- 3.1 建立一个新的特征层
- 3.2 创建，编译和训练模型
4、完整代码
5、另一个简单例子

本文主要内容来自：https://www.tensorflow.org/tutorials/structured_data/feature_columns?hl=zh-cn

本教程演示了如何对结构化数据进行分类（例如，CSV 中的表格数据）。我们将使用 Keras 来定义模型，将特征列（feature columns）作为从 CSV 中的列（columns）映射到用于训练模型的特征（features）的桥梁。本教程包括了以下内容的完整代码：

用 Pandas 导入 CSV 文件。
用 tf.data 建立了一个输入流水线（pipeline），用于对行进行分批（batch）和随机排序（shuffle）。
用特征列将 CSV 中的列映射到用于训练模型的特征。
用 Keras 构建，训练并评估模型。

1、数据集

我们将使用一个小型数据集，该数据集由克利夫兰心脏病诊所基金会（Cleveland Clinic Foundation for Heart Disease）提供。CSV 中有几百行数据。每行描述了一个病人（patient），每列描述了一个属性（attribute）。我们将使用这些信息来预测一位病人是否患有心脏病，这是在该数据集上的二分类任务。

import numpy as np
import pandas as pdimport tensorflow as tffrom tensorflow import feature_column
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split

1.1 使用 Pandas 从csv创建一个 dataframe

Pandas 是一个 Python 库，它有许多有用的实用程序，用于加载和处理结构化数据。我们将使用 Pandas 从 URL下载数据集，并将其加载到 dataframe 中。

URL = 'https://storage.googleapis.com/applied-dl/heart.csv'
dataframe = pd.read_csv(URL)
dataframe.head()

	age	sex	cp	trestbps	chol	fbs	restecg	thalach	exang	oldpeak	slope	ca	thal	target
0	63	1	1	145	233	1	2	150	0	2.3	3	0	fixed	0
1	67	1	4	160	286	0	2	108	1	1.5	2	3	normal	1
2	67	1	4	120	229	0	2	129	1	2.6	2	2	reversible	0
3	37	1	3	130	250	0	0	187	0	3.5	3	0	normal	0
4	41	0	2	130	204	0	2	172	0	1.4	1	0	normal	0

1.2 将 dataframe 拆分为训练、验证和测试集

我们下载的数据集是一个 CSV 文件。我们将其拆分为训练、验证和测试集。

train, test = train_test_split(dataframe, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
print(len(train), 'train examples')
print(len(val), 'validation examples')
print(len(test), 'test examples')

193 train examples
49 validation examples
61 test examples

1.3 用 tf.data 创建输入流水线Dataset

接下来，我们将使用 tf.data 包装 dataframe。这让我们能将特征列作为一座桥梁，该桥梁将 Pandas dataframe 中的列映射到用于训练模型的特征。如果我们使用一个非常大的 CSV 文件（非常大以至于它不能放入内存），我们将使用 tf.data 直接从磁盘读取它。本教程不涉及这一点。

# 一种从 Pandas Dataframe 创建 tf.data 数据集的实用程序方法（utility method）
def df_to_dataset(dataframe, shuffle=True, batch_size=32):dataframe = dataframe.copy()labels = dataframe.pop('target')ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))if shuffle:ds = ds.shuffle(buffer_size=len(dataframe))ds = ds.batch(batch_size)return ds

batch_size = 5 # 小批量大小用于演示
train_ds = df_to_dataset(train, batch_size=batch_size)
val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size)
test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size)

1.4 理解输入流水线

现在我们已经创建了输入流水线，让我们调用它来查看它返回的数据的格式。我们使用了一小批量大小来保持输出的可读性。

for feature_batch, label_batch in train_ds.take(1):print('Every feature:', list(feature_batch.keys()))print('A batch of ages:', feature_batch['age'])print('A batch of targets:', label_batch )

Every feature: ['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach', 'exang', 'oldpeak', 'slope', 'ca', 'thal']
A batch of ages: tf.Tensor([51 56 42 54 46], shape=(5,), dtype=int64)
A batch of targets: tf.Tensor([0 0 0 1 0], shape=(5,), dtype=int64)

我们可以看到数据集返回了一个字典，该字典从列名称（来自 dataframe）映射到 dataframe 中行的列值。

2、特征列 feature_column

TensorFlow 提供了多种特征列。本节中，我们将创建几类特征列，并演示特征列如何转换 dataframe 中的列。

# 我们将使用该批数据演示几种特征列
example_batch = next(iter(train_ds))[0]
# 用于创建一个特征列
# 并转换一批次数据的一个实用程序方法
def demo(feature_column):feature_layer = layers.DenseFeatures(feature_column)print(feature_layer(example_batch).numpy())

2.1 数值列

一个特征列的输出将成为模型的输入（使用上面定义的 demo 函数，我们将能准确地看到 dataframe 中的每列的转换方式）。数值列（numeric column）是最简单的列类型。它用于表示实数特征。使用此列时，模型将从 dataframe 中接收未更改的列值。

age = feature_column.numeric_column("age")
demo(age)

WARNING:tensorflow:Layer dense_features is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because its dtype defaults to floatx.If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.[[62.][52.][40.][59.][56.]]

在这个心脏病数据集中，dataframe 中的大多数列都是数值列。

2.2 分桶列

通常，您不希望将数字直接输入模型，而是根据数值范围将其值分成不同的类别。考虑代表一个人年龄的原始数据。我们可以用分桶列（bucketized column）将年龄分成几个分桶（buckets），而不是将年龄表示成数值列。请注意下面的 one-hot 数值表示每行匹配的年龄范围。

age_buckets = feature_column.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
demo(age_buckets)

WARNING:tensorflow:Layer dense_features_1 is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because its dtype defaults to floatx.If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.[[0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.][0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.][0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.][0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.][0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]]

2.3 分类列

在此数据集中，thal 用字符串表示（如 ‘fixed’，‘normal’，或 ‘reversible’）。我们无法直接将字符串提供给模型。相反，我们必须首先将它们映射到数值。分类词汇列（categorical vocabulary columns）提供了一种用 one-hot 向量表示字符串的方法（就像您在上面看到的年龄分桶一样）。词汇表可以用 categorical_column_with_vocabulary_list 作为 list 传递，或者用 categorical_column_with_vocabulary_file 从文件中加载。

thal = feature_column.categorical_column_with_vocabulary_list('thal', ['fixed', 'normal', 'reversible'])thal_one_hot = feature_column.indicator_column(thal)
demo(thal_one_hot)

WARNING:tensorflow:Layer dense_features_2 is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because its dtype defaults to floatx.If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.[[0. 1. 0.][0. 0. 1.][0. 0. 1.][0. 0. 1.][0. 1. 0.]]

在更复杂的数据集中，许多列都是分类列（如 strings）。在处理分类数据时，特征列最有价值。尽管在该数据集中只有一列分类列，但我们将使用它来演示在处理其他数据集时，可以使用的几种重要的特征列。

2.4 嵌入列

假设我们不是只有几个可能的字符串，而是每个类别有数千（或更多）值。由于多种原因，随着类别数量的增加，使用 one-hot 编码训练神经网络变得不可行。我们可以使用嵌入列来克服此限制。嵌入列（embedding column）将数据表示为一个低维度密集向量，而非多维的 one-hot 向量，该低维度密集向量可以包含任何数，而不仅仅是 0 或 1。嵌入的大小（在下面的示例中为 8）是必须调整的参数。

关键点：当分类列具有许多可能的值时，最好使用嵌入列。我们在这里使用嵌入列用于演示目的，为此您有一个完整的示例，以在将来可以修改用于其他数据集。

# 注意到嵌入列的输入是我们之前创建的类别列
thal_embedding = feature_column.embedding_column(thal, dimension=8)
demo(thal_embedding)

WARNING:tensorflow:Layer dense_features_3 is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because its dtype defaults to floatx.If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.[[-0.16302079 -0.19813393 -0.11037839 -0.2307198   0.30720705 -0.57019540.0502194  -0.34920064][ 0.4270712  -0.278063    0.23978122 -0.07503474  0.10773634 -0.06057737-0.6062939   0.19062711][ 0.4270712  -0.278063    0.23978122 -0.07503474  0.10773634 -0.06057737-0.6062939   0.19062711][ 0.4270712  -0.278063    0.23978122 -0.07503474  0.10773634 -0.06057737-0.6062939   0.19062711][-0.16302079 -0.19813393 -0.11037839 -0.2307198   0.30720705 -0.57019540.0502194  -0.34920064]]

2.5 经过哈希处理的特征列

表示具有大量数值的分类列的另一种方法是使用 categorical_column_with_hash_bucket。该特征列计算输入的一个哈希值，然后选择一个 hash_bucket_size 分桶来编码字符串。使用此列时，您不需要提供词汇表，并且可以选择使 hash_buckets 的数量远远小于实际类别的数量以节省空间。

关键点：该技术的一个重要缺点是可能存在冲突，不同的字符串被映射到同一个范围。实际上，无论如何，经过哈希处理的特征列对某些数据集都有效。

thal_hashed = feature_column.categorical_column_with_hash_bucket('thal', hash_bucket_size=1000)
demo(feature_column.indicator_column(thal_hashed))

WARNING:tensorflow:Layer dense_features_4 is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because its dtype defaults to floatx.If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.[[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]]

2.6 组合的特征列

将多种特征组合到一个特征中，称为特征组合（feature crosses），它让模型能够为每种特征组合学习单独的权重。此处，我们将创建一个 age 和 thal 组合的新特征。请注意，crossed_column 不会构建所有可能组合的完整列表（可能非常大）。相反，它由 hashed_column 支持，因此您可以选择表的大小。

crossed_feature = feature_column.crossed_column([age_buckets, thal], hash_bucket_size=1000)
demo(feature_column.indicator_column(crossed_feature))

WARNING:tensorflow:Layer dense_features_5 is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because its dtype defaults to floatx.If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.[[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]]

2.7 选择要使用的列

我们已经了解了如何使用几种类型的特征列。现在我们将使用它们来训练模型。本教程的目标是向您展示使用特征列所需的完整代码（例如，机制）。我们任意地选择了几列来训练我们的模型。

关键点：如果您的目标是建立一个准确的模型，请尝试使用您自己的更大的数据集，并仔细考虑哪些特征最有意义，以及如何表示它们。

feature_columns = []# 数值列
for header in ['age', 'trestbps', 'chol', 'thalach', 'oldpeak', 'slope', 'ca']:feature_columns.append(feature_column.numeric_column(header))# 分桶列
age_buckets = feature_column.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
feature_columns.append(age_buckets)# 分类列
thal = feature_column.categorical_column_with_vocabulary_list('thal', ['fixed', 'normal', 'reversible'])
thal_one_hot = feature_column.indicator_column(thal)
feature_columns.append(thal_one_hot)# 嵌入列
thal_embedding = feature_column.embedding_column(thal, dimension=8)
feature_columns.append(thal_embedding)# 组合列
crossed_feature = feature_column.crossed_column([age_buckets, thal], hash_bucket_size=1000)
crossed_feature = feature_column.indicator_column(crossed_feature)
feature_columns.append(crossed_feature)

3、构建&运行模型

3.1 建立一个新的特征层

现在我们已经定义了我们的特征列，我们将使用密集特征（DenseFeatures）层将特征列输入到我们的 Keras 模型中。

feature_layer = tf.keras.layers.DenseFeatures(feature_columns)

之前，我们使用一个小批量大小来演示特征列如何运转。我们将创建一个新的更大批量的输入流水线。

batch_size = 32
train_ds = df_to_dataset(train, batch_size=batch_size)
val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size)
test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size)

3.2 创建，编译和训练模型

model = tf.keras.Sequential([feature_layer,layers.Dense(128, activation='relu'),layers.Dense(128, activation='relu'),layers.Dense(1, activation='sigmoid')
])model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'],run_eagerly=True)model.fit(train_ds,validation_data=val_ds,epochs=5)

7/7 [==============================] - 0s 42ms/step - loss: 0.5361 - accuracy: 0.7254 - val_loss: 0.7132 - val_accuracy: 0.5102<tensorflow.python.keras.callbacks.History at 0x7ffbf4973410>

关键点：通常使用更大更复杂的数据集进行深度学习，您将看到最佳结果。使用像这样的小数据集时，我们建议使用决策树或随机森林作为强有力的基准。本教程的目的不是训练一个准确的模型，而是演示处理结构化数据的机制，这样，在将来使用自己的数据集时，您有可以使用的代码作为起点。

下一步
了解有关分类结构化数据的更多信息的最佳方法是亲自尝试。我们建议寻找另一个可以使用的数据集，并使用和上面相似的代码，训练一个模型，对其分类。要提高准确率，请仔细考虑模型中包含哪些特征，以及如何表示这些特征。

4、完整代码

# -*- coding: utf-8 -*-"""AUTHOR: lujinhongCREATED ON: 2020年08月28日 11:53PROJECT: lujinhong-commons-python3
DESCRIPTION: TODO
"""
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import feature_column
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split## 1、数据集
URL = 'https://storage.googleapis.com/applied-dl/heart.csv'
dataframe = pd.read_csv(URL)
dataframe.head()train, test = train_test_split(dataframe, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)# 一种从 Pandas Dataframe 创建 tf.data 数据集的实用程序方法（utility method）
def df_to_dataset(dataframe, shuffle=True, batch_size=32):dataframe = dataframe.copy()labels = dataframe.pop('target')ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))if shuffle:ds = ds.shuffle(buffer_size=len(dataframe))ds = ds.batch(batch_size)return dsbatch_size = 32
train_ds = df_to_dataset(train, batch_size=batch_size)
val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size)
test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size)## 2、构建feature_columns
feature_columns = []# 数值列
for header in ['age', 'trestbps', 'chol', 'thalach', 'oldpeak', 'slope', 'ca']:feature_columns.append(feature_column.numeric_column(header))# 分桶列
age = feature_column.numeric_column("age")
age_buckets = feature_column.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
feature_columns.append(age_buckets)# 分类列
thal = feature_column.categorical_column_with_vocabulary_list('thal', ['fixed', 'normal', 'reversible'])
thal_one_hot = feature_column.indicator_column(thal)
feature_columns.append(thal_one_hot)# 嵌入列
thal_embedding = feature_column.embedding_column(thal, dimension=8)
feature_columns.append(thal_embedding)# 组合列
crossed_feature = feature_column.crossed_column([age_buckets, thal], hash_bucket_size=1000)
crossed_feature = feature_column.indicator_column(crossed_feature)
feature_columns.append(crossed_feature)## 3、构建并运行模型
feature_layer = tf.keras.layers.DenseFeatures(feature_columns)model = tf.keras.Sequential([feature_layer,layers.Dense(128, activation='relu'),layers.Dense(128, activation='relu'),layers.Dense(1, activation='sigmoid')
])model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'],run_eagerly=True)model.fit(train_ds,validation_data=val_ds,epochs=5)

Epoch 1/5Consider rewriting this model with the Functional API.
7/7 [==============================] - 0s 42ms/step - loss: 0.5361 - accuracy: 0.7254 - val_loss: 0.7132 - val_accuracy: 0.5102<tensorflow.python.keras.callbacks.History at 0x7ffbf4973410>

5、另一个简单例子

# -*- coding: utf-8 -*-"""AUTHOR: lujinhongCREATED ON: 2020年08月28日 10:26PROJECT: lujinhong-commons-python3
DESCRIPTION: TODO
"""import tensorflow as tf
import pandas as pd
print(tf.__version__)
import ssl
ssl._create_default_https_context = ssl._create_unverified_context## 1、准备数据集
df_train = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')
df_eval = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv')
y_train = df_train.pop('survived')
y_eval = df_eval.pop('survived')ds_train = tf.data.Dataset.from_tensor_slices((dict(df_train),y_train)).batch(2)
ds_eval = tf.data.Dataset.from_tensor_slices((dict(df_eval),y_eval))## 2、构建feature_column
CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck','embark_town', 'alone']
NUMERIC_COLUMNS = ['age', 'fare']feature_columns = []
#对类别特征做one-hot，还可以用embeding_column做embedding。
for feature_name in CATEGORICAL_COLUMNS:vocabulary = df_train[feature_name].unique()feature_columns.append(tf.feature_column.indicator_column(tf.feature_column.categorical_column_with_vocabulary_list(feature_name,vocabulary)))for feature_name in NUMERIC_COLUMNS:feature_columns.append(tf.feature_column.numeric_column(feature_name,dtype=tf.float32))#除上述特征外，还可以做组合特征。## 3、构建并运行模型
feature_layer = tf.keras.layers.DenseFeatures(feature_columns)model = tf.keras.Sequential([feature_layer,# tf.keras.layers.Dense(128, activation='relu'),# tf.keras.layers.Dense(128, activation='relu'),tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss = 'binary_crossentropy', optimizer='sgd',metrics=['accuracy'])model.fit(ds_train)

2.3.0
WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor, but we receive a <class 'dict'> input: {'sex': <tf.Tensor 'ExpandDims_8:0' shape=(None, 1) dtype=string>, 'age': <tf.Tensor 'ExpandDims:0' shape=(None, 1) dtype=float64>, 'n_siblings_spouses': <tf.Tensor 'ExpandDims_6:0' shape=(None, 1) dtype=int64>, 'parch': <tf.Tensor 'ExpandDims_7:0' shape=(None, 1) dtype=int64>, 'fare': <tf.Tensor 'ExpandDims_5:0' shape=(None, 1) dtype=float64>, 'class': <tf.Tensor 'ExpandDims_2:0' shape=(None, 1) dtype=string>, 'deck': <tf.Tensor 'ExpandDims_3:0' shape=(None, 1) dtype=string>, 'embark_town': <tf.Tensor 'ExpandDims_4:0' shape=(None, 1) dtype=string>, 'alone': <tf.Tensor 'ExpandDims_1:0' shape=(None, 1) dtype=string>}
Consider rewriting this model with the Functional API.
WARNING:tensorflow:Layer dense_features_8 is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because its dtype defaults to floatx.If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor, but we receive a <class 'dict'> input: {'sex': <tf.Tensor 'ExpandDims_8:0' shape=(None, 1) dtype=string>, 'age': <tf.Tensor 'ExpandDims:0' shape=(None, 1) dtype=float64>, 'n_siblings_spouses': <tf.Tensor 'ExpandDims_6:0' shape=(None, 1) dtype=int64>, 'parch': <tf.Tensor 'ExpandDims_7:0' shape=(None, 1) dtype=int64>, 'fare': <tf.Tensor 'ExpandDims_5:0' shape=(None, 1) dtype=float64>, 'class': <tf.Tensor 'ExpandDims_2:0' shape=(None, 1) dtype=string>, 'deck': <tf.Tensor 'ExpandDims_3:0' shape=(None, 1) dtype=string>, 'embark_town': <tf.Tensor 'ExpandDims_4:0' shape=(None, 1) dtype=string>, 'alone': <tf.Tensor 'ExpandDims_1:0' shape=(None, 1) dtype=string>}
Consider rewriting this model with the Functional API.
314/314 [==============================] - 0s 1ms/step - loss: 6.1425 - accuracy: 0.5965<tensorflow.python.keras.callbacks.History at 0x7ffbf4d03390>