pyspark ValueError: Some of types cannot be determined after inferring

场景：当pandas的DF转换成spark的DF的时候报错 ValueError: Some of types cannot be determined after inferring

报错原因是存在字段spark无法推断它的类型

解决方案，直接全部转换成str

b['request_market'] = b['request_market'].astype(str)
b['request_vin'] = b['request_vin'].astype(str)
b['request_brandCode'] = b['request_brandCode'].astype(str)
b['request_token'] = b['request_token'].astype(str)
b['response_msg'] = b['response_msg'].astype(str)
b['response_brandCode'] = b['response_brandCode'].astype(str)
b['response_data_source'] = b['response_data_source'].astype(str)
b['response_title'] = b['response_title'].astype(str)
b['response_img'] = b['response_img'].astype(str)
b['result'] = b['result'].astype(str)
b['api_path'] = b['api_path'].astype(str)
b['response_code'] = b['response_code'].astype(str)
b['create_time'] = b['create_time'].astype(str)
b['takeup_time'] = b['takeup_time'].astype(str)
b['response_code'] = b['response_code'].astype(str)
b['response_length'] = b['response_length'].astype(str)
b['response_feedback'] = b['response_feedback'].astype(str)
b['response_carsmodel'] = b['response_carsmodel'].astype(str)
b['response_query_time'] = b['response_query_time'].astype(str)
b['response_data'] = b['response_data'].astype(str)
b.dtypes

pyspark ValueError: Some of types cannot be determined after inferring相关推荐

【解决方案】ValueError: Some of types cannot be determined by the first 100 rows
问题在 spark 中试图将 RDD 转换成 DataFrame 时,有时会提示 ValueError: Some of types cannot be determined by the firs ...
使用lgb.cv时出现ValueError: Supported target types are: (‘binary‘, ‘multiclass‘). Got ‘continuous‘ instea
使用lgb.cv时出现ValueError: Supported target types are: ('binary', 'multiclass'). Got 'continuous' instea ...
使用stratifiedKFold进行分层交叉验证时候报错:ValueError: Supported target types are: ('binary', 'multiclass'). Got
ValueError Traceback (most recent call last) <ipython-input-42-2ab744268d80> in <module> ...
ValueError: Excel file format cannot be determined, you must specify an engine manually.解决问题亲测有效
这个问题我搞了很久,最后终于搞明白了,先贴图,网上各种什么utf-8呀,格式化或者另存都不行,我都试过了. 问题在于原表格格式有些问题,最直接的办法就是把表格的内容复制到一个自己新建的表格中,然后改成 ...
【Pyspark教程】SQL、MLlib、Core等模块基础使用
文章目录零.Spark基本原理 0.1 pyspark.sql 核心类 0.2 spark的基本概念 0.3 spark部署方式 0.4 RDD数据结构 (1)创建RDD的2种方式 (2)RDD操作 ...
ML之Xgboost：利用Xgboost模型(7f-CrVa+网格搜索调参)对数据集(比马印第安人糖尿病)进行二分类预测
ML之Xgboost:利用Xgboost模型(7f-CrVa+网格搜索调参)对数据集(比马印第安人糖尿病)进行二分类预测目录输出结果设计思路核心代码输出结果设计思路核心代码 grid_s ...
python 使用xlsx和pandas处理Excel表格
目录一.使用xls和xlsx处理Excel表格 1.1 用openpyxl模块打开Excel文档,查看所有sheet表 1.2 通过sheet名称获取表格 1.3 获取活动表的获取行数和列数 ◼ 读 ...
python使用xlsx和pandas处理Excel表格的操作步骤
python的神器pandas库就可以非常方便地处理excel,csv,矩阵,表格等数据,下面这篇文章主要给大家介绍了关于python使用xlsx和pandas处理Excel表格的操作步骤,文中通过 ...
成功解决ValueError: DataFrame.dtypes for data must be int, float or bool.Did not expect the data types
成功解决ValueError: DataFrame.dtypes for data must be int, float or bool. Did not expect the data types ...

pyspark ValueError: Some of types cannot be determined after inferring

pyspark ValueError: Some of types cannot be determined after inferring相关推荐

最新文章

热门文章