文章目录

  • 一、问题描述
  • 二、解决方案
  • Reference

一、问题描述

from pyspark.sql.types import StringType@udf(returnType = StringType())
def bad_funify(s):return s + " is fun!"countries2 = spark.createDataFrame([("Thailand", 3), (None, 4)], ["country", "id"])
countries2.withColumn("fun_country", bad_funify("country")).show()

用一个udf想让df(有country和id两个字段)生成新的一列fun_country(内容是字符串,内容为country xx is fun),但是df中有的country字段内容没有数据(注意类型是None而不是null),结果报错如下:

PythonException: An exception was thrown from the Python worker. Please see the stack trace below.
Traceback (most recent call last):File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/worker.py", line 619, in mainprocess()File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/worker.py", line 611, in processserializer.dump_stream(out_iter, outfile)File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/serializers.py", line 211, in dump_streamself.serializer.dump_stream(self._batched(iterator), stream)File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/serializers.py", line 132, in dump_streamfor obj in iterator:File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/serializers.py", line 200, in _batchedfor item in iterator:File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/worker.py", line 452, in mapperresult = tuple(f(*[a[o] for o in arg_offsets]) for (arg_offsets, f) in udfs)File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/worker.py", line 452, in <genexpr>result = tuple(f(*[a[o] for o in arg_offsets]) for (arg_offsets, f) in udfs)File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/worker.py", line 87, in <lambda>return lambda *a: f(*a)File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/util.py", line 74, in wrapperreturn f(*args, **kwargs)File "<ipython-input-1051-5a6c51e7c332>", line 5, in bad_funify
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

二、解决方案

这是个很蠢的问题。其实如果country为空值时,fun_country应该也是空的,所以就简单加多个判断的逻辑即可。修改udf为good_funity后:

@udf(returnType=StringType())
def good_funify(s):return None if s == None else s + " is fun!"
countries2.withColumn("fun_country", good_funify("country")).show()+--------+---+----------------+
| country| id|     fun_country|
+--------+---+----------------+
|Thailand|  3|Thailand is fun!|
|    null|  4|            null|
+--------+---+----------------+

Reference

[1] Navigating None and null in PySpark

解决报错TypeError:unsupported operand type(s) for +: ‘NoneType‘ and ‘str‘相关推荐

  1. apex安装报错:TypeError: unsupported operand type(s) for +: ‘NoneType‘ and ‘str‘

    参考解决方法:TypeError: unsupported operand type(s) for +: 'NoneType' and 'str' · Issue #990 · NVIDIA/apex ...

  2. 解决pip安装时出现报错TypeError: unsupported operand type(s) for -=: ‘Retry‘ and ‘int‘

    我在Linux下使用pip安装时出现报错: Exception: Traceback (most recent call last):File "/usr/lib/python2.7/dis ...

  3. python报错TypeError: unsupported operand type(s) for -: ‘decimal.Decimal‘ and ‘float‘的解决方法

    问题描述 在编写python代码时,进行小数位相减时出现 TypeError: unsupported operand type(s) for -: 'decimal.Decimal' and 'fl ...

  4. 预测数据时数据类型是object导致报错TypeError: unsupported operand type(s) for -: ‘str‘ and ‘float‘

    解决方法 更换数据类型: data:pd.DataFrame = data.astype('int64') # 或是: data:pd.DataFrame = data.astype('float') ...

  5. TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

    遇到这个问题,因为在文件中写入中文,目前的做法是open时指定encoding为utf-8,问题解决. 修改文件名为中文也有类似问题,原因是系统默认不支持中文.

  6. opencv小笔记(TypeError: unsupported operand type(s) for +: ‘NoneType‘ and ‘NoneType‘)

    今天在学习OpenCV的算数操作时,进行图像加法时,运行了下列程序 import numpy as np import cv2 as cv import matplotlib.pyplot as pl ...

  7. 成功解决TypeError: unsupported operand type(s) for %: 'NoneType' and 'dict'

    成功解决TypeError: unsupported operand type(s) for %: 'NoneType' and 'dict' 目录 解决问题 解决思路 解决方法 解决问题 TypeE ...

  8. 解决pandas(Python)的报错:unsupported operand type(s) for -: ‘datetime.date’ and ‘Timestamp’

    在使用Pandas包的时候,遇到时间加减出现的报错 'datetime.date'是datetime的一种时间格式: 'Timestamp'(注意是大写的字母)是Pandas的一种时间格式. 这两个虽 ...

  9. Python3报错:TypeError: unsupported operand type(s) for +: ‘int‘ and ‘str‘

    其实错误提示已经很明确了,"类型错误:不支持操作类型为整数和字符串",这里需要解释的最关键的东西是"+","+"在python中有两个作用, ...

最新文章

  1. 数据库基础笔记(MySQL)6 —— 基础事务
  2. iOS8开发~UI布局(二)storyboard中autolayout和size class的使用详解
  3. C++何时调用拷贝(复制)构造函数
  4. ubuntu14.0.4下安装pycharm
  5. 5.12 QR分解的阻尼倒数法和正则化方法区别
  6. Leetcode每日一题:112.path-sum(路经总和)
  7. 微信推出“腾讯QQ”小程序;马化腾又要发红包;GitLab 12.5 稳定版发布| 极客头条...
  8. 计蒜客 - T1012 A*B问题
  9. 打开UG8.0出现启动界闪一下就没有任何反应了怎么回事?
  10. java连接微信发送给好友信息,微信消息转发以及给指定好友发送消息
  11. Linux安装wordpress
  12. arm汇编语言中bne 1b的意思
  13. Matlab中int2str函数使用
  14. 外卖优惠券返利系统外卖返利公众号搭建cps系统小程序SaaS源码
  15. SIP-sipp的使用
  16. python 中在字符串前面加上b,u,r的含义
  17. 全国大学生“高教杯“成图大赛:关于齿轮的快速建模研究
  18. [TL-WDR7300] 如何当作交换机使用?
  19. 高频交易配对交易学习——Copulas函数理解
  20. JAVA 的while循环和字符串的使用

热门文章

  1. CVPR 2023 所有论文已可下载,获奖候选论文 12篇
  2. Firefox浏览器下载网页上的视频
  3. centos8: 执行 telnet ip port 报错 No route to host
  4. Nginx HA双机热备
  5. mapbox地图的使用方法-总结
  6. UAV路径规划算法与平台总结
  7. 南开大学软件学院2021年秋季学期研究生算法课程(复习)动态规划
  8. Captcha验证码使用,算术,中文,数字
  9. java中的类的继承_再谈Java中类的继承
  10. 经典面试题-什么是ORM?