- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Pandas UDFs in Pyspark ; ModuleNotFoundError: No module named 'pyarrow'
- Labels:
-
Apache Spark
Created 08-13-2020 03:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to use pandas udfs in my code. Internally it uses apache arrow for the data conversion. I am getting below issue with the pyarrow module despite of me importing it in my app code explicitly.
File "/opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p3996.4056429/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 361, in main
func, profiler, deserializer, serializer = read_udfs(pickleSer, infile, eval_type)
File "/opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p3996.4056429/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 236, in read_udfs
arg_offsets, udf = read_single_udf(pickleSer, infile, eval_type, runner_conf)
File "/opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p3996.4056429/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 175, in read_single_udf
return arg_offsets, wrap_scalar_pandas_udf(func, return_type)
File "/opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p3996.4056429/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 84, in wrap_scalar_pandas_udf
arrow_return_type = to_arrow_type(return_type)
File "/opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p3996.4056429/lib/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 1585, in to_arrow_type
import pyarrow as pa
ModuleNotFoundError: No module named 'pyarrow'
I also tried to manually enable arrow but still no luck
spark.conf.set("spark.sql.execution.arrow.enabled", "true")
Created 08-13-2020 08:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@AnandG Is pyarrow module is installed? Please try to install with:
#pip install pyarrow
(pip3 for Python 3, you may need to upgrade the pip utility as well)
Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
