- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on ‎04-26-2023 03:21 AM
Spark Python Integration Test Result Exceptions
In this article, just I talk about exceptions and their Python and Spark versions. Keep on watching this article where I will add some more exceptions and solutions.
1. TypeError: an integer is required (got type bytes)
Exception:
Traceback (most recent call last):
File "/opt/pyspark_udf_example.py", line 3, in <module>
from pyspark.sql import SparkSession
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
File "<frozen zipimport>", line 259, in load_module
File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 46, in <module>
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
File "<frozen zipimport>", line 259, in load_module
File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 31, in <module>
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
File "<frozen zipimport>", line 259, in load_module
File "/opt/spark/python/lib/pyspark.zip/pyspark/accumulators.py", line 97, in <module>
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
File "<frozen zipimport>", line 259, in load_module
File "/opt/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 146, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 127, in _make_cell_set_template_code
TypeError: an integer is required (got type bytes)
Python Versions:
- Python 3.8
- Python 3.9
Spark Versions:
- Spark Version: 2.3.0
- Spark Version: 2.3.1
- Spark Version: 2.3.2
- Spark Version: 2.3.3
- Spark Version: 2.3.4
- Spark Version: 2.4.0
- Spark Version: 2.4.1
- Spark Version: 2.4.2
- Spark Version: 2.4.3
- Spark Version: 2.4.4
- Spark Version: 2.4.5
- Spark Version: 2.4.6
- Spark Version: 2.4.7
- Spark Version: 2.4.8
2. TypeError: 'bytes' object cannot be interpreted as an integer
Exception:
Traceback (most recent call last):
File "/opt/pyspark_udf_example.py", line 3, in <module>
from pyspark.sql import SparkSession
File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 46, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 31, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/accumulators.py", line 97, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 146, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 127, in _make_cell_set_template_code
TypeError: 'bytes' object cannot be interpreted as an integer
Python Versions:
- Python 3.10
Spark Versions:
- Spark Version: 2.3.0
- Spark Version: 2.3.1
- Spark Version: 2.3.2
- Spark Version: 2.3.3
- Spark Version: 2.3.4
- Spark Version: 2.4.0
- Spark Version: 2.4.1
- Spark Version: 2.4.2
- Spark Version: 2.4.3
- Spark Version: 2.4.4
- Spark Version: 2.4.5
- Spark Version: 2.4.6
- Spark Version: 2.4.7
- Spark Version: 2.4.8
3. TypeError: code expected at least 16 arguments, got 15
Exception:
Traceback (most recent call last):
File "/opt/pyspark_udf_example.py", line 3, in <module>
from pyspark.sql import SparkSession
File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 46, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 31, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/accumulators.py", line 97, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 146, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 127, in _make_cell_set_template_code
TypeError: code expected at least 16 arguments, got 15
Python Versions:
- Python 3.11
Spark Versions:
- Spark Version: 2.3.0
- Spark Version: 2.3.1
- Spark Version: 2.3.2
- Spark Version: 2.3.3
- Spark Version: 2.3.4
- Spark Version: 2.4.0
- Spark Version: 2.4.1
- Spark Version: 2.4.2
- Spark Version: 2.4.3
- Spark Version: 2.4.4
- Spark Version: 2.4.5
- Spark Version: 2.4.6
- Spark Version: 2.4.7
- Spark Version: 2.4.8
4. TypeError: code() argument 13 must be str, not int
Exception:
Traceback (most recent call last):
File "/opt/pyspark_udf_example.py", line 3, in <module>
from pyspark.sql import SparkSession
File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 51, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/context.py", line 30, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/accumulators.py", line 97, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 71, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 209, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 172, in _make_cell_set_template_code
TypeError: code() argument 13 must be str, not int
Python Versions:
- Python 3.11
Spark Versions:
- Spark Version: 3.0.0
- Spark Version: 3.0.1
- Spark Version: 3.0.3
- Spark Version: 3.0.3
5. SyntaxError: invalid syntax
Exception:
Traceback (most recent call last):
File "/opt/pyspark_udf_example.py", line 3, in <module>
from pyspark.sql import SparkSession
File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 53, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 34, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/java_gateway.py", line 31, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/find_spark_home.py", line 68
print("Could not find valid SPARK_HOME while searching {0}".format(paths), file=sys.stderr)
^
SyntaxError: invalid syntax
Python Versions:
- Python 2.7
Spark Versions:
- Spark Version: 3.1.1
- Spark Version: 3.1.2
- Spark Version: 3.1.3
- Spark Version: 3.2.0
- Spark Version: 3.2.1
- Spark Version: 3.2.2
- Spark Version: 3.2.3
6. ImportError: No module named 'typing'
Exception:
Traceback (most recent call last):
File "/opt/pyspark_udf_example.py", line 3, in <module>
from pyspark.sql import SparkSession
File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 53, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 34, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/java_gateway.py", line 32, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 67, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/cloudpickle/__init__.py", line 4, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/cloudpickle/cloudpickle.py", line 54, in <module>
ImportError: No module named 'typing'
Python Versions:
- Python 3.4
Spark Versions:
- Spark Version: 3.1.1
- Spark Version: 3.1.2
- Spark Version: 3.1.3
- Spark Version: 3.2.0
- Spark Version: 3.2.1
- Spark Version: 3.2.2
- Spark Version: 3.2.3
- Spark Version: 3.3.0
- Spark Version: 3.3.1
7. AttributeError: 'NoneType' object has no attribute 'items'
Exception:
Traceback (most recent call last):
File "/opt/pyspark_udf_example.py", line 3, in <module>
from pyspark.sql import SparkSession
File "<frozen importlib._bootstrap>", line 968, in _find_and_load
File "<frozen importlib._bootstrap>", line 957, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 664, in _load_unlocked
File "<frozen importlib._bootstrap>", line 634, in _load_backward_compatible
File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 53, in <module>
File "<frozen importlib._bootstrap>", line 968, in _find_and_load
File "<frozen importlib._bootstrap>", line 957, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 664, in _load_unlocked
File "<frozen importlib._bootstrap>", line 634, in _load_backward_compatible
File "/opt/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 48, in <module>
File "<frozen importlib._bootstrap>", line 968, in _find_and_load
File "<frozen importlib._bootstrap>", line 957, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 664, in _load_unlocked
File "<frozen importlib._bootstrap>", line 634, in _load_backward_compatible
File "/opt/spark/python/lib/pyspark.zip/pyspark/traceback_utils.py", line 23, in <module>
File "/opt/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 390, in namedtuple
AttributeError: 'NoneType' object has no attribute 'items'
Python Versions:
- Python 3.5
Spark Versions:
- Spark Version: 3.1.1
- Spark Version: 3.1.2
- Spark Version: 3.1.3
- Spark Version: 3.2.0
- Spark Version: 3.2.1
- Spark Version: 3.2.2
- Spark Version: 3.2.3
- Spark Version: 3.3.2
- Spark Version: 3.4.0
8. SyntaxError: invalid syntax
Exception:
Traceback (most recent call last):
File "/opt/pyspark_udf_example.py", line 3, in <module>
from pyspark.sql import SparkSession
File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 71
def since(version: Union[str, float]) -> Callable[[F], F]:
^
SyntaxError: invalid syntax
Python Versions:
- Python 2.7
Spark Versions:
- Spark Version: 3.3.0
- Spark Version: 3.3.1
- Spark Version: 3.3.2
- Spark Version: 3.4.0
9. SyntaxError: invalid syntax
Exception:
Traceback (most recent call last):
File "/opt/pyspark_udf_example.py", line 3, in <module>
from pyspark.sql import SparkSession
File "<frozen importlib._bootstrap>", line 968, in _find_and_load
File "<frozen importlib._bootstrap>", line 957, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 664, in _load_unlocked
File "<frozen importlib._bootstrap>", line 634, in _load_backward_compatible
File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 53, in <module>
File "<frozen importlib._bootstrap>", line 968, in _find_and_load
File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 896, in _find_spec
File "<frozen importlib._bootstrap_external>", line 1171, in find_spec
File "<frozen importlib._bootstrap_external>", line 1147, in _get_spec
File "<frozen importlib._bootstrap_external>", line 1128, in _legacy_get_spec
File "<frozen importlib._bootstrap>", line 444, in spec_from_loader
File "<frozen importlib._bootstrap_external>", line 565, in spec_from_file_location
File "/opt/spark/python/lib/pyspark.zip/pyspark/conf.py", line 110
_jconf: Optional[JavaObject]
^
SyntaxError: invalid syntax
Python Versions:
- Python 3.5
Spark Versions:
- Spark Version: 3.3.0
- Spark Version: 3.3.1
- Spark Version: 3.3.2
- Spark Version: 3.4.0
Note: The above all exceptions occurred while testing the pyspark code (sample udf) example with different Python versions.
Created on ‎03-25-2024 05:44 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi
Do I have the chance to obtain the source code of pyspark_udf_example.py ?
/opt/pyspark_udf_example.py
We would like to perform some compatibility test for our python and spark version.
Thank you.
Created on ‎03-25-2024 10:34 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @Leonm
We have already published Spark supported Python version(s) in the below article:
https://community.cloudera.com/t5/Community-Articles/Spark-Python-Supportability-Matrix/ta-p/379144
Please let me know still you need pyspark udf example for testing?
Created on ‎03-25-2024 10:40 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Yes, please.
We may need that udf example code to test our enviroment in the future.
Thank your for the help.
Created on ‎03-26-2024 06:20 AM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
You can find examples in the following github:
Created on ‎03-26-2024 06:21 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Thanks a lot !
Created on ‎12-07-2024 10:17 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
iam using below versions:
spark 2.4.8
Python 3.6.8
and got the above error when only run spark submit from nifi or oozie, but it works fine when run it using shell, is there solution or configuration i missed.
from pyspark.sql import SparkSession
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
File "<frozen zipimport>", line 259, in load_module
File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 51, in <module>
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
File "<frozen zipimport>", line 259, in load_module
File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 31, in <module>
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
File "<frozen zipimport>", line 259, in load_module
File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/accumulators.py", line 97, in <module>
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
File "<frozen zipimport>", line 259, in load_module
File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 72, in <module>
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
File "<frozen zipimport>", line 259, in load_module
File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 145, in <module>
File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 126, in _make_cell_set_template_code
TypeError: an integer is required (got type bytes)