Member since
06-02-2020
331
Posts
64
Kudos Received
49
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1104 | 07-11-2024 01:55 AM | |
3143 | 07-09-2024 11:18 PM | |
2704 | 07-09-2024 04:26 AM | |
2064 | 07-09-2024 03:38 AM | |
2344 | 06-05-2024 02:03 AM |
12-07-2024
10:17 PM
iam using below versions: spark 2.4.8 Python 3.6.8 and got the above error when only run spark submit from nifi or oozie, but it works fine when run it using shell, is there solution or configuration i missed. from pyspark.sql import SparkSession File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 655, in _load_unlocked File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible File "<frozen zipimport>", line 259, in load_module File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 51, in <module> File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 655, in _load_unlocked File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible File "<frozen zipimport>", line 259, in load_module File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 31, in <module> File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 655, in _load_unlocked File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible File "<frozen zipimport>", line 259, in load_module File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/accumulators.py", line 97, in <module> File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 655, in _load_unlocked File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible File "<frozen zipimport>", line 259, in load_module File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 72, in <module> File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 655, in _load_unlocked File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible File "<frozen zipimport>", line 259, in load_module File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 145, in <module> File "/opt/cloudera/parcels/CDH-7.1.9-1.cdh7.1.9.p14.53489573/lib/spark/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 126, in _make_cell_set_template_code TypeError: an integer is required (got type bytes)
... View more
10-25-2024
06:57 AM
We’re attempting to run a basic Spark job to read/write data from Solr, using the following versions: CDP version: 7.1.9 Spark: Spark3 Solr: 8.11 Spark-Solr Connector: opt/cloudera/parcels/SPARK3/lib/spark3/spark-solr/spark-solr-3.9.3000.3.3.7191000.0-78-shaded.jar When we attempt to interact with Solr through Spark, the execution stalls indefinitely without any errors or results(similar to the issue which @hadoopranger mentioned). Other components, such as Hive and HBase, integrate smoothly with Spark, and we are using a valid Kerberos ticket that successfully connects with other Hadoop components. Additionally, testing REST API calls via both curl and Python’s requests library confirms we can access Solr and retrieve data using the Kerberos ticket. The issue seems isolated to Solr’s connection with Spark, as we have had no problems with other systems. Has anyone encountered a similar issue or have suggestions for potential solutions? @RangaReddy @hadoopranger
... View more
10-08-2024
03:57 AM
Yes , upgrading spark to newest SPARK version SPARK3-3.3.2.3.3.7190.5-2-1.p0.54391297 - fixed the issue
... View more
10-01-2024
03:39 AM
1 Kudo
@ayukus0705, Did the response assist in resolving your query? If it did, kindly mark the relevant reply as the solution, as it will aid others in locating the answer more easily in the future.
... View more
09-10-2024
05:34 PM
1 Kudo
In CDP Public Cloud CDW Impala, you can only use HTTP+SSL to access, So you have to Edit the config file to specify ODBC Driver C:\Program Files\Microsoft Power BI Desktop\bin\ODBC Drivers\Cloudera ODBC Driver for Impala\lib\cloudera.impalaodbc.ini [Driver]
AllowHostNameCNMismatch = 0
CheckCertRevocation = 0
TransportMode = http
AuthMech=3 https://community.cloudera.com/t5/Community-Articles/How-to-Connect-to-CDW-Impala-VW-Using-the-Power-BI-Desktop/ta-p/393013#toc-hId-1805728480
... View more
08-18-2024
10:11 PM
Pyspark 3.5.2 - python >= 3.8 and <=3.11 ref: https://pypi.org/project/pyspark/3.5.2/
... View more
07-15-2024
08:20 PM
Hi @RangaReddy , Exception stack trace: Currently we are running our spark jobs on yarn using same code and we never get his issue. Could it be caused by lack of memory. 2. We didn't hard code the clientmode any where. I was working fine in yarn not with Kubernetes. 3. we have tried by providing the following but it didn't work. And we also downloaded these jars and placed in the jars folder. But no Luck. --packages org.apache.hadoop:hadoop-aws:3.3.4 \ --packages com.amazonaws:aws-java-sdk-bundle:1.12.262 \ --packages org.apache.spark:spark-hadoop-cloud_2.12:3.5.1 \ --packages org.apache.hadoop:hadoop-client-api:3.3.4 \ --packages org.apache.hadoop:hadoop-client-runtime:3.3.4 \
... View more
07-11-2024
01:55 AM
Hi @MoatazNader Yes, you can create/update/delete the iceberg table data using Impala in CDP 7.1.9 Creating table: https://docs.cloudera.com/cdp-private-cloud-base/7.1.9/iceberg-how-to/topics/iceberg-table-creation.html Insert Data: https://docs.cloudera.com/cdp-private-cloud-base/7.1.9/iceberg-how-to/topics/iceberg-insert-table-data.html Update/Delete: https://docs.cloudera.com/cdp-private-cloud-base/7.1.9/iceberg-how-to/topics/iceberg-best-practice-row-modifications.html Reference: https://docs.cloudera.com/cdp-private-cloud-base/7.1.9/iceberg-how-to/topics/iceberg-table-creation.html
... View more
07-09-2024
11:49 PM
1 Kudo
Introduction
The Spark and Iceberg Supportability Matrix provides comprehensive information regarding the compatibility and supportability of Spark and Iceberg versions with various operating systems, frameworks, and dependencies.
Apache Iceberg History
The development of Iceberg was started in 2017 by Netflix. The project was open-sourced and donated to the Apache Software Foundation in November 2018. In May 2020, the Iceberg project graduated to become a top-level Apache project.
Apache Iceberg 0.7.0 was released on Oct 26, 2019 (Incubating)
Apache Iceberg 0.8.0 was released on May 7, 2020 (Incubating).
Apache Iceberg 0.9.0 was released on Jul 14, 2020.
Apache Iceberg 0.9.1 was released on Aug 11, 2020.
Apache Iceberg 0.10.0 was released on Nov 12, 2020.
Apache Iceberg 0.11.0 was released on Jan 27, 2021.
Apache Iceberg 0.11.1 was released on Apr 3, 2021.
Apache Iceberg 0.12.0 was released on August 15, 2021.
Apache Iceberg 0.12.1 was released on November 8th, 2021.
Apache Iceberg 0.13.0 was released on February 4th, 2022.
Apache Iceberg 0.13.1 was released on February 14th, 2022.
Apache Iceberg 0.13.2 was released on June 15th, 2022.
Apache Iceberg 0.14.0 was released on July 16 2022.
Apache Iceberg 0.14.1 was released on Sep 12, 2022.
Apache Iceberg 1.0.0 was released on Nov 3, 2022.
Apache Iceberg 1.1.0 was released on November 28th, 2022.
Apache Iceberg 1.2.0 was released on March 20th, 2023.
Apache Iceberg 1.2.1 was released on April 11th, 2023.
Apache Iceberg 1.3.0 was released on May 30th, 2023.
Apache Iceberg 1.3.1 was released on July 25, 2023.
Apache Iceberg 1.4.0 was released on October 4, 2023.
Apache Iceberg 1.4.1 was released on October 23, 2023.
Apache Iceberg 1.4.2 was released on November 2, 2023.
Apache Iceberg 1.4.3 was released on December 27, 2023.
Apache Iceberg 1.5.0 was released on March 11, 2024.
Apache Iceberg 1.5.1 was released on April 25, 2024.
Apache Iceberg 1.5.2 was released on May 9, 2024.
Apache Spark and Iceberg Supportability Matrix Table
The following table explains the Iceberg Version Release Date Status Default Spark Version Supported Spark Version(s):
Iceberg Version
Release Date
Status
Default Spark Version
Supported Spark Version(s)
0.7.0
Oct 26, 2019
Incubating
2.4
2.4
0.8.0
May 07, 2020
Incubating
2.4
2.4
0.9.0
Jul 14, 2020
2.4,3.0
0.9.1
Aug 11, 2020
2.4,3.0
0.10.0
Nov 12, 2020
2.4,3.0
0.11.0
Jan 27, 2021
2.4,3.0
0.11.1
Apr 03, 2021
2.4,3.0
0.12.0
Aug 15, 2021
2.4,3.0,3.1
0.12.1
Nov 08, 2021
2.4,3.0,3.1
0.13.0
Feb 04, 2022
3.2
2.4,3.0,3.1,3.2
0.13.1
Feb 14, 2022
3.2
2.4,3.0,3.1,3.2
0.13.2
Jun 15, 2022
3.2
2.4,3.0,3.1,3.2
0.14.0
Jul 17, 2022
3.3
2.4,3.0,3.1,3.2,3.3
0.14.1
Sep 12, 2022
3.3
2.4,3.0,3.1,3.2,3.3
1.0.0
Nov 03, 2022
3.3
2.4,3.0,3.1,3.2,3.3
1.1.0
Nov 28, 2022
3.3
2.4,3.1,3.2,3.3
1.2.0
Mar 20, 2023
3.3
2.4,3.1,3.2,3.3
1.2.1
Apr 11, 2023
3.3
2.4,3.1,3.2,3.3
1.3.0
May 30, 2023
3.4
3.1,3.2,3.3,3.4
1.3.1
Jul 25, 2023
3.4
3.1,3.2,3.3,3.4
1.4.0
Oct 04, 2023
3.5
3.2,3.3,3.4,3.5
1.4.1
Oct 23, 2023
3.5
3.2,3.3,3.4,3.5
1.4.2
Nov 02, 2023
3.5
3.2,3.3,3.4,3.5
1.4.3
Dec 27, 2023
3.5
3.2,3.3,3.4,3.5
1.5.0
Mar 11, 2024
3.5
3.3,3.4,3.5
1.5.1
Apr 25, 2024
3.5
3.3,3.4,3.5
1.5.2
May 09, 2024
3.5
3.3,3.4,3.5
References
Iceberg Releases
Github Iceberg
Thank you for taking the time to read this article. We hope you found it informative and helpful in enhancing your understanding of the topic. If you have any questions or feedback, please feel free to contact me. Remember, your support motivates us to continue creating valuable content. If this article helped you, please consider giving it a like and providing a kudos. We appreciate your support!
... View more