Member since
06-02-2020
331
Posts
64
Kudos Received
49
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1104 | 07-11-2024 01:55 AM | |
3143 | 07-09-2024 11:18 PM | |
2704 | 07-09-2024 04:26 AM | |
2065 | 07-09-2024 03:38 AM | |
2344 | 06-05-2024 02:03 AM |
02-22-2024
10:51 PM
1 Kudo
Spark and Java versions Supportability Matrix 1. Introduction: Apache Spark is a powerful open-source distributed computing system widely used for big data processing and analytics. However, choosing the right Java version for your Spark application is crucial for optimal performance, security, and compatibility. This article dives deep into the officially supported Java versions for Spark, along with helpful advice on choosing the right one for your project. 2. Matrix Table Spark Version Supported Java Version(s) Java 8 Java 11 Java 17 Java 21 Deprecated Java Version(s) 3.5.1 Java 8*/11/17 Yes Yes Yes No Java 8 prior to version 8u371 support is deprecated 3.5.0 Java 8*/11/17 Yes Yes Yes No Java 8 prior to version 8u371 support is deprecated 3.4.2 Java 8*/11/17 Yes Yes Yes No Java 8 prior to version 8u362 support is deprecated 3.4.1 Java 8*/11/17 Yes Yes Yes No Java 8 prior to version 8u362 support is deprecated 3.4.0 Java 8*/11/17 Yes Yes Yes No Java 8 prior to version 8u362 support is deprecated 3.3.3 Java 8*/11/17 Yes Yes Yes No Java 8 prior to version 8u201 support is deprecated 3.3.2 Java 8*/11/17 Yes Yes Yes No Java 8 prior to version 8u201 support is deprecated 3.3.1 Java 8*/11/17 Yes Yes Yes No Java 8 prior to version 8u201 support is deprecated 3.3.0 Java 8*/11/17^ Yes Yes No No Java 8 prior to version 8u201 support is deprecated 3.2.4 Java 8*/11 Yes Yes No No Java 8 prior to version 8u201 support is deprecated 3.2.3 Java 8*/11 Yes Yes No No Java 8 prior to version 8u201 support is deprecated 3.2.2 Java 8*/11 Yes Yes No No Java 8 prior to version 8u201 support is deprecated 3.2.1 Java 8*/11 Yes Yes No No Java 8 prior to version 8u201 support is deprecated 3.2.0 Java 8*/11 Yes Yes No No Java 8 prior to version 8u201 support is deprecated 3.1.3 Java 8*/11 Yes Yes No No Java 8 prior to version 8u92 support is deprecated 3.1.2 Java 8*/11 Yes Yes No No Java 8 prior to version 8u92 support is deprecated 3.1.1 Java 8*/11 Yes Yes No No Java 8 prior to version 8u92 support is deprecated 3.0.3 Java 8*/11 Yes Yes No No Java 8 prior to version 8u92 support is deprecated 3.0.2 Java 8*/11 Yes Yes No No Java 8 prior to version 8u92 support is deprecated 3.0.1 Java 8*/11 Yes Yes No No Java 8 prior to version 8u92 support is deprecated 3.0.0 Java 8*/11 Yes Yes No No Java 8 prior to version 8u92 support is deprecated 2.4.8 Java 8* Yes No No No 2.4.7 Java 8* Yes No No No 2.4.6 Java 8* Yes No No No 2.4.5 Java 8* Yes No No No 2.4.4 Java 8* Yes No No No 2.4.3 Java 8* Yes No No No 2.4.2 Java 8* Yes No No No 2.4.1 Java 8* Yes No No No 2.4.0 Java 8* Yes No No No * means Cloudera recommended Java version.
^ means Upstream Spark is supported. Note: According to the Cloudera documentation, Spark 3.3.0 only supports Java 8 and 11. However, the official Spark documentation lists Java 8, 11, and 17 as compatible versions. References: Apache Spark - A Unified engine for large-scale data analytics CDS 3.3 Powered by Apache Spark Requirements SPARK-24417 Support Matrix Cloudera 3. Problems Arising from Unsupported Spark & Java Versions Utilizing incompatible or unsupported versions of Spark and Java can introduce various challenges and impediments in the operation of your Spark environment. Performance Degradation: The usage of an incompatible Java version could lead to performance degradation or inefficiencies. This is attributable to the inability to leverage the latest optimizations and features provided by newer Java releases, resulting in the suboptimal performance of Spark jobs. Compatibility Issues: Spark's functionality may be compromised or rendered unstable when interfacing with specific versions of Java. This can manifest as unexpected errors or failures during runtime, hindering the smooth execution of Spark applications. Feature Limitations: Newer iterations of Spark may rely on features or enhancements exclusive to certain Java versions. Employing outdated or unsupported Java versions may curtail your ability to exploit these advanced features, constraining the capabilities and functionalities of your Spark applications. 4. End of Life (EOL) dates for Java versions: Java Version EOL Date 1 Java 8 31, December 2020 (Public Updates), Still supported with Long Term Support (LTS) until December 2030. 2 Java 11 30, September 2023 (Public Updates), Still supported with Long Term Support (LTS) until January 2032. 3 Java 17 30, September 2026 (Public Updates), Long Term Support (LTS) until September 2029. 4 Java 21 30, September 2028 (Public Updates), Long Term Support (LTS) until September 2031. Reference(s): Oracle Java SE Support Roadmap Java version history 5. JDK & Scala compatibility Minimum Scala versions: JDK Version Scala 3 Scala 2.13 Scala 2.12 Scala 2.11 22 (ea) 3.3.2 2.13.12 2.12.19 21 (LTS) 3.3.1 2.13.11 2.12.18 20 3.3.0 2.13.11 2.12.18 19 3.2.0 2.13.9 2.12.16 18 3.1.3 2.13.7 2.12.15 17 (LTS) 3.0.0 2.13.6 2.12.15 11 (LTS) 3.0.0 2.13.0 2.12.4 2.11.12 8 (LTS) 3.0.0 2.13.0 2.12.0 2.11.0 `*` = forthcoming; support available in nightly builds Thank you for taking the time to read this article. We hope you found it informative and helpful in enhancing your understanding of the topic. If you have any questions or feedback, please feel free to reach out to me. Remember, your support motivates us to continue creating valuable content. If this article helped you, please consider giving it a like and providing a kudos. We appreciate your support!
... View more
02-16-2024
02:54 AM
1 Kudo
Thank you, my friend. A week ago, I read through your configurations in the official documentation and experimented with them. However, I encountered an error along the lines of 'class not found.' Currently, I've identified the root cause: I'm using HDP 3.1.0, which includes PySpark 2.3.2.3.1.0.0-78. Therefore, I upgraded it to PySpark 3, while still using the standalone-metastore-1.21.2.3.1.0.0-78-hive3.jar file by default. That's the reason why, when using the configuration, I encountered the 'class not found' error. Now, I've replaced that JAR file with hive-metastore-2.3.9.jar. Everything is working fine now. Once again, thank you, my friend.
... View more
02-04-2024
08:05 PM
1 Kudo
Hi @Meepoljd Please let me know still you need any help on this issue. If any of the above solutions is helped then mark Accept as Solution.
... View more
02-04-2024
08:11 AM
Hi @zhuw.bigdata To locate Spark logs, follow these steps: Access the Spark UI: Open the Spark UI in your web browser. Identify Nodes: Navigate to the Executors tab to view information about the driver and executor nodes involved in the Spark application. Determine Log Directory: Within the Spark UI, find the Hadoop settings section and locate the value of the yarn.nodemanager.log-dirs property. This specifies the base directory for Spark logs on the cluster. Access Log Location: Using a terminal or SSH, log in to the relevant node (driver or executor) where the logs you need are located. Navigate to Application Log Directory: Within the yarn.nodemanager.log-dirs directory, access the subdirectory for the specific application using the pattern application_${appid}, where ${appid} is the unique application ID of the Spark job. Find Container Logs: Within the application directory, locate the individual container log directories named container_{$contid}, where ${contid} is the container ID. Review Log Files: Each container directory contains the following log files generated by that container: stderr: Standard error output stdin: Standard input (if applicable) syslog: System-level logs
... View more
02-04-2024
08:02 AM
Hi @zenaskun001 Could you please provide more details to check your issue. Check the following things: 1. By default SparkInterpreter will be installed. Check in your case $ZEPPELIN_HOME/interpreters location Spark Interpreter is installed or not. 2. After proper installation, you need to restart the Zeppelin and its related Components like Spark. 3. After restarting the Zeppelin service, try to login and check the Interpreter is installed or not. 4. As a last step, you need to check the Zeppelin logs (/var/log/zeppelin) path.
... View more
02-04-2024
07:46 AM
1 Kudo
Hi @Taries I hope you are doing good. Do you need any further help on this issue. If above solutions is helped in your case please accept the Solution. It will help for others.
... View more
02-04-2024
07:44 AM
Hi @yagoaparecidoti Unfortunately Cloudera will not support installing/using the open source Spark because of some customisations needs to be done at Cloudera end support other component integrations.
... View more
02-04-2024
07:40 AM
Hi @Kunank_kesar Could you please check the following things to resolve the above issue: 1. Have you closed the spark session properly. For better practise close the spark session if it is not closed. 2. Have you checked the application code by adding some loggers what driver is doing with out stopping the application. 3. As a last step, go to driver machine and collect the thread dumps and see is there any operation it is doing internally.
... View more
01-25-2024
09:18 AM
HI Ranga, Thank you for the reply, Please find my answers below: 1. We are working on CDP. 2. Spark submit is submitted with the cache file, property files and java options. Please let me know if i need to share more details. 3. We are using principal and kerberos cache 4. The below article has been followed and the configuration i have validated We are using "authentication=SPENGO" and the phoenix string(phoenixurl in code) is as below: <ZK>:2181;authentication=SPNEGO/hbase-secure:<principal>:<cache> We are using the above phoenix string with the code below: dataset.write().format("org.apache.phoenix.spark").mode(SaveMode.Overwrite) .option("table",tableName.concat(row.get(0).toString())) .option("zkUrl",phoenixurl) .save(); The same above is working fine on CDP, when we migrate the same app on the hadoop instance enabled with ozone. we have above issue w.r.t kerberos. Please support.. Looking forward for your suggestion
... View more
01-23-2024
07:10 AM
Hi @parimalpatil The RedactorAppender is mostly you can ignore it is nothing to do with real failure unless the stacktraces at bottom points something related to any ozone roles.
This Log4j Appender redacts log messages using redaction rules before delegating to other Appenders. You can share the complete failure log so that we can check and update you. The workaround is add jar file in classpath of roles where you see RedactorAppender error. We can add this through CM UI -> Configuration-> Search "role_env_safety_valve" for the role you are getting error. OZONE_CLASSPATH=$OZONE_CLASSPATH:/opt/cloudera/parcels/CDH/jars/logredactor-2.0.8.jar
... View more