About RangaReddy

RangaReddy · ‎02-08-2022

Hi @loridigia If cluster/application is not enabled dynamic allocation and if you set --conf spark.executor.instances=1 then it will launch only 1 executor. Apart from executor, you will see AM/driver in the Executor tab Spark UI.

RangaReddy · ‎12-07-2021

In this article, we will learn how to integrate Zeppelin JDBC (Phoenix) interpreter example. 1. Configuring the JDBC (Phoenix) interpreter: Login to Zeppelin UI -> Click on the user name (in my case, admin) at the right-hand corner. It will display a menu > click on Interpreter. Click on + Create at the right-hand side of the screen. It will display a popup menu. Enter Interpreter Name as jdbc and select Interpreter Group as jdbc. Then, it will populate Properties in table format. Click on + button and add the Phoenix-related properties according to your cluster, and click on the Save button. phoenix.driver org.apache.phoenix.jdbc.PhoenixDriver phoenix.url jdbc:phoenix:localhost:2181:/hbase phoenix.user phoenix.password 2. Creating the Notebook: Click Notebook dropdown menu in the top left-hand corner and select Create new note and enter Note Name as Phoenix_Test,and select Default Interpreter as jdbc. Finally, click on Create button. 3. Running the Phoenix queries using jdbc (Phoenix) interpreter in Notebook: %jdbc(phoenix) CREATE TABLE IF NOT EXISTS Employee ( id INTEGER PRIMARY KEY, name VARCHAR(225), salary FLOAT ) %jdbc(phoenix) UPSERT INTO Employee VALUES(1, 'Ranga Reddy', 24000) %jdbc(phoenix) UPSERT INTO Employee (id, name, salary) VALUES(2, 'Nishantha', 10000) %jdbc(phoenix) select * from Employee 4. Final Results: Happy Learning.

VidyaSargur · ‎11-09-2021

@EBH, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.

VidyaSargur · ‎10-28-2021

@SimonBergerard, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.

RangaReddy · ‎10-28-2021

Hi @Marwn Please check the application logs to identify why application startup is taking X mins. Without providing application logs very difficult to provide.

VidyaSargur · ‎10-21-2021

@LegallyBind Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.

RangaReddy · ‎10-17-2021

Hi @Paop We don't have enough information (how much data, spark submit command etc) to provide solution. Please raise a case for this issue.

RangaReddy · ‎10-07-2021

Hi @shivanageshch EMR is not part of cloudera. If you are using CDP/HDP cluster, go through the following tutorial. Livy Configuration: Add the following properties to the livy.conf file: # Use this keystore for the SSL certificate and key. livy.keystore = <path-to-ssl_keystore> # Specify the keystore password. livy.keystore.password = <keystore_password> # Specify the key password. livy.key-password = <key_password> Access Livy Server: After enabling SSL over Livy server. Livy server should be accessible over https protocol. https://<livy host>:<livy port> References: 1. https://docs.cloudera.com/cdp-private-cloud-base/latest/security-encrypting-data-in-transit/topics/livy-configure-tls-ssl.html Was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.

RangaReddy · ‎09-24-2021

Hi @Tomas79 While launching spark-shell, you need to add spark.yarn.access.hadoopFileSystems parameter. And also ensure to add dfs.namenode.kerberos.principal.pattern parameter value * in core-site.xml file. For example, # spark-shell --conf spark.yarn.access.hadoopFileSystems="hdfs://c1441-node2.coelab.cloudera.com:8020" Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 21/09/24 07:23:25 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered! Spark context Web UI available at http://c2441-node2.supportlab.cloudera.com:4040 Spark context available as 'sc' (master = yarn, app id = application_1632395260786_0004). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.0.7.1.6.0-297 /_/ Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_232) Type in expressions to have them evaluated. Type :help for more information. scala> val textDF = spark.read.textFile("hdfs://c1441-node2.coelab.cloudera.com:8020/tmp/ranga_clusterb_test.txt") textDF: org.apache.spark.sql.Dataset[String] = [value: string] scala> textDF.show(false) +---------------------+ |value | +---------------------+ |Hello Ranga, | | | +---------------------+

Seaport · ‎09-15-2021

@RangaReddy The link is exactly what I need. Thanks for your help.

Online	Offline
Last Visited	‎08-29-2024 03:41 AM

Member Since	‎06-02-2020 05:25 AM
Last Visited	‎08-29-2024 03:41 AM
Posts	331
Kudos received	68

Cloudera Community

Re: Icebreg on CDP private cloud 7.1.9

Re: How to set default time zone/local time for Sp...

Re: Load Iceberg Table on PowerBI Desktop

Re: NoClassDefFoundError due to Incompatible Spark...

Re: Creating Iceberg table

Re: Spark max number of executor to 1

Zeppelin jdbc(Phoenix) interpreter example

Re: Spak structured streaming job failed

Re: My SparkConfigurations are not overwriting in ...

Re: Spark is slow to start running

Re: What is the best practice for multiple python ...

Re: Zeppelin max open file limits

Re: SSL enabling for Livy-server on EMR

Re: Spark access remote HDFS in cross realm trust ...

Re: Parse nested json using Spark RDD