About RangaReddy

RangaReddy · ‎07-15-2021

Hi @PrernaU 1. By default CDP uses PAM authentication. So we can remove below two properties pamRealm=org.apache.zeppelin.realm.PamRealm pamRealm.service=sshd 2. And then configured `admin=admin, admins` under `zeppelin.shiro.user.block`

RangaReddy · ‎06-28-2021

Hi @javidshaik Yes based on cloudera documentation, it is not supported multiple versions under same Cloudera Manager Server.

RangaReddy · ‎06-28-2021

Hi @javidshaik I have checked with internal team. We can migrate Spark version from Spark 2.3 to 2.4 mentioned the details in below document. 2.4 Release 2 CDH 5.10 and any higher CDH 5.x versions 2.4 Release 1 CDH 5.10 and any higher CDH 5.x versions https://docs.cloudera.com/documentation/spark2/latest/topics/spark2_requirements.html But Spark 2.3 -> 2.4 version changes have higher potential of risks. If you are satisfied with my answer please Accept as solution.

RangaReddy · ‎06-28-2021

Hi @roshanbi Could you please share what you are trying to do? Share some additional steps.

RangaReddy · ‎06-28-2021

Hi @KhASQ For Watermarking use any framework/db to update values once job is successfully. If you are using kafka then kafka itself you can store kafka related watermarking. Other than kafka you want to use then choose any RDBMS or HBase table.

RangaReddy · ‎06-28-2021

Hi @javidshaik CDH 5.x and HDP 2.X version clusters has reached end of life support. Better upgrade your cluster to CDH 6.X/CDP 7.X version. Both CDH 6.X and CDP 7.X clusters will support Spark 2.4 versions. Please refer following documentation: https://www.cloudera.com/legal/policies/support-lifecycle-policy.html

RangaReddy · ‎06-27-2021

Hi @roshanbi Please find the difference: val textFileDF : Dataset[String] = spark.read.textFile("/path") // returns Dataset object val textFileRDD : RDD[String] = spark.sparkContext.textFile("/path") // returns RDD object If you are satisfied, please Accept as Solution.

RangaReddy · ‎06-25-2021

Hi @roshanbi val ds = Seq(1, 2, 3).toDS() It will create sequence of number and later we are converting it into DataSet. There are multiple ways we can create dataset. Above one one way of creating Dataset. If you are created a dataframe with case class and you want to convert it into dataset you can use dataframe.as[Classname] Here you can find different ways of creating dataset. https://www.educba.com/spark-dataset/ Please let me know is there any doubts. Please Accept as Solution once you satisfied with above answer.

RangaReddy · ‎06-24-2021

Hi @roshanbi If you are satisfied with my answer please Accept as Solution.

RangaReddy · ‎06-24-2021

Hi @roshanbi Apache Spark offers several methods to use when selecting a column. Scala Spark: // Scala import org.apache.spark.sql.functions.{expr, col, column} // 6 ways to select a column df.select(df.col("ColumnName")) df.select(col("ColumnName")) df.select(column("ColumnName")) df.select(`ColumnName) df.select($"ColumnName") df.select(expr("ColumnName")) PySpark: # Python from pyspark.sql.functions import expr, col, column # 4 ways to select a column df.select(df.ColumnName) df.select(col("ColumnName")) df.select(column("ColumnName")) df.select(expr("ColumnName"))

Online	Offline
Last Visited	‎08-29-2024 03:41 AM

Member Since	‎06-02-2020 05:25 AM
Last Visited	‎08-29-2024 03:41 AM
Posts	331
Kudos received	68

Cloudera Community

Re: Icebreg on CDP private cloud 7.1.9

Re: How to set default time zone/local time for Sp...

Re: Load Iceberg Table on PowerBI Desktop

Re: NoClassDefFoundError due to Incompatible Spark...

Re: Creating Iceberg table

Re: Zeppelin default login

Re: upgrading spark to 2.4

Re: upgrading spark to 2.4

Re: Spark on Windows

Re: Where to store Apache Spark batch ETL job pro...

Re: upgrading spark to 2.4

Re: RDD questions

Re: RDD questions

Re: data frames and data sets

Re: data frames and data sets