Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Applying avg spark

Applying avg spark

Explorer

Hi guys i am trying this source code

val spark = SparkSession.builder.master("local").appName("my-spark-app").getOrCreate() val df = spark.read.option("header",true).csv("C:/Users/mhattabi/Desktop/Clean _data/Mud_Pumps _Cleaned/Set_1_Mud_Pumps_Merged.csv") df("DateTime").cast("timestamp") df("ADCH_Mud Pumps.db.MudPump.2.On.value").cast("integer") val result=df.select( col("*"), date_format(df("DateTime"),"yyyy-MM-dd hh:mm").alias("DateTime")).groupBy(df("DateTime")) .agg(avg(df("ADCH_Mud Pumps.db.MudPump.2.On.value"))) result.show(5)

It says what sould i do for the attribute it is really existing in my data set,Thanks

13015-1.png

4 REPLIES 4
Highlighted

Re: Applying avg spark

Super Guru

@Maher Hattabi

In your dataset, do you really "." in your column name in csv file? Is it possible that you can cleanse your data (by removing "." from column name) in your csv file before getting to this step?

Highlighted

Re: Applying avg spark

Explorer

Hi ,

yes there is a "." in the column name , can this cause a problem in such a operation ? Thanks

Highlighted

Re: Applying avg spark

Super Guru

@Maher Hattabi

yes. Last I knew, you cannot have "." in your column name. This is still unresolved. Please see following link.

https://issues.apache.org/jira/browse/SPARK-5632 --> this says it was resolved in 1.4 but its actually not. It points to the following jira and it is still unresolved.

https://issues.apache.org/jira/browse/SPARK-18084

Highlighted

Re: Applying avg spark

Expert Contributor

can you also show a sample dataset of the csv file. Like 2-3 rows of the csv file so it can be tested.

Don't have an account?
Coming from Hortonworks? Activate your account here