Member since
07-29-2017
10
Posts
0
Kudos Received
0
Solutions
02-04-2018
03:13 PM
@elango vaithiyanathan if you are having integer,float.. values represented as string datatypes,then you can use string datatypes in aggregation. Example:- 10,10.5 value in age column represented as string data type, we can use aggregate functions on this age column directly. gr_df2=gr_df.agg(sum(col('age'))) (or) You can cast String data types to int,double..etc in aggregations also. from pyspark.sql.types import *
gr_df2=gr_df.agg(sum(col('age').cast("int"))) Casting age column as integer and apply aggregate functions on age column. Create a temp table on the dataframe, use your sql queries on the temp table gr_df2.registerTempTable("people") hc.sql("select * from people").show(10,False)
... View more