About thangarajanp

thangarajanp · ‎10-20-2017

Try spark-submit --master <master-ip>:<spark-port> to submit the job.

thangarajanp · ‎10-20-2017

Try this code from pyspark import SparkConf, SparkContext from pyspark.sql import SQLContext conf1 = SparkConf().setAppName('sort_desc') sc1 = SparkContext(conf=conf1) sql_context = SQLContext(sc1) csv_file_path = 'emp.csv' employee_rdd = sc1.textFile(csv_file_path).map(lambda line: line.split(',')) print(type(employee_rdd)) employee_rdd_sorted = employee_rdd.sortByKey(ascending= False) employee_df = employee_rdd.toDF(['dept','ctc']) employee_df_sorted = employee_rdd_sorted.toDF(['dept','ctc'])

thangarajanp · ‎03-30-2017

Great Working Fine now

thangarajanp · ‎03-29-2017

Hi , I have one query which contain many join. Now I want to create a Dataframe or Dataset from the query (not from a single table) in scala

thangarajanp · ‎03-16-2017

Hi All, In mynifi flow i have two processor one is GetFTP and PutS3Object . consider i have one file in FTP a.txt .after the data get into the S3 the a.txt's timestamp is 12:00:00 in S3 after sometime again one file 'b.txt' is put into ftp now the S3 have two files as below but the timestamp in the S3 is changed for both a.txt and b.txt a.txt 12:01:00 b.txt 12:01:00

thangarajanp · ‎11-28-2016

Hi Raf Mohammed if you want to do some real-time analysis on twitter do not go with hive or some traditional reporting tools. use flume for pulling data and store data in Elasticsearch and do visualization in Kibana. if you want to do some real-time analytics such as Sentiment Analysis try Flume+Spark Streaming+Elasticsearch+Kibana @Raf Mohammed

Online	Offline
Last Visited	‎10-11-2019 06:55 AM

Member Since	‎10-14-2016 02:50 AM
Last Visited	‎10-11-2019 06:55 AM
Posts	12
Kudos received	1

Cloudera Community

Re: Twitter Data is in HIVE - How do I visualise i...

Re: INFO yarn.ApplicationMaster: Unregistering App...

Re: Spark sort by key with descending order

Re: Dataframe/Dataset from Query

Dataframe/Dataset from Query

How to use Conflict Resolution Strategy in PutS3Ob...

Re: Twitter Data is in HIVE - How do I visualise i...