Member since
02-01-2019
650
Posts
143
Kudos Received
117
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2612 | 04-01-2019 09:53 AM | |
1376 | 04-01-2019 09:34 AM | |
6474 | 01-28-2019 03:50 PM | |
1484 | 11-08-2018 09:26 AM | |
3610 | 11-08-2018 08:55 AM |
12-18-2017
06:39 PM
@venkateswara reddy bukkasamudram Please refer: https://community.hortonworks.com/questions/33690/hdpcd-exam-network-issues.html
... View more
12-18-2017
06:29 PM
@Ashnee Sharma: Do we know what is the data size of this table? "select * from rasdb.dim_account" gets complete data to driver and we need to make sure table data fits into driver.
... View more
12-15-2017
01:17 PM
@Ashnee Sharma What is your driver memory? java.lang.OutOfMemoryError: GC overhead Try increasing the Driver memory according to the data size.
... View more
11-12-2017
05:57 PM
Try updating your local /etc/hosts with the sandbox hostname and IP. @Aditya Srivastava
... View more
11-12-2017
05:48 PM
@Swaapnika Guntaka : 9092 is not a default port in HDP rather it is 6667, Could you please check the port and re-run the producer.
... View more
10-20-2017
03:18 PM
@Guilherme Colla Click on each host and start the datanode service.
... View more
10-20-2017
03:16 PM
1 Kudo
@karthick baskaran Here is the command to get number of lines in a file. Spark will internally load your text file and keep it in RDD/dataframe/dataset. spark-shell (spark 1.6.x)
scala> val textFile = sc.textFile("README.md")
scala> textFile.count() // Number of items in this RD
... View more
10-20-2017
01:53 PM
1 Kudo
@Guilherme Colla: Looks like you don't have any live datanodes. Can you check the status of datanodes and start them if they are down/haven't started?
... View more
10-20-2017
01:48 PM
1 Kudo
@karthick baskaranFor Part 1: Record counts: A simple rdd.count() or df.count() should give you the records count. For Part 2: Duplicate Check: You could load the data into a dataframe and run a distinct against it or use dropDuplicates [https://spark.apache.org/docs/1.6.1/api/java/org/apache/spark/sql/DataFrame.html#dropDuplicates()]
... View more