Member since
09-02-2016
523
Posts
89
Kudos Received
42
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2309 | 08-28-2018 02:00 AM | |
2160 | 07-31-2018 06:55 AM | |
5070 | 07-26-2018 03:02 AM | |
2433 | 07-19-2018 02:30 AM | |
5863 | 05-21-2018 03:42 AM |
03-07-2017
07:37 AM
Hello Saranvisa, Thanks for your reply. I guess it was not balanced untill now, since I've downsized yarn container memory to fit the host memory, all hosts have the exactly the same memory usage. Regards,
... View more
03-02-2017
04:39 PM
@Akira191 1. Go to Cloudera Manager -> Spark -> Instance -> Identify the node where you have Spark server installed 2. Login to the above identified node using CLI and go to path "/opt/cloudera/parcels/CDH-<version>/lib/spark/bin" , it will list binaries for "spark-shell, pyspark, spark-submit, etc". It helps us to login to spark & submit jobs. if it has spark-sql, then you can run the command that you have mentioned. In your case, spark-sql binary should be missing, so you are getting this error. You need to talk to your admin
... View more
03-01-2017
12:14 AM
Hey guys! Great news! Problem finally solved after 1.5 days of troubleshooting! While digging through the error logs, I saw an error message somewhere in the Cloudera Agent log saying "ValueError: too many values to unpack". Then I search for solutions online to solve that problem. In conclusion, the errors were caused by 1) System time synchronization was disabled. All nodes's system time was not in sync. 2) Latest Java OpenJDK update broke Cloudera Agent. Solution: 1) Enabling system time syncronization(As suggested by @saranvisa) service ntpd start 2) Uninstalling OpenJDK on each node. rpm -qa | grep jdk yum remove <each item from the previous step> 3) Run "Re-run upgrade Wizard" in Cloudera Manager and wait for Inspect Host to finish. Done! Thanks so much for the help guys! Reference: https://community.cloudera.com/t5/Cloudera-Manager-Installation/Problem-with-cloudera-agent/td-p/47698/page/2 https://community.cloudera.com/t5/Cloudera-Manager-Installation/Mismatched-CDH-versions-host-has-NONE-but-role-expect-5/m-p/48780#U48780
... View more
02-26-2017
06:47 PM
@saranvisa Yes , you are right . I can't connect the host and i will apply it. Thanks a lot.
... View more
02-24-2017
11:01 AM
One more way to get low-level versions for everything: Hosts --> All Hosts --> Inspect All Hosts (button). This will return a report of CM and CDH packages.
... View more
02-24-2017
05:21 AM
Use the event desearlizer You can use BlobDeserializer - if you want to parse the whole file inside one event. or You can use Line - one event per line of text input. Refer the link https://flume.apache.org/FlumeUserGuide.html#event-deserializers
... View more
02-20-2017
06:01 AM
Finally I can get my cluster up and running! As msbigelow said two of my three JNs were up and running but bad rdeclared in hdfs-site.xml dfs.namenode.shared.edits.dir property. After change it the namenode service starts! Now everything apperars to be in order. I hope my problem could help in this community. Thanks @saranvisa and @mbigelow!
... View more
02-15-2017
06:46 AM
thanks for the response really good and detailed could you give a little bit of a lower level response as well say how would I add data from a dataframe in spark to a table in hive effeciently. The goal is to improve the speed by using spark instead of hive or impala for db insertions thanks.
... View more
02-03-2017
12:30 PM
saranvisa is correct in that you should set a minimum and the max should not push the a single nodes memory limits as a single container cannot run across nodes. There is still the mismatch in what is in the configs versus what YARN is using and reporting. On the RM machine get the process id for the RM, sudo su yarn -c "jps" and then get the process info for that id, ps -ef | grep <id>. Does that show that ti is using the configs from the path that you changed, it should be listed in -classpath?
... View more