About mbigelow

shuffle · ‎04-14-2017

It's true that you can aggreate logs to hdfs when the job is still running, however, the minimun log uploading interval (yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds) you can set is 3600 seconds which is 1 hour. The design is trying to protect namenode from being spamed. You may have to use an external service to do the log aggregation. Either write your own or find other tools. Below is the proof from yarn-default.xml in hadoop-common source code (cdh5-2.6.0_5.7.1). <property> <description>Defines how often NMs wake up to upload log files. The default value is -1. By default, the logs will be uploaded when the application is finished. By setting this configure, logs can be uploaded periodically when the application is running. The minimum rolling-interval-seconds can be set is 3600. </description> <name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name> <value>-1</value> </property>

mbigelow · ‎04-12-2017

There is probably an issue with the client connecting to the Datanode. It is reporting that you have one live data nodes but it is failing to place any replica on it. I would expect the client to get a different error if it was failing to write out the first replica. Check the NN UI to validate that the DN is live, and check the NN and DN logs to see if there is more information on what the issue is.

cjervis · ‎04-11-2017

I moved it for you. 🙂

Meister1867 · ‎04-03-2017

Thanks 🙂

nymous · ‎03-22-2017

We are only runnng hdfs, so we only need to upgrade that. Since it was a dev environment, we shut all of hdfs down, download hadoop-2.6.0-cdh5.8.4.tar.gz from http://archive.cloudera.com/cdh5/cdh/5/ and run with that. (We are actually running with hdfs on mesos, so the artifacts get packaged up into an uberjar with the mesos executor, but there's no real magic there. I think it just uses the stuff in hadoop/common and hadoop/hdfs and some of the run scripts.)

Fawze · ‎03-05-2017

When i checked the job/the query that occur prior to the alert on the JN, i found one hive query that runs on a data of 6 months and recreate the hive table from new, which resulted in a good percentage of edit logs, i contacted the query owner and he reduced the his running window from 6 months to 2 months which solve for us the issue.

matt123 · ‎03-02-2017

@mbigelow Cant Thank you engouh Mate

saranvisa · ‎03-02-2017

@Akira191 1. Go to Cloudera Manager -> Spark -> Instance -> Identify the node where you have Spark server installed 2. Login to the above identified node using CLI and go to path "/opt/cloudera/parcels/CDH-<version>/lib/spark/bin" , it will list binaries for "spark-shell, pyspark, spark-submit, etc". It helps us to login to spark & submit jobs. if it has spark-sql, then you can run the command that you have mentioned. In your case, spark-sql binary should be missing, so you are getting this error. You need to talk to your admin

mbigelow · ‎03-02-2017

The only suggestion I have is to try running some tests to see if you can weed out any bad disks. DFSIO and Terasort may hit on it but may not. You can use 'dd' or other software to test the raw disks. Beyond that you may be chasing ghosts (spending more time than worth it on an ephemeral problem).

rwesolowski · ‎03-02-2017

Thanks, Removing all alternatives OpenJDK helped:)

Online	Offline
Last Visited	‎03-25-2019 05:55 PM

Member Since	‎08-16-2016 08:51 PM
Last Visited	‎03-25-2019 05:55 PM
Posts	642
Kudos received	129

Cloudera Community

Re: Configuring the HDFS superuser in Kerberos

Re: Hive process crash

Re: Upgrade from CDH 5.11 Express to Enterprise

Re: Adding user to Cloudera Manager using REST AP...

Re: Running in non-interactive mode, and data appe...

Re: Log managmement for Long-running Spark Stream...

Re: Error with HDFS PUT replicated to 0 nodes inst...

Re: Receiving ZooKeeperException when trying conne...

Re: Cloudera Manager stuck at Distributed on CentO...

Re: File Descriptor usage in Datanode climbing ste...

Re: Intermittently one of the journal nodes get ou...

Re: How to see Mapreduce Spill Disk Activity

Re: Spark-sql command not found

Re: Mapreduce failed on Could not deallocate conta...

Re: Cluster Installation distributing parcel probl...