About mbigelow

sim6 · ‎07-06-2017

I have increased the heap size. It was set to default of 256 MB which I guess was causing the problem. I will revert if it keeps working alright 🙂 Thanks much for your response .It helped

vrumengan · ‎07-05-2017

Yes, install web browser on the same maching and try accessing it from the browser. Do this in the first time, instead trying from your host (if you are using VirtualBox or other virtualization tools). You can ignore the "unable to retrieve non-local non-loopback ip address" error.

lewiss · ‎07-02-2017

I have been re-run the test, and kudu perform much better this time(though it's still a little bit slower than parquet), thanks for @mpercy's suggestion. I changed two things by re-runing the test: 1, increase the partitions for the fact table from 60 to 768(affact all queries) 2, change the query3.sql 'or' predicate into 'in' predicate, so predicate can push down to kudu(only affact query 3) below is the re-run result: (column 'kudu60' is the previous result, which means the partitions of fact table is 60 ) (column 'kudu768' is the new result, which means the partitions of fact table is 768)

weichiu · ‎06-30-2017

Interesting story. The decomm process would not complete until all blocks have at least 1 good replica on other DNs. (good replica = replicas that are not stale and on a DataNode that is not being decommissioned or already decommissioned) DirectoryScanner in a DataNode scans the entire directory, reconciling inconsistency between in-memory block map and on-disk replica, so it would eventually pick up the added replica, just a matter of time.

mbigelow · ‎06-29-2017

This usually means that another adaptor is being picked up by the test. For me it was the loopback and it doesn't have a speed or mode, so the health test fails. Use ethtool to examine your adaptors and find the one that doesn't have a speed. Add a regex to exclude it in Network Interface Collection Exclusion Regex under the Host configuration screen. My regex for the loopback adaptor (lo) is ^lo$

Nagamalleswara · ‎06-28-2017

Thank you very much @mbigelow I was able to fix the missing blocks issue & replication factor change. Missing block issue : All the datanodes were included in rack but it's the configuration issue we had in our cluster that caused the issue. Replication factor : Yes we need to change the client configuration value and re-deployed client configuration files and restarted the HDFS , YARN and all other client services that require this update. Following are the links were useful for me to change client configuration files https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_mc_mod_configs.html http://grokbase.com/t/cloudera/scm-users/126wjwf5da/setting-the-replication-value

mbigelow · ‎06-23-2017

This has to do with the YARN memory settings. The amount of memory allocated to yarn is only 8 GB. I don't know what the minimum container size is, probably around 1.3 G. That combination of the two determine the amount of containers that can be launched. The result of that for your cluster is 6 containers. Anything beyond that will have to wait for resources to be freed up. https://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/ https://www.cloudera.com/documentation/enterprise/5-3-x/topics/cdh_ig_yarn_tuning.html

parnigot · ‎06-23-2017

In the logs for the ApplicationMaster/SparkDriver (which was around 4GB) I've found a StackOverflowError from Spark reporter thread: I've found this Spark issue https://issues.apache.org/jira/browse/SPARK-18750 that matches my error. The job was launched used dynamicAllocation and requested an insane number of containers (16000 with 20GB/8cores) and apparently this can cause a SO in the Spark thread managing the executors. An easy workaround is to disable dynamicAllocation and use a fixed number of executor. With 10 executors the job is running fine.

Sidhartha · ‎06-20-2017

Partition column is a logical entity related to the table. It cannot be present in the data or schema of the Hive table. So how can I put the data in DataFrame and a case class ?

ponypony · ‎06-20-2017

Thanks

Online	Offline
Last Visited	‎03-25-2019 05:55 PM

Member Since	‎08-16-2016 08:51 PM
Last Visited	‎03-25-2019 05:55 PM
Posts	642
Kudos received	129

Cloudera Community

Re: Configuring the HDFS superuser in Kerberos

Re: Hive process crash

Re: Upgrade from CDH 5.11 Express to Enterprise

Re: Adding user to Cloudera Manager using REST AP...

Re: Running in non-interactive mode, and data appe...

Re: hive server 2 pause duration

Re: Cloudera Manager Server Panel doesn't work pro...

Re: kudu is slower than parquet?

Re: Tell the NameNode where to find a "MISSING" bl...

Re: The health test result for HOST_SCM_HEALTH has...

Re: Changing rack awareness in a running Hadoop cl...

Re: yarn application always in pending

Re: Application Failed for YARN exit code 12

Re: How to use spark to load data into a Hive part...

Re: query on partition question