Member since
07-16-2015
177
Posts
28
Kudos Received
19
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
14120 | 11-14-2017 01:11 AM | |
60564 | 11-03-2017 06:53 AM | |
4313 | 11-03-2017 06:18 AM | |
13536 | 09-12-2017 05:51 AM | |
1988 | 09-08-2017 02:50 AM |
11-03-2017
06:43 AM
This issue just means that your shell action has exited with a error code (different from 0). If you want to know the reason then you need to add logging inside the shell script for knowing what happened. Be aware that the scipt execute localy on a data-node. The log you made with the script will be on that particular data-node.
... View more
11-03-2017
06:38 AM
Alternatively you could search around "yarn queue" and ressource allocation. This will not "restrict" the number of mappers or reducers but this will control how many can run concurrently by giving access to only a subset of the available resources.
... View more
11-03-2017
06:31 AM
First : save the namenode dir content. Second : can you launch the second namenode only ? Does it start ? If yes, you should be able to start the data-nodes and get access to the data.
... View more
11-03-2017
06:18 AM
1 Kudo
Hi, The concept of Hive partition do not map to HBase tables. So if you want to have HBase as the storage then you will need to workaround your use case. You could try to use "one HBase table" having a row key constructed with the partition value. That way you should be able to query your HBase table using the row key and avoid a full scan of the table. Or you could have one HBase table per "partition" (this also mean one hive table per partition). Or you could see that HBase do not answer your need and stay in Hive ? regards, Mathieu
... View more
10-25-2017
02:57 AM
I think what you search is a configuration located inside the "core-site.xml" file (in HDFS configuration). search for "proxyuser" on the documentation of Cloudera. regards, Mathieu
... View more
09-12-2017
05:51 AM
Not sure this information is available. You could go with the "yarn logs" command or go with the basic way using command line : - pdsh to distribute the same command on every data-node - launch a find on the container id regards, mathieu
... View more
09-08-2017
02:50 AM
I believe this wait time of 30s is hard coded into the cloudera agent. I don't think we can alter it other than doing a real dirty modification which I wouldn't recommend. regards, Mathieu
... View more
08-11-2017
06:15 AM
As far as I understand how Impala works, that is the expected behaviour. It is indeed intended for speeding up later queries that use the same sets of data.
... View more
07-25-2017
12:52 AM
Hi, I personaly don't know of that possibility. But you can reference a morphline on a network share accessible from all nodes as a workaround (Guess you already know that). regards, Mathieu
... View more
06-12-2017
04:45 AM
From my understanding when you use the Sentry HDFS synchronization plugin you only need to set the following ACLs : hive:hive / 771 https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_sg_hiveserver2_security.html#concept_vxf_pgx_nm https://www.cloudera.com/documentation/enterprise/latest/topics/sg_sentry_service_config.html#concept_z5b_42s_p4__section_lvc_4g4_rp Then it is the plugin that will manage the other permission according to permissions granted in Sentry. If you set the permissions yourself then there is not point in using the Sentry HDFS synchronization plugin.
... View more