Member since
07-16-2015
177
Posts
28
Kudos Received
19
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
14251 | 11-14-2017 01:11 AM | |
60651 | 11-03-2017 06:53 AM | |
4333 | 11-03-2017 06:18 AM | |
13578 | 09-12-2017 05:51 AM | |
1998 | 09-08-2017 02:50 AM |
11-03-2017
06:38 AM
Alternatively you could search around "yarn queue" and ressource allocation. This will not "restrict" the number of mappers or reducers but this will control how many can run concurrently by giving access to only a subset of the available resources.
... View more
11-03-2017
06:18 AM
1 Kudo
Hi, The concept of Hive partition do not map to HBase tables. So if you want to have HBase as the storage then you will need to workaround your use case. You could try to use "one HBase table" having a row key constructed with the partition value. That way you should be able to query your HBase table using the row key and avoid a full scan of the table. Or you could have one HBase table per "partition" (this also mean one hive table per partition). Or you could see that HBase do not answer your need and stay in Hive ? regards, Mathieu
... View more
10-25-2017
02:57 AM
I think what you search is a configuration located inside the "core-site.xml" file (in HDFS configuration). search for "proxyuser" on the documentation of Cloudera. regards, Mathieu
... View more
10-19-2017
07:15 PM
there are couple of places that needsd tuining in the query level 1 . stats for the table is must for good performance 2. when user is joining two tables make sure there are using the large table in the last and the first table is smaller 3. you can also use HINTS to imporve query performance. 4. hive table's file format is big a factor 5. choosing when to use paritioning vs bucketing. 6.allocate good memory to hiveserver2 and metastore 7.heapsize 8 .load balancer on the host https://www.cloudera.com/documentation/enterprise/5-9-x/topics/admin_cm_ha_hosts.html#concept_qkr_bfd_pr
... View more
10-14-2017
07:05 AM
Do you need the --override? I reran my Tutorial 1 and it didnt append new records....I thought it would...why do you think it allowed it?
... View more
09-15-2017
09:43 AM
Finding logs manually in machine sound very brute force; I was thinking more of an API or CLI option to find logs Anyway the main issue we're trying to solve is access to logs to all developers in prod environment. Our node managers are behind the bars and not accessible ( any port or web ) to develoeprs and it's unlikely to happen. So we're trying to find a way to proxy the logs. I discovered that there is a jobhistory proxy to look at completed jobs / yarn apps but I coudln't get it working for running app. Is there any trick / way to access running app's logs like above ? http://resourcemanager.xyz.com:19888/jobhistory/logs//dataNode.com:8041/container_id_000001/container_id_000001/root
... View more
09-08-2017
02:50 AM
I believe this wait time of 30s is hard coded into the cloudera agent. I don't think we can alter it other than doing a real dirty modification which I wouldn't recommend. regards, Mathieu
... View more
08-11-2017
07:57 AM
Thank you for the detailed answer
... View more
07-25-2017
07:43 AM
the hbase-indexer morphlines.conf is managed by CM, and will automatically be distributed to each node in the /var/run/cloudera-scm-agent/process directory when hbase-indexer starts. You'll want to specify a relative path name in the morphline-hbase-mapper.xml, and it will pick it up from the process directory: https://www.cloudera.com/documentation/enterprise/latest/topics/search_hbase_batch_indexer.html#concept_q3l_2tb_4r -pd
... View more
06-12-2017
04:45 AM
From my understanding when you use the Sentry HDFS synchronization plugin you only need to set the following ACLs : hive:hive / 771 https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_sg_hiveserver2_security.html#concept_vxf_pgx_nm https://www.cloudera.com/documentation/enterprise/latest/topics/sg_sentry_service_config.html#concept_z5b_42s_p4__section_lvc_4g4_rp Then it is the plugin that will manage the other permission according to permissions granted in Sentry. If you set the permissions yourself then there is not point in using the Sentry HDFS synchronization plugin.
... View more