About saranvisa

saranvisa · ‎01-31-2017

@AnisurRehman 1. Pls refer this official link to know more about sqoop. Change the version according to your sqoop version: https://sqoop.apache.org/docs/1.4.1-incubating/SqoopUserGuide.html 2. Yes bulk import is possible. Pls refer "sqoop-import-all-tables" topic from the above link 3. About Incremental: Pls refer "incremental import" from the above link 4. About Impala for Sqoop: a. Sqoop uses Mapper from MapReduce (No Reducers by default). It will refer the hive db/table just to idenfy the target location and it will never use hive/impala engine/process methods to import. So specifying impala/hive doesn't make any difference, so sqoop provides hive-import option by default. The bottom line is you can continue to use hive options in the sqoop script b. After data import, it is upto your option to use either hive/impala depends upon your requirement. But as you mentioned, you can use impala in certain situation, so pls use impala only when it is necessary (some priority tables) Thanks Kumar

saranvisa · ‎01-30-2017

@csguna It is authorized_key nothing to do with hdfs here. so it is user:linux group (instead of hdfs group)

MasterOfPuppets · ‎01-26-2017

I've set-up hive.prewarm.enabled=true and it did not improve the slow latency to start and initialize executors. It still takes about 15seconds to initialize things. Any idea ?

alex.behm · ‎01-20-2017

Thanks!

mbigelow · ‎01-17-2017

On the setting changes, stats, as stated will help with counts as that info is precalculates and stored in the metadata. The CBO and stats also help a lot with joins. It is possible that the OS cache is more to do with the improvement if this was a subsequent run with little activity. You could look at Hive on Spark for better consistent performance. Set hive.execution.engine = spark; On the times, the big impact between job submission and start is the the scheduler. That is a deep topic. It is best if you read up on them and review your settings and ask any specific questions that come up, preferably in a new topic. The other factor, not captured on the job stats, is the time it takes to return the results to the client. This will vary depending on the client and there isn't much to do about it. In general small result sets can be handle by the hive CLI. You can increase the client heap if needed. Otherwise use HS2 connections like beeline or HUE.

xzheng · ‎01-12-2017

@saranvisaThis health check result indicates that NodeManager is not getting enough heap space compared to its workload. Typically when workload grows in the cluster and thus the java daemon needs more heap, you need to give more heap to the Role. You could: 1. Increase the heap given to Node Manager through Node Manager's configuration page.('Java Heap Size of NodeManager in Bytes') 2. Alternatively, though not recommended, you could tune the threshold you found to tolerate higher GC ratio for Node Manager. I would recommend you go to the specific Node Manager's role instance page in Cloudera Manager, browse through the charts available for Node Manager, there would be a chart named 'JVM heap memory usage' telling you the heap consumption of the particular Node Manager. Then you can have a better idea of how much memory the Role is using and potentially increase the heap given to it to a higher value.

saranvisa · ‎01-12-2017

Since you have mentioned the word "user role", I want to clarify this You have to understand the difference between Group, User and Role Group and User to be created in both Linux(root user) and Hue(as admin user) But Role to be created only in Hue Ex: Login as root in Linux and apply below commands. Group: groupadd hive; groupadd hue; groupadd impala; groupadd analyst; groupadd admin; # In your case, your Group suppose to be.. Auditor, Read-Only, Limited Operator, Operator, Configurator, Cluster Administrator ,BDR Administrator, Navigator Administrator, User Administrator, Key Administrator, Full Administrator User: useradd kumar; # User belongs to Group usermod -a -G hive,hue,impala,admin,analyst kumar; passwd kumar; # Role assigned to Group: Now, login to Hue -> Security(Menu)-> Sentry Tables -> Add Roles (as Hive user)

saranvisa · ‎01-12-2017

@cplusplus1 You can get xml files in the below path... But I will not recommand you to update it directly, instead you can update your configuration using CM /var/run/cloudera-scm-agent/process/*-hive-HIVESERVER2 By default, Sentry requirs configuration changes in Hive, Imapal, YARN and Hue ( you can add addiontal services as needed and change configuration) Ex: You can follow this method CM -> Hive -> Configuration Select Scope > HiveServer2. Select Category > Main. Uncheck the HiveServer2 Enable Impersonation checkbox

saranvisa · ‎01-06-2017

FYI... Everything is fine with kadmin.local but kadmin is not working properly.. the same issue was raised by someone else in stackoverflow... I just followed the instruction.. The issue has been fixed now http://stackoverflow.com/questions/23779468/kerberos-kadmin-not-working-properly

cdhnaidu · ‎01-03-2017

Hi, I created a user called "commonuser" and group called "commonuser" in hue and linux machine. Created role called "commonuser" in sentry app to access databases and gave "select" privilege. Now, I logged in as commonuser in hue. In hue-hive editor the databases are visable but not in hue-Impala editor. In impala only the default database without any tables is visable as show in the below screenshot. Please advice me on the issue.

Online	Offline
Last Visited	‎08-10-2019 05:12 PM

Member Since	‎09-02-2016 11:35 AM
Last Visited	‎08-10-2019 05:12 PM
Posts	523
Kudos received	97

Cloudera Community

Re: Promoting Metadata

Re: Mix on premise and cloud nodes

Re: impala-shell

Re: How do I see user usage stats by table in Impa...

Re: Replica Not FoundException

Re: How to bulk upload Impala -Direct table[Oracle...

Re: Configure NameNode HA Cluster - How to generat...

Re: Hive hive.prewarm.enabled property

Re: CREATE TABLE AS SELECT returns error 'Failed t...

Re: Hive Queries run slowly

Re: NodeManager Health is bad: Issue due to garbag...

Re: How to create the following user roles

Re: how to find the hive-sitexml and hdfs-site.xml...

Re: Getting error when add new service in Cluster ...

Re: Regarding Hue and Hive. Can't see database on ...