About saranvisa

hendry · ‎05-10-2018

@alex.behmwrote: To debug wrong results, it's very helpful for us to get an Impala query profile of the query that returns wrong results. Would you be able to provide that to help us debug? Please see on this url for impala query profile. Thanks

AnkitGaurav · ‎05-08-2018

Do not specify driver in sqoop arguments. Using the –driver parameter will always force Sqoop to use the Generic JDBC Connector regardless of if a more specialized connector is available. For example, if the MySQL specialized connector would be used because the URL starts with jdbc:mysql://, specifying the –driver option will force Sqoop to use the generic connector instead. As a result, in most cases, you should not need to use the –driver option at all. Thanks, Ankit Gaurav Sinha

balajivsn · ‎05-03-2018

Thank you for your support

Alan-H · ‎05-01-2018

Related bug reports. Only fixed in V2 and not backported unfortunately. https://issues.apache.org/jira/browse/HIVE-17358?jql=project%20%3D%20HIVE%20AND%20text%20~%20%22inse... https://issues.apache.org/jira/browse/HIVE-11723?jql=project%20%3D%20HIVE%20AND%20text%20~%20%22inse...

saranvisa · ‎04-30-2018

@bhaveshsharma03 In fact there is no standard answer for this question as it is purly based on your business model, cluster size, sqoop export/import frequency, data volume, hardware capacity, etc I can give few points based my experience, hope it may help you 1. 75% of the sqoop scripts (non-priority) will use the default mappers for various reasons as we don't want to use all the available resources for just sqoop alone. 2. Also we don't want to apply all the possible performance tuning methods on those non-priority jobs, as it may disturb the RDBMS (source/target) too. 3. Get in touch with RDBMS owner to see their non-busy hours, identify the priority sqoop scripts (based on your business model), apply the performance tuning methods on the priroity scripts based on data volume (not only rows, 100s of column also matters). Repeat it if you have more than one Databases. 4. Regarding who is responsible... in most of the cases, If you have small cluster being used by very few teams, then developers and admin can work together but if you have a very large cluster being used by so many teams, then it is out of admin's scope.... again it depends

Cindy · ‎04-27-2018

Those are the same steps I've taken, except that restarting the -db service did not create a new data directory. Maybe I should re-check the permissions. I've also been working on creating the data directory manually with initdb, etc but somewhere I'm missing a password. Am about to rework pg_hba.conf to let clouderad-scm in without a password. If I can reload the dump (and I used pg_dumpall in order to get he roles and permissions as well) then I think I can get this thing going. UGH...this has been an amazingly frustrating process.

Fawze · ‎04-18-2018

@dpugazhe Generally the / monut in the linux servers are small. Could you share the df -h command output of you linux box? I would suggest you to change the location for the parcels and logs for example if you have larger mount in your linux box called /xxxxx, change the /var/lib and /var/log to /xxxx/hadoop/lib and /xxxx/hadoop/log and the same for the parcels, as you are using cloudera manager, these changes can be done quickly. so to do that. 1- Stop cloudera manager services. 2- Move the old logs to the new partition. 3- Delete the old logs. 4- Start cloudera manager services

ronnie10 · ‎04-18-2018

Managed to check the home directory, view the content of the file by using the following command. curl "http://192.168.1.7:14000/webhdfs/v1?op=gethomedirectory&user.name=root" curl 'http://192.168.1.7:14000/webhdfs/v1/user/root/t?op=LISTSTATUS&user.name=root' curl -i -L "http://192.168.1.7:14000/webhdfs/v1/user/root/t?op=OPEN&user.name=root"

Aedulla · ‎04-17-2018

Hi Guna, Thank you, I will check it.

ludof · ‎04-17-2018

@Harsh J Thank you, unfortunately I have access only to edge node (I can't ssh to masters and workers). I have access to web interfaces though (CM, HUE, Yarn, etc) thus if there is anything I can check from there let me know.

Online	Offline
Last Visited	‎08-10-2019 05:12 PM

Member Since	‎09-02-2016 11:35 AM
Last Visited	‎08-10-2019 05:12 PM
Posts	523
Kudos received	96

Cloudera Community

Re: Promoting Metadata

Re: Mix on premise and cloud nodes

Re: impala-shell

Re: How do I see user usage stats by table in Impa...

Re: Replica Not FoundException

Re: Select in impala has different value with hive

Re: Sqoop mysql connection manager

Re: Why to stop Namenode services to taker Metadat...

Re: Escaping Single Quote in Hive

Re: Number of mappers in sqoop

Re: Recommended Upgrade process for CDH

Re: How to resolve space issue in cloudera managem...

Re: httpfs - HTTP Status 401 - Authentication requ...

Re: Hive Practice data and questions

Re: Spark SQL action fails in Kerberos secured clu...