Member since
09-02-2016
523
Posts
89
Kudos Received
42
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2724 | 08-28-2018 02:00 AM | |
| 2696 | 07-31-2018 06:55 AM | |
| 5685 | 07-26-2018 03:02 AM | |
| 2981 | 07-19-2018 02:30 AM | |
| 6465 | 05-21-2018 03:42 AM |
05-10-2018
07:54 PM
@alex.behmwrote: To debug wrong results, it's very helpful for us to get an Impala query profile of the query that returns wrong results. Would you be able to provide that to help us debug? Please see on this url for impala query profile. Thanks
... View more
05-08-2018
01:13 AM
Do not specify driver in sqoop arguments. Using the –driver parameter will always force Sqoop to use the Generic JDBC Connector regardless of if a more specialized connector is available. For example, if the MySQL specialized connector would be used because the URL starts with jdbc:mysql://, specifying the –driver option will force Sqoop to use the generic connector instead. As a result, in most cases, you should not need to use the –driver option at all. Thanks, Ankit Gaurav Sinha
... View more
05-03-2018
01:33 AM
Thank you for your support
... View more
05-01-2018
04:36 AM
Related bug reports. Only fixed in V2 and not backported unfortunately. https://issues.apache.org/jira/browse/HIVE-17358?jql=project%20%3D%20HIVE%20AND%20text%20~%20%22inse... https://issues.apache.org/jira/browse/HIVE-11723?jql=project%20%3D%20HIVE%20AND%20text%20~%20%22inse...
... View more
04-30-2018
04:10 AM
@bhaveshsharma03 In fact there is no standard answer for this question as it is purly based on your business model, cluster size, sqoop export/import frequency, data volume, hardware capacity, etc I can give few points based my experience, hope it may help you 1. 75% of the sqoop scripts (non-priority) will use the default mappers for various reasons as we don't want to use all the available resources for just sqoop alone. 2. Also we don't want to apply all the possible performance tuning methods on those non-priority jobs, as it may disturb the RDBMS (source/target) too. 3. Get in touch with RDBMS owner to see their non-busy hours, identify the priority sqoop scripts (based on your business model), apply the performance tuning methods on the priroity scripts based on data volume (not only rows, 100s of column also matters). Repeat it if you have more than one Databases. 4. Regarding who is responsible... in most of the cases, If you have small cluster being used by very few teams, then developers and admin can work together but if you have a very large cluster being used by so many teams, then it is out of admin's scope.... again it depends
... View more
04-27-2018
06:59 PM
Those are the same steps I've taken, except that restarting the -db service did not create a new data directory. Maybe I should re-check the permissions. I've also been working on creating the data directory manually with initdb, etc but somewhere I'm missing a password. Am about to rework pg_hba.conf to let clouderad-scm in without a password. If I can reload the dump (and I used pg_dumpall in order to get he roles and permissions as well) then I think I can get this thing going. UGH...this has been an amazingly frustrating process.
... View more
04-18-2018
06:55 AM
@dpugazhe Generally the / monut in the linux servers are small. Could you share the df -h command output of you linux box? I would suggest you to change the location for the parcels and logs for example if you have larger mount in your linux box called /xxxxx, change the /var/lib and /var/log to /xxxx/hadoop/lib and /xxxx/hadoop/log and the same for the parcels, as you are using cloudera manager, these changes can be done quickly. so to do that. 1- Stop cloudera manager services. 2- Move the old logs to the new partition. 3- Delete the old logs. 4- Start cloudera manager services
... View more
04-18-2018
01:08 AM
Managed to check the home directory, view the content of the file by using the following command. curl "http://192.168.1.7:14000/webhdfs/v1?op=gethomedirectory&user.name=root" curl 'http://192.168.1.7:14000/webhdfs/v1/user/root/t?op=LISTSTATUS&user.name=root' curl -i -L "http://192.168.1.7:14000/webhdfs/v1/user/root/t?op=OPEN&user.name=root"
... View more
04-17-2018
06:37 AM
@Harsh J Thank you, unfortunately I have access only to edge node (I can't ssh to masters and workers). I have access to web interfaces though (CM, HUE, Yarn, etc) thus if there is anything I can check from there let me know.
... View more