Member since
09-23-2015
800
Posts
898
Kudos Received
185
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5183 | 08-12-2016 01:02 PM | |
2147 | 08-08-2016 10:00 AM | |
2524 | 08-03-2016 04:44 PM | |
5346 | 08-03-2016 02:53 PM | |
1369 | 08-01-2016 02:38 PM |
01-25-2016
11:36 AM
2 Kudos
wrote as an answer because of the character limit: yes first go into ambari or perhaps better the OS and search for the tez.lib.uris property in the properties file less /etc/tez/conf/tez-site.xml You should find something like this: <value>/hdp/apps/${hdp.version}/tez/tez.tar.gz</value> if this is not available you may have a different problem. ( Tez client not installed some configuration issue) You can then check if these files exist in HDFS with hadoop fs -ls /hdp/apps/ find the version number for example 2.3.2.0-2950 [root@sandbox ~]# hadoop fs -ls /hdp/apps/2.3.2.0-2950/tez Found 1 items -r--r--r-- 3 hdfs hadoop 56926645 2015-10-27 14:40 /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz You can check if this file is corrupted somehow with hadoop fs -get /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz You can then try to untar it to see if that works. If the file doesn't exist in HDFS you can find it in the installation directory of HDP (/usr/hdp/2.3.2.0-2950/tez/lib/tez.tar.gz on the local filesystem ) You could then put it into hdfs
... View more
01-25-2016
10:20 AM
1 Kudo
There are different possibilities. Normally this means the tez libraries are not present in HDFS. Are you using the sandbox? You should check if the tez client is installed on your pig client, if the tez-site.xml contains the tez.lib.uris property and if the tez libraries are actually in HDFS and valid ( download them and untar to check ) /hdp/apps/<hdp_version>/tez/tez.tar.gz https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_installing_manually_book/content/ref-ffec9e6b-41f4-47de-b5cd-1403b4c4a7c8.1.html
... View more
01-25-2016
10:00 AM
Hmmmm weird, the order shouldn't really make a difference. I assume he added a reducer doing that. Only explanation I have. Adding a distribute by would most likely also have helped. But sort is good for predicate pushdown and so as long as all is good ... 🙂
... View more
01-14-2016
06:23 PM
1 Kudo
Apart from apreduce.reduce.java.opts=-Xmx4096m missing an m which I don't think will be the problem; How many days are you loading? You essentially do a dynamic partitioning so the task needs to keep memory for every day you load into. If you have a lot of days this might be the reason: Possible solutions: a) Try to load one day and see if that makes it better. b) use dynamic sorted partitioning, ( slide 16) this theoretically should fix the problem if this is the reason c) use manual distribution ( slide 19 ) http://www.slideshare.net/BenjaminLeonhardi/hive-loading-data
... View more
01-13-2016
12:25 PM
1 Kudo
That is very curious I have seen lots of stripes being created because of memory problems. But normally he only gets down to 5000 rows and then out of memory. Which version of Hive are you using? What are your memory settings for the hive tasks and if the file is small is it possible that the table is partitioned and the task is writing into a large number of partitions at the same time? Can you share the LOAD command and the table layout?
... View more
01-11-2016
05:32 PM
ah nice undercover magic. I will try and see what happens if I switch the active off.
... View more
01-11-2016
05:29 PM
I have seen the question for HA Namenodes however HA Resource Managers still confuse me. In Hue you are for example told to add a second resource manager entry with the same logical hue name. I.e. Hue supports adding two resource manager urls and he will manually try both. How does that work in Falcon, how can I enter an HA Resource Manager entry into the interfaces of the cluster Entity document. For Namenode HA I would use the logical name and the program would then read the hdfs-site.xml I have seen the other similar questions for oozie but I am not sure it was answered or I didn't really understand it. https://community.hortonworks.com/questions/2740/what-value-should-i-use-for-jobtracker-for-resourc.html so assuming my active resource manager is mycluster1.com:8050 and standby is mycluster2,com:8050
... View more
Labels:
01-07-2016
02:05 PM
2 Kudos
You could use a shell action, add the token to the oozie files ( file tag ) and do the kinit yourself before running the java command. Obviously not that elegant and you have a token somewhere in HDFS but it should work. I did something similar with a shell action running a scala program and running a kinit before. ( Not against hive but running kinit then connecting to HDFS ). Ceterum censeo I would always suggest using a hive server with LDAP/PAM authentication. beeline and hive2 action has a password file option now and it makes life so much easier. As a database guy kerberos for a jdbc connection just always makes problems. Here is the oozie shell command by the way. <shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>runJavaCommand.sh</exec>
<file>${nameNode}/scripts/runJavaCommand.sh#runJavaCommand.sh</file>
<file>${nameNode}/securelocation/user.keytab#user.keytab</file>
</shell>
then just add a kinit into the script before running java
kinit -kt user.keytab user@EXAMPLE.COM
java org.apache.myprogram
... View more
01-04-2016
01:59 PM
It looks like a very useful command for debugging. Never used it before. Shame it seems to be broken.
... View more