Created 11-11-2016 01:47 PM
I am running an insert into table select from... query on Hive. Whether I set the execution engine to TEZ or MR I am getting BlockMissingException errors. They all look similar to this one:
Diagnostics: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-300459168-127.0.1.1-1478287363661:blk_1073741827_1003 file=/hdp/apps/2.5.0.0-1245/tez/tez.tar.gz
When I go into HDFS the files are there. They exist. So I thought maybe it is a permissions issue. But all my related proxyusers are set to hosts=* and groups=* just to try to rule this all out.
I have a 2.5 cluster hosted on Ubuntu 12.04.
Can anyone point me in a direction of what I might be missing here?
Created 11-14-2016 02:49 PM
I finally figured this out and thought it would be friendly of me to post the solution. One of those that when you finally get it you think, "Ugh, that was so obvious". One important note, if you are having trouble with Hive make sure to check the Yarn logs too!
My solution to this and so many other issues was ensuring all my nodes had all the other nodes ip addresses in their host files. This ensures Ambari picks up all the correct IPs by hostname.
I am on Ubuntu so I did the following:
$ vim /etc/hosts
And then the file came out looking like this:
127.0.0.1 localhost #127.0.1.1 ambarihost.com ambarihost # Assigning static IP here so ambari gets it right 192.168.0.20 ambarihost.com ambarihost #Other hadoop nodes 192.168.0.21 kafkahost.com kafkahost 192.168.0.22 hdfshost.com hdfshost
Created 11-11-2016 02:49 PM
I had permissions issues I had created here because the original issue is back. It is this one:
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-300459168-127.0.1.1-1478287363661:blk_1073741947_1125 file=/tmp/hive/hiveuser/_tez_session_dir/34896acf-3209-4aa0-a244-5a28b5b15b92/hive-hcatalog-core.jar
So that is my real question, I guess!
Created 11-11-2016 06:00 PM
Is there any HDFS Balancer process is running?...
Created 11-11-2016 06:14 PM
No, there isn't. That would have been nice, wouldn't it?
Created 11-14-2016 02:49 PM
I finally figured this out and thought it would be friendly of me to post the solution. One of those that when you finally get it you think, "Ugh, that was so obvious". One important note, if you are having trouble with Hive make sure to check the Yarn logs too!
My solution to this and so many other issues was ensuring all my nodes had all the other nodes ip addresses in their host files. This ensures Ambari picks up all the correct IPs by hostname.
I am on Ubuntu so I did the following:
$ vim /etc/hosts
And then the file came out looking like this:
127.0.0.1 localhost #127.0.1.1 ambarihost.com ambarihost # Assigning static IP here so ambari gets it right 192.168.0.20 ambarihost.com ambarihost #Other hadoop nodes 192.168.0.21 kafkahost.com kafkahost 192.168.0.22 hdfshost.com hdfshost