Created on 01-19-2017 06:00 AM - edited 09-16-2022 03:55 AM
I created a Cloudera cluster for Impala.
If I run the following command via impala-shell:
create table test as select 1;
The following error is returned:
WARNINGS: Failed to open HDFS file for writing: hdfs://[DNhostname]:8020/user/hive/warehouse/test/_impala_insert_staging/[...]/[...].0. Error(255): Unknown error 255
However, if I run:
create table test (testcol int); insert into test select 1;
...The table is created and data is inserted without a hitch.
Any ideas on why the first statement might fail while the second set of commands would succeed, and what I could do to fix it? I might have messed something up with directory permissions, either locally or on HDFS, however I've set 'dfs.permissions' to false to turn off HDFS permissions. I don't know what to check on the local folders to ensure the correct user(s) have the right permissions. In either case, I don't know why the permissions would cause the 'CREATE TABLE AS SELECT' statement to fail but not the 'CREATE... INSERT'.
I should also mention that 'DNhostname' is the hostname of the HDFS datanode/impala daemon that I'm SSHed into, not the hostname of the namenode. This worries me because 'DNhostname' was originally where my namenode was located; I moved it to a different host for reasons outside the scope of this question. Is it possible that 'CREATE TABLE AS SELECT' is still expecting the namenode to be 'DNhostname' for some reason?
Created 01-19-2017 09:05 AM
Was the below action performed after moving the namenode?
The HiveMetastoreDatabase maintains the location of tables and databases. So once after moving the namenode, it is necessary to perform the above step to update the locations in HMS.
Created 01-19-2017 07:19 AM
This is nothing to do with Impala, If I am correct, The query "create table test as select 1" will not work in any DB (at least 95% of DB), because the query "select 1" will return the data & column name as '1'. But this is not valid column name
create table test (1 int); --This is invalid column name
Also I put the data type 'int' on my own, but "select 1" will not return any datatype. As everyone know, "Column name" and "data type" are mandatory to create any table. But "Select 1" will neither return valid ColumnName nor datatype
But the below query will work, because it will get the column name and datatype from the base table
create table db.table2 as select * from db.table1
Hope this will help you!!
Thanks
Kumar
Created 01-19-2017 08:20 AM
Created on 01-19-2017 08:28 AM - edited 01-19-2017 08:28 AM
After messing around a little more, I found something strange.
If I switch to a different database before running the 'create table as select' statement, the table is created and populated without error:
create database testdb;
use testdb; create table test as select 1;
Created 01-19-2017 08:42 AM
Created on 01-19-2017 08:58 AM - edited 01-19-2017 09:03 AM
I think I've found the problem but I'm unsure how to best fix it.
If I run:
desc database default;
The 'location' property shows the hdfs: url to the datanode and not the namenode. My second, working cluster's "location" property shows the url to the namenode.
The problem is there doesn't appear to be an ALTER DATABASE statement. How can I easily change the location property of the default database? The database already has many internal and external tables; I can try to back them up and recreate the default database from scratch, but if there's a way to modify the location property that would be great.
Created 01-19-2017 09:12 AM
Created 01-19-2017 02:50 PM
Are you running with HDFS HA (i.e. a setup with an active and a standby NN)? It is a known issue that Impala CTAS may not work with HDFS HA. As a workaround, CREATE+INSERT should work.
Created 01-20-2017 06:53 AM
Created 01-20-2017 11:52 AM