Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

No files matching path error while loading data in Hive table

avatar
Super Collaborator

Using Beeline running in the HDP 2.5 TP Sandbox and created a table, say, json_staging and then issuing the command:

LOAD DATA LOCAL INPATH '/root/hbase_data.json' INTO TABLE json_staging;

It fails with the message

Error: Error while compiling statement: FAILED: SemanticException Line 1:23 Invalid path "/root/hbase_data.json": No files matching path file:/root/hbase_data.json (state=42000,code=40000)

The file does exist at the specified location in the Linux file system and all users have read permission.

The workaround is to first copy the file to /home/hive directory. Then I can specify the file as 'hbase_data.json' and it is loaded successfully. It fails if the file is anywhere else. Can anyone please explain the reason?

1 ACCEPTED SOLUTION

avatar
Master Guru

Hmmm access issue? If its in the root folder only root can access it. Hive error messages can be pretty generic here and not distinguish between access rights and files exists. If you run beeline the read will be executed by the hive server and that one is running under the hive user so only he could access the data.

Try it in /tmp if it works there too you would know.

View solution in original post

6 REPLIES 6

avatar
Master Guru

Hmmm access issue? If its in the root folder only root can access it. Hive error messages can be pretty generic here and not distinguish between access rights and files exists. If you run beeline the read will be executed by the hive server and that one is running under the hive user so only he could access the data.

Try it in /tmp if it works there too you would know.

avatar
Super Collaborator

I guess I got confused by the error message, thank you @Benjamin Leonhardi

avatar
Super Guru

The file needs to be owned by hive:hdfs or the user that has proper hive privileges. It can be located anywhere.

avatar
Super Collaborator

thank you so much @Constantin Stanca

avatar

Hi

I am not getting any error but the data which has loaded in table is showing null values:-

load data inpath '/priyanka/txn' into table transaction; No rows affected (0.568 seconds)

select * from transaction; +-----------------+-----------------+---------------------+----------------------+-------------------+-------------------+--+ | transaction.sr | transaction.id | transaction.amount | transaction.product | transaction.city | transaction.date | +-----------------+-----------------+---------------------+----------------------+-------------------+-------------------+--+ | NULL | NULL | NULL | NULL | NULL | NULL | | NULL | NULL | NULL | NULL | NULL | NULL | | NULL | NULL | NULL | NULL | NULL | NULL | | NULL | NULL | NULL | NULL | NULL | NULL | | NULL | NULL | NULL | NULL | NULL | NULL | | NULL | NULL | NULL | NULL | NULL | NULL | | NULL | NULL | NULL | NULL | NULL | NULL | +-----------------+-----------------+---------------------+----------------------+-------------------+-------------------+--+

While placing the file in /tmp, it is again showing all the values as NULL @Benjamin Leonhardi

avatar
Cloudera Employee

Another possible reason is that the file is not existing on the node where Hive server is running. The file needs to be present on the node where Hive server is running. Sometimes Hive server might not be running on the same node from where you are invoking the beeline client.