Member since
12-19-2014
6
Posts
5
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
69963 | 01-26-2015 07:56 AM |
01-26-2015
07:56 AM
1 Kudo
Just wanted to finalize this thread. I was able to successfully reference a file in HDFS for the data load. So it seems like that is a good option in replace of LOCAL. Not sure what you wanted to do about the nuance, but maybe an indicator in the 'file not found' error message indicating that beeline won't be able to accept LOCAL if the beeline client is not running on the Hive2 server? (At my level of undertanding, I'd imagine most beeline clients won't be running on the same node as the Hive2 server.) "....file not found. NOTE: LOCAL is not supported in beeline unless the beeline client is running on the same node as the HIVESERVER2. A work around is to load the file to HDFS and then load from there." Just a thought. Thanks for your help and prompt response. cloud@c-192-199-76-8:~> hadoop fs -ls /tmp Found 7 items drwxrwxrwx - hdfs supergroup 0 2015-01-26 14:54 /tmp/.cloudera_health_monitoring_canary_files drwxr-xr-x - cloud supergroup 0 2015-01-22 21:47 /tmp/hive-cloud drwxrwxrwx - hive supergroup 0 2014-10-28 21:34 /tmp/hive-hive -rw-r--r-- 3 meee meee 1180101268 2015-01-23 19:52 /tmp/hourly_TEMP_2014.csv drwxr-xr-x - hdfs supergroup 0 2014-11-07 18:06 /tmp/input drwxrwxrwt - mapred hadoop 0 2014-11-19 16:45 /tmp/logs drwxr-xr-x - hdfs supergroup 0 2014-11-07 18:33 /tmp/output cloud@c-10-206-76-8:~> beeline -u jdbc:hive2://c-192-199-76-8.int.cis.trcloud:10000/default --verbose=true -n meee issuing: !connect jdbc:hive2://c-192-199-76-8.int.cis.trcloud:10000/default meee'' scan complete in 3ms Connecting to jdbc:hive2://c-192-199-76-8.int.cis.trcloud:10000/default Connected to: Apache Hive (version 0.13.1-cdh5.2.0) Driver: Hive JDBC (version 0.13.1-cdh5.2.0) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.13.1-cdh5.2.0 by Apache Hive 0: jdbc:hive2://c-192-199-76-8.int.cis.trcloud> load data inpath '/tmp/hourly_TEMP_2014.csv' into table temps_txt; No rows affected (0.556 seconds) 0: jdbc:hive2://c-192-199-76-8.int.cis.trcloud> select avg(degrees) from temps_txt; +--------------------+--+ | _c0 | +--------------------+--+ | 56.87016100866962 | +--------------------+--+ 1 row selected (77.389 seconds)
... View more
01-23-2015
01:06 PM
1 Kudo
Thanks Szehon. The file is not on the HiveServer2 machine. I will try to move the file into HDFS and then load from there to see if that works instead. Might be that the "LOCAL" option is not an option with beeline clients not running on the HiveServer2.
... View more
01-23-2015
08:15 AM
Good morning. After my local Hadoop User Group meeting last night, I decided to switch over from using the native "hive" shell to "beeline." I can't remember the exact reasons, but the wonderful speaker made a point of saying users needed to back way from using the native "hive" shell for some very good reasons that I've forgotten after three beers last night.
In anysense, I took the advice and fired up beeline this morning. Everything seems to be working well, but when trying to load data, I get an "Invalid Path" error. Below you can see that when not fully qualifying the file name, the working directory is set to where the properties and such are stored. That's fine.
Connected to: Apache Hive (version 0.13.1-cdh5.2.0) Driver: Hive JDBC (version 0.13.1-cdh5.2.0) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.13.1-cdh5.2.0 by Apache Hive 0: jdbc:hive2://c-10-206-76-8.int.cis.trcloud> load data local inpath 'hourly_TEMP_2014.csv' into table temps_txt; Error: Error while compiling statement: FAILED: SemanticException Line 1:23 Invalid path ''hourly_TEMP_2014.csv'': No files matching path file:/var/run/cloudera-scm-agent/process/24-hive-HIVESERVER2/hourly_TEMP_2014.csv (state=42000,code=40000)
This is fine, I can fully qualify the file. However, even when I do that, I still get that the file is not found.
0: jdbc:hive2://c-10-206-76-8.int.cis.trcloud> load data local inpath '/home/cloud/hourly_TEMP_2014.csv' into table temps_txt; Error: Error while compiling statement: FAILED: SemanticException Line 1:23 Invalid path ''/home/cloud/hourly_TEMP_2014.csv'': No files matching path file:/home/cloud/hourly_TEMP_2014.csv (state=42000,code=40000) 0: jdbc:hive2://c-10-206-76-8.int.cis.trcloud> !quit Closing: 0: jdbc:hive2://c-10-206-76-8.int.cis.trcloud:10000/default cloud@c-10-206-76-8:~> ls -l /home/cloud/* | grep TEMP -rw-rw-r-- 1 cloud cloud 1180101268 Jan 22 21:28 /home/cloud/hourly_TEMP_2014.csv
When I issue these commands via the "hive" shell, the file location resolves fine - both relatively and fully qualified. I'm going to upgrade my small cluster to CDH 5.3.0 to see if the Hive version + backports change the behavior, but figured I'd post this to see if anyone has seen this issue with the 5.2.0 release.
Thanks for your time.
(Oh, also in the "Labels" section of this forum, there is only CDH 4.x options to choose from and it's a required field. So I selected 4.6.x, even though this is realted to CDH 5.2.0. Just thought I'd note that. Or it could be that I'm in the wrong area. Wouldn't be the first time.)
... View more
Labels:
- Labels:
-
Apache Hive
12-19-2014
12:16 PM
For various reasons I'm too embarrassed to talk about, we've run into this a few time with dev clusters in our private cloud and DNS getting mangled. We've found that if the python script that smark provided works, the DNS stuff is setup correctly and the agent will start. Thanks smark for sharing. python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())'
... View more