Support Questions

Nayland · ‎11-20-2017

Hello all,

Hive is complaning when I try to import a csv into a table I created called "stocks." The table is set up as follows:

hive> describe stocks;
OK
exchng string
symbol string
ymd string
price_open float
price_high float
price_low float
price_close float
volume int
price_adj_close float

Then I try to load data from a csv as follows:

hive> load data inpath '/user/data/stocks/stocks.csv'
> overwrite into table human_resources.stocks;

I then get the following error:

Loading data to table human_resources.stocks
Failed with exception Unable to move source hdfs://quickstart.cloudera:8020/user/data/stocks/stocks.csv to destination hdfs://quickstart.cloudera:8020/user/data/stocks/stocks.csv
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
hive> describe table stocks;
FAILED: SemanticException [Error 10001]: Table not found table

I don't think that the file is corrupted. You can see the file in the link below and it's just a normal csv file - in fact, it was provided by the author of the Hive book I'm working through.

http://www.vaughn-s.net/hadoop/stocks.csv

The image file I'm using is cloudera-quickstart-vm-5.10.0-0-vmware, I'm not sure if I need to update or not.

AutoIN · ‎11-24-2017

The problem is running the LOAD query with OVERWRITE option and having the source data file (location where the CSV file is placed) being in the same directory as the table is located in.

Unable to move 
source 
hdfs://quickstart.cloudera:8020/user/data/stocks/stocks.csv to
destination 
hdfs://quickstart.cloudera:8020/user/data/stocks/stocks.csv

The solution would be to move the source data file into a different hdfs directory and load the data into the table from there or alternatively, if the table is newly created you can leave the overwrite part out of the query.

Note: In general, if your data is already there in table's location, you don't need to load data again, you can simply define the table using the external keyword, which leaves the files in place, but creates the table definition in the hive metastore.

Example:

$ cat /tmp/sample.txt
1 a
2 b
3 c

$ hdfs dfs -mkdir  /data1
$ hdfs dfs -chown hive:hive /data1
$ hdfs dfs -cp /tmp/sample.txt /data1

$ hive
hive> CREATE EXTERNAL TABLE weather6 (col1 INT, col2 STRING)
    > COMMENT 'Employee details'
    > ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '
    > STORED AS TEXTFILE
    > LOCATION '/data1';

hive> select * from weather6;
OK
1	a
2	b
3	c

View solution in original post

AutoIN · ‎11-24-2017

The problem is running the LOAD query with OVERWRITE option and having the source data file (location where the CSV file is placed) being in the same directory as the table is located in.

Unable to move 
source 
hdfs://quickstart.cloudera:8020/user/data/stocks/stocks.csv to
destination 
hdfs://quickstart.cloudera:8020/user/data/stocks/stocks.csv

The solution would be to move the source data file into a different hdfs directory and load the data into the table from there or alternatively, if the table is newly created you can leave the overwrite part out of the query.

Note: In general, if your data is already there in table's location, you don't need to load data again, you can simply define the table using the external keyword, which leaves the files in place, but creates the table definition in the hive metastore.

Example:

$ cat /tmp/sample.txt
1 a
2 b
3 c

$ hdfs dfs -mkdir  /data1
$ hdfs dfs -chown hive:hive /data1
$ hdfs dfs -cp /tmp/sample.txt /data1

$ hive
hive> CREATE EXTERNAL TABLE weather6 (col1 INT, col2 STRING)
    > COMMENT 'Employee details'
    > ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '
    > STORED AS TEXTFILE
    > LOCATION '/data1';

hive> select * from weather6;
OK
1	a
2	b
3	c

Nayland · ‎11-25-2017

Woops, I had completely overlooked that. Thanks!

Bhagya · ‎06-11-2019

Hi ,
While am creating a table I'm getting the following error:

hive> CREATE TABLE ETL.FeedControl
    > (
    > feedName STRING,
    > sourceSystem STRING,
    > fileName STRING,
    > fileFormat STRING,
    > landingPath STRING,
    > rejectThreshold INT,
    > addBusDateCol STRING,
    > header_trailer_Flag STRING,
    > rawZonePath STRING,
    > outputTableName STRING,
    > sourceTableName STRING,
    > dataSourceType STRING
    > )
    > row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    > with serdeproperties (
    > "separatorChar" = ",",
    > "quoteChar"     = "'",
    > "escapeChar"    = "\\"
    > )
    > stored as textfile;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.net.ConnectException Call From localhost/127.0.0.1 to caaspoc64:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused)

Can you please explain what the error means and how to resolve it?
Thanks in advance.