Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

ParseException missing EOF error message with Hive Query

avatar
Explorer

Hello Community,

The Hive script I have created keeps throwing the following error:

Time taken: 2.634 seconds
FAILED: ParseException line 17:2 missing EOF at 'COLUMN' near ')'
18/01/29 10:29:53 [main]: ERROR ql.Driver: FAILED: ParseException line 17:2 missing EOF at 'COLUMN' near ')'
org.apache.hadoop.hive.ql.parse.ParseException: line 17:2 missing EOF at 'COLUMN' near ')'

Can someone please take a look at the Hive script and let me know where I might be going wrong?

DROP TABLE IF EXISTS HiveSampleIn; 
CREATE EXTERNAL TABLE HiveSampleIn 
(
 anonid int,
 eprofileclass int,
 fueltypes STRING,
 acorn_category int,
 acorn_group STRING,
 acorn_type int,
 nuts4 STRING,
 lacode STRING,
 nuts1 STRING,
 gspgroup STRING,
 ldz STRING,
 gas_elec STRING,
 gas_tout STRING
) COLUMN FORMAT DELIMITED FIELDS TERMINATED BY (',') LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein/'; 
DROP TABLE IF EXISTS HiveSampleOut; 
CREATE EXTERNAL TABLE HiveSampleOut 
(   
    acorn_category int
) COLUMN FORMAT DELIMITED FIELDS TERMINATED BY (',') LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/';
INSERT OVERWRITE TABLE HiveSampleOut
Select 
   acorn_category,
   count(*) as acorn_categorycount 
FROM HiveSampleIn Group by acorn_category

Cheers

Carlton

9 REPLIES 9

avatar
Super Guru

@Carlton Patterson,

Can you try setting it to ROW FORMAT DELIMITED instead of COLUMN FORMAT DELIMITED and try running the query.

Thanks,

Aditya

avatar
Explorer

Aditya,

Thanks for getting in touch,

I change script to the following:

DROP TABLE IF EXISTS HiveSampleIn; 
CREATE EXTERNAL TABLE HiveSampleIn 
(
 anonid int,
 eprofileclass int,
 fueltypes STRING,
 acorn_category int,
 acorn_group STRING,
 acorn_type int,
 nuts4 STRING,
 lacode STRING,
 nuts1 STRING,
 gspgroup STRING,
 ldz STRING,
 gas_elec STRING,
 gas_tout STRING
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION '
wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein/'; 
TBLPROPERTIES ("skip.header.line.count" = "1");
DROP TABLE IF EXISTS HiveSampleOut; 
CREATE EXTERNAL TABLE HiveSampleOut 
(   
    acorn_category int
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION '
wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/';
TBLPROPERTIES ("skip.header.line.count" = "1");
INSERT OVERWRITE TABLE HiveSampleOut
Select 
   acorn_category,
   count(*) as acorn_categorycount 
FROM HiveSampleIn Group by acorn_category

But I'm now getting the following errors:

Time taken: 2.684 seconds
FAILED: ParseException line 18:46 character '<EOF>' not supported here
18/01/29 11:37:24 [main]: ERROR ql.Driver: FAILED: ParseException line 18:46 character '<EOF>' not supported here
org.apache.hadoop.hive.ql.parse.ParseException: line 18:46 character '<EOF>' not supported here

avatar
Explorer

Hi Aditya,

I tried your suggestion, but now I'm getting the following error:

FAILED: IllegalArgumentException java.net.URISyntaxException: Illegal character in scheme name at index 0: 
wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein
18/01/29 12:18:20 [main]: ERROR ql.Driver: FAILED: IllegalArgumentException java.net.URISyntaxException: Illegal character in scheme name at index 0: 
wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein
java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal character in scheme name at index 0: 
wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein

The new code looks like the following:

DROP TABLE IF EXISTS HiveSampleIn; 
CREATE EXTERNAL TABLE HiveSampleIn 
(
 anonid int,
 eprofileclass int,
 fueltypes STRING,
 acorn_category int,
 acorn_group STRING,
 acorn_type int,
 nuts4 STRING,
 lacode STRING,
 nuts1 STRING,
 gspgroup STRING,
 ldz STRING,
 gas_elec STRING,
 gas_tout STRING
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION '
wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein/' 
TBLPROPERTIES ("skip.header.line.count" = "1");
DROP TABLE IF EXISTS HiveSampleOut; 
CREATE EXTERNAL TABLE HiveSampleOut 
(   
    acorn_category int
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION '
wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/'
TBLPROPERTIES ("skip.header.line.count" = "1");
INSERT OVERWRITE TABLE HiveSampleOut
Select 
   acorn_category,
   count(*) as acorn_categorycount 
FROM HiveSampleIn Group by acorn_category

avatar
Explorer

Hello Community,

I have updated the script as follows:

DROP TABLE IF EXISTS HiveSampleIn; 
CREATE EXTERNAL TABLE HiveSampleIn 
(
 anonid int,
 eprofileclass int,
 fueltypes STRING,
 acorn_category int,
 acorn_group STRING,
 acorn_type int,
 nuts4 STRING,
 lacode STRING,
 nuts1 STRING,
 gspgroup STRING,
 ldz STRING,
 gas_elec STRING,
 gas_tout STRING
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein/' 
TBLPROPERTIES ("skip.header.line.count" = "1");
DROP TABLE IF EXISTS HiveSampleOut; 
CREATE EXTERNAL TABLE HiveSampleOut 
(   
    acorn_category int
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/'
INSERT OVERWRITE TABLE HiveSampleOut
Select 
   acorn_category 
FROM HiveSampleIn Group by acorn_category
TBLPROPERTIES ("skip.header.line.count" = "1");

However, I'm getting the following error message:

Time taken: 0.499 seconds
FAILED: ParseException line 7:0 missing EOF at 'INSERT' near ''wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/''
18/01/29 15:34:09 [main]: ERROR ql.Driver: FAILED: ParseException line 7:0 missing EOF at 'INSERT' near ''wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/''
org.apache.hadoop.hive.ql.parse.ParseException: line 7:0 missing EOF at 'INSERT' near ''wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/''

Any help will be greatly appreciated.

Cheers

avatar
Explorer

Hello Community,

While waiting for some assistance, I have created the the following HQL script, but I'm still have the problem:

DROP TABLE IF EXISTS HiveSampleIn;
CREATE EXTERNAL TABLE HiveSampleIn
(
 anonid INT,
 eprofileclass INT,
 fueltypes STRING,
 acorn_category INT,
 acorn_group STRING,
 acorn_type INT,
 nuts4 STRING,
 lacode STRING,
 nuts1 STRING,
 gspgroup STRING,
 ldz STRING,
 gas_elec STRING,
 gas_tout STRING
)
ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein/';
tblproperties ("skip.header.line.count"="2");
DROP TABLE IF EXISTS HiveSampleOut; 
CREATE EXTERNAL TABLE HiveSampleOut 
(   
    acorn_category int
) 
ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/';
INSERT OVERWRITE TABLE HiveSampleOut
Select 
   acorn_category
FROM HiveSampleIn Group by acorn_category;

I'm getting the following error:

Time taken: 2.663 seconds
FAILED: ParseException line 21:44 missing EOF at ',' near ')'
18/01/29 21:37:15 [main]: ERROR ql.Driver: FAILED: ParseException line 21:44 missing EOF at ',' near ')'
org.apache.hadoop.hive.ql.parse.ParseException: line 21:44 missing EOF at ',' near ')'

Any help with this will be greatly appreciated.

Cheers

Carlton

avatar
Expert Contributor

CREATE EXTERNAL TABLE HiveSampleIn

...

LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein/'; -- Getting terminated at here

tblproperties ("skip.header.line.count"="2"); -- This might be throwing ParseException

avatar
Explorer

Hi Naresh,

Thanks for getting in touch.

Can you think of any reason why tblproperties ("skip.header.line.count"="2"); is throwing the error? And any suggestions on how to fix it?

Much appreciated.

Carlton

avatar
Expert Contributor

pls try to use below command for creating HiveSampleIn Table

CREATE EXTERNAL TABLE HiveSampleIn
(
 anonid INT,
 eprofileclass INT,
 fueltypes STRING,
 acorn_category INT,
 acorn_group STRING,
 acorn_type INT,
 nuts4 STRING,
 lacode STRING,
 nuts1 STRING,
 gspgroup STRING,
 ldz STRING,
 gas_elec STRING,
 gas_tout STRING
)
ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein/'
tblproperties ("skip.header.line.count"="2");

avatar
New Contributor

This could be because you are parsing actual data in the place of header,supposing your first row has header and second row onwards has data.

Hence it can't parse data(int, string) as header(string).

So try changing it to ("skip.header.line.count"="1");

Hope this helps.