Created 04-02-2018 03:58 AM
I have been trying to upload a Wikipedia clickstream file into a Hive table with no luck.
I am using a recently downloaded HDP Sandbox. After doing some file upload/table create ops with good luck, I thought I'd go for a big one.
The file is 1.2 GB and is tab-separated, and the preview wouldn't work. So I converted it to comma-separated with double-quotes.
From the Files view, I uploaded the file into a new folder as maria_dev and set permissions to rw-rw-rw-.
In Hive 2.0, I created a new database and tried several times to create table by upload from that file. The preview picks up the columns without trouble.
Every time I try to hit 'create' it goes through the usual steps, then end with 'deleting live table 2016_03_en_clickstream.csv'
?? what ??
I get a message "missing translation: uploadFromHdfs" but I think this is a red herring.
The job log shows creating the gibberish-name temp table, then deleting it, then deleting the table.
Can someone clue me in here?
Created 04-02-2018 05:11 PM
Are you trying to create an ORC table in the create table interface? When trying to create an ORC formatted table, a temporary external table has to be created first and then the ORC table can be created from that external table
Created 04-02-2018 05:21 PM
Yes, it is an ORC table.
Just as with the others, it creates a temp table with a gibberish name, it appears in the table list, then the 'live table' appears in the table list, then it deletes the temp table.
My problem is that it then deletes the 'live' table, with the courteous message 'deleting live table,' leaving me with nothing and no errors except an 'undefined' with no information.
Created 04-02-2018 05:25 PM
Try creating as an external table first, and then create an ORC table. Then read from the external table into the ORC table.
Created 04-02-2018 05:42 PM
So is the Hive web interface just too unreliable to use? I don't get it. It's a simple CSV file.