Support Questions
Find answers, ask questions, and share your expertise

How Upload Table in Hive view transfers data?

Rising Star

I used Upload Table action in Hive view. It worked fine and uploaded table. But I was surprised when I analyzed Internal jobs created by this action. There were 4 jobs

1. Create target table

2. Create temporary table

3. Copy data from temporary table to target table

4. Delete temporary table

What was surprising - neither of this job pointed to a source file, so it was not clear at which point (and how) the process was reading data from the source file. I reasonably expected that temporary table created at p.2 would be EXTERNAL table pointing to a source file (so it would have LOCATION clause), that would make sense. However it did not happen. Below is the statement executed by a second job.

CREATE TABLE lpsaabddmakzdzbwxapezrkqcvkcqy 
(truckid STRING, driverid STRING, event STRING, latitude DOUBLE, longitude DOUBLE, city STRING, state STRING, velocity INT, event_ind INT, idling_ind INT) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '1' STORED AS TEXTFILE;

As we can see it just creates internal table, which is not associated with a source file.

2 REPLIES 2

Hi @Dmitry Otblesk, there is a step between your #2 and #3 above where data is inserted into the temporary table from the view. Please see https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.1/bk_ambari-views/content/upload_table.html

Rising Star

@slachterman

>there is a step between your #2 and #3 above were data is inserted

Yes, I assumed there would such step, but there was not. There were only 4 jobs in the history, those I already listed above. I also clicked Clear Filters button in the History page to ensure that all jobs are shown.

Another interesting note - the screenshot for the similar process within Hadoop tutorial also shows only 4 internal jobs, not 5.

So is this step between p.2 and p.3 executed somehow differently (not as a job)?