I used Upload Table action in Hive view. It worked fine and uploaded table. But I was surprised when I analyzed Internal jobs created by this action. There were 4 jobs
1. Create target table
2. Create temporary table
3. Copy data from temporary table to target table
4. Delete temporary table
What was surprising - neither of this job pointed to a source file, so it was not clear at which point (and how) the process was reading data from the source file. I reasonably expected that temporary table created at p.2 would be EXTERNAL table pointing to a source file (so it would have LOCATION clause), that would make sense. However it did not happen. Below is the statement executed by a second job.
CREATE TABLE lpsaabddmakzdzbwxapezrkqcvkcqy (truckid STRING, driverid STRING, event STRING, latitude DOUBLE, longitude DOUBLE, city STRING, state STRING, velocity INT, event_ind INT, idling_ind INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '1' STORED AS TEXTFILE;
As we can see it just creates internal table, which is not associated with a source file.
Hi @Dmitry Otblesk, there is a step between your #2 and #3 above where data is inserted into the temporary table from the view. Please see https://docs.hortonworks.com/HDPDocuments/Ambari-184.108.40.206/bk_ambari-views/content/upload_table.html
>there is a step between your #2 and #3 above were data is inserted
Yes, I assumed there would such step, but there was not. There were only 4 jobs in the history, those I already listed above. I also clicked Clear Filters button in the History page to ensure that all jobs are shown.
Another interesting note - the screenshot for the similar process within Hadoop tutorial also shows only 4 internal jobs, not 5.
So is this step between p.2 and p.3 executed somehow differently (not as a job)?