Member since
07-31-2017
2
Posts
0
Kudos Received
0
Solutions
07-31-2017
02:31 PM
@Bala Vignesh N V
I loaded the external table using the following query _____________________________________________________________________________________ drop table if exists table1; CREATE EXTERNAL TABLE table1 (col1 String,col2 String,col3 String,col4 String,col5 String,col6 String,col7 String) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' lines terminated by '\n' STORED AS TEXTFILE LOCATION '/user/maria_dev/'; select * from table1; The count of the rows for this table is 4,32,65,807. _____________________________________________________________________________ Then I loaded another table with the same query as above. This table has 3,79,405 rows. Now , i joined the two queries using the following query SELECT distinct tb1.col1,tb1.col2,tb1.col3,tb1.col4,tb1.col5,tb1.col6,
tb2.col1,tb2.col2,tb2.col3,tb2.col4,tb2.col5,tb2.col6 FROM table1 tb1 LEFT OUTER JOIN table2 tb2
ON (tb1.col1 = tb2.col1); The count for this comes out to be 65618892. But when i save this query in a external table as follows the count of rows come out to be 11520062. ____________________________________________________________________________________ CREATE EXTERNAL TABLE resultant (col1 STRING,col2 STRING,col3 STRING,col4 STRING,col5 STRING, col6 STRING,col7 STRING,col8 STRING,col9 STRING,col10 STRING,col11 STRING, col12 STRING); INSERT OVERWRITE TABLE resultant SELECT distinct tb1.col1,tb1.col2,tb1.col3,tb1.col4,tb1.col5,tb1.col6, tb2.col1,tb2.col2,tb2.col3,tb2.col4,tb2.col5,tb2.col6 FROM table1 tb1 LEFT OUTER JOIN table2 tb2 ON (tb1.col1 = tb2.col1); _____________________________________________________________________________________ This is how i loaded the tables. And i used count(*) and show tblproperties yourTableName to calculate the rows.
... View more
07-31-2017
11:52 AM
I am using hortonworks vm for hive computations.I am joining two hive tables and want to store the result in another hive table. When I run my query it returns me 65618892 rows but when i save the same query in the hive external table it only stores 11520062 rows and ends the query successfully without any error. I need help to store all of my data set in the hive external table.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive