Member since
08-15-2019
29
Posts
111
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1667 | 09-08-2017 11:30 PM | |
2612 | 06-08-2017 07:24 PM | |
6573 | 03-28-2017 05:20 PM | |
3823 | 03-17-2017 04:27 AM | |
3279 | 03-09-2017 11:48 PM |
06-08-2017
07:24 PM
6 Kudos
If the query is run over hive cli it should have worked. In case it is beeline or other clients accessing hiveserver you have to restart the server for property to get updated. If the cluster is managed via ambari it will be optimal to update it there since ambari will list the services that need a restart
... View more
05-02-2017
06:01 PM
1 Kudo
update is supported for ACID tables in hive, https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Update
... View more
04-28-2017
11:12 PM
12 Kudos
Hive is having an improvised explain plan with the hive.explain.user set to true. Now the indentation and arrows clearly depict the execution flow. Retrieving the join order needed a lot of back and forth references in the old style which was made as easy as following the arrows now, even with the details of mappers and reducers.
Let us take a look at one sample explain plan and retrieve the join order
Plan optimized by CBO.
Vertex dependency in root stage
Map 2 <- Map 1 (BROADCAST_EDGE), Map 4 (BROADCAST_EDGE), Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)
Reducer 3 <- Map 2 (SIMPLE_EDGE)
Stage-0
Fetch Operator
limit:-1
Stage-1
Reducer 3 vectorized, llap
File Output Operator [FS_101]
Group By Operator [GBY_100] (rows=1 width=736)
Output:["_col0","_col1","_col2","_col3"],aggregations:["min(VALUE._col0)","min(VALUE._col1)","min(VALUE._col2)","min(VALUE._col3)"]
<-Map 2 [SIMPLE_EDGE] llap
SHUFFLE [RS_45]
Group By Operator [GBY_44] (rows=1 width=736)
Output:["_col0","_col1","_col2","_col3"],aggregations:["min(_col1)","min(_col3)","min(_col14)","min(_col19)"]
Map Join Operator [MAPJOIN_92] (rows=80573328660391072 width=114)
Conds:MAPJOIN_91._col11, _col7=RS_41._col0, _col0(Inner),Output:["_col1","_col3","_col14","_col19"]
<-Map 9 [BROADCAST_EDGE] llap
BROADCAST [RS_41]
PartitionCols:_col0, _col0
Select Operator [SEL_23] (rows=1960358 width=104)
Output:["_col0","_col1"]
Filter Operator [FIL_86] (rows=1960358 width=104)
predicate:id is not null
TableScan [TS_21] (rows=1960358 width=104)
default@title,t,Tbl:COMPLETE,Col:NONE,Output:["id","title"]
<-Map Join Operator [MAPJOIN_91] (rows=73248479012737488 width=125)
Conds:MAPJOIN_90._col10=RS_38._col0(Inner),Output:["_col1","_col3","_col7","_col11","_col14"]
<-Map 8 [BROADCAST_EDGE] llap
BROADCAST [RS_38]
PartitionCols:_col0
Select Operator [SEL_20] (rows=2 width=40)
Output:["_col0"]
Filter Operator [FIL_85] (rows=2 width=40)
predicate:((UDFToString(role) = 'actress') and id is not null)
TableScan [TS_18] (rows=4 width=40)
default@role_type,rt,Tbl:COMPLETE,Col:NONE,Output:["id","role"]
<-Map Join Operator [MAPJOIN_90] (rows=66589524931926816 width=138)
Conds:MAPJOIN_89._col0, _col6=RS_35._col0, _col0(Inner),Output:["_col1","_col3","_col7","_col10","_col11","_col14"]
<-Map 7 [BROADCAST_EDGE] llap
BROADCAST [RS_35]
PartitionCols:_col0, _col0
Select Operator [SEL_17] (rows=764741 width=105)
Output:["_col0","_col1"]
Filter Operator [FIL_84] (rows=764741 width=105)
predicate:((UDFToString(gender) = 'f') and (name like '%An%') and id is not null)
TableScan [TS_15] (rows=3058966 width=105)
default@name,n,Tbl:COMPLETE,Col:NONE,Output:["id","name","gender"]
<-Map Join Operator [MAPJOIN_89] (rows=60535930444216152 width=152)
Conds:MAPJOIN_88._col7, _col4=RS_32._col0, _col1(Inner),Output:["_col0","_col1","_col3","_col6","_col7","_col10","_col11"]
<-Map 6 [BROADCAST_EDGE] llap
BROADCAST [RS_32]
PartitionCols:_col0, _col1
Select Operator [SEL_14] (rows=11636372 width=8)
Output:["_col0","_col1"]
Filter Operator [FIL_83] (rows=11636372 width=8)
predicate:(movie_id is not null and company_id is not null)
TableScan [TS_12] (rows=11636372 width=8)
default@movie_companies,mc,Tbl:COMPLETE,Col:NONE,Output:["movie_id","company_id"]
<-Map Join Operator [MAPJOIN_88] (rows=55032662847395656 width=167)
Conds:MAPJOIN_87._col0, _col2=RS_29._col0, _col2(Inner),Output:["_col0","_col1","_col3","_col4","_col6","_col7","_col10"]
<-Map 4 [BROADCAST_EDGE] llap
BROADCAST [RS_29]
PartitionCols:_col0, _col2
Select Operator [SEL_8] (rows=6110483 width=116)
Output:["_col0","_col1","_col2","_col4"]
Filter Operator [FIL_81] (rows=6110483 width=116)
predicate:((note) IN ('(voice)', '(voice: Japanese version)', '(voice) (uncredited)', '(voice: English version)') and person_id is not null and person_role_id is not null and movie_id is not null and role_id is not null)
TableScan [TS_6] (rows=12220966 width=116)
default@cast_info,ci,Tbl:COMPLETE,Col:NONE,Output:["person_id","movie_id","person_role_id","note","role_id"]
<-Map Join Operator [MAPJOIN_87] (rows=50029692413268030 width=184)
Conds:(Inner),(Inner),Output:["_col0","_col1","_col2","_col3","_col4"]
<-Map 1 [BROADCAST_EDGE] llap
BROADCAST [RS_24]
Select Operator [SEL_2] (rows=701938 width=104)
Output:["_col0","_col1"]
Filter Operator [FIL_79] (rows=701938 width=104)
predicate:person_id is not null
TableScan [TS_0] (rows=701938 width=104)
default@aka_name,an,Tbl:COMPLETE,Col:NONE,Output:["person_id","name"]
<-Map 5 [BROADCAST_EDGE] llap
BROADCAST [RS_26]
Select Operator [SEL_11] (rows=34365 width=259)
Output:["_col0"]
Filter Operator [FIL_82] (rows=34365 width=259)
predicate:((UDFToString(country_code) = '[us]') and id is not null)
TableScan [TS_9] (rows=68731 width=259)
default@company_name,cn,Tbl:COMPLETE,Col:NONE,Output:["id","country_code"]
<-Select Operator [SEL_5] (rows=2074019 width=104)
Output:["_col0","_col1"]
Filter Operator [FIL_80] (rows=2074019 width=104)
predicate:id is not null
TableScan [TS_3] (rows=2074019 width=104)
default@char_name,chn,Tbl:COMPLETE,Col:NONE,Output:["id","name"]
1. Start reading from the bottom
2. Look for the keyword 'TableScan', the row below it will have the details of the tables being joined
3. Look for the keyword 'Join Operator', the indent of this line will tell us the tables that are part of this join
4. This joined table group will form candidate for another join if the line starts with '<-'
5. Similarly keep going to top to get the complete join order
The join order in the above example boils down to (title(t),(((movie_companies(mc),(cast_info(ci),(company_name(cn),aka_name(an),char_name(chn)))),name(n)),role_type(rt)))
... View more
Labels:
04-28-2017
05:39 PM
1 Kudo
@Funamizu Koshi Could you please post the full stack of error message from hivemetastore.log, default location is /var/log/hive
... View more
04-28-2017
05:10 PM
7 Kudos
Looks like the data was removed from hdfs but no drop partition query was executed on hive side. The data being displayed from 'show partitions <table_name>' command retrieves data from hive metastore db, which has the information on table partition but not the real data, which is stored in hdfs. About the data missing from hdfs, it might be worth checking the hdfs-audit.log to trace the lifecycle of the directory/file
... View more
03-28-2017
05:20 PM
10 Kudos
@dvt isoft I am not sure which client you are using to connect. It might be worth checking if the same connection (so that same session) is being used to run successive queries or is it like client is kind of establishing new connection for each query. Otherwise, hive odbc driver is able to run select queries on temporary tables and as you already mentioned you don't need to enable it explicitly starting Hive 0.14.
... View more
03-17-2017
04:27 AM
9 Kudos
Check "Number of nodes used by Hive's LLAP" in hive configs (Advanced) section. From error looks like it is set 4 , in which case 4 node managers should be running in the cluster. Looks like the cluster has only 3.
... View more
03-09-2017
11:48 PM
7 Kudos
In View Config try setting sslTrustStore=/var/log/ambari-server/hive-next-view/hive.jks;trustStorePassword=******** as Hive Session param
... View more
03-02-2017
11:09 PM
2 Kudos
For getting the number of rows in result set you could use this query Select count(*) from (SELECT t1.c1 ,t1.c2 FROM t1 WHERE NOT (t1.c1 IS NULL )) AS table_1;
... View more
02-28-2017
05:35 AM
2 Kudos
One another way to backup/restore specific hive tables would be to use 'show create table' to back up DDL which could be used to recreate the table. Then the saved hdfs files could be dropped to warehouse dir to finish the table restore. beeline -u jdbc:hive2://<cluster-name>:<port#>/<db-name> --outputformat=csv -e "show create table <table-name>;" > <table_name>.ddl
... View more
- « Previous
-
- 1
- 2
- Next »