Member since
01-04-2018
1
Post
0
Kudos Received
0
Solutions
01-04-2018
03:26 PM
I am using HDP 2.6.3 in Hortonworks sandbox and trying to test the data ingestion to hive partitioned and bucketed table using flume hive sink. when I try to stream data to single partition of hive table, It works perfectly and Data ingested successfully. Also I noticed that 2 metastore connection are opened with 28 threads for my 1000 events. 04 Jan 2018 04:21:11,068 INFO [hive-sink1-call-runner-0] (org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open:443) - Trying to connect to metastore with URI thrift://sandbox-hdp.hortonworks.com:9083 04 Jan 2018 04:21:11,366 INFO [hive-sink1-call-runner-0] (org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open:539) - Connected to metastore. 04 Jan 2018 04:21:12,397 WARN [hive-sink1-call-runner-0] (org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.acquire:422) - Unexpected increment of user count beyond one: 2 HCatClient: thread: 28 users=2 expired=false closed=false 04 Jan 2018 04:21:13,084 INFO [hive-sink1-call-runner-0] (org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open:443) - Trying to connect to metastore with URI thrift://sandbox-hdp.hortonworks.com:9083 04 Jan 2018 04:21:13,088 INFO [hive-sink1-call-runner-0] (org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open:539) - Connected to metastore. But when I tried to ingest the data to random partitions in the table, Transactions started getting aborted after few ingestion. In 1000 events unique partitions would be around 70. Also I noticed the Hive metastore connection, It was more than 2 04 Jan 2018 03:46:19,755 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hive.HiveWriter.abortTxn:355) - Aborting Txn id 60964 on End Point {metaStoreUri='thrift://sandbox-hdp.hortonworks.com:9083', database='default', table='table', partitionVals=[201712634] } 04 Jan 2018 03:46:19,773 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hive.HiveWriter.abortTxn:355) - Aborting Txn id 61044 on End Point {metaStoreUri='thrift://sandbox-hdp.hortonworks.com:9083', database='default', table='table', partitionVals=[201712436] } 04 Jan 2018 03:46:19,824 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hive.HiveWriter.abortTxn:355) - Aborting Txn id 61034 on End Point {metaStoreUri='thrift://sandbox-hdp.hortonworks.com:9083', database='default', table='table', partitionVals=[201712657] } 04 Jan 2018 03:46:19,843 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hive.HiveWriter.abortTxn:355) - Aborting Txn id 61223 on End Point {metaStoreUri='thrift://sandbox-hdp.hortonworks.com:9083', database='default', table='table', partitionVals=[201712718] } 04 Jan 2018 04:00:29,016 WARN [hive-sink1-call-runner-0] (org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.acquire:422) - Unexpected increment of user count beyond one: 25 HCatClient: thread: 29 users=25 expired=false closed=false 04 Jan 2018 04:00:29,022 WARN [hive-sink1-call-runner-0] (org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.acquire:422) - Unexpected increment of user count beyond one: 26 HCatClient: thread: 29 users=26 expired=false closed=false I am not able to understand the issue, Is it because of the increase in Hive metastore connections, If yes how can we increase that limit or any workaround. Secondly If it is the issue for multiple partition access or multiple bucket access which is causing the table getting locked and leads to the transaction abort. Please suggest me the solution.
... View more
Labels: