About ekoifman

ekoifman · ‎11-07-2017

If your competing read/insert target a single partition this should be safe since Hive uses 'rename' file system operation at the end of insert to make new files visible. Rename is atomic on HDFS. If your insert is a dynamic partition insert then you are writing multiple partitions and the data for each partition is using the 'rename' operation. This means that some read operation could see a set of files that reflects only part of the insert. Insert overwrite actually deletes existing files so this can conflict with a concurrent read.

ekoifman · ‎09-29-2017

hive.support.concurrency =true hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager hive.lock.manager=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager will install a lock manager (there are several; ZooKeeper based is the default) w/o enabling full Acid. If you do use hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager then hive.lock.manager is ignored and you will be using Metastore based lock manager that is used by Acid but if you don't create your tables with "transactional=true" all your tables remain the same. I believe external tables should be locked in this case.

ekoifman · ‎09-26-2017

This may be due to not having https://issues.apache.org/jira/browse/HIVE-10632 in your build. There is some data in internal tables that was not cleaned when tables were dropped which is causing Initiator to try to schedule compactions.

ekoifman · ‎09-14-2017

There isn't. Perhaps @thejas has a recommendation.

ekoifman · ‎09-06-2017

Acid tables require system determined sort order so you should not specify Sort By. Also, since Acid tables have to be bucketed the system should determine which rows go to which writer based on "Clustered By (...) into N buckets" clause of the DDL so it should not need Distribute By either.

ekoifman · ‎08-02-2017

since you already created directories in delta_23569546_23569546_0000 format, the compactor can't understand then. if for each X in delta_X_X you only have 1 directory (which should be the case) you can just rename it by stripping the suffix. This should let the compactor proceed. This will interfere with ongoing queries of course.

ekoifman · ‎08-01-2017

This suffix is a feature when you are using LLAP and there is no way to avoid it. Is upgrading to HDP 2.6 an option? Compactor in 2.6 is able to handle it. If you make the target table transactional=false it won't be creating any delta directories. If you use transactional=true but don't go through LLAP on 2.5 you won't see this suffix.

ekoifman · ‎07-10-2017

This is not supported. Transactional table data cannot be simply copied from cluster to cluster. Each cluster maintains a global transaction ID sequence which is embedded in the data files and file names of transactional tables. Copying the data files confuses the target system. The only way to do this right now is to copy the data to a non-acid table on source cluster using "Insert ... Select..." and then using import/export to transfer it to target side.

ekoifman · ‎07-10-2017

ACID in Hive is enabled globally and per table. There is no such thing as enabling it per job. Existing queries will not be affected if they started before ACID was enabled.

ekoifman · ‎06-22-2017

Is your target table partitioned? If so, have you tried hive.optimize.sort.dynamic.partition ? Providing target table DDL may be useful

Online	Offline
Last Visited	‎01-02-2019 10:23 PM

Member Since	‎12-09-2015 05:12 PM
Last Visited	‎01-02-2019 10:23 PM
Posts	106
Kudos received	40

Cloudera Community

Re: Hive 3 export single ORC file

Re: Hive Compaction error

Re: What is difference between Distributed by,clus...

Re: Hive Transactional Tables are not readable by ...

Re: Updating the bucketted Hive table

Re: what is the behaviour of select during an inse...

Re: Hive : Can we enable concurrency support witho...

Re: How to abort HIVE-compactions

Re: HIVE ACID table - Not enough history available...

Re: How to reduce 'SPILLED_RECORDS' in Hive with T...

Re: Compactor and SparkSQL do not work after data...

Re: Compactor and SparkSQL do not work after data...

Re: HIVE ACID table - Not enough history available...

Re: Enabling ACID properties in Hive

Re: Controlling Number of small files while insert...