Member since
12-09-2015
106
Posts
40
Kudos Received
20
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3062 | 12-26-2018 08:07 PM | |
2784 | 08-17-2018 06:12 PM | |
1480 | 08-09-2018 08:35 PM | |
12516 | 01-03-2018 12:31 AM | |
1060 | 11-07-2017 05:53 PM |
11-07-2017
05:51 PM
If your competing read/insert target a single partition this should be safe since Hive uses 'rename' file system operation at the end of insert to make new files visible. Rename is atomic on HDFS. If your insert is a dynamic partition insert then you are writing multiple partitions and the data for each partition is using the 'rename' operation. This means that some read operation could see a set of files that reflects only part of the insert. Insert overwrite actually deletes existing files so this can conflict with a concurrent read.
... View more
09-29-2017
04:07 PM
hive.support.concurrency =true hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager hive.lock.manager=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager will install a lock manager (there are several; ZooKeeper based is the default) w/o enabling full Acid. If you do use hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager then hive.lock.manager is ignored and you will be using Metastore based lock manager that is used by Acid but if you don't create your tables with "transactional=true" all your tables remain the same. I believe external tables should be locked in this case.
... View more
09-26-2017
05:04 PM
This may be due to not having https://issues.apache.org/jira/browse/HIVE-10632 in your build. There is some data in internal tables that was not cleaned when tables were dropped which is causing Initiator to try to schedule compactions.
... View more
09-14-2017
04:10 PM
There isn't. Perhaps @thejas has a recommendation.
... View more
09-06-2017
11:07 PM
1 Kudo
Acid tables require system determined sort order so you should not specify Sort By. Also, since Acid tables have to be bucketed the system should determine which rows go to which writer based on "Clustered By (...) into N buckets" clause of the DDL so it should not need Distribute By either.
... View more
08-02-2017
06:29 PM
since you already created directories in delta_23569546_23569546_0000 format, the compactor can't understand then. if for each X in delta_X_X you only have 1 directory (which should be the case) you can just rename it by stripping the suffix. This should let the compactor proceed. This will interfere with ongoing queries of course.
... View more
08-01-2017
03:54 PM
1 Kudo
This suffix is a feature when you are using LLAP and there is no way to avoid it. Is upgrading to HDP 2.6 an option? Compactor in 2.6 is able to handle it. If you make the target table transactional=false it won't be creating any delta directories. If you use transactional=true but don't go through LLAP on 2.5 you won't see this suffix.
... View more
07-10-2017
05:21 PM
1 Kudo
This is not supported. Transactional table data cannot be simply copied from cluster to cluster. Each cluster maintains a global transaction ID sequence which is embedded in the data files and file names of transactional tables. Copying the data files confuses the target system. The only way to do this right now is to copy the data to a non-acid table on source cluster using "Insert ... Select..." and then using import/export to transfer it to target side.
... View more
07-10-2017
05:16 PM
3 Kudos
ACID in Hive is enabled globally and per table. There is no such thing as enabling it per job. Existing queries will not be affected if they started before ACID was enabled.
... View more
06-22-2017
02:29 PM
Is your target table partitioned? If so, have you tried hive.optimize.sort.dynamic.partition ? Providing target table DDL may be useful
... View more