About gauthier1

gauthier1 · ‎02-19-2019

We solved this issue by using dynamic partitioning instead of specifying the "Partition values" field.

gauthier1 · ‎11-20-2018

Cluster information: HDP 3.0, NiFi 1.7.0, 3 NiFi nodes Context: We are using NiFi PutHive3Streaming processor to ingest data into a partitioned ORC Hive table (5 level partitioning). The input FlowFile is in JSON format. Here is the processor configuration: Problem: The data ingestion works fine (data is written to HDFS in the right path and readable from Hive requests), but when the Hive compactor is triggered, it looks for wrong partitions (all the partitions values are lowercased when some should be uppercase). For example, we have the partition dev.table1.partition1=MH/partition2=P2/partition3=0025/year=2018/month=01, and we get the following logs in hivemetastore.log: 2018-11-20T09:42:34,701 INFO [Thread-10]: compactor.Initiator (Initiator.java:run(97)) - Checking to see if we should compact dev.table1.partition1=mh/partition2=p2/partition3=0025/year=2018/month=01 2018-11-20T09:42:34,710 INFO [Thread-10]: compactor.Initiator (Initiator.java:run(142)) - Can't find partition dev.table1.partition1=mh/partition2=p2/partition3=0025/year=2018/month=01, assuming it has been dropped and moving on. It looks like Hive has wrong information about the table's partitions, yet when running a DESCRIBE FORMATTED on the partition: With lowercased partitions' values: describe formatted dev.table1(partition1='mh',partition2='p2',partition3='0025',year=2018,month=01); Error: Error while compiling statement: FAILED: SemanticException [Error 10006]: Partition not found {partition1=mh, partition2=p2, partition3=0025, year=2018, month=01} (state=42000,code=10006) With right partitions' values: describe formatted dev.table1 partition(partition1='MH',partition2='P2',partition3='0025',year=2018,month=01); [...] | # Detailed Partition Information | NULL | NULL | | Partition Value: | [MH, P2, 0025, 2019, 01] | NULL | | Database: | dev | NULL | | Table: | table1 | NULL | | CreateTime: | Wed Nov 07 15:16:15 CET 2018 | NULL | | LastAccessTime: | UNKNOWN | NULL | | Location: | hdfs://nnhdfs/warehouse/tablespace/managed/hive/dev.db/table1/partition1=MH/partition2=P2/partition3=0025/year=2019/month=01 | NULL | | Partition Parameters: | NULL | NULL | | | transient_lastDdlTime | 1541600175 | Here we see that Hive has the good metadata about the partition. We don't get why when looking for partitions to compact the partitions values are turned to lowercase. Is this a normal behavior, and if not do you have any idea where this problem comes from? Our temporary workaround is to use lowercase partition values, which is not very satisfying.

gauthier1 · ‎03-27-2018

@ke chen Starting from Ambari 2.6, the way of specifying repo versions has changed as you can see here. You have to register a VDF file like this one with the version_definitions API (in the VDF file, replace the repository-info.os.repo.baseurl by your local repo url). curl -u user:pwd -H "X-Requested-By: ambari" -X POST http://ambari.server:8080/api/v1/version_definitions \ -d '{ "VersionDefinition": { "version_url": "your.vdf.file.xml" } }' You will get an id for the version and you will have to reference it when creating the cluster : { "blueprint": "<blueprint name>", "repository_version_id": 1, "default_password": "password", "host_groups": [ ... ] } More details in the release note: https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-release-notes/content/ambari_relnotes-2.6.0.0-behavioral-changes.html

Online	Offline
Last Visited	‎09-02-2019 09:26 AM

Member Since	‎03-27-2018 09:05 AM
Last Visited	‎09-02-2019 09:26 AM
Posts	5
Kudos received	1

Cloudera Community

Re: NiFi PutHive3Streaming - Hive compactor lookin...

Re: NiFi PutHive3Streaming - Hive compactor lookin...

NiFi PutHive3Streaming - Hive compactor looking fo...

Re: Ambari Blueprint Installation using local repo...