Member since
05-31-2017
4
Posts
0
Kudos Received
0
Solutions
11-10-2018
10:25 PM
While some time has passed, for those who subsquently find this thread, Pig 0.17.0 has been included in CDH 6: https://www.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_60_packaging.html#cdh_601_packaging
... View more
09-01-2017
06:22 PM
I'm using NIFI 1.2.0 and I'm trying to load csv data into Hive table. My flow looks like: GetHDFS (get csv files from hdfs)->UpdateAttribute (setting schema.name atr)->QueryRecord (select all columns from csv + add additional column "loaded_ts" - hive table is partitioned based on this field) -> ConversCSVToAvro (mandatory action for next HiveStreaming processor) -> PutHiveStreaming 1. When I create non-partitioned table in Hive - everything goes ok and data is loaded to the Hive table: CREATE TABLE `default.nifi_stream_table`(
`id` string,
`company` string,
`city` string,
`state` string,
`country` string,
`loaded_ts` string)
CLUSTERED BY (id) INTO 16 BUCKETS
STORED AS ORC
TBLPROPERTIES('transactional'='true'); 2. When I created partitioned table in Hive - data stream seems goes ok through PutHiveStreaming processor , and there are no any errors, and I see on hdfs in hive warehouse buckets have been created with data, but "select * from default.nifi_stream_table" - fetches nothing. CREATE TABLE `default.nifi_stream_table`(
`id` string,
`company` string,
`city` string,
`state` string,
`country` string)
PARTITIONED BY (`loaded_ts` string)
CLUSTERED BY (id) INTO 16 BUCKETS
STORED AS ORC
TBLPROPERTIES('transactional'='true'); In NIFI PutHiveStreaming processor I've tried to set all combinations of these two properties: Partition Columns: No value set/ loaded_ts Auto-Create Partitions: False/true Any ideas what I'm doing wrong?
... View more
Labels:
05-31-2017
03:33 AM
Any ideas about where scripts are located?
... View more