Member since
11-26-2022
3
Posts
0
Kudos Received
0
Solutions
12-03-2022
03:14 PM
Hi Team, I have loaded Kafka data file in HDFS location in avro json format, and now 1- created Hive External table with 2 partitions (p_consume_dt string, p_consume_hr). 2- ran - Msck repair table <tablename>; 3- Validated LOCATION by command- Show create table <tablename> Everything is OK till now. but when I execute SELECT * FROM <tablename> it throws error java Exception:java.io.IOException:Not a data file:44:43 Please help..
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache Kafka
12-03-2022
09:35 AM
I am dealing with Kafka dataset where there are multiple types of message data is processing (coming) Sample data: eventType 1- { "type": "record", "name": "Dispatch_Accepted", "namespace": "accepted.avro", "fields": [ { "name": "John", "type": "string", "doc": "Name of the user account" }, { "name": "email", "type": "string", "doc": "The email of the user logging message on the blog" }, { "name": "timestamp", "type": "long", "doc": "time in seconds" } ], "doc:": "A basic schema of Dispatch_Rejected" } EventType-2 { "type": "record", "name": "Dispatch_Rejected", "namespace": "rejected.avro", "fields": [ { "name": "Merry", "type": "string", "doc": "Name of the user" }, { "name": "email", "type": "string", "doc": "The email of the user logging message on the blog" }, { "name": "timestamp", "type": "long", "doc": "time in seconds" } ], "doc:": "A basic schema Rejected data" } Schema of the data getting validated from Confluent Schema Regisry (Working Fine), I need to apply filter on Schema name (Dispatch_Rejected and Dispatch_Accepted) and crete two separate data files for each so I am using QueryRecord Processor which below query <Dispatch_Rejected>=Select * from FLOWFILE WHERE name='Dispatch_Rejected' <Dispatch_Accepted>=Select * from FLOWFILE WHERE name='Dispatch_Accepted' This is not working.. can't identify the schema name. Controller service is working fine. 1- How I can pick the schema name from Controller service 2- Should I need to assign the value ${schema.name} in another variable <My_schema> and need to write SELECT Statement like <Dispatch_Rejected>=Select * from FLOWFILE WHERE My_Schema.name='Dispatch_Rejected' <Dispatch_Accepted>=Select * from FLOWFILE WHERE My_Schema.name='Dispatch_Accepted' Summary-- I want to filter the data based on eventType, and create separate data files Please help
... View more
11-26-2022
10:10 AM
Dear Expert, Here is my scenario for nifi development, I have generated a file with delimited '|' and I have to filter the records based on 3rd column value and separate dataset of each match Example: Note- No header/column name of rows. 1|1|Class|Electronic 1|1|Class|CS 1|1|Teacher|B 1|1|Teacher|B 1|1|Student|abc 1|1|Student|abc 1|4|Student|ex 1|3|Student|xyz So desired output should be Dataset 1 1|1|Class|Electronic 1|1|Class|CS Dataset 2 1|1|Teacher|B 1|1|Teacher|B Dataset 3 1|1|Student|abc 1|1|Student|abc 1|4|Student|ex 1|3|Student|xyz later on I will create files of above data and load in HDFS location Regards Mars
... View more
Labels:
- Labels:
-
Apache NiFi