Member since
02-11-2021
21
Posts
1
Kudos Received
0
Solutions
12-05-2022
01:54 AM
Is there a way in NIFI to create Hive tables from using the schema from CDP schema registry.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache NiFi
09-22-2022
09:03 PM
While converting the input flow file JSON to any other format using query record (CSVwriter or AVRO Writer) using inferschema strategy the NIFI processor is trying to convert to Date based on first few characters of the incoming string. NIFI Error Output Successfully parsed a JSON object from input but failed to convert into a Record object with the given schema\n- Caused by: org.apache.nifi.serialization.MalformedRecordException: Successfully parsed a JSON object from input but failed to convert into a Record object with the given schema\n- Caused by: org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Failed Conversion of Field [sharepoint_documents_name__c] from String [10-7-20 2x10GE LAN PHY XXXXXXXXXXXX] to LocalDate\n- Caused by: java.time.format.DateTimeParseException: Text '10-7-20 2x10GE LAN PHY XXXXXXXXXXX' could not be parsed at index 0: Any workaround/fixes available in the newer version. Iam using NIFI 16.1
... View more
Labels:
- Labels:
-
Apache NiFi
07-30-2022
05:18 PM
I have achieved this using Wait and Noitfy which is working as per my use case.
... View more
07-25-2022
02:31 AM
Is there a simple way to restrict number of threads to a NIFI process group?.
... View more
Labels:
- Labels:
-
Apache NiFi
07-25-2022
02:26 AM
Not yet resolved. @SAMSAL solution is a potential workaround. Iam looking for a right solution why Convert produces this as i have few places to change.
... View more
07-15-2022
05:43 AM
@SAMSAL I need to retain the JSON format as-is in the output. I have updated the updated the expected output. Expected Output: aaaa|{country=CHINA, city=null, street=null, latitude=null, postalCode=null, geocodeAccuracy=null, state=null, longitude=null}
... View more
07-15-2022
04:00 AM
JsonTreeReader to CSV Writer produces MapRecord String for the inner Json. Data "id" : "aaaa", "billingaddress" : { "city" : null, "country" : "CHINA", "geocodeAccuracy" : null, "latitude" : null, "longitude" : null, "postalCode" : null, "state" : null, "street" : null } Produces CSV out with MapRecord String attached. aaaa|MapRecord[{country=CHINA, city=null, street=null, latitude=null, postalCode=null, geocodeAccuracy=null, state=null, longitude=null}] Expected Output: aaaa|{country=CHINA, city=null, street=null, latitude=null, postalCode=null, geocodeAccuracy=null, state=null, longitude=null} I want to keep the inner JSON as-is in the CSV output. Any workarounds to this issue?
... View more
Labels:
- Labels:
-
Apache NiFi
06-21-2022
03:24 AM
1 Kudo
NIFI version 1.16.1
nifi-hive3-nar-1.16.2
While loading data to hive tables using PutHive3Streaming, some of the tables loads are getting failed with errors. I tried to change commit size, it didn't help. out of 50 hive table loads only 5 gets failed with this error and this is repeatable. Record count for these tables are less than 1Million.
2022-06-21 09:53:23,114 ERROR [Timer-Driven Process Thread-21] o.a.n.processors.hive.PutHive3Streaming PutHive3Streaming[id=9c9e3916-2000-1698-a0da-2dc44149819f] Aborted transaction cannot be committed: Transaction txnid:10346416 already aborted org.apache.nifi.processors.hive.PutHive3Streaming$ShouldRetryException: Aborted transaction cannot be committed: Transaction txnid:10346416 already aborted at org.apache.nifi.processors.hive.PutHive3Streaming.lambda$onTrigger$0(PutHive3Streaming.java:512) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1662) at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:412) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1283) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:103) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hive.streaming.TransactionError: Aborted transaction cannot be committed: Transaction txnid:10346416 already aborted at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.commitImpl(HiveStreamingConnection.java:877) at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.commit(HiveStreamingConnection.java:841) at org.apache.hive.streaming.HiveStreamingConnection.commitTransaction(HiveStreamingConnection.java:513) at org.apache.nifi.processors.hive.PutHive3Streaming.lambda$onTrigger$0(PutHive3Streaming.java:499) ... 16 common frames omitted Caused by: org.apache.hadoop.hive.metastore.api.TxnAbortedException: Transaction txnid:10346416 already aborted at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$commit_txn_result$commit_txn_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$commit_txn_result$commit_txn_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$commit_txn_result.read(ThriftHiveMetastore.java) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_commit_txn(ThriftHiveMetastore.java:5192) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.commit_txn(ThriftHiveMetastore.java:5179) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.commitTxn(HiveMetaStoreClient.java:2491) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208) at com.sun.proxy.$Proxy249.commitTxn(Unknown Source) at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.commitImpl(HiveStreamingConnection.java:859) ... 19 common frames omitted 2022-06-21 09:53:23,114 ERROR [Timer-Driven Process Thread-21] o.a.n.processors.hive.PutHive3Streaming PutHive3Streaming[id=9c9e3916-2000-1698-a0da-2dc44149819f] Failed to abort Hive Streaming transaction { metaStoreUri: thrift://vsgcnredhad12.in.reach.com:9083,thrift://vsgcnredhad13.in.reach.com:9083, database: tigfin_nifi, table: t_ap_invoice_lines_all } due to exception org.apache.hive.streaming.StreamingException: Transaction state is not OPEN. Missing beginTransaction? at org.apache.hive.streaming.HiveStreamingConnection.checkState(HiveStreamingConnection.java:500) at org.apache.hive.streaming.HiveStreamingConnection.abortTransaction(HiveStreamingConnection.java:519) at org.apache.nifi.processors.hive.PutHive3Streaming.abortConnection(PutHive3Streaming.java:652) at org.apache.nifi.processors.hive.PutHive3Streaming.lambda$onTrigger$0(PutHive3Streaming.java:559) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1662) at org.apache.nifi.processors.hive.PutHive3Streaming.onTrigger(PutHive3Streaming.java:412) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1283) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:103) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
... View more
Labels:
- Labels:
-
Apache NiFi
06-15-2022
07:01 AM
I had a look into it but that's not the right solution to my opinion. ExecuteSQL just converts every date into UTC, lets say if the dates are already in UTC then it once again converts to UTC which is not right.
... View more
06-15-2022
01:54 AM
The version 1.16.2 works. I don't see an option to accept the soln.
... View more
06-15-2022
01:23 AM
I have used invokeHTTP in 1.12 version without SSL cert and works fine, however new version 1.16 doesn't. Any settings available to ignore?. Ex: I connect to SQL server with JDBCConnection with property trustServerCertificate=true; post which it works without certifcates.
... View more
06-08-2022
10:23 PM
While querying postgres database using ExecuteSQL Processor 1.16.1 converts the date into UTC format. However my target system requirement is to store data in local timezone. This happens only if you set the AVRO logical data type to True. If Avro logical types set to false then it returns the localtime. I have few 100 date fields and i cannot convert each of these from the source. How do i force the ExecuteSQL with AVRO type set to return the date as-is from the source. When Avro Type set to false : "install_date" : "2018-07-20 13:31:25.733" When Avro Type set to true: "install_date" : "2018-07-20T05:31:25.733Z",
... View more
Labels:
- Labels:
-
Apache NiFi
06-08-2022
12:51 AM
@acoast83 Even with 1.16 iam facing this issue. Any global workaround?.
... View more
06-07-2022
07:09 AM
@mburgess is there a way i can access the lineage duration from a update attribute processor. I like to store this in a database along with other meta information for analysis.
... View more
06-06-2022
07:47 AM
@geepark Iam also facing simillar issue. How you got this resolved?. PutHive3Streaming[id=38c56b98-0181-1000-0000-000077757de2] Failed to properly initialize Processor. If still scheduled to run, NiFi will attempt to initialize and run the Processor again after the 'Administrative Yield Duration' has elapsed. Failure is due to java.lang.NoClassDefFoundError: org/apache/hadoop/tracing/SpanReceiverHost java.lang.NoClassDefFoundError: org/apache/hadoop/tracing/SpanReceiverHost at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:634) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
... View more
03-02-2021
08:55 AM
How to perform the same for the very first occurrence of [ and last occurrence of ].
... View more
02-26-2021
01:12 AM
@ mburgess The challenge is that i dont want to handle specific columns as i have more than 100 tables and may be 500 Date fields. It has be to generic. The Execute SQL has the logical Data type which converts every datetime to Hive Datetime automatically which works good for SQL Database source. For JSON format and manually setting up each conversion is a tough option.
... View more
02-24-2021
08:58 AM
Hi Experts I need some help in handling ISO8601 data type. I get salesforce data in which all the date fields comes in the format as "2015-12-08T08:19:00.000+0000" which is not accepted by Hive3streaming where the target datatype is Timestamp or date. The hive streaming tries to convert to timestamp however throw errors as number format exception. How can I convert these data type fields automatically?. Each tables at least have 10-20 date fields in this format and i have more than 100 tables. I used InvokeHTTP to get the JSON data from Salesforce. Any help on this is really appreciated.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache NiFi
02-24-2021
08:23 AM
How can i convert some of the fields based on this validation?. Can i perform validation based on a particular data type alone?. Thanks
... View more
02-24-2021
08:06 AM
Iam also having the same issue.I have multiple fields and multiple files coming with this format. Any generic way to handle this?. Are there any sample for the validate processor and conversion of these fields?. Thanks in advance.
... View more
- Tags:
- The
02-18-2021
03:19 AM
NIFI Version : 1.12 Iam able to load the data using PutHive3Streaming processor for tables without timestamp datatype. However if the Hive table contains Timestamp field then the load fails with below error. WARN [Timer-Driven Process Thread-7] o.a.n.processors.hive.PutHive3Streaming PutHive3Streaming[id=01751042-d64d-1d24-cd20-5b2a9e1ea92b] Error [java.lang.NumberFormatException: For input string: "2018-10-20 21:30:03.493"] Any help here is highly appreciated.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache NiFi