Member since
07-10-2017
78
Posts
6
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4468 | 10-17-2017 12:17 PM | |
7114 | 09-13-2017 12:36 PM | |
5437 | 07-14-2017 09:57 AM | |
3552 | 07-13-2017 12:52 PM |
06-13-2018
02:38 PM
Hi @Oleg Parkhomenko, The following link describe how you can secure yarn queue to be sure that only specific user can submit job to specific queue, it done with Ranger: https://community.hortonworks.com/articles/10797/apache-ranger-and-yarn-setup-security.html Normaly if you are in a kerberos environment, you should not have job running as dr who Miche
... View more
06-13-2018
02:29 PM
Hi @rajat puchnanda, Based on your example, you are trying to do a "join". Nifi is not an ETL tool but more a flow manager, it allow to move data accros system and to do some very simple transformation like csv to avro. You should not do computation or join with Nifi. For you usecase it would be better to use another tools like hive, spark,... Best regards, Michel
... View more
06-13-2018
02:09 PM
Hi @rajat puchnanda, If by merging you means doing an union, you can use the processor mergecontent if the two csv have the same structure. Best regards, Michel
... View more
06-13-2018
02:07 PM
Hi @Oleg Parkhomenko, You should be able to kill al the queue job with this script: for app in `yarn application -list | awk '$6 == "ACCEPTED" { print $1 }'`; do yarn application -kill "$app"; done Just put in a scri[t .sh and run it wit ha user that are allow to kill app Best regards, Michel
... View more
06-13-2018
02:01 PM
HI, Usually timeout happen because the cluster is undersized or no dedicated node for hbase or the ingestion is so quick that hbase need to do a lot of split of region. - Do you manage a lot of data with hbase? if yes, idd you pre-split your table? - If i was you I would also have a look to the cpu, memory and io disk usage. If you dont have anydedicated nodes for hbase other hadoop component like spark, hive, etc can have an impact. As a general best practice, you should have dedicated node for hbase with enough cpu and several disk Mest regards, Michel
... View more
11-15-2017
10:16 AM
@Arti Wadhwani Do you have the answer to your question? I'm trying to do that, connection with zookeeper discovery and specifying the tez queue but it doesn't work
... View more
11-06-2017
03:51 PM
Hi @Ennio Sisalli, Before running your query that save the result in HDFS, can you try to set the following parameter: set hive.cli.print.header=true; Best regards, Michel
... View more
10-17-2017
12:25 PM
Hello, I try to ingest data in hive with nifi
(from json data =>convert jsontosql => puthiveql) and I got this error message from the puthiveql processor: Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.parse.ParseException:line 1:221 cannot recognize input near '?' ',' '?' in value row constructor if I look at the input flowfile of the puthiveql it has the correct insert query INSERT INTO nifilog (objectid, platform, bulletinid, bulletincategory, bulletingroupid, bulletinlevel, bulletinmessage, bulletinnodeid, bulletinsourceid, bulletinsourcename, bulletinsourcetype, bulletintimestamp) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) each flowfile has all the needed attribute: sql.args.N.type & . sql.args.N.value Any idea how to debug/solve this?
... View more
Labels:
- Labels:
-
Apache NiFi
10-17-2017
12:17 PM
The solution is used the "SiteToSiteBulletinReportingTask" in the reporting task. It can send all the bulletin to a nifi instance. It can be the same instance the nifi that generate it. It will send it to a specific input port in json, then you will be able to process it. It has all the attributed needed: Here an example [{"objectId":"9c8e75e6-eb5a-4a52-9d4a-a3d3b7f0c80f",
"platform":"nifi",
"bulletinId":305,
"bulletinCategory":"Log Message",
"bulletinGroupId":"24a8726b-015f-1000-ffff-ffffae66ea1c",
"bulletinLevel":"ERROR",
"bulletinMessage":"PutHDFS[id=24b463f8-015f-1000-ffff-ffffd09bd856] PutHDFS[id=24b463f8-015f-1000-ffff-ffffd09bd856] failed to invoke @OnScheduled method due to java.lang.RuntimeException: Failed while executing one of processor's OnScheduled task.; processor will not be scheduled to run for 30 seconds: java.lang.RuntimeException: Failed while executing one of processor's OnScheduled task.",
"bulletinNodeId":"ede4721c-30fe-4879-b22e-20bfe602c615",
"bulletinSourceId":"24b463f8-015f-1000-ffff-ffffd09bd856",
"bulletinSourceName":"PutHDFS",
"bulletinSourceType":"PROCESSOR",
"bulletinTimestamp":"2017-10-17T08:16:48.945Z"},
... View more
10-16-2017
12:50 PM
Hi @Abdelkrim Hadjidj, Thanks for your reply, my objective is to get the error message which can be many thing, hostnotfound, parsing error, connection refuse, etc for the same failure relationship. Michel
... View more