Member since
07-19-2018
613
Posts
101
Kudos Received
117
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 5095 | 01-11-2021 05:54 AM | |
| 3421 | 01-11-2021 05:52 AM | |
| 8789 | 01-08-2021 05:23 AM | |
| 8385 | 01-04-2021 04:08 AM | |
| 36687 | 12-18-2020 05:42 AM |
02-18-2020
04:45 PM
@JohnYaya if you can show a sample, it would be very helpful. If your header and footer is static, and always predictable for a specific file you can replaceText and get "in between" lines using those static header/footer matches with a very creative regex...
... View more
02-18-2020
04:37 PM
Try: $.outputs.['EM_CLASSIFICATION'] that's $.mainobject.['array name'] to access the value for that outputs array item Fun with Json...
... View more
02-18-2020
09:59 AM
As mentioned above, make sure you are sending full JSON Object. Then in your EvaluateJsonPath processor click a + for each one you want to map to an attribute. Then enter the key → values as below:
test1 → $.outputs.test1
test2 → $.outputs.test2
test3 → $.outputs.test3
test4 → $.outputs.test4
test5 → $.outputs.test5
test6 → $.outputs.test6
If this solution works please accept it as the solution to close the topic.
... View more
02-12-2020
05:29 AM
@AndyTech Can you share some details about the hive table (schema, number rows, data size, etc)? Can you describe your hive setup (configuration, # of nodes, tez/yarn container size, queue setup, etc)? Can you speak to the current speed benchmark, and your expected speed? Depending on the data itself, partition can have some performance. There are a bunch of things you can do to hive itself, that will have a huge impact on performance. With working table, and working query, especially parquet, I would want to investigate Hive Performance Tuning before making any changes to the data structure. Additionally, there would be some discussion about parquet vs orc which is known to be faster. Share some details and myself or others will comment further.
... View more
02-12-2020
05:16 AM
@vikrant_kumar24 There are many ways to do this. I have added a template to my NiFI templates for you. This way takes a csv Input, splits the lines, extracts two columns, builds an insert statement, and executes that statement (requires database connection pool controller service). The only real tricky part to this is the regex for mapping the columns in ExtractText Processor. https://github.com/steven-dfheinz/NiFi-Templates Once you are able to parse the csv to attributes, adding more attributes for metadata, and adding those details to the insert query should be very easy. Hope this helps get you started. Additionally, if you search here, you will find loads of posts with all the other suggested methods for processing csv to sql.
... View more
02-12-2020
05:02 AM
1 Kudo
@Peruvian81 I would delete only enough to get nifi re-started. Then I would want to go into the flow and look at what has caused it to fill up. This is of course assuming you have enough space in the drive to begin with. Next, I would recommend you should address nifi documented steps for disk configuration, and based on your flow, expand the content repository if necessary and if possible. Last thing to consider: your flow may just need to terminate large flow files when they are completed at the end of the line. If these are held in Q, and no longer needed, they are taking up valuable space.
... View more
02-12-2020
04:57 AM
@AarifAkhter
You need to create permissions within mysql for your ranger user.
An example of this is:
CREATE DATABASE ranger;
CREATE USER 'ranger'@'hdp.cloudera.com' IDENTIFIED BY 'ranger';
GRANT ALL PRIVILEGES ON *.* TO 'ranger'@'hdp.cloudera.com' WITH GRANT OPTION;
FLUSH PRIVILEGES;
where you replace in your hostname for ranger in place of hdp.cloudera.com above:
ip-xxx-xx-xx-xx.ec2.internal
I would also recommend to install using FQDNs (Fully Qualified Domain Names).
Please accept this answer as the solution to close the topic.
... View more
02-04-2020
12:28 PM
@wengelbrecht thank you that is exactly what i needed to see. I am having an issue with the parquet-hadoop-1.10 and need to get a 1.12 version working in NiFi and Hive....
... View more
02-04-2020
11:41 AM
@MattWho do you know what version of Parquet is supported by the new readers?
... View more
02-04-2020
11:38 AM
@wengelbrecht do you know version of parquet this reader is supposed to support?
... View more