Member since
02-07-2019
2729
Posts
239
Kudos Received
31
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1464 | 08-21-2025 10:43 PM | |
| 2128 | 04-15-2025 10:34 PM | |
| 5495 | 10-28-2024 12:37 AM | |
| 2120 | 09-04-2024 07:38 AM | |
| 3993 | 06-10-2024 10:24 PM |
11-13-2023
10:39 PM
We did this at our end and ended up re-cycling the provenance repository much faster than usual. The huge amount of data that an output of a tailfile generates can fill up both your content and provenance repositories.
... View more
11-11-2023
06:40 AM
There is no magic solution for those scenarios and no one solution fits all out of Nifi that I can think of. You have to understand the nature of the input before you start consuming it and you have to provide the solution catered to this input. Sometimes if you are lucky you can combine multiple scenarios into one flow but that still depends on the complexity of the input. Even thought in your first scenario the second option I proposed seem to be simple enough and it did the job, for your second example its more complex and I dont think the out of the box GrokReader will be able to handle such complexity, therefore the first option of using the ExtractText Processor will work better because you can customize your regex as needed. For example, based on the text you provided: JohnCena32 Male New York USA813668 I can use the following regex: [A-Z][a-z]+[A-Z][a-z]+\d+\s(?:Male|Female|M|F)\s[A-Z][a-z]+(?:\s[A-Z][a-z]+)?\s[A-Za-z]+\d+ In the ExtractText processor I will define a dynamic property for each attribute (city, age, firstname...etc.) and surround the segment of the pattern that corresponds to the value with a parenthesis to extract as matching group. For Example: Age: [A-Z][a-z]+[A-Z][a-z]+(\d+)\s(?:Male|Female|M|F)\s[A-Z][a-z]+(?:\s[A-Z][a-z]+)?\s[A-Za-z]+\d+ FirstName: ([A-Z][a-z]+)[A-Z][a-z]+\d+\s(?:Male|Female|M|F)\s[A-Z][a-z]+(?:\s[A-Z][a-z]+)?\s[A-Za-z]+\d+ Gender: [A-Z][a-z]+[A-Z][a-z]+\d+\s((?:Male|Female|M|F))\s[A-Z][a-z]+(?:\s[A-Z][a-z]+)?\s[A-Za-z]+\d+ Country: [A-Z][a-z]+[A-Z][a-z]+\d+\s(?:Male|Female|M|F)\s[A-Z][a-z]+(?:\s[A-Z][a-z]+)?\s([A-Za-z]+)\d+ And so on... This should give you the attribute you need. Then you can use the AttributeToJson processor to get the json output and finally if you want to convert the data to the proper type you can either user JoltTransformation or QueryRecord with cast as shown above. One final note: If you know how to use some external libraries in python for example or groovy or any of the supported code script in the ExecuteScript processor then you can use that to write your custom code to create the required fllowfile\attributes that will help you downstream to generate the final output. If that helps please accept solution. Thanks
... View more
11-10-2023
01:33 AM
@Venin, Welcome to the Cloudera Community. This post may be helpful with your query: https://community.cloudera.com/t5/Support-Questions/Downloading-and-Installing-HDP-for-Windows-Hortonworks/m-p/372948
... View more
11-09-2023
06:20 AM
this link seen only recommend using dns server , but here two way to using dns is build linux bind dns local or using exists AD DNS
... View more
11-08-2023
09:23 PM
@Wadok88, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
... View more
11-08-2023
04:03 AM
1 Kudo
Hello Vidya, would you support me in this as well? regards, Mahrous Badr
... View more
11-08-2023
01:17 AM
1 Kudo
To add to the point of @ggangadharan, there are lots of good articles/posts why the float and even the double datatype has these problems. Note that this is not Hive / Hadoop or Java specific. https://stackoverflow.com/questions/3730019/why-not-use-double-or-float-to-represent-currency https://dzone.com/articles/never-use-float-and-double-for-monetary-calculatio https://www.red-gate.com/hub/product-learning/sql-prompt/the-dangers-of-using-float-or-real-datatypes Miklos
... View more
11-07-2023
12:56 AM
@divyesh, Welcome to the Cloudera Community. As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
... View more
11-06-2023
02:39 AM
@FrederickYip, Welcome to our community! To help you get the best possible answer, I have tagged in our experts @vaishaakb who may be able to assist you further. Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
... View more
10-30-2023
02:54 AM
I have set --rpc_max_message_size=134217728 and restart the service,but I got the error again;I find the parameter has been changed to --rpc_max_message_size=134217728
... View more