Member since
01-07-2019
220
Posts
23
Kudos Received
30
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5170 | 08-19-2021 05:45 AM | |
1851 | 08-04-2021 05:59 AM | |
899 | 07-22-2021 08:09 AM | |
3750 | 07-22-2021 08:01 AM | |
3529 | 07-22-2021 07:32 AM |
12-28-2019
02:15 AM
1 Kudo
The first thing that comes to mind is not select $.data.* but something like $.data.file_attachment or $.data.file_attachment.* Does this bring you (closer) to the answer? If there are still simple things you want to change in the text, you could use this workaround: In the update Attribute processor use something like replaceall. Hope this helps, but also curious if there are other things relevant here.
... View more
12-27-2019
02:13 AM
Perhaps I missed it, but what is the exact error that you see? And just in case, what command do you use? And have you successfully ran similar commands? Also, you mentioned oozie, does that mean you can run the command outside oozie? And with the same user?
... View more
12-25-2019
08:51 AM
For access to the versions intended for production, you would indeed need to be a customer. However for a quick glance there are some more accessible paths, for instance the HDP sandbox, or the CDP free trial. These can be used by anyone without being a customer, but may not contain the latest version for instance.
... View more
12-25-2019
08:45 AM
1 Kudo
Just a heads up: Splitting the file into individual records may provide additional flexibility, but if the case is straightforward enough, I do think it is recommended to use processors (like route text) that avoid creating a flow file for each line.
... View more
12-24-2019
04:19 AM
1 Kudo
Nifi is meant for moving data, it can get/send data to and from APIs. It will not be great to just let it sit there an try to call it whenever you need a bit of data. Kafka may be a closer fit for this.
... View more
12-24-2019
04:05 AM
Though I have not tried it, I suspect this to be possible. ListS3 typically provides the input of FetchS3, if you want to imitate this, consider manually running ListS3 and carefully inspecting the flowfiles it creates. (Content and attributes/metadata). From here you can probably simulate the flowfile e.g. with generateflowfile to test if you can use FetchS3 without ListS3. If this succeeds, you can update your flow to make sure it provides the right inputs.
... View more
12-24-2019
04:00 AM
Not sure if this helps, but it looks like you are hitting a general connectivity error. Possibly not even related to MySQL. Are you able to do anything via nifi on that remote node?
... View more
12-24-2019
03:51 AM
As you seem to understand, this is not a valid CSV file, hence custom parsing is required. I have copied the answer you gave here: ---- To implement search and replace missing double quote, I used ExecuteScript processor using Python such as, from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
from org.apache.nifi.processors.script import ExecuteScript
from org.python.core.util.FileUtil import wrap
from io import StringIO
import re
# Define a subclass of StreamCallback for use in session.write()
class PyStreamCallback(StreamCallback):
def __init__(self):
pass
def process(self, inputStream, outputStream):
with wrap(inputStream) as f:
lines = f.readlines()
outer_new_value_list = []
for csv_row in lines:
field_value_list = csv_row.split('|')
inner_new_value_list = []
for field in field_value_list:
if field.count('"') > 2:
replaced_field = re.sub(r'(?!^|.$)["^]', '""', field)
inner_new_value_list.append(replaced_field)
else:
inner_new_value_list.append(field)
row = '|'.join([str(elem) for elem in inner_new_value_list])
outer_new_value_list.append(row)
with wrap(outputStream, 'w') as filehandle:
filehandle.writelines("%s" % line for line in outer_new_value_list)
# end class
flowFile = session.get()
if (flowFile != None):
flowFile = session.write(flowFile, PyStreamCallback())
session.transfer(flowFile, ExecuteScript.REL_SUCCESS)
# implicit return at the end
... View more
12-24-2019
03:39 AM
1 Kudo
Extract text is for getting some text from the content and putting it in an attribute. This does not sound like what you want. Also it will match the regex to the whole flowfile so again probably not what you want. If you only want to keep certain lines from a flowfile, the processor to use seems to be RouteText. Here is an example of this: https://community.cloudera.com/t5/Support-Questions/Filtering-records-from-a-file-using-NiFi/td-p/184346
... View more
12-24-2019
03:31 AM
This is quite a broad question and hard to troubleshoot all at once. A general approach: 1. Analyze your custom code and identify all bits of complexity/all dependencies 2. Run the standard example 3. Introduce in a very minimal fashion one complexity/dependency from your custom code into the standard 4. If it works, repeat step 3 This should allow you to narrow down exactly what is causing the problem (and hopefully guide the resolution).
... View more