Member since
10-31-2018
4
Posts
0
Kudos Received
0
Solutions
01-30-2019
11:23 PM
I fully typed out this question here: https://stackoverflow.com/questions/54450058/nifi-running-python-web-scraper-through-executecommandstream-executeprocess-p But, the overall gist is that I have a python web-scraping script in my docker container, and I'm trying to have the processor scrape what I need, and send it on down my pipeline. Problem is, I can't get it to scrape without throwing some "command not found" errors, and I have no idea how to get the system to recognize my python script. Python3 is downloaded in the container. The SO link above fully explains my issue. I've taken a look at this: https://community.hortonworks.com/questions/178561/can-anyone-provide-an-example-of-a-python-script-e.html, a good starting place, but not truly germane to the issue.
... View more
Labels:
- Labels:
-
Apache NiFi
-
Docker
12-11-2018
10:00 PM
This question is really a follow up to @Timothy Spann's guide series for the Stanford NLP and its use in NiFi. Problem: I have NiFi up in AWS, and I also have the Stanford Core NLP jar file running in an ECS task. I can't get them connected. My current flow is this: 1) GenerateFlowFile - with custom text: "Testing because I have no idea how this works?" (just under 50B) 2) InvokeHTTP - POST, and url = http://xx.xxx.xx.xxx:port (ip and port, throws no errors) 3) ???? - I currently have the original and response connected to a LogAttribute, to see what comes out. For response, when I check the list queue, the flowfile has nothing in it, upon viewer inspection, and when I download the file, it just gives me the Apache Tika license agreement. Original just puts that message as an attribute. How do I call *entity* analysis? I know the NLP is running over in that ECS. I have no idea how to input a correct url call, or what type of processor must come after InvokeHTTP. If I am asking the wrong question/a dumb question, please let me know. Thanks
... View more
Labels:
- Labels:
-
Apache NiFi
12-05-2018
05:11 PM
I have a connected SFTP server, and I am trying to route files based on type: `.csv`, `.tsv`, and `.xlsx`. For now, I'm just uploading test files through the command line. My flow is: GetSFTP (with correct hostname, etc.) -> RouteOnAttribute -> LogAttribute (will dump elsewhere soon, this is just for testing) My problem, I think, is that I created a property in `RouteOnAttribute` incorrectly: screen-shot-2018-12-05-at-120805-pm.png Am I correct in assuming that this does not actually pick up on the `.csv` because it is not technically part of the filename? What would be the correct expression to route on the file type? Thanks!
... View more
Labels:
- Labels:
-
Apache NiFi