Member since
12-20-2017
9
Posts
1
Kudos Received
0
Solutions
04-23-2018
09:33 PM
Thanks, I did see that but it looked a bit hard to follow how to do it from scratch
... View more
04-23-2018
09:03 PM
Hello. Using this template https://github.com/Teradata/kylo/blob/master/samples/templates/nifi-1.0/template-starter-pyspark.xml I managed to get a pyspark job running. The spark script doesn't accept data from a flow file however, it has a hardcoded path for the input and output file. I had to tell spark to use a specific anaconda python environment in spark setting PYSPARK_PYTHON as follows "export PYSPARK_PYTHON="/path/to/python/env/python" in the spark conf/spark-env.sh file. It would be nice to know how to how to create a script and template that accepts flowfiles however. If anyone has a template with an example of that would be great. Cheers, Tim
... View more
04-21-2018
11:13 AM
1 Kudo
Hey all, After some information on how I can use nifi to get a file on S3 send it to pyspark, transform it and move it to another folder in a different bucket. I have used this template https://gist.github.com/ijokarumawak/26ff675039e252d177b1195f3576cf9a to get data moving between buckets, which works fine. But im a bit unsure of the next steps of how to pass a file to pyspark, run a script to transform it then put it in another location. I have been looking at this https://pierrevillard.com/2016/03/09/transform-data-with-apache-nifi/ which I will try to understand. If you know of or have any examples of how I might do this, or could describe how I might set it up Thanks, Tim
... View more
Labels:
- Labels:
-
Apache NiFi
01-31-2018
08:22 PM
Hi @Abdelkrim Hadjidj Would you mind expanding on exactly how to use templates for flow control? Is it a case of creating a template, downloading that using the UI into a repo folder? I also tried tracking and reverting changes on the flow.xml.gz but it didnt seem to revert the changes on the UI when I refreshed it? Just wondering if im missing something here. Thanks, Tim
... View more
12-21-2017
02:37 AM
localhost name resolution is handled within DNS itself.
#127.0.0.1 localhost
#::1 localhost
127.0.0.1 localhost sandbox.hortonworks.com sandbox-hdp.hortonworks.com sandbox-hdf.hortonworks.com
@Jay Kumar SenSharma Above is my etc/hosts Should it be 172.17.0.2 instead?
... View more
12-21-2017
01:36 AM
@Jay Kumar SenSharma Cool thanks- I can see the log, and there is indeed an error:
... View more
12-21-2017
01:15 AM
@Jay Kumar SenSharma Thanks yes I did the restart all, and the result checking if 50070 was open as shown in the picture. Are you able to tell me how to check the namenode logs?
... View more
12-21-2017
12:01 AM
Hey Jay, thanks for the reply 🙂 Am I doing this correctly?
... View more
12-20-2017
11:29 PM
Hello, Just trying to work through the tutorial, im using the docker version. All was going well, until I got to the "explore Ambari" part. It seems some connections are getting refused, yet Ambari is running fine, and I can access the sandbox through the terminal and stop/start Ambari. Please have a look at the picture below.
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)