Member since
01-16-2018
25
Posts
2
Kudos Received
0
Solutions
04-01-2019
07:04 AM
I have created a mapreduce program which reads some files from local and stores them to hdfs after processing. When I am trying to run this on HDP cluster it reads null value when all files are processed in the folder and throws NullPointerException. This problem is only coming in aws HDP cluster, whereas in our local dev environment all the components are installed separately and the same program is running perfectly. We are using hdp 2.6.2 version and all the components on dev environment are same as hdp 2.6.2.
... View more
Labels:
- Labels:
-
Apache Hadoop
09-29-2018
05:31 AM
Is there any way to do the same task using zookeeper ?
... View more
09-28-2018
12:24 PM
We are working on application which contains Spark, hdfs, and kafka.
We want to deploy this application on existing HDP cluster. So what would be the best approach/way to deploy this application on HDP in less time. What I want to do is to create some script that will co-ordinate with ambari and findout which component is already installed on existing HDP. For. Ex if some HDP cluster doesnot contain spark then it will automatically download (from hortonwork repo) and configure spark on that HDP cluster otherwise simply load all tasks/jobs.
Can I use zookeeper to detect which service is installed and detect its state (running/stopped/maintainence) ?
... View more
Labels:
- Labels:
-
Apache Ambari
06-19-2018
08:56 AM
I'd seen that already but its just for reading not for changing. I need to read and then manipulate the variable's value
... View more
06-19-2018
05:10 AM
I want to access the process group variable from execute script and then need to change its value. I am using python. I have read the article which describe how to access flowfile attributes but not variables. My requirement is when some job complete successfully then some value is to be stored in variable.
... View more
Labels:
- Labels:
-
Apache NiFi
06-05-2018
05:07 AM
I am trying to read zip file in nifi execute processor and I am using python as a scripting language. When I run the script it throws no viable alternative input at line 25 (flowFile = session.get()). What is the real cause behind this. Here is my script from zipfile import ZipFile
from org.apache.nifi.processor.io import InputStreamCallback
import java.io
import json
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
class ReadVersion(InputStreamCallback):
def __init__(self):
self.flowFile = None
self.version = ''
self.error = ''
def process(self,inputStream):
try:
zipname = self.flowFile.getAttribute('filename')
zippath = self.flowFile.getAttribute('absolute.path')
zfile = zipfile.ZipFile(zippath+zipname)
with zipFile(zfile,'r') as zip:
pageview = zip.read('pageview.json').decode("utf-8")
pageview = json.loads(clients)
pam = zip.read('pam.json')
pam= json.loads(Company.decode("utf-8") )
flowFile = session.get()
if (flowFile != None):
callback = ReadVersion()
callback.flowFile = flowFile
session.read(flowFile, callback)
if (callback.version != ''):
flowFile = session.putAttribute(flowFile,'MSVersion',callback.version)
session.transfer(flowFile, REL_SUCCESS)
if (callback.error == 'error'):
session.transfer(ff, REL_FAILURE)
... View more
Labels:
- Labels:
-
Apache NiFi
06-04-2018
07:23 AM
I am trying to create some tables in Hive from Apache NIFI but I didn't find any exact Processor for that. However I found a processor name PutHiveQl which can be used for DDL/DML operation but I didn't find any property in which I can write the query. If it is the right processor for this purpose then how it can be used in my case.
... View more
Labels:
- Labels:
-
Apache NiFi
04-06-2018
11:54 AM
I am trying to split an array of record using SplitJson processor. But it fails to split the record. I am unable to find the correct expression for my json. Here is my json [
{"id":"30fb76fa-acbe-463b-830e-66f203bb0911","session_id":"804e8d5b-c266-92b7-4a1d-eed3650d3b4a","tag_name":"call"},
{"id":"23986d19-c91f-4d98-8cfd-08fb26c5ff85","session_id":"804e8d5b-c266-92b7-4a1d-eed3650d3b4a","tag_name":"direct-call"},
{"id":"7c374ae9-b96a-4383-85ce-6d45cbc5f8a4","session_id":"804e8d5b-c266-92b7-4a1d-eed3650d3b4a","tag_name":"homepage"},
{"id":"599bf3e0-2c76-4d38-8349-91cc04e34c33","session_id":"b8f17ef9-d7df-dec0-71e3-3ed28991d396","tag_name":"bounce"},
{"id":"55791f8a-3243-48b3-bb4a-70404a21148d","session_id":"b8f17ef9-d7df-dec0-71e3-3ed28991d396","tag_name":"homepage"}
]
I want split each record as seprate flowfile, means there will be five flow files. What is the correct Json path expression ?
... View more
Labels:
- Labels:
-
Apache NiFi
04-05-2018
11:30 AM
I am building a job in which I have to validate phone numbers and we wants to use 'google-libphonenumber' npm package. I am using javascript in executescript processor. What is the correct way to include the npm package?
... View more
Labels:
- Labels:
-
Apache NiFi
04-02-2018
06:08 AM
1 Kudo
I am using apache nifi to convert json to csv. I want to change the headers of the generated csv . Is there any specific processor for this. I know how to achieve this using ExecuteScript processor but is there any easy approach. Ex. "_id", "name","time" to "id", "browser_name","duration"
... View more
- Tags:
- apache-nifi
- csv
Labels:
- Labels:
-
Apache NiFi
03-29-2018
09:19 AM
Thanks setting max bin age property works.
... View more
03-29-2018
09:07 AM
No I am not getting any error. It just hang and do nothing. I mean flow files are reaching to MergeProcessor after after converting record but after that nothing happens. You can see in the new image.
... View more
03-29-2018
08:56 AM
@Abdelkrim Hadjidj I tried. In this case it results in failure.
... View more
03-29-2018
08:12 AM
I am using apache nifi and retrieving data in bulk from mongodb in json format and converting to csv but the problem is multiple csv is generating for each json record. How could I merge all the csv in NIFI. I have tried MergeRecord processor but still multiple csv are generating. I not sure whether all the setting is valid for MergeRecord.
... View more
- Tags:
- apache-nifi
- csv
Labels:
- Labels:
-
Apache NiFi
03-28-2018
07:47 AM
I am developing a JSON to CSV converter job in NIFI and I have to generate a UUID for each json element and add that generated UUID to the flowfile. I don't find any suitable processor for generating UUID ? My question is how can I generate a UUID for each incoming flowfile.
... View more
Labels:
- Labels:
-
Apache NiFi
03-23-2018
01:29 AM
1 Kudo
@Rahul Soni Yes Mr. Soni thanks its working. Thanks for your time.
... View more
03-21-2018
10:16 AM
I am trying to fetch some data from mongodb using Nifi and I am using GetMongo processor. I am trying to write a query inside GetMongo but I dont have any idea how t write query inside this processor. This is my sample query db.Person.find({
$and:[
{"createdAt":{$ne:null}},
{"updatedAt":{$ne:null}}
]
}) when I write this query it shows the exclamation marks over the processor showing that "Query validated against .... is invalid because JSON was expecting a value but find db." Help me with the correct expression variables at least a good sample will be helpful.
... View more
Labels:
- Labels:
-
Apache NiFi
03-20-2018
11:24 AM
I want to convert Json coming from MongoDb to csv and I am using ConvertRecord processor for this purpose. I have configured all the required controllers and I am using AvroSchemaRegistry controller to validate the json and .JsonTreereader controller to read the json but when I run the job it it throws the error that schema not found. I have two question Which is the most proper way to do this Job. i.e Converting JSON to CSV ? How to pass schema into the JsonTreeReader This is ConvertRecord Processor Setting This my json file demo {"id":1001, "name":"vivek"}
... View more
Labels:
- Labels:
-
Apache NiFi