Member since
02-26-2017
19
Posts
0
Kudos Received
0
Solutions
02-08-2018
06:55 AM
Have you installed Nifi previously on the same machine? or Trying to upgrade?
... View more
02-08-2018
06:50 AM
Try to make use of ExecuteScript using python to replicate the jobs you wanted.
... View more
02-08-2018
06:43 AM
Please post the properties of processor. Perform a check on whether znodes are properly configured with the nifi server.
... View more
08-11-2017
09:44 AM
To add more description: web service is exposed to provide JSON data which needs to be streamed into HDFS via Flume. Can anyone provide information of tools and technology available to achieve this. @Artem Ervits @Neeraj Sabharwal @Josh Elser Flume is one of the options we thought about.(HTTP Source with JSONHandler). Do you think would this be a viable option or flume is a wrong choice?
... View more
08-10-2017
12:54 PM
Thanks @Hellmar Becker Am a fan of Nifi. I had used it earlier. Now for this particular piece of work, am supposed to use flume or other solutions which have very minor footprint. Thanks for answering..
... View more
08-09-2017
12:47 PM
I need to pull the data which is exposed by web service. Am planning to make use of Flume to get the data from web service and store it in HDFS. From where data will be loaded into Hive tables.Need to know 1) Is it a better option to make use of Flume? 2) Any alternative approach available which is better than Flume? Data from web service and another source (oracle) needs to be merged and loaded into Hive. Your views and ideas are welcome.
... View more
Labels:
- Labels:
-
Apache Flume
05-15-2017
06:30 PM
Hi, I have a requirement to handle 10k files in Nifi in parallel. Whats the best way to handle this scenario? Say for example if i use GetFile Processor and point to a loading directory, will it take 10k files one by one or it will be handled in parallel. Is there any properties or way to set it up so that 10k files will be loaded in parallel. Its not necessarily GetFile. ListFile-FetchFile also will do. Another question: Does nifi creates a single JVM instance to handle all the flowfiles being generated, or will there be multiple JVM's created for different processes?
... View more
Labels:
- Labels:
-
Apache NiFi
03-20-2017
12:51 PM
@Josh Elser FetchHbaseRow configuration: Hbase Client service: Hbase_1_1_2_ClientService, Table Name: hbase_table, Row Identifier: id, Columns: cf1, Destination: flowfile-content. Configured "Hbase_1_1_2_ClientService" with Kerberos principal and keytabs.
... View more
03-20-2017
12:32 PM
@Bryan Bende @Josh Elser Thanks. Yes as you specified, I had tried initially with GetHbase and came to know it isn't build for that purpose. Tried 'FetchHbaseRow', but it fails saying 'scan' method is not found. Now trying with Python. Route is HandleHTTPRequest-> ExecuteStreamCommand -> HandleHTTPResponse. Here ExecuteStreamCommand will have a python script which will read request data and parse it and pass it to Hbase to retrieve the values. But again, Python doesn't have inbuilt library to fetch hbase data. Is there any library available to query hbase from python apart from HappyBase. I tried using plain Hbase rest, but since hbase is Kerberos enabled, its not allowing us connect to external system. Any inputs are highly appreciated.
... View more
03-14-2017
08:16 AM
@Artem Ervits Yes, I had tried kinit and then i tried to execute the jobs. still gettng the same error. export OOZIE_URL=http://rdctl01.sn1:11000/oozie oozie job -run -config job.properties -verbose -debug
... View more
03-12-2017
10:10 AM
Am using oozie 4.2 with kerberos. While submitting jobs, its throwing the error as specified below: Error: IO_ERROR : java.io.IOException: Error while connecting Oozie server. No of retries = 1. Exception = Could not authenticate, Authentication failed, status: 500, message: Internal Server Error Oozie server has started and its showing proper status in Ambari dashboard. Am able to run the hive jobs individually. Only while submitting the jobs am getting the authentication error. Can you please help on this..
... View more
Labels:
- Labels:
-
Apache Oozie
03-04-2017
06:35 AM
Am trying to setup the web service using HandleHTTPRequest ->InvokeHTTP->FetchHbaseRow->HandleHTTPResponse processor. While starting the nifi process group getting the error: [Timer-Driven Process Thread-6] o.a.n.p.standard.HandleHttpRequest
org.apache.nifi.processor.exception.ProcessException: Failed to initialize the server Any ideas on the cause. Please let me know if you need any more info..
... View more
Labels:
- Labels:
-
Apache NiFi
02-28-2017
11:23 AM
@Josh Elser
The web service must be able to respond to external calls outside hadoop cluster. Original idea is to develop a wrapper in java for the reading Request from external and query Hbase and deploy it as a web service in any web srver. But for these we need a web server to deploy and run. There is no rule to use Nifi, but since the capability is there we can make better out of it.
For GetHbase, Hbase_client_service is created and connected. The response has to be the hbase table data, so no insert/update is going to happen in hbase. Seems that, GetHbase is used only for putting the data into
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.hbase.GetHBase/
Yet to try out other options. If you have came across any similar scenarios and can share the processor connections, it would be great.
... View more
02-28-2017
11:07 AM
@mqureshi ya sort of joining 2 hive tables, but with null values since its a full outer..
... View more
02-27-2017
06:41 PM
I have a Hbase table which needs to be exposed as a web service using Nifi. My actual path of usage is HandleHTTPRequest -- ExtractText -- (GetHbase/FetchHbase) -- HandleHTTPResponse. But am unable to connect the processors from request -> hbase-> response at all. Is there anything am missing here? Is it possible to query hbase table with request url param and send the output data in json format to external system using rest api? Is there any limitation (content/size) in using Nifi in sending the content via rest?
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache NiFi
02-27-2017
06:33 PM
Scenario: Two hive tables with data from entirely different systems created and loaded with data. Need to merge the 2 tables into a single table by performing full outer join which obviously have null values. This merged table has been planned to be created with a partition column. Now with null coming into picture, thinking of removing partition column from here. Now the real question is, how to integrate this merged hive table with Hbase. It has nulls and Hbase doesn't understand null and its not possible to retrieve null data (since its full join, even the primary key column also contains null). How to integrate this hive table with hbase table. All columns and rows must be preserved and null value rows should not be filtered out. Is there any way to achieve this? Or should we be thinking of some other alternative? Any other out of box alternatives are most welcome.
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache Hive
02-27-2017
06:19 PM
@dsun Thanks. ExecuteStreamCommand is the processor which we are going to make use of. Nifi role will be til creating a data mart from all source and oozie will take care of the rest of the flow.
... View more
02-27-2017
06:14 PM
@Sunile Manjee
Thanks
for the recommendations. Am planning to make use of Nifi for data
ingestion alone and rest of the flow using oozie.we are building a
system with similar to ETL (staging,refine layer). So there will be
multiple hive tables created and merged & resulting data will be
sent to external systems. Planning to go to Nifi (data ingestion) and
(workflow schedule) oozie. Coming to common processing logic, correct me
if am wrong. lets say PutMail will be in one separate process group and
entire flow will have a link to that group whenever failure occurs.
... View more
02-26-2017
01:32 PM
Am pretty new to Nifi. The soltion we have designed is using Nifi for data ingestion and oozie for scheduling. Using oozie, hive tables are loaded, merged and hive-to-hive schema copy is performed. 1) Is it possible/ recommended using Nifi to completely achieve it. 2) What are the areas that needs to be taken care while using this approach? 3) The flow after developed using Nifi has so many common processors (updateattribute,putmail etc). Is there any way to set common processors as a separate template and make use of it wherever necessary) Note: The data flow is scheduled to run once in a day.
... View more
Labels:
- Labels:
-
Apache NiFi
-
Apache Oozie