Member since
07-25-2016
55
Posts
28
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5778 | 07-26-2016 12:31 AM |
11-11-2016
07:32 PM
Thanks, that was very helpful,
... View more
11-11-2016
07:13 PM
1 Kudo
Hi, I am using QueryDatabaseTable (on Nifi 1.0) to fetch rows from MySQL DB table. To reduce redundent executions, I have scheduled it on Primary node. - However, doing so prevents me from scheduling this processor on a fixed-time since there is no option to do so (like CRON option, where you could specify a fixed time, say 9AM every day). I want to schedule it on fixed time, like 9AM every day. Is that possible? Any workaround? Thanks Obaid
... View more
Labels:
- Labels:
-
Apache NiFi
11-11-2016
05:29 PM
1 Kudo
Hi All, I tried using ExecuteSQL to select rows from my MySQL table: CREATE TABLE table_name (
domain_id mediumint(8) UNSIGNED NOT NULL,
....
run_date timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
) And get following error (This error is caused by domain_id column, i.e I get the error only if I include domain_id in the select query, otherwise it works just fine): 09:24:25 PST ERROR fa363402-bbc0-1802-ffff-ffff96aa31f4 <host>:9090
ExecuteSQL[id=fa363402-bbc0-1802-ffff-ffff96aa31f4] ExecuteSQL[id=fa363402-bbc0-1802-ffff-ffff96aa31f4] failed to process session due to org.apache.avro.file.DataFileWriter$AppendWriteException: org.apache.avro.UnresolvedUnionException: Not in union ["null","long"]: 141419: org.apache.avro.file.DataFileWriter$AppendWriteException: org.apache.avro.UnresolvedUnionException: Not in union ["null","long"]: 141419
Is this a bug or I might not be using this correctly? Is there any workaround for this issue? Thanks Obaid
... View more
Labels:
- Labels:
-
Apache NiFi
11-11-2016
12:24 AM
Hi all, We have been using Jenkins for scheduling jobs, hence it is easy to schedule a job (or jobs, define dependencies etc) and keep track of each job run i.e if a job fails you get an alert etc. Hence for Operation teams, Jenkins is an easy platform to manage/keep track of jobs. I have following questions: 1. For scheduling jobs, what is the best tool: Jenkins or Nifi? 2. How could you operationalize a dataflow like you can in Jenkins? Meaning, if any individual dataflow fails, Operation team gets an alert, so they have complete visibility on each job run? 3. Can I (or should I, meaning does it sound reasonable) use Jenkins to launch DataFlows on Nifi? Just to let Operation team have a single UI to keep track of all jobs ! 4. How can we track the status of each DataFlow run on Nifi? Thanks Obaid
... View more
Labels:
- Labels:
-
Apache NiFi
11-09-2016
04:20 PM
Hi all, I have just started playing around with Apache Ambari. The goal is to install HDF 2.x. I have one question: I have an existing Zookeeper cluster which is being already used by applications, and I want to use this cluster for HDF. Is there a way to tell Ambari to use an existing zookeeper cluster (without having Ambari to install anything since Zookeeper cluster is already up and running) ? Thanks Obaid
... View more
Labels:
11-04-2016
07:42 PM
Hi All, I have a CSV file which contains 3 empty lines at the end of the file. Is there a way to remove these from the end of file? I mean, the file has multiple rows, and I dont split the file. Was wondering if I could remove the spaces without splitting. Thanks Obaid
... View more
Labels:
- Labels:
-
Apache NiFi
11-03-2016
01:35 PM
sure, no problem
... View more
11-03-2016
01:33 PM
Great, Thanks for your response, Do you think that there is a relationship with Cores and RAM, meaning if you have X cores then you should have X+ RAM etc, is there any dependency or good practice? We can think of minimum requirements, assuming we will be running a lot of light-weight flows (batch, scheduled). I mean, more cores will let us run more flows, so just thinking if 32GB RAM will be enough for 20cores if I go for HDF2.x.x. Say in the future all 20cores become busy, then would RAM be an issue? Thanks
... View more
11-01-2016
10:31 PM
Hi All, - This seems like an obvious question, so forgive me if it is redundant: What hardware configurations would be suitable for setting-up HDF 2.x on VMs for 8 node cluster? - I found an old document which does help: link - It seems like Nifi might need more cores vs RAM. My current setup of 12GB/node and 6cores/Node is not working (note: Master has 6GM RAM, which seems like a bottlenect). - After going-through the link, I am thinking of following , but not sure if this is optimal: 24 cores vs 20GB RAM vs 250-500GB Disk. Does it seems like an optimal configurations (consider the ratio, more cores vs more RAM?)? To give more context, I currently don't have any specific throughput requirements, and using Nifi for some batch jobs/log processing etc, however I do want to have a stable cluster setup which we could also use in future if the use increases. Thanks Obaid
... View more
Labels:
10-29-2016
06:20 PM
Thanks a lot @Attila Kanto for a detailed response, Let me ask another cost related question, which is an important factor for making a decision on which technology to use: How would you compare EMR vs Cloudbreak (or Hortonworks Data Cloud) in-terms of cost? Obaid
... View more