Member since
12-27-2018
25
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
702 | 01-16-2019 07:21 PM |
02-27-2019
05:45 PM
1 Kudo
I receive JSON data from kafka with from_json() method. It expects schema from me. My JSON structure like this;
{ "Items": { "key1": [ { "id": "", "name": "", "val": "" } ], "key2": [ { "id": "", "name": "", "val": "" } ], "key3": [ { "id": "", "name": "", "val": "" } ] }
}
Key1, Key2, Key3 are dynamic. So, they may be changed. For example, another json is;
{ "Items": { "hortoworks": [ { "id": "", "name": "", "val": "" } ], "community": [ { "id": "", "name": "", "val": "" } ], "question": [ { "id": "", "name": "", "val": "" } ] }
}
These key names are unknown. But "id-name-val" fields inside these keys are the same. I must define a json schema for read data from Kafka in Spark Structured Streaming. How can I do this?
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache Spark
02-20-2019
04:44 PM
Nifi has HBase 1.1.2 client version. Can I use HBase 2.x version this HBase controller service?
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache NiFi
02-17-2019
08:08 PM
I have disabled Hyper-V. It worked. Thank you so much.
... View more
02-16-2019
11:54 AM
I have donwloaded HDP 3.0.1, and then installed in Virtualbox(its version greater than 5.1). But I am getting this error. Virtualization is enabled in BIOS. Also, Hyper -v was installed. What I will do? Note: I run Virtualbox as Adminastor.
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)
01-31-2019
10:07 PM
@David Miller You are right. But s2s etc. not suitable for my scenario. There are many reasons.
... View more
01-31-2019
06:30 PM
Yes, I am getting nodeAddress and its status from "rest-api/controller/cluster" for create http endpoint. I will send flowFile to other active nodes after i prepare URL. For example; Result of "rest-api/controller/cluster"; {
"cluster": {
"nodes": [
{
"nodeId": "node1Id" ,
"address": "node1Address",
"apiPort": 9999,
"status": "CONNECTED",
"heartbeat": "value",
"connectionRequested": "value",
"roles": [],
"activeThreadCount": 0,
"queued": "value",
"events": [{
"timestamp": "value",
"category": "value",
"message": "value"
}],
"nodeStartTime": "value"
},
{
"nodeId": "node2Id" ,
"address": "node2Address",
"apiPort": 9999,
"status": "CONNECTED",
"heartbeat": "value",
"connectionRequested": "value",
"roles": ["PRIMARY"],
"activeThreadCount": 0,
"queued": "value",
"events": [{
"timestamp": "value",
"category": "value",
"message": "value"
}],
"nodeStartTime": "value"
}
],
"generated": "value"
}
}<br> I get node1Address and node2Address(if they CONNECTED) values from this json. Then, I am creating URL like this; http://node1Address:9999/contentListener Then, I will send post request to listenHttp(running on each node) using postHttp with this URL. ListenHttp(running on the each node) will listen to the this path and retrieve the data.
... View more
01-30-2019
09:34 PM
@David Miller "But it sounds like you are trying to do something that would be better served by just running a completely independent flow on each node." -- Yeah, that's exactly what I want to do. So I'm reviewing all possible scenarios. Thank you.
... View more
01-30-2019
09:10 PM
@Matt Clarke I will look for answers to these questions.. Then I'll re-edit this question. Thank you so much.
... View more
01-30-2019
09:03 PM
@Matt Clarke Yes, I know that some data stuck in connection queues. But I have to use postHttp(primary node) and listenHttp for distribute data to cluster. Because, I want to send the same FlowFile to same node. For this, "Partition By Attribute" may be used. Even, its great for this scenario. But, if any node downs, some data will be waiting in the unavailable queue. But i don't want this. Because, i will lose data due to these data that waiting in the unavailable queue. Also, Roun-roind is available. But, it doesn't same data to same node. For this, I will use postHttp and listenHttp for distribute data to cluster. Thank you
... View more
01-30-2019
06:40 PM
I will send request to ListenHttp(running on each node) using PostHTTP(running on only primary node). For this, i am getting nodes address from nifi rest-api(/controller/cluster). And i prepare URL like this; http://nodeAddress:port/contentListener Can I successfully send request to ListenHttp processors in each node with PostHttp using this URL?
... View more
Labels:
- Labels:
-
Apache NiFi
01-25-2019
01:28 PM
@Matt Clarke Does "secure Nifi" mean secure nifi cluster? Each nifi instance have nifi.properties file. I will read this file, and prepare url after receive node ip and port number from this file. Then I will send this url to invoke http, and get cluster summary from rest-api. If pimary node is down, and new primary node will be elected, this solution will be working. Is it correct?
... View more
01-24-2019
07:57 PM
@Matt Clarke firstly, thank you for your response Matt. Yes, I know that. But the processor that does works only works on the primary node. It sends get request to "192.xxx.xx.50:8080". When new primary node is elected, this address will be changed(for example "192.xxx.xx.68:8080"). In new primary node, processor will not be able to send request to nifi api. Because existing address("192.xxx.xx.50:8080") represents the node that was down. For this; Can I send request if I use "localhost:8080" address.
... View more
01-24-2019
05:57 PM
I want to get node details from nifi rest-api in primary node. For this, I can use "192.xxx.xx.xx:8080" address. But, what happens if primary node is down? I can not access the rest api using above address, after new primary node will be
elected. Can I use "localhost:8080" address for this purpose for all nifi nodes? Does this work?
... View more
Labels:
- Labels:
-
Apache NiFi
01-16-2019
07:21 PM
I solved the problem. I used this solution; [
{
"operation": "shift",
"spec": {
"*": {
"nodeId": "node_ids",
"status": "node_status"
}
}
}
]
... View more
01-16-2019
07:21 PM
I have nifi cluster summarry json data like this; [{
"nodeId": "29bed24c-6d73-4652-930b-0065cad4ef66",
"address": "b1518aec7e38",
"apiPort": 8080,
"status": "CONNECTED",
"heartbeat": "01/16/2019 16:24:32 UTC",
"roles": ["Primary Node", "Cluster Coordinator"],
"activeThreadCount": 0,
"queued": "0 / 0 bytes",
"events": [{
"timestamp": "01/16/2019 15:21:50 UTC",
"category": "INFO",
"message": "Received first heartbeat from connecting node. Node connected."
}, {
"timestamp": "01/16/2019 15:21:48 UTC",
"category": "INFO",
"message": "Connection requested from existing node. Setting status to connecting."
}],
"nodeStartTime": "01/16/2019 15:20:44 UTC"
},
{
"nodeId": "22aofpp4-87rf-asf4-930b-0065cad4ef66",
"address": "b67uf98tkl5",
"apiPort": 8080,
"status": "CONNECTED",
"heartbeat": "01/16/2019 16:24:32 UTC",
"roles": [],
"activeThreadCount": 0,
"queued": "0 / 0 bytes",
"events": [{
"timestamp": "01/16/2019 15:21:50 UTC",
"category": "INFO",
"message": "Received first heartbeat from connecting node. Node connected."
}, {
"timestamp": "01/16/2019 15:21:48 UTC",
"category": "INFO",
"message": "Connection requested from existing node. Setting status to connecting."
}],
"nodeStartTime": "01/16/2019 15:20:44 UTC"
}] I want only nodeId and status fields from this json. Output what i want; {
"nodeIds": ["node1Id", "node2Id"],
"status": ["node1Status", "node2Status"]
} For example output what i want; {
"nodeIds": ["29bed24c-6d73-4652-930b-0065cad4ef66", "22aofpp4-87rf-asf4-930b-0065cad4ef66"],
"status": ["CONNECTED", "CONNECTED"]
} But I couldn't do Jolt Specification for this. How can i do this?
... View more
Labels:
- Labels:
-
Apache NiFi
01-16-2019
05:39 PM
Thank you @Geoffrey Shelton Okot. I dont have any kubernates container for Nifi. As far as I know, Nifi have not any kubernates yaml file. But I think, hortonworks will make kubernates container for Nifi
... View more
01-10-2019
05:46 PM
If any node is down, can nifi create new node automatically? The unused node will be ready to use. When a node is down(fail, disconnect or etc.), I just want to use this new node instead of failed node. Can nifi do this automatically? Is this possible? Thank you.
... View more
Labels:
- Labels:
-
Apache NiFi
01-05-2019
08:45 AM
"Partition by Attribute" was great for me! Thank you Matt!
... View more
01-03-2019
02:21 PM
I have Nifi cluster(two instances). I generate urls using "GenerateFlowFile", then split it using "SplitText". For example; -> "GenerateFlowFile" is generate this; url1 url2 url3 url4 -> In "SplitText"(split count 2); url1 url2 --------- url3 url4 -> I want to send these two flowfiles to other nifi instances in order. For this, i am using round robin. But i always want to go in the same order. What I want; url1, url2 ---> node1 url3, url4 ---> node2 url1, url2 ---> node1 url3, url4 ---> node2 url1, url2 ---> node1 url3, url4 ---> node2 -> But data is being sent this way url1, url2 ---> node1 url3, url4 ---> node2 url3, url4 ---> node1 url1, url2 ---> node2 How can I send same data to the same Nifi node?
... View more
Labels:
- Labels:
-
Apache NiFi