About rajak

MattWho · ‎05-29-2018

@Raja K There are two Java process associated to a running NiFi instance. The first JVM process is tied to the NiFi bootstrap. It is what you are interacting with when you issue the "start, stop, restart, dump, etc..." During the startup phase, the bootstrap will start the other NiFi JVM process. - During startup of the main NiFi process, there is a lot that goes on. For example: - NiFi must unpack all the nars in the NiFi lib directory to the Nifi work directory and load relevant code in to memory. - NiFi must join the cluster ----- If entire cluster has just been started/restarted, NiFi goes in to an election phase (Cluster needs to elect a cluster coordinator, primary node, a cluster flow). The election will not occur until all nodes have joined (based on nifi.cluster.flow.election.max.candidates=) or until election wait time expires (nifi.cluster.flow.election.max.wait.time=5 mins). If max.candidates is not set, you will always have a min 5 min latency here. - Nodes joining the cluster must then compare their local flow with the cluster flow to make sure they match exactly. - Once the node successfully passes these steps it must start loading flowfiles back in to the flow (these would be flowfiles still queued in connection when NiFi was last stopped/restarted). - Processor components are then scheduled to run based on the cluster flow processor state (start all processors that are in a running state. Start primary node only processors on the elected primary node) - Only then is the cluster as a whole in a state where user should be given access to the UI for interactive purposes. This does not mean that the flows were not already running before theUI was actually available to the user. - There are are things that happening there, but those are the biggest pieces. Tailing the nifi-app/log watching for the "The UI is available at the following URLs:" is the easiest way to see when the UI will be available for the users. Those UIs are based on the following properties from the nifi.properties file: nifi.web.http.host= <-- unsecured nifi nifi.web.https.host= <-- secured nifi If they are left blank, NiFi will bind to all available interfaces that the JVM finds on the host machine (as your UIs shows). There is no redirection going on here. - Hope this helps explain what is occurring during the startup. The size of your dataflow(s), election process, number of queued FlowFiles being loaded back in to JVM memory (queued connections),etc. will have some impact on startup time. Your output above only has the timestamp for when the UI became available (2018-05-29 06:58:05,832), so it is not clear how much latency you are really seeing between the actual start command that timestamp. First entry in nifi-app.log should include (org.apache.nifi.NiFi Launching NiFi...). - Thank you, Matt

MMG · ‎03-16-2018

Also, I ran the flume in debug mode : [cloudera@quickstart flume-ng]$ ./bin/flume-ng agent --conf conf -conf-file conf/flumekafka.conf --name agent1 -Dflume.root.logger=DEBUG,console Getting the below snippet repeatedaly with 'Updated cluster metadata version' value changing 2018-03-16 12:01:51,797 (lifecycleSupervisor-1-4) [DEBUG - org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendGroupCoordinatorRequest(AbstractCoordinator.java:470)] Sending coordinator request for group flume to broker quickstart.cloudera:9092 (id: 33) 2018-03-16 12:01:51,801 (lifecycleSupervisor-1-4) [DEBUG - org.apache.kafka.clients.consumer.internals.AbstractCoordinator.handleGroupMetadataResponse(AbstractCoordinator.java:483)] Received group coordinator response ClientResponse(receivedTimeMs=1521226911801, disconnected=false, request=ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@10f78e4b, request=RequestSend(header={api_key=10,api_version=0,correlation_id=15,client_id=consumer-1}, body={group_id=flume}), createdTimeMs=1521226911798, sendTimeMs=1521226911798), responseBody={error_code=15,coordinator={node_id=-1,host=,port=-1}}) 2018-03-16 12:01:51,879 (lifecycleSupervisor-1-4) [DEBUG - org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:627)] Sending metadata request ClientRequest(expectResponse=true, callback=null, request=RequestSend(header={api_key=3,api_version=0,correlation_id=16,client_id=consumer-1}, body={topics=[Airports]}), isInitiatedByNetworkClient, createdTimeMs=1521226911878, sendTimeMs=0) to node 33 2018-03-16 12:01:51,891 (lifecycleSupervisor-1-4) [DEBUG - org.apache.kafka.clients.Metadata.update(Metadata.java:180)] Updated cluster metadata version 10 to Cluster(nodes = [quickstart.cloudera:9092 (id: 33)], partitions = [Partition(topic = Airports, partition = 6, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 10, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 1, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 11, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 13, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 5, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 7, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 12, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 8, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 17, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 16, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 4, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 3, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 14, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 2, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 15, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 19, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 18, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 0, leader = 33, replicas = [33,], isr = [33,], Partition(topic = Airports, partition = 9, leader = 33, replicas = [33,], isr = [33,]]) S

rajak · ‎10-22-2017

If I have 50 users. 10 users are updating and inserting the same base table simultaneusly and 40 users are just querying simultaneously, will there be any locking? How is the concurrency working? I do not want to turn on the transactional or ACID features. Please let me know. Thanks, Raja

Online	Offline
Last Visited	‎03-21-2018 02:20 PM

Member Since	‎10-22-2017 09:29 AM
Last Visited	‎03-21-2018 02:20 PM
Posts	7
Kudos received	1

Cloudera Community

Re: Nifi -taking so much time(misleading) to redir...

Re: Flume ingestion error ( need solution)

Re: Hive view definition update