About b_rousseau

b_rousseau · ‎10-31-2018

hi @Srini K, Here is the most simple process using JMX prometheus exporter in oder to store collected JMX data: 1/ download and copy jmx_exporter_jar (highly recommend version 3+) on all NiFi nodes and grant access to NiFi user 2/ create a config file, let's name it nifi.yml and copy inside at least an "export everything" like this one: lowercaseOutputLabelNames: true lowercaseOutputName: true rules:- pattern: ".*" 3/ update NiFi bootstrap.conf in Ambari by adding (replace N by an available number and choose any available port) and then restart NiFi node java.arg.N=-javaagent:/<path_to_jmx_exporter>/jmx_prometheus_javaagent.jar=<port>:/<path_to_jmx_exporter>/nifi.yml" 4/ configure prometheus server to poll nodes on above port 5/ for visualisation i'd recommend Grafana and customise an existing Dashboard like https://grafana.com/dashboards/3066

b_rousseau · ‎01-15-2018

Hi, Most handy option for JMX is https://github.com/prometheus/jmx_exporter In bootstrap.conf set java.arg.N=-javaagent:/path/to/prometheus/jmx_exporter/jmx_prometheus_javaagent.jar=portChoosen:/path/to/prometheus/jmx_exporter/nifi.yml" There is a dedicated exporter: https://github.com/msiedlarek/nifi_exporter Also useful to push events to Push Gateway (business view) https://github.com/mkjoerg/nifi-prometheus-reporter Rgds,

b_rousseau · ‎01-15-2018

@PJ This might be due to io issue on JN host "Remote journal x.x.x.x:8485". Is it always the same JN which is lagging at failure? If so you should check IO load on this machine using iotop for instance. I can also be the result of a very large amount of transactions. What is the value of dfs.namenode.accesstime .precision?

b_rousseau · ‎01-14-2018

@PJ You have less than (nbJournalNodes/2)+1 Journal nodes online. You need at least 2 working journal nodes to leave safe mode if you have set 3 journal nodes in your cluster Explanation on Journal nodes quorum : https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

b_rousseau · ‎01-14-2018

@Alvin Jin A basic "export all" config is as follow: lowercaseOutputLabelNames: true lowercaseOutputName: true rules: - pattern: ".*" If you want to refine your config and set wich beans you want to fetch, these blog post might be useful http://www.whiteboardcoder.com/2017/04/prometheus-and-jmx.html https://blog.godatadriven.com/hbase-prometheus-monitoring To browse beans I'd recommend using jmxterm https://github.com/jiaqi/jmxterm su <ownerJVM> -c "$JAVA_HOME/bin/java -jar jmxterm-1.0.0-uber.jar " $>jvms jvms <pidJVM> (m) - <JVM command line> <pidJMXterm> ( ) - jmxterm-1.0.0-uber.jar $> open <pidJVM> $> beans $> info -d <domain> -b <Mbean> $> get -d <domain> -b <Mbean> <parameters> $> close

b_rousseau · ‎01-14-2018

@Karan Alang It's probably because -javaagent config was loaded twice or more in env loading. It happens when a variable is set like this in env config : KAFKA_OPT="$KAFKA_OPTS -java-agent:..." or JAVA_OPT="$JAVA_OPTS -java-agent:..." And variable is loaded twice during startup.

b_rousseau · ‎01-09-2018

@Vijay Parmar Hi, If you just want an ephemeral table you can use CREATE TEMPORARY EXTERNAL table. Using SPARK is also possible once ORC is loaded as RDD. JSON : rdd.toDF.toJSON.saveAsTextFile() AVRO: import com.databricks.spark.avro._; rdd.toDF.write.avro() Here is a nice gist that explains it for SCALA https://gist.github.com/mannharleen/b1f2e60457cb2b08a2f14db40b7ffa0f Writing JSON in PySpark is write.format('json').save() Here is the API for SPARK-AVRO available in SCALA and Python : https://github.com/databricks/spark-avro Writing Avro in PySpark is write.format("com.databricks.spark.avro").save()<br>

b_rousseau · ‎01-08-2018

Hi The most simple way is probably to: create two HIVE tables in JSON and AVRO format using correct SERDE (CREATE TABLE xxx AS TABLE yyy) then INSERT OVERWRITE from original ORC table SERDE: https://github.com/rcongiu/Hive-JSON-Serde https://cwiki.apache.org/confluence/display/Hive/AvroSerDe Also orc-content offer basic ORC to JSON file by file : https://orc.apache.org/docs/tools.html Rgds,

Online	Offline
Last Visited	‎12-29-2018 03:59 PM

Member Since	‎01-08-2018 10:19 PM
Last Visited	‎12-29-2018 03:59 PM
Posts	10

Cloudera Community

Re: How to enable Apache NiFi metrics for JMX

Re: How to enable Apache NiFi metrics for JMX

Re: Urgent! one of the namenode shuts off in HA, n...

Re: Urgent! one of the namenode shuts off in HA, n...

Re: How to enable Apache NiFi metrics for JMX

Re: Error in Kafka startup with the JMX exporter -...

Re: Converting Hive ORC data to AVRO and JSON form...

Re: Converting Hive ORC data to AVRO and JSON form...