Member since
01-08-2018
10
Posts
0
Kudos Received
0
Solutions
10-31-2018
06:05 AM
hi @Srini K, Here is the most simple process using JMX prometheus exporter in oder to store collected JMX data: 1/ download and copy jmx_exporter_jar (highly recommend version 3+) on all NiFi nodes and grant access to NiFi user 2/ create a config file, let's name it nifi.yml and copy inside at least an "export everything" like this one: lowercaseOutputLabelNames: true
lowercaseOutputName: true
rules:- pattern: ".*" 3/ update NiFi bootstrap.conf in Ambari by adding (replace N by an available number and choose any available port) and then restart NiFi node java.arg.N=-javaagent:/<path_to_jmx_exporter>/jmx_prometheus_javaagent.jar=<port>:/<path_to_jmx_exporter>/nifi.yml"
4/ configure prometheus server to poll nodes on above port 5/ for visualisation i'd recommend Grafana and customise an existing Dashboard like https://grafana.com/dashboards/3066
... View more
01-15-2018
04:28 PM
Hi, Most handy option for JMX is https://github.com/prometheus/jmx_exporter In bootstrap.conf set java.arg.N=-javaagent:/path/to/prometheus/jmx_exporter/jmx_prometheus_javaagent.jar=portChoosen:/path/to/prometheus/jmx_exporter/nifi.yml" There is a dedicated exporter: https://github.com/msiedlarek/nifi_exporter Also useful to push events to Push Gateway (business view) https://github.com/mkjoerg/nifi-prometheus-reporter Rgds,
... View more
01-15-2018
05:50 AM
@PJ This might be due to io issue on JN host "Remote journal x.x.x.x:8485". Is it always the same JN which is lagging at failure? If so you should check IO load on this machine using iotop for instance. I can also be the result of a very large amount of transactions. What is the value of dfs.namenode.accesstime .precision?
... View more
01-14-2018
04:37 PM
@PJ You have less than (nbJournalNodes/2)+1 Journal nodes online. You need at least 2 working journal nodes to leave safe mode if you have set 3 journal nodes in your cluster Explanation on Journal nodes quorum : https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
... View more
01-14-2018
04:24 PM
@Alvin Jin A basic "export all" config is as follow: lowercaseOutputLabelNames: true
lowercaseOutputName: true
rules:
- pattern: ".*"
If you want to refine your config and set wich beans you want to fetch, these blog post might be useful http://www.whiteboardcoder.com/2017/04/prometheus-and-jmx.html https://blog.godatadriven.com/hbase-prometheus-monitoring To browse beans I'd recommend using jmxterm https://github.com/jiaqi/jmxterm su <ownerJVM> -c "$JAVA_HOME/bin/java -jar jmxterm-1.0.0-uber.jar "
$>jvms
jvms
<pidJVM> (m) - <JVM command line>
<pidJMXterm> ( ) - jmxterm-1.0.0-uber.jar
$> open <pidJVM>
$> beans
$> info -d <domain> -b <Mbean>
$> get -d <domain> -b <Mbean> <parameters>
$> close
... View more
01-14-2018
04:18 PM
@Karan Alang It's probably because -javaagent config was loaded twice or more in env loading. It happens when a variable is set like this in env config : KAFKA_OPT="$KAFKA_OPTS -java-agent:..." or JAVA_OPT="$JAVA_OPTS -java-agent:..." And variable is loaded twice during startup.
... View more
01-09-2018
11:37 PM
@Vijay Parmar Hi, If you just want an ephemeral table you can use CREATE TEMPORARY EXTERNAL table. Using SPARK is also possible once ORC is loaded as RDD. JSON : rdd.toDF.toJSON.saveAsTextFile() AVRO: import com.databricks.spark.avro._; rdd.toDF.write.avro() Here is a nice gist that explains it for SCALA https://gist.github.com/mannharleen/b1f2e60457cb2b08a2f14db40b7ffa0f Writing JSON in PySpark is write.format('json').save() Here is the API for SPARK-AVRO available in SCALA and Python : https://github.com/databricks/spark-avro Writing Avro in PySpark is write.format("com.databricks.spark.avro").save()<br>
... View more
01-08-2018
10:50 PM
Hi The most simple way is probably to: create two HIVE tables in JSON and AVRO format using correct SERDE (CREATE TABLE xxx AS TABLE yyy) then INSERT OVERWRITE from original ORC table SERDE: https://github.com/rcongiu/Hive-JSON-Serde https://cwiki.apache.org/confluence/display/Hive/AvroSerDe Also orc-content offer basic ORC to JSON file by file : https://orc.apache.org/docs/tools.html Rgds,
... View more