About gkeys

gkeys · ‎12-15-2016

@NAVEEN KUMAR I would post that as a separate question, since it is different from the question you posed. (You will also get more exposure)

gkeys · ‎12-15-2016

You should place a ListFile processor in front of you FetchFile. Alternatively, you could use GetFile, with these properties: Input Directory: /a/b/c File Filter: [^\.].* (or change regex if you are filtering specific files by filename pattern) Note that GetFile does not have the configuration to move the file after processing (creating flowfile)

gkeys · ‎12-15-2016

From looking at your YARN log, it looks like the local directory for the YARN data are full. See this post for cleaning this up: https://community.hortonworks.com/questions/35751/manage-yarn-local-log-dirs-space.html 2016-12-15 15:50:10,986 WARN nodemanager.DirectoryCollection (DirectoryCollection.java:checkDirs(248)) - Directory /hadoop/yarn/local error, used space above threshold of 90.0%, removing from list of valid directories 2016-12-15 15:50:10,987 WARN nodemanager.DirectoryCollection (DirectoryCollection.java:checkDirs(248)) - Directory /hadoop/yarn/log error, used space above threshold of 90.0%, removing from list of valid directories 2016-12-15 15:50:10,989 ERROR nodemanager.LocalDirsHandlerService (LocalDirsHandlerService.java:updateDirsAfterTest(356)) - Most of the disks failed. 1/1 local-dirs are bad: /hadoop/yarn/local; 1/1 log-dirs are bad: /hadoop/yarn/log

gkeys · ‎12-12-2016

Hope you give it a go with the free 5 day trial -- looking forward to seeing how it goes.

gkeys · ‎12-12-2016

The first 3 bundles Cluster Configurations ("bundles") shown below have SparkR as part of Spark 1.6 and Spark 2.0 https://spark.apache.org/docs/1.6.0/sparkr.html

gkeys · ‎12-12-2016

There is no out-of-the box UDF to do this in Hive. You need to build it yourself using Hive's map function. Examples: http://stackoverflow.com/questions/23025380/how-to-transpose-pivot-data-in-hive http://stackoverflow.com/questions/37436710/is-there-a-way-to-transpose-data-in-hive http://hadoopmania.blogspot.com/2015/12/transposepivot-table-in-hive.html

gkeys · ‎12-12-2016

You can use NiFis native reporting to monitor: https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Reporting_Tasks For your needs, there is a memory monitor which checks the amount of Java Heap available in the JVM for a particular JVM Memory Pool. If the amount of space used exceeds some configurable threshold, will warn (via a log message and System-Level Bulletin) that the memory pool is exceeding this threshold. See this post for more on NiFi Reporting: https://community.hortonworks.com/questions/69004/nifi-monitoring-processor-and-nifi-service.html#answer-69684

gkeys · ‎12-12-2016

JVM is measured in these 3 Ambari service metrics: NameNode Heap (HDFS): The percentage of NameNode JVM Heap used. ResourceManager Heap (YARN): The percentage of ResourceManager JVM Heap used. HBase Master Heap (HBase): The percentage of NameNode JVM Heap used. You can add these native widgets to the Ambari dashboard if you do not see them. See section 2.1.2 in: http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Ambari_Users_Guide/bk_Ambari_Users_Guide-20160509.pdf Alternatively, you can leverage Ambari 2.2 new grafana dashboarding capabilities to create much more granular and customized dashboard and reporting components from Ambari service metrics: http://hortonworks.com/blog/hood-ambari-metrics-grafana/ https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Ambari_Users_Guide/content/_using_grafana.html https://community.hortonworks.com/articles/2558/how-to-use-grafana-to-visualize-metrics-exposed-by.html

gkeys · ‎12-12-2016

Yes. (They are installed under the covers when installing Atlas service).

gkeys · ‎12-10-2016

You can be 100% that one forked flow file will not effect another. When a flow file is passed from one processor to another, the upstream processor passes a reference (to flowfile in content repository) to the second processor. When one processor forks the same flow file to two different processors, the flow file in content repository is CLONED ... reference of one clone is passed to one processor and reference to the other clone is passed to the second processor. Note that viewing the provenance of your flow live flow shows these reference-clone details. This explains flowfile life cycle, including explanation here: https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#pass-by-reference

Online	Offline
Last Visited	‎06-11-2019 01:24 AM

Member Since	‎06-20-2016 01:29 PM
Last Visited	‎06-11-2019 01:24 AM
Posts	488
Kudos received	430

Cloudera Community

Re: DR for hadoop

Re: API + how to know by API command all machines ...

Re: Does data get copied in edge node from externa...

Re: is it possible to set the hadoop.tmp.dir value...

Re: How to handle nulls when exporting from Hive?

Re: How to provide the path in NiFi FetchFile proc...

Re: How to provide the path in NiFi FetchFile proc...

Re: Hive Query does not run

Re: Hortonworks Data Cloud and R

Re: Hortonworks Data Cloud and R

Re: Hive:Transpose the set of rows

Re: Debugging NiFi flow

Re: How to monitor JVM in Ambari

Re: Falcon - Atlas backend store

Re: NiFi: Pass by Reference vs Copy on Write