About matt123

matt123 · ‎06-17-2017

${filename}.${now():toNumber()} I am newbie when it comes to NIFI . My cluster is managed by Cloudera manager . I am trying to do simple poc - for checking duplicates files . I would like to get my file written in HDFS as filename_now_date_timestamps(HH:MM:SS) I am trying the above expression unable to achive could anyone help me out this .

matt123 · ‎03-02-2017

@mbigelow Cant Thank you engouh Mate

matt123 · ‎03-02-2017

@mbigelow My English is not that good so I assume from ur answer that I can I set more than 8gb in yarn.scheduler.maximum-allocation-mb please correct me if I am wrong.

matt123 · ‎03-02-2017

@mbigelow Thanks for the explanation with example. its clear. One last clarification The default - yarn.scheduler.maximum-allocation-mb = 8024 - Will i be able to increase more than 8GB if I have enough Ram in my system.

matt123 · ‎03-01-2017

Thanks

matt123 · ‎03-01-2017

@mbigelow @mbigelow - Could you please clarify this - You could also increase the mapper memory as you increase the io.sort.mb. 1 . is it mandatory to increase the mapper memory as we increase io.sort.mb - does it have a dependencies . 2. Say if I increase the mapper memory then follow up I have to increase the yarn.scheduler.maximum-allocation-mb because of the yarn.nodemanager.vmem-pmem-ratio = 2.1 yarn.nodemanager.resource.memory.mb = 8192 mapreduce.map.java.opts = 2.5GB mapreduce.map.memory.mb = 3 gb mapreduce.task.io.sort.mb = 4gb - I can do this . 3. yarn.scheduler.maximum-allocation-mb = 8024 - Will i be able to increase the more than 8GB if I have enough Ram in my system. Thanks for the help

matt123 · ‎02-28-2017

Thanks for the information. Does hadoop metrics are collected by default or should we have to enable it. ? Could you please tell me Also one more quick clarification if there is too much spill in mapreduce job does it mean we have to increase io-sort mb , if so whats an ideal number should be can i start with 1000. mapreduce.task.io.sort.mb

matt123 · ‎02-28-2017

Hi Just started learning Hadoop, I have no idea about as to how to check if a mapreduce job is making spill or not . if so correct me if i am wrong we have to increase io-sort size , please help me out with this. 1 . Also what are all the other parameters that needs to be checked if there is too much spill in mapred-site.xml , hadoop-env.sh files.

Online	Offline
Last Visited	‎03-02-2017 07:24 PM

Member Since	‎02-28-2017 07:39 PM
Last Visited	‎03-02-2017 07:24 PM
Posts	11

Cloudera Community

Nifi - PutHDFS - now() time format

Re: How to see Mapreduce Spill Disk Activity

Re: How to see Mapreduce Spill Disk Activity

Re: How to see Mapreduce Spill Disk Activity

Re: How to see Mapreduce Spill Disk Activity

Re: How to see Mapreduce Spill Disk Activity

Re: How to see Mapreduce Spill Disk Activity

How to see Mapreduce Spill Disk Activity