About Harsh J

Harsh J · ‎12-03-2018

This may be a very basic question but I ask because it is unclear from the data you've posted: Have you accounted for replication? 50 GiB of HDFS file lengths summed up (hdfs dfs -du values) with 3x replication would be ~150 GiB of actual used space on the physical storage. The /dfs/dn is where the file block replicas are stored. Nothing unnecessary is retained in HDFS, however a common overlooked item is older snapshots retaining data blocks that are no longer necessary. Deleting such snapshots frees up the occupied space based on HDFS files deleted after the snapshot was made. If you're unable to grow your cluster, but need to store more data, then you may sacrifice availability of data by lowering your default replication to 2x or 1x (via dfs.replication config for new data writes, and hdfs dfs -setrep n for existing data).

ludof · ‎11-26-2018

Thank you very much @Harsh J! If I got it correctly these parameters oozie.launcher.mapreduce.map.java.opts oozie.launcher.mapreduce.reduce.java.opts oozie.launcher.yarn.app.mapreduce.am.command-opts control the maximum amount of memory allocated for the Oozie launcher. What are the equivalent parameters to control the memory allocated for the action instead (e.g. a Sqoop action), as shown in the image?

Shashis · ‎11-24-2018

Check your input file, if it seperated by other than ',' value, Please use --input-fields-terminated-by <char>. It will work. Let me know, incase, you have still issue. Thanks, Shashi

Harsh J · ‎11-23-2018

Your port for broker is incorrect in that command, you're supplying the ZooKeeper port of 2181 in an argument that requires the Broker client port of 9092. Follow our guide at https://www.cloudera.com/documentation/kafka/latest/topics/kafka_command_line.html for using the command line tools.

maziyar · ‎11-19-2018

Hi @Harsh J You mentioning the Safe Valve gave me an idea! I thought maybe the UI in CM is not setting one or both of those key/values. So I did this manually and it worked! Now every container asked by Spark Pipe() has the same owner as the Spark application itself (no more nobody or yarn! - there must be something with the UI that won't map one of those two configs back to yarn-site.xml):

maziyar · ‎11-18-2018

Have you found any way to run YARN container as the user who'is launched it? I also have set these two and have all the nodes sync with LDAP, still runs as nobody despite the fact I can see it says the yarn user request is ...

manoi · ‎11-13-2018

Hi, I am facing same error or problem, what was that property you mis-spelled or wrote incorrectly ? Please mention. Maoj

madankumarpuril · ‎11-05-2018

@Harsh J is there a way to avoid giving the column names manually... beacuse I have 150 columns per table and more than 200 tables which is a huge number.

sbpothineni · ‎11-01-2018

@Harsh J A couple of more questions: http://<rm http address:port>/ws/v1/cluster/apps?queue=root.queue1 filter by queue in resource manager rest interface only showing 'running' applications only, is there any way to show finsihed applications? eve mentioning 'states=running,finished' it still showing only running applications. We are using Cloudera 5.14. Is it possible to upgrade from CDH 5.14 to CDH 6.0 with out complete reinstall?, we are using parcels.

zbz · ‎10-22-2018

Thank you so munch！ I change the group of '/tmp/logs' to hadoop , and restart the JobHistoryServer role, everything being OK. So happy !

Member Since	‎07-31-2013 07:21 AM
Last Visited
Posts	1,924
Kudos received	461

Cloudera Community

Re: S3Guard Suggested to help fix Consistency

Re: Failed to start namenode. java.io.FileNotFound...

Re: sqoop import issue

Re: Efficient ways to store many images files

Re: S3 loading into HDFS

Re: Cleaning /dfs/dn sub-directories to free disk ...

Re: Best practices to correctly handle multiple co...

Re: ERROR tool.ExportTool: Error during export: E...

Re: Kafka is not working / Warning while sending m...

Re: YARN force nobody user on all jobs (and so the...

Re: OOzie shell action-scp with user

Re: HDFS HA not working

Re: Conversion of a file(with pipe(|), comma(,) an...

Re: Resource Manager UI authentication and filter ...

Re: Yarn Aggregate Log Retention Setting