About Harsh J

Harsh J · ‎07-03-2018

Cell TTLs are a HFile V3 feature (as far as persistence goes). CDH5 HBase uses HFile V2 by default for backward compatibility reasons with older CDH5 HBase versions. To persist features properly into HFiles, you must manually enable the HFile V3 feature. You are likely missing the following configuration property: <property> <name>hfile.format.version</name> <value>3</value> </property> This must be manually added into both the CM fields noted below: HBase - Configuration - 'HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml' HBase - Configuration - 'HBase Client Advanced Configuration Snippet (Safety Valve) for hbase-site.xml' Note that enabling this will not let you rollback your new HFiles of V3 back to V2 in future. That said, we do have a lot of users on V3 running without any issues.

Harsh J · ‎06-09-2018

It is a recommendation based on the fact that active and standby are merely states of the NameNode and not different daemons. The NameNode doesn't check it's own hardware to be the same as other NameNodes if that's what you are worried about.

Harsh J · ‎06-09-2018

The username "dr.who" is the default identity of anyone connecting to HDFS or YARN via the web server (REST) APIs in an unsecured, non-kerberos cluster where a connecting user's identity cannot be authentically determined. If your cluster is exposed to the internet, and/or you are unable to recognize any of these jobs, I'd recommend shutting down the service immediately and investigating a possible external attack. I'd also strongly recommend following: https://blog.cloudera.com/blog/2017/01/how-to-secure-internet-exposed-apache-hadoop/ and securing your cluster in such a deployment.

Harsh J · ‎06-03-2018

The path you are specifying (/home/cloudera/Documents/ard.txt) appears to be a local filesystem path coming from your Linux machine. The URI you are supplying (hdfs://quickstart.cloudera:8020) is indicating to the program to use Hadoop Distributed FileSystem, not Local FileSystem. You may want to ensure such a path exists on HDFS via commands such as 'hadoop fs -ls /home/cloudera/Documents/ard.txt', or use commands 'hadoop fs -mkdir -p /home/cloudera/Documents/' and 'hadoop fs -copyFromLocal /home/cloudera/Documents/ard.txt hdfs:///home/cloudera/Documents/ard.txt' to place it first on HDFS on the same path. P.s. "Home" paths on HDFS are typically /user/{username}, not /home like on Linux.

Harsh J · ‎05-22-2018

@Smitha is right here. The below step specifically is incorrect. > jar cvfm MySource.jar manifest.mf MySource.class Your class is within a package (com.cloudera.flume.source) but the jar is loading them into the top level package. The ideal way would be to do this: ~> mkdir -p com/cloudera/flume/source/ ~> mv MySource.class com/cloudera/flume/source/ ~> jar cvf MySource.jar com/cloudera/flume/source/MySource.class Doing the above steps within your sequence would ensure the class gets placed in the declared package instead of at the top level. More generally, you can avoid these forms of trivial packaging mistakes by using a formal build tool/system such as Maven, or even IDEs such as IntelliJ or Eclipse which allow archive building from source projects. These package jars for you in the required form, maintaining namespaces perfectly among several other benefits.

Harsh J · ‎05-21-2018

The container memory usage limits are driven not by the available host memory but by the resource limits applied by the container configuration. For example if you've configured a map task to use 1 GiB of pmem, but its actual code at runtime uses more than 1 GiB, it will get killed. The common resolution to this would be to grant it more than 1 GiB, so it may do its higher-memory work without exceeding what it is given. Another resolution in certain cases would be to investigate if the excess memory use is justified, which can be discussed with the developer of the application. The randomness may be dependent on the amount of data the container code processes and what it ends up doing with it. Have you tried increasing the memory properties of containers via fields such as "Map Task Memory", "Reduce Task Memory" if its MR jobs you are having issues with, or pass higher values to --executor-memory arguments with spark-submit if its Spark jobs instead. This is all assuming you are seeing an error of the below form, since the relevant log isn't shared in your post: … Container killed by YARN for exceeding memory limits. 1.1 GB of 1 GB physical memory used …

Harsh J · ‎05-21-2018

The problem appears to stem from a version incompatibility. The kudu-python's latest release in PyPI is 1.7.0, which references some changes that are not yet in the most recent packages available over the Kudu website (1.4.0-cdh5.12.0). Since the installed devel packages of Kudu 1.4.x are lacking some of the new headers referenced by the Kudu Python client of 1.7.x, the error pops up. You can install a more compatible version of kudu-python client instead, by running: pip install Cython kudu-python==1.2.0 Where 1.2.0 is the prior release version of kudu-python and should work with the 1.4.0 devel package you have installed already.

Harsh J · ‎05-20-2018

Do you perchance have any snapshots held from before the 'hdfs dfs -setrep 2' command was executed, under the target path (/backups)? If you do have a snapshot, and the over replicated count is still stuck, this behaviour can be explained, because replication factor is a file based attribute and the older snapshot references the higher replication factor, disallowing the deletion of the now-excess block. You can run the below to discover existing snapshots, as the 'hdfs' user (or equivalent superuser): ~> hdfs lsSnapshottableDir ~> # For every directory printed above as $DIR: ~> hdfs dfs -ls $DIR/.snapshot/

Harsh J · ‎05-17-2018

The pattern of your issue isn't clear - could you help answer a few more questions? - Is this consistently occurring on all your NodeManagers? - Did this start occurring after you upgraded? If yes, what was the earlier version and the upgraded version? - Did this instead start occurring after an abrupt restart of the daemon or the host? - Do you have NodeManager logs covering the earliest time period this issue was observed? Could you share those here? Overall this appears to be related to NodeManager's container recovery feature (a corruption of the data stored for this feature in the local filesystem of the NodeManager) and you should be able to bypass the issue if you (re)moved the contents of /var/lib/hadoop-yarn/yarn-nm-recovery/ directory on the affected NodeManagers. This effectively resets the states maintained, which should be OK to perform on a NodeManager that is down. Full trace for posterity: Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.fusesource.leveldbjni.internal.NativeDB$DBJNI.Get(JLorg/fusesource/leveldbjni/internal/NativeReadOptions;Lorg/fusesource/leveldbjni/internal/NativeSlice;J)J+0 j org.fusesource.leveldbjni.internal.NativeDB.get(Lorg/fusesource/leveldbjni/internal/NativeReadOptions;Lorg/fusesource/leveldbjni/internal/NativeSlice;)[B+22 j org.fusesource.leveldbjni.internal.NativeDB.get(Lorg/fusesource/leveldbjni/internal/NativeReadOptions;Lorg/fusesource/leveldbjni/internal/NativeBuffer;)[B+10 j org.fusesource.leveldbjni.internal.NativeDB.get(Lorg/fusesource/leveldbjni/internal/NativeReadOptions;[B)[B+20 j org.fusesource.leveldbjni.internal.JniDB.get([BLorg/iq80/leveldb/ReadOptions;)[B+27 j org.fusesource.leveldbjni.internal.JniDB.get([B)[B+26 j org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadVersion()Lorg/apache/hadoop/yarn/server/records/Version;+9 j org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.checkVersion()V+1 j org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(Lorg/apache/hadoop/conf/Configuration;)V+10 j org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(Lorg/apache/hadoop/conf/Configuration;)V+2 j org.apache.hadoop.service.AbstractService.init(Lorg/apache/hadoop/conf/Configuration;)V+80 j org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(Lorg/apache/hadoop/conf/Configuration;)V+98 j org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(Lorg/apache/hadoop/conf/Configuration;)V+20 j org.apache.hadoop.service.AbstractService.init(Lorg/apache/hadoop/conf/Configuration;)V+80 j org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(Lorg/apache/hadoop/conf/Configuration;Z)V+50 j org.apache.hadoop.yarn.server.nodemanager.NodeManager.main([Ljava/lang/String;)V+39

Harsh J · ‎05-17-2018

In unsecured mode, all YARN container processes execute as the Linux local user "yarn". This cannot be changed unless you either enable Kerberos based security or explicitly turn on the LinuxContainerExecutor [1], which will also require ensuring that local Linux accounts exist for all job submitting user. The HADOOP_USER_NAME value affects only 'hadoop' and other related Apache Hadoop/Ecosystem commands. Since the 'scp' program is not a Hadoop program, it does not get influenced by the username carried by that variable. It instead runs as the linux user that runs the shell script - which is "yarn" due to the above. [1] - https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_sg_other_hadoop_security.html#topic_18_3 and 'Always Use Linux Container Executor' under CM -> YARN -> Configuration

Member Since	‎07-31-2013 07:21 AM
Last Visited
Posts	1,924
Kudos received	461

Cloudera Community

Re: S3Guard Suggested to help fix Consistency

Re: Failed to start namenode. java.io.FileNotFound...

Re: sqoop import issue

Re: Efficient ways to store many images files

Re: S3 loading into HDFS

Re: HBase Cell level TTL does not work when after ...

Re: HDFS High Availability - is the Active

Re: What is Dr.who user. 100s of yarn jobs are get...

Re: unabel to load data in using spark

Re: Flume custom source

Re: container exit code 137 and memory usage is li...

Re: can not install kudu-python

Re: after -setrep from 3 to 2, Over-replicated blo...

Re: Yarn NodeManager fails to start and crashing w...

Re: OOzie shell action-scp with user