Community Articles
Find and share helpful community-sourced technical articles
Rising Star

On HDP 2.4, some services may have corrupted jar and tar.gz files on HDFS. The specific files I have seen broken are as follows:

  • hive.tar.gz
  • mapreduce.tar.gz
  • hadoop-streaming.jar
  • pig.tar.gz
  • spark-hdp-assembly.jar
  • sqoop.tar.gz
  • tez.tar.gz

All of these are found in the /hdp/apps/<hdp-version> directory. On my install, they all had zero size (reported as 0.1 kB on HDFS File View). This led to errors in a variety of services, including the following:

  • gzip: /foo/bar/yarn/local/filecache/11_tmp/tmp_mapreduce.tar.gz: unexpected end of file
  • tar: This does not look like a tar archive
  • Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher

There may be other errors, but those are the ones I personally experienced. This is fairly easy to fix. Each corrupt file has a healthy version on the local file system. The healthy version must be copied from the local system to HDFS, replacing the corrupt version. For example, to update Tez, perform the following:

$ hdfs dfs -rm /hdp/apps/<hdp-version>/tez/*

$ hdfs dfs put /usr/hdp/current/tez-client/lib/tez.tar.gz /hdp/apps/<hdp-version>/tez/

$ hdfs dfs -chmod 444 /hdp/apps/<hdp_version>/tez/tez.tar.gz

Problems caused by a corrupt tar on Tez should now be fixed

3,139 Views
Comments
Contributor

I am having the same issue with broken links. Any idea why the links will break? Its happening every week.