Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

org.apache.solr.hadoop.HdfsFindTool not available in Cloudera 6.1

org.apache.solr.hadoop.HdfsFindTool not available in Cloudera 6.1

New Contributor

After upgrading Cloudera to 6.1 

org.apache.solr.hadoop.HdfsFindTool

seems to be no longer available in the 

search-mr-job.jar

 

hadoop jar /opt/cloudera/parcels/CDH-6.1.1-1.cdh6.1.1.p0.875250/lib/solr/contrib/mr/search-mr-job.jar org.apache.solr.hadoop.HdfsFindTool
WARNING: Use "yarn jar" to launch YARN applications.
Exception in thread "main" java.lang.ClassNotFoundException: org.apache.solr.hadoop.HdfsFindTool
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:306)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:227)

 

are there any alternatives? Is there a way to run older version together with the 6.1 and newer stack? is this class contained in any other libraries?

 

3 REPLIES 3

Re: org.apache.solr.hadoop.HdfsFindTool not available in Cloudera 6.1

Master Guru
The search-based HDFS find tool has been removed and is superseded in C6 by the native "hdfs dfs -find" command, documented here: https://hadoop.apache.org/docs/r3.1.2/hadoop-project-dist/hadoop-common/FileSystemShell.html#find

Re: org.apache.solr.hadoop.HdfsFindTool not available in Cloudera 6.1

New Contributor

The problem is, find doesn't seem to take the parameters that old tool, or OS level find accepts - and I cannot find documentation of viable expressions and its syntax:

 

hdfs dfs -find /example/path -mtime +14
find: Unexpected argument: -mtime
Highlighted

Re: org.apache.solr.hadoop.HdfsFindTool not available in Cloudera 6.1

New Contributor

So - if I am correct, native hdfs find accepts only two expressions:

 

The following primary expressions are recognised:

-name pattern
-iname pattern

Evaluates as true if the basename of the file matches the pattern using standard file system globbing. If -iname is used then the match is case insensitive.

-print
-print0

Always evaluates to true. Causes the current pathname to be written to standard output. If the -print0 expression is used then an ASCII NULL character is appended.

Which makes it useless for searching files-older-then - which was main use case for the HdfsFindTool. Is there any chance of the HdfsFindTool being brought back? Or any workaround for how to make it work with newer cloudera?