Support Questions

Find answers, ask questions, and share your expertise

parquet-tools :: No FileSystem for scheme hdfs

avatar
Contributor

Hello Friends:

 

On a relatively new installation of CDH6.1 (parcels) with one node for CDH manager and a second node for Master and Slave services (combined), I'm getting this error:

 

org.apache.hadoop.fs.UnsupportedFileSystemException:
    No FileSystem for scheme "hdfs"'

after running this:

 

user$ /opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/bin/parquet-tools \
          cat hdfs://tmp/1.parquet

Here is the output of hadoop classpath:

 

/etc/hadoop/conf:/opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-
6.1.0-1.cdh6.1.0.p0.770702/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/lib/hado
op/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/.//*:/opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0
.p0.770702/lib/hadoop/libexec/../../hadoop-yarn/.//*

Some pertinent environment variables:

 

user$ env | egrep -i 'hadoop|classpath'
HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop

 

Finally, there are two JAVA distributions installed; one OpenJDK and the other installed by the CDH6.x installation wizard. I tried running the above parquet-tools command with each JAVA distribution exported,  but both yield the same error. Here are the JAVA distributions:

 

user$ ls -al /usr/java /usr/lib/jvm
/usr/java:
total 12
drwxr-xr-x  3 root root 4096 Feb  1 01:52 .
drwxr-xr-x 14 root root 4096 Jan 21 21:01 ..
lrwxrwxrwx  1 root root   21 Feb  1 01:52 current.d -> jdk1.8.0_141-cloudera
drwxrwxr-x  8 root root 4096 Jan 21 21:01 jdk1.8.0_141-cloudera

/usr/lib/jvm:
total 24
drwxr-xr-x  4 root root  4096 Jan 21 20:44 .
dr-xr-xr-x 44 root root 12288 Feb  6 19:02 ..
lrwxrwxrwx  1 root root    26 Jan 21 20:44 java -> /etc/alternatives/java_sdk
lrwxrwxrwx  1 root root    32 Jan 21 20:44 java-1.8.0 -> /etc/alternatives/java_sdk_1.8.0
lrwxrwxrwx  1 root root    40 Jan 21 20:44 java-1.8.0-openjdk -> /etc/alternatives/java_sdk_1.8.0_openjdk
drwxr-xr-x  7 root root  4096 Jan 21 20:44 java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.i386
drwxr-xr-x  7 root root  4096 Jan 21 20:44 java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64
lrwxrwxrwx  1 root root    34 Jan 21 20:44 java-openjdk -> /etc/alternatives/java_sdk_openjdk
lrwxrwxrwx  1 root root    21 Jan 21 20:44 jre -> /etc/alternatives/jre
lrwxrwxrwx  1 root root    27 Jan 21 20:44 jre-1.8.0 -> /etc/alternatives/jre_1.8.0
lrwxrwxrwx  1 root root    35 Jan 21 20:44 jre-1.8.0-openjdk -> /etc/alternatives/jre_1.8.0_openjdk
lrwxrwxrwx  1 root root    49 Jan 21 20:44 jre-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.i386 -> java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.i386/jre
lrwxrwxrwx  1 root root    51 Jan 21 20:44 jre-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64 -> java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64/jre
lrwxrwxrwx  1 root root    29 Jan 21 20:44 jre-openjdk -> /etc/alternatives/jre_openjdk

Note that the setup/cluster is set to use/prefer CDH's JAVA.

 

Any ideas?

 

P.S. But for this, the entire cluster is (and has been) running perfectly.

 

Thank you!

 

 

1 ACCEPTED SOLUTION

avatar
Guru

Hi @prismalytics,

 

As documented in the Apache Github, we need to execute with hadoop jar command for a file on HDFS filesystem.

---
#Run from hadoop

See Commands Usage for command to use

hadoop jar ./parquet-tools-<VERSION>.jar <command> my_parquet_file.lzo.parquet
---

 

So could you please execute hadoop jar command as following?

 

hadoop jar /opt/cloudera/parcels/<CDH-VERSION>/jars/parquet-tools-<VERSION>.jar <command> <hdfs path to parquet file>

 

e.g.

hadoop jar /opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/jars/parquet-tools-1.9.0-cdh6.1.0.jar cat hdfs://tmp/1.parquet

 

Thanks and hope this helps,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

View solution in original post

3 REPLIES 3

avatar
Guru

Hi @prismalytics,

 

As documented in the Apache Github, we need to execute with hadoop jar command for a file on HDFS filesystem.

---
#Run from hadoop

See Commands Usage for command to use

hadoop jar ./parquet-tools-<VERSION>.jar <command> my_parquet_file.lzo.parquet
---

 

So could you please execute hadoop jar command as following?

 

hadoop jar /opt/cloudera/parcels/<CDH-VERSION>/jars/parquet-tools-<VERSION>.jar <command> <hdfs path to parquet file>

 

e.g.

hadoop jar /opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/jars/parquet-tools-1.9.0-cdh6.1.0.jar cat hdfs://tmp/1.parquet

 

Thanks and hope this helps,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Contributor

Hi @lwang:

 

Yes, your resolution worked with one minor tweak:

 

Need hdfs:/// instead of hdfs:// :

 

user$ hadoop jar /opt/cloudera/parcels/CDH/jars/parquet-tools-1.9.0-cdh6.1.0.jar cat hdfs:///tmp/1.parquet

or, if fully-qualifying the HDFS host, then the following (where hdfs:// will do):

 

user$ hadoop jar /opt/cloudera/parcels/CDH/jars/parquet-tools-1.9.0-cdh6.1.0.jar cat hdfs://vps00:8020/tmp/1.parquet

Thank you so very much! =:)

avatar
Guru
Great to hear the issue got resolved! Thanks for the feedback!

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum