Support Questions
Find answers, ask questions, and share your expertise

Error in accessing google cloud storage bucket via hadoop fs -ls that runs on Cloudera Hadoop CDH 6.3.3 integrated with Kerberos/SSL/LDAP cluster

New Contributor

Hi,

 

I am getting the below error while accessing a Google Cloud Storage bucket for the first time via Cloudera CDH 6.3.3 Hadoop Cluster. I am running the command on the edge node where Google Cloud SDK is installed. Reachability of Google Storage is only possible via HTTP proxy as of now.

 

Cloudera CDH 6.3.3 cluster is on-prem.

 

Below is the command that I run:

 

hadoop --loglevel trace fs -ls gs://distcppoc-2021-08-09/

 

Error is:

 

ls: Error accessing: bucket: distcppoc-2021-08-09

 

Last few lines when the Hadoop command is run:

21/08/10 21:07:42 DEBUG security.UserGroupInformation: hadoop login commit
21/08/10 21:07:42 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: <username>
21/08/10 21:07:42 DEBUG security.UserGroupInformation: Using user: "UnixPrincipal: <username>" with name <username>
21/08/10 21:07:42 DEBUG security.UserGroupInformation: User entry: "<username>"
21/08/10 21:07:42 DEBUG security.UserGroupInformation: UGI loginUser:<username> (auth:SIMPLE)
21/08/10 21:07:42 DEBUG core.Tracer: sampler.classes = ; loaded no samplers
21/08/10 21:07:42 TRACE core.TracerId: ProcessID(fmt=%{tname}/%{ip}): computed process ID of "FSClient/<ip>"
21/08/10 21:07:42 TRACE core.TracerPool: TracerPool(Global): adding tracer Tracer(FSClient/<IP>)
21/08/10 21:07:42 DEBUG core.Tracer: span.receiver.classes = ; loaded no span receivers
21/08/10 21:07:42 TRACE core.Tracer: Created Tracer(FSClient/<ip>) for FSClient
21/08/10 21:07:42 DEBUG fs.FileSystem: Loading filesystems
21/08/10 21:07:42 DEBUG fs.FileSystem: file:// = class org.apache.hadoop.fs.LocalFileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/hadoop-common-3.0.0-cdh6.3.3.jar
21/08/10 21:07:42 DEBUG fs.FileSystem: viewfs:// = class org.apache.hadoop.fs.viewfs.ViewFileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/hadoop-common-3.0.0-cdh6.3.3.jar
21/08/10 21:07:42 DEBUG fs.FileSystem: ftp:// = class org.apache.hadoop.fs.ftp.FTPFileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/hadoop-common-3.0.0-cdh6.3.3.jar
21/08/10 21:07:42 DEBUG fs.FileSystem: har:// = class org.apache.hadoop.fs.HarFileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/hadoop-common-3.0.0-cdh6.3.3.jar
21/08/10 21:07:42 DEBUG fs.FileSystem: http:// = class org.apache.hadoop.fs.http.HttpFileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/hadoop-common-3.0.0-cdh6.3.3.jar
21/08/10 21:07:42 DEBUG fs.FileSystem: https:// = class org.apache.hadoop.fs.http.HttpsFileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/hadoop-common-3.0.0-cdh6.3.3.jar
21/08/10 21:07:42 DEBUG fs.FileSystem: s3n:// = class org.apache.hadoop.fs.s3native.NativeS3FileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/hadoop-aws-3.0.0-cdh6.3.3.jar
21/08/10 21:07:42 DEBUG fs.FileSystem: gs:// = class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/gcs-connector-hadoop3-1.9.10-cdh6
21/08/10 21:07:42 DEBUG fs.FileSystem: hdfs:// = class org.apache.hadoop.hdfs.DistributedFileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/hadoop-hdfs-client-3.0.0-cdh6.3.3.jar
21/08/10 21:07:42 DEBUG fs.FileSystem: webhdfs:// = class org.apache.hadoop.hdfs.web.WebHdfsFileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/hadoop-hdfs-client-3.0.0-cdh6.3.3.jar
21/08/10 21:07:42 DEBUG fs.FileSystem: swebhdfs:// = class org.apache.hadoop.hdfs.web.SWebHdfsFileSystem from /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars/hadoop-hdfs-client-3.0.0-cdh6.3.3.j
21/08/10 21:07:42 DEBUG fs.FileSystem: Looking for FS supporting gs
21/08/10 21:07:42 DEBUG fs.FileSystem: looking for configuration option fs.gs.impl
21/08/10 21:07:42 DEBUG fs.FileSystem: Filesystem gs defined in configuration option
21/08/10 21:07:42 DEBUG fs.FileSystem: FS for gs is class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem
ls: Error accessing: bucket: distcppoc-2021-08-09
21/08/10 21:07:43 TRACE core.TracerPool: TracerPool(Global): removing tracer Tracer(FsShell/<ip>)
21/08/10 21:07:43 DEBUG util.ShutdownHookManager: Completed shutdown in 0.004 seconds; Timeouts: 0
21/08/10 21:07:43 DEBUG util.ShutdownHookManager: ShutdownHookManger completed shutdown.

 

Below are the configurations that are added to Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml in Cloudera Manager --> HDFS --> Configurations

 

 

<property>
    <name>fs.gs.working.dir</name>
    <value>/</value>
</property>
<property>
    <name>fs.gs.path.encoding</name>
    <value>uri-path</value>
</property>
<property>
    <name>fs.gs.auth.service.account.email</name>
    <value>serviceaccount@dummyemail.iam.gserviceaccount.com</value>
</property>
<property>
    <name>fs.gs.auth.service.account.private.key.id</name>
<value>52d6ad0c6ecb7f6da9</value>
</property>
<property>
    <name>fs.gs.auth.service.account.private.key</name>
    <value>MIIEvgIBADANBgkq<FULL PRIVATE KEY>MMASBjSOTA1j+jL</value>
</property>

 

Restarted HDFS Services.

gsutil command works fine when it is run from an on-prem cluster.

 

 

Command: gsutil ls gs://distcppoc-2021-08-09                                                                                                                     
Output: gs://distcppoc-2021-08-09/sftp.png

 

GCS Connector is installed on all the Cloudera Cluster Hadoop nodes at the below location:

 

Location: /opt/cloudera/parcels/CDH-6.3.3-1.cdh6.3.3.p4762.13062148/jars
Jar file: gcs-connector-hadoop3-1.9.10-cdh6.3.3-shaded.jar

 

Can I get some help here?

P.S: This is the first time I am putting a question, so please correct me if I am putting the question in the wrong way.

5 REPLIES 5

Re: Error in accessing google cloud storage bucket via hadoop fs -ls that runs on Cloudera Hadoop CDH 6.3.3 integrated with Kerberos/SSL/LDAP cluster

Cloudera Employee

The user with which you are trying to access the bucket has all the appropriate permissions to read,write on that particular bucket ?

Re: Error in accessing google cloud storage bucket via hadoop fs -ls that runs on Cloudera Hadoop CDH 6.3.3 integrated with Kerberos/SSL/LDAP cluster

New Contributor

Hi @Atahar 

 

Yes, I have made the user 'StorageAdmin' just to avoid any confusion.

 

Thanks

Re: Error in accessing google cloud storage bucket via hadoop fs -ls that runs on Cloudera Hadoop CDH 6.3.3 integrated with Kerberos/SSL/LDAP cluster

Cloudera Employee

So does the storageadmin has all the required permissions on the bucket ?

Re: Error in accessing google cloud storage bucket via hadoop fs -ls that runs on Cloudera Hadoop CDH 6.3.3 integrated with Kerberos/SSL/LDAP cluster

Community Manager

@bell1985, were you able to resolve your issue? If you have, can you please provide the solution or mark the appropriate reply as a solution? If you are still experiencing the issue, can you provide the information @Atahar has requested?


Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Re: Error in accessing google cloud storage bucket via hadoop fs -ls that runs on Cloudera Hadoop CDH 6.3.3 integrated with Kerberos/SSL/LDAP cluster

New Contributor

Hi Atahar,

 

Yes, all the permissions are given to storageadmin role.

 

Thanks