Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How can I put data into HDFS in Cloudera Altus on AWS

How can I put data into HDFS in Cloudera Altus on AWS

Explorer

Hi All,

 

    I need to put data from S3 to HDFS in my Cloudera Altus Eng cluster. I can ssh to EC2 as a centos user, but actually I cannot authenticate as user different than hadoop service user I mean hdfs, yarn, oozie etc. I tried to use distcp but I cannot run mapreduce as a hdfs user.

 

    Is there any way to authenticate as a 'altus' user, the same as Cloudera Altus uses for Spark/MapReduce/Hive jobs?

 

Thanks in advance for any advices.

 

Regards,

 

Bart

2 REPLIES 2

Re: How can I put data into HDFS in Cloudera Altus on AWS

Contributor

Hi Bart,

 

By design Altus Data Engineering and Data Warehouse clusters intend for their long-term storage to reside in the cloud provider object store, such as AWS S3 or Microsoft's ADLS.  If youi're looking for the functionality of using a cloud-based cluster with the ability for traditional HDFS usage, Altus Director may be better suited for your use-case, as Director allows for creation of a full-fleged CDH cluster residing within the cloud that has the capability to utilize both S3/ADLS as well as cluster-local HDFS.

 

Listed below are a few links regarding Altus Director in the event you would like to know more about it's features and functionality, as well as how to install and configure it:

 

https://www.cloudera.com/products/product-components/cloudera-director.html

https://www.cloudera.com/documentation/director/latest/topics/director_intro.html

 

 

Re: How can I put data into HDFS in Cloudera Altus on AWS

Explorer
I understand. Then I need to modify my scala source code to be able to read from S3 too. Thanks!