Answer
New Member
Posts: 1
Registered: ‎11-05-2018
How can I put data into HDFS in Cloudera Altus on AWS

Hi All,

 

    I need to put data from S3 to HDFS in my Cloudera Altus Eng cluster. I can ssh to EC2 as a centos user, but actually I cannot authenticate as user different than hadoop service user I mean hdfs, yarn, oozie etc. I tried to use distcp but I cannot run mapreduce as a hdfs user.

 

    Is there any way to authenticate as a 'altus' user, the same as Cloudera Altus uses for Spark/MapReduce/Hive jobs?

 

Thanks in advance for any advices.

 

Regards,

 

Bart

View Entire Topic
Cloudera Employee
Posts: 44
Registered: ‎08-22-2014
Answered

Hi Bart,

 

By design Altus Data Engineering and Data Warehouse clusters intend for their long-term storage to reside in the cloud provider object store, such as AWS S3 or Microsoft's ADLS.  If youi're looking for the functionality of using a cloud-based cluster with the ability for traditional HDFS usage, Altus Director may be better suited for your use-case, as Director allows for creation of a full-fleged CDH cluster residing within the cloud that has the capability to utilize both S3/ADLS as well as cluster-local HDFS.

 

Listed below are a few links regarding Altus Director in the event you would like to know more about it's features and functionality, as well as how to install and configure it:

 

https://www.cloudera.com/products/product-components/cloudera-director.html

https://www.cloudera.com/documentation/director/latest/topics/director_intro.html