Posts: 12
Registered: ‎11-05-2018
How can I put data into HDFS in Cloudera Altus on AWS

Hi All,


    I need to put data from S3 to HDFS in my Cloudera Altus Eng cluster. I can ssh to EC2 as a centos user, but actually I cannot authenticate as user different than hadoop service user I mean hdfs, yarn, oozie etc. I tried to use distcp but I cannot run mapreduce as a hdfs user.


    Is there any way to authenticate as a 'altus' user, the same as Cloudera Altus uses for Spark/MapReduce/Hive jobs?


Thanks in advance for any advices.





Other Answers: 1
Cloudera Employee
Posts: 44
Registered: ‎08-22-2014

Hi Bart,


By design Altus Data Engineering and Data Warehouse clusters intend for their long-term storage to reside in the cloud provider object store, such as AWS S3 or Microsoft's ADLS.  If youi're looking for the functionality of using a cloud-based cluster with the ability for traditional HDFS usage, Altus Director may be better suited for your use-case, as Director allows for creation of a full-fleged CDH cluster residing within the cloud that has the capability to utilize both S3/ADLS as well as cluster-local HDFS.


Listed below are a few links regarding Altus Director in the event you would like to know more about it's features and functionality, as well as how to install and configure it: