Support Questions
Find answers, ask questions, and share your expertise

How can I put data into HDFS in Cloudera Altus on AWS


Hi All,


    I need to put data from S3 to HDFS in my Cloudera Altus Eng cluster. I can ssh to EC2 as a centos user, but actually I cannot authenticate as user different than hadoop service user I mean hdfs, yarn, oozie etc. I tried to use distcp but I cannot run mapreduce as a hdfs user.


    Is there any way to authenticate as a 'altus' user, the same as Cloudera Altus uses for Spark/MapReduce/Hive jobs?


Thanks in advance for any advices.







Hi Bart,


By design Altus Data Engineering and Data Warehouse clusters intend for their long-term storage to reside in the cloud provider object store, such as AWS S3 or Microsoft's ADLS.  If youi're looking for the functionality of using a cloud-based cluster with the ability for traditional HDFS usage, Altus Director may be better suited for your use-case, as Director allows for creation of a full-fleged CDH cluster residing within the cloud that has the capability to utilize both S3/ADLS as well as cluster-local HDFS.


Listed below are a few links regarding Altus Director in the event you would like to know more about it's features and functionality, as well as how to install and configure it:



I understand. Then I need to modify my scala source code to be able to read from S3 too. Thanks!