Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

CDP private cloud trial version on AWS - How to use CLI

avatar
New Contributor

Hi,

I have successfully installed CDP private cloud trial version on AWS using below instructions.
https://www.cloudera.com/tutorials/how-to-create-a-cdp-private-cloud-base-development-cluster.html

Question:
How to access the CDP components (spark, hive, etc) from my local laptop through CLI?
As part of installation, 4 hosts were created in aws, but I do not know how to access those hosts through CLI from my laptop. I am using ubuntu on my laptop.
Also I am able to access all the 4 aws hosts through ssh using below command, but after getting into the aws host I am not able to invoke pyspark because I am getting below error.

ssh command:
ssh -i '/home/<localuser>/.ssh/cdp-trial-key.pem' centos@<aws_ip_address>

Error:
org.apache.hadoop.security.AccessControlException: Permission denied: user=centos, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x

My objective:
I need to learn all components in CDP, write few lines of code using python and spark, access HDFS, etc. But I am new to this setup. Could you please help with solutions?

Thank you.

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Hi @Ragavend,

 

Happy to hear that you are exploring CDP Private Cloud and taking on the learning of the platform in a lab environment. To answer your questions:

1.  There are a few steps that are needed to access CDP Private Cloud CLI. Instructions are here: https://docs.cloudera.com/management-console/1.3.3/private-cloud-cli/topics/mc-private-cloud-cli-cli.... Note that you will need to allow external connections to your AWS EC2 instances in order to be able to issue commands from your laptop to the CDP cluster. This is also assuming you are talking about CDP CLI. If you are talking about AWS CLI (different tool entirely), then please see the many AWS tutorials available. 

2. In order to run pyspark, the user who is executing the the job needs to be able to create a log directory on hdfs. So, instead of running your command as root (i.e. centos) try running it as your CDP admin user. 

 

Hope this helps.

 

Regards,

Alex

View solution in original post

2 REPLIES 2

avatar
Master Collaborator

Hi @Ragavend,

 

Happy to hear that you are exploring CDP Private Cloud and taking on the learning of the platform in a lab environment. To answer your questions:

1.  There are a few steps that are needed to access CDP Private Cloud CLI. Instructions are here: https://docs.cloudera.com/management-console/1.3.3/private-cloud-cli/topics/mc-private-cloud-cli-cli.... Note that you will need to allow external connections to your AWS EC2 instances in order to be able to issue commands from your laptop to the CDP cluster. This is also assuming you are talking about CDP CLI. If you are talking about AWS CLI (different tool entirely), then please see the many AWS tutorials available. 

2. In order to run pyspark, the user who is executing the the job needs to be able to create a log directory on hdfs. So, instead of running your command as root (i.e. centos) try running it as your CDP admin user. 

 

Hope this helps.

 

Regards,

Alex

avatar
New Contributor

Hi @aakulov 

 

Welcome and thank you very much for the reply.

 

Regards,

Ragav