About namaheshwari

namaheshwari · ‎03-07-2017

Prerequisite: Create an Account in S3 and get the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY AWS Command Line: For the AWS command line to work have AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY configured in ~/.aws/credentials. Something like: [default] aws_access_key_id=$AWS_ACCESS_KEY_ID aws_secret_access_key=$AWS_SECRET_ACCESS_KEY You might also want to set the region and output in ~/.aws/config. Something like: [default] region=us-west-2 output=json Steps: Create a bucket in S3. You can create it online on Amazon Console ( CreatingABucket.html ) or using the command line like: aws s3 mb $BUCKET_NAME Modify the below properties in core-site.xml: fs.defaultFS to s3a://$BUCKET_NAME fs.s3a.access.key to $AWS_ACCESS_KEY_ID fs.s3a.secret.key to $AWS_SECRET_ACCESS_KEY fs.AbstractFileSystem.s3a.imp to org.apache.hadoop.fs.s3a.S3A (HADOOP-11262) You might also want to set the below property in tez-site.xml if you need to run some Example jobs: tez.staging-dir to hdfs://$NN_HOST:8020/tmp/$user_name/staging (TEZ-3276) hive.exec.scratchdir to hdfs://$NN_HOST:8020/tmp/hive (For running Hive on Tez) Restart HDFS,YARN, MAPREDUCE2 You should now be able to use S3 as the Default FileSystem.

Online	Offline
Last Visited	‎09-24-2018 04:24 AM

Member Since	‎09-15-2015 06:18 PM
Last Visited	‎09-24-2018 04:24 AM
Posts	294
Kudos received	716

Cloudera Community

Using S3 as DefaultFs