Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar
Guru

Steps to Create Table in Hive on S3A with Ranger

  • Create a bucket with a unique name, I've used "myhivebucket" and do not change any details in the permissions
  • Complete the "Create bucket" wizard by clicking on "create bucket" button
  • Make the following entries in custom hdfs-site.xml
    • fs.s3a.access.key = <access key>
    • fs.s3a.secret.key = <access secret>
    • fs.s3a.impl = org.apache.hadoop.fs.s3a.S3AFileSystem
  • To retrieve the value for access key and secret, follow these steps:
    • Login to https://aws.amazon.com/console
    • Click on "Sign in to the console" tab
    • Login with appropriate credentials
    • Once logged in, you should see your login name on the top right corner of the AWS page
    • Click on the drop-down arrow beside your login name and select "My Security Credentials"
    • This should take you to a page titled "Your Security Credentials"
    • From this page, collapse the option that says "Access Keys(Access Key ID and Secret Access Key)"
    • You have to click on "create a new access key" because of Amazon limitation described here
    • This lets you download the key/secret in this format (this is not case sensitive)
      • AWSAccessKeyId=XXXXXXXXXXXXXXXXXXXXX
      • AWSSecretKey=XXXXXxxxxxXXXXXxxxxxXXXXX/xxxxx
    • Value for "fs.s3a.access.key" will be the value for "AWSAccessKeyId"
    • Value for "fs.s3a.secret.key" will be the value for "AWSSecretKey"
  • Login to ranger admin interface and create a policy for hive/desired user to allow the desired permissions

14627-ranger-policy-hive.png

  • Now login to hive with the kerberos credentials, as required via beeline and create table ensuring that the location is on s3a
[hive@xlnode-standalone ~]$ beeline -u "jdbc:hive2://xlnode-standalone.hwx.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" 
   WARNING: Use "yarn jar" to launch YARN applications.
   Connecting to jdbc:hive2://xlnode-standalone.hwx.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
   Connected to: Apache Hive (version 1.2.1000.2.4.3.0-227)
   Driver: Hive JDBC (version 1.2.1000.2.4.3.0-227)
   Transaction isolation: TRANSACTION_REPEATABLE_READ
   Beeline version 1.2.1000.2.4.3.0-227 by Apache Hive
   0: jdbc:hive2://xlnode-standalone.hwx.com:218> create table mys3test (col1 int, col2 string) row format delimited fields terminated by ',' stored as textfile location 's3a://myhivebucket/test';
   No rows affected (12.04 seconds)
   0: jdbc:hive2://xlnode-standalone.hwx.com:218>
  • Now try and insert some rows
0: jdbc:hive2://xlnode-standalone.hwx.com:218> insert into mys3test values (1,'test'),(2,'test');
   Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [hive] does not have [UPDATE] privilege on [default/mys3test] (state=42000,code=40000)
   0: jdbc:hive2://xlnode-standalone.hwx.com:218>
  • The above error is intentional, since we do not have "UPDATE" privilege assigned via ranger, we cannot insert the values yet, allow the permission and INSERT again

14629-ranger-policy-hive-update.png

  • Validate INSERT/UPDATE and SELECT
0: jdbc:hive2://xlnode-standalone.hwx.com:218> insert into mys3test values (1,'test'),(2,'test');
   INFO  : Tez session hasn't been created yet. Opening session
   INFO  : Dag name: insert into mys3test ...1,'test'),(2,'test')(Stage-1)
   INFO  : 
   INFO  : Status: Running (Executing on YARN cluster with App id application_1492107639289_0002)
   INFO  : Map 1: -/- 
   INFO  : Map 1: 0/1 
   INFO  : Map 1: 0(+1)/1 
   INFO  : Map 1: 0(+1)/1 
   INFO  : Map 1: 0(+1)/1 
   INFO  : Map 1: 0(+1)/1 
   INFO  : Map 1: 0/1 
   INFO  : Map 1: 1/1 
   INFO  : Loading data to table default.mys3test from s3a://myhivebucket/test/.hive-staging_hive_2017-04-13_19-27-13_226_6105571528298793138-1/-ext-10000
   INFO  : Table default.mys3test stats: [numFiles=1, numRows=2, totalSize=14, rawDataSize=12]
   No rows affected (53.854 seconds)
   0: jdbc:hive2://xlnode-standalone.hwx.com:218> select * from mys3test;
   +----------------+----------------+--+
   | mys3test.col1  | mys3test.col2  |
   +----------------+----------------+--+
   | 1              | test           |
   | 2              | test           |
   +----------------+----------------+--+
   2 rows selected (3.554 seconds)
   0: jdbc:hive2://xlnode-standalone.hwx.com:218> 

ranger-policy-hive-update.png
3,926 Views