Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Configure Hue on EMR with Hive meta store on S3

Highlighted

Configure Hue on EMR with Hive meta store on S3

New Contributor

Hi,

I am trying to launch an EMR cluster release 5.3.1 with Hive 2.1.1 and Hue 3.11.

I tried to follow the instruction in this page http://gethue.com/introducing-s3-support-in-hue/.

I am launching the cluster through python script using boto3 with the following configuration Json:

 

[
     {
      "Classification": "core-site",
      "Properties": {
        "fs.s3a.awsAccessKeyId":"<aws key>",
        "fs.s3a.awsSecretAccessKey": "<aws secret key>"
        }
    },
    {
      "Classification": "hive-site",
      "Properties": {
        "hive.metastore.warehouse.dir":"s3://<bucket_name>/<hive-folder>",
        "javax.jdo.option.ConnectionURL": "jdbc:mysql://<rds-url>:3306/hivedb?createDatabaseIfNotExist=true",
        "javax.jdo.option.ConnectionDriverName": "org.mariadb.jdbc.Driver",
        "javax.jdo.option.ConnectionUserName": "<db_user>",
        "javax.jdo.option.ConnectionPassword": "db_pass",
        "hive.exec.scratchdir":"/hive_temp/",
        "hive.exec.stagingdir" : "${hive.exec.scratchdir}/${user.name}/.staging",
        "hive.exec.dynamic.partition.mode":"nonstrict",
        "hive.exec.parallel":"true",
        "hive.exec.compress.intermediate":"true",
        "hive.optimize.index.filter":"true",
        "hive.optimize.index.groupby":"true",
        "hive.cluster.delegation.key.update-interval":"31536000000",
        "hive.cluster.delegation.token.renew-interval":"31536000000",
        "hive.cluster.delegation.token.max-lifetime":"31536000000"
        }
    },
    {
  "Classification": "hue-ini",
  "Properties": {},
  "Configurations": [
    {
      "Classification": "desktop",
      "Properties": {"user_access_history_size":"50",
                     "time_zone":"Europe/Berlin"
                     },
      "Configurations": [
        {
          "Classification": "database",
          "Properties": {
            "name": "hue_db",
            "user": "hue_user",
            "password": "hue_pass",
            "host": "<rds_host>",
            "port": "3306",
            "engine": "mysql"
          },
          "Configurations": []
        }
      ]
    },
    ## HUE AWS
    {
      "Classification": "aws",
      "Properties": {},
      "Configurations": [
      {
          "Classification": "aws_accounts",
          "Properties": {},
         "Configurations": [
      {     "Classification": "default",
             "Properties": {"allow_environment_credentials": "False",
                            "region": "eu-central-1"}
           }
          ]
          }]
        }
      ]
    }
  ]

This gives me back an error:
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the RunJobFlow operation: Classification 'aws_accounts' is not valid for parent classification 'aws'. Did I nested the Json incorrectly?
In addition if I remove the Hue AWS part, the cluster is launched without errors but when logging into Hue there is an error for misconfiguration:
Hive  - Failed to access Hive warehouse: s3://<my_bucket>/<hive_directory>

 

Also when going into Query editor, there is an error for "Could not connect to <master-node-ip>:1000".
Another point to mention is that Hue and Hive are on the same RDS MySQL server and Hue user has full access to Hive DB.

I also tried adding the AWS access keys to the Hue AWS part of the Json. But getting the same error for Hive misconfiguration.

Thanks in advance for your help.