Member since
01-22-2016
41
Posts
10
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1665 | 05-22-2017 04:03 PM | |
19585 | 09-26-2016 09:00 AM | |
4659 | 05-23-2016 09:11 PM | |
7492 | 05-23-2016 08:00 AM |
05-23-2017
10:00 AM
@Pardeep This code in theory runs perfectly for me with the hdfs stdout showing: Replication 3 set: /apps/hive/warehouse....
however once the script has finished, the blocks still remain under replicated. Any idea as to what else I could do?
... View more
05-22-2017
04:03 PM
There was an issue with the Ranger KMS UI which prevented me from making any changes to the policy. Instead I used the API to update the policy which worked successfully. The change I made was to add the HDFS user to the 'GENERATE_EEK' policy. API documentation and resources: https://community.hortonworks.com/articles/76118/how-to-access-ranger-kms-policies-via-rest-api.html https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+0.6+-+REST+APIs+for+Service+Definition%2C+Service+and+Policy+Management
... View more
05-19-2017
09:43 AM
I've recently upgraded the cluster to HDP 2.5.3 as well as Ambari to 2.4.2.0 however I'm now facing problems running Hive queries. Each query that invokes Tez (i.e. `insert`) results in the following error: Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): java.util.concurrent.ExecutionException: org.apache.hadoop.security.authorize.Authori
zationException: User:hdfs not allowed to do 'GENERATE_EEK' on 'hive'
Here are my commands: $ kinit -kt /etc/security/keytabs/automation.keytab
$ beeline -u 'jdbc:hive2://hiverserver2:10000/default;principal=hive/hiverserver2@ACTIVE.DIRECTORY' -f hive_script.hql This is obviously something that was working before the upgrade. Why is it running the script as the hdfs user? I have not added the `hdfs` user to the 'GENERATE_EEK' property on the Ranger KMS UI as this is not advised (and also not permitted). Are there any settings that need to be adjusted after the upgrade?
... View more
10-06-2016
02:55 PM
Hi @vasanath rajendran check out this article https://community.hortonworks.com/questions/58096/is-there-a-working-python-hive-library-that-connec.html#answer-58343
... View more
09-26-2016
09:00 AM
2 Kudos
This connection string will work as long as the user running the script has a valid kerberos ticket: import pyhs2
with pyhs2.connect(host='beelinehost@hadoop.com',
port=10000,
authMechanism="KERBEROS")as conn:
with conn.cursor()as cur:
print cur.getDatabases() Username, password and any other configuration parameters are not passed through the KDC.
... View more
09-23-2016
03:14 PM
1 Kudo
I have tried using the following Python libraries to connect to a kerberised Hive instance: PyHive
Impyla
Pyhs2 None of them seem to be able to connect. Here is the error message I see when using Impyla: >>> from impala.dbapi import connect
>>> conn = connect(host='hdpmaster.hadoop',port=10000,database='default',auth_mechanism='GSSAPI',kerberos_service_name='user1')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/impala/dbapi.py", line 147, in connect
auth_mechanism=auth_mechanism)
File "/usr/local/lib/python2.7/dist-packages/impala/hiveserver2.py", line 658, in connect
transport.open()
File "/usr/local/lib/python2.7/dist-packages/thrift_sasl/__init__.py", line 72, in open
message=("Could not start SASL: %s" % self.sasl.getError()))
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server not found in Kerberos database)
Does anyone have a working connection string? Thanks,
Dale
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
07-20-2016
03:08 PM
@skonduru interestingly, setting "sAMAccountName":"$sAMAccountName.substring(0,20)", failed for me when installing kerberos. And wouldn't this also result in an inconsistent naming convention? E.g. sAMAccountName for HDFS would be: hdfs/node01.hadoop.p but sAMAccountName for Zookeeper would be: zookeeper/node01.had Is there a better way to achieve a consistent naming convention?
... View more
07-19-2016
02:35 PM
Is rangerrepouser listed in Ranger UI? For HA
configuration to work, need to add the below properties in repo config (I.e.
additional entries in the advanced section). They can be copied from
hdfs-site.xml. dfs.nameservices = <ha_name>
dfs.ha.namenodes.<ha_name> = <nn1,nn2>
dfs.namenode.rpc-address.<nn1> = <nn1_host:8020>
dfs.namenode.rpc-address.<nn2> = <nn2_host:8020>
dfs.client.failover.proxy.provider.<nn2> = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
... View more
07-19-2016
12:57 PM
Also make sure the parameters in Ambari are correct for the plug ins, restart HDFS & Ranger and then make sure the parameters in Ranger UI are correct. Are you using HDFS HA?
... View more
07-18-2016
09:09 AM
How did this go for you @skonduru ? Did you have to do the additional stuff to limit the value?
... View more