Member since
05-10-2016
184
Posts
60
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4105 | 05-06-2017 10:21 PM | |
4109 | 05-04-2017 08:02 PM | |
5024 | 12-28-2016 04:49 PM | |
1243 | 11-11-2016 08:09 PM | |
3338 | 10-22-2016 03:03 AM |
11-08-2016
07:18 PM
Can you trying creating an encryption zone on a path which already exists on HDFS ? The error says that it cannot find /enc_zone2. Trying doing "hdfs dfs -mkdir -p /enc_zone2 and then execute this command.
... View more
11-07-2016
06:20 PM
2 Kudos
Error org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : Access denied for user 'hive'@'node1.hortonworks.com' (using password: YES)
SQL Error code: 1045
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
at org.apache.hive.beeline.HiveSchemaHelper.getConnectionToMetastore(HiveSchemaHelper.java:80)
at org.apache.hive.beeline.HiveSchemaTool.getConnectionToMetastore(HiveSchemaTool.java:133)
at org.apache.hive.beeline.HiveSchemaTool.testConnectionToMetastore(HiveSchemaTool.java:187)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:291)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:277)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:526)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233) Background The above output is an attempt of Ambari trying to initiate the hive metastore process based on the parameters supplied in the following properites javax.jdo.option.ConnectionDriverName
javax.jdo.option.ConnectionURL
javax.jdo.option.ConnectionUserName
This is mostly not required if all the credentials are aligned, however, its possible we might end up with such issues when we are trying to use customized DB access. The above error is relevant to mysql db as metastore repository, however, same principal can be applied to other DBs as well. Mysql's authentication mechanism allows user/host credentials to be defined in a certain order. Here is an example. mysql> use mysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select user,host from user;
+------+-----------------------+
| user | host |
+------+-----------------------+
| hive | % |
| hive | 127.0.0.1 |
| root | 127.0.0.1 |
| hive | ::1 |
| root | ::1 |
| hive | localhost |
| root | localhost |
| hive | node1.hortonworks.com |
| root | node1.hortonworks.com |
| hive | node2.hortonworks.com |
| hive | node3.hortonworks.com |
+------+-----------------------+
11 rows in set (0.00 sec)
We can verify the error as indicated by the ambari log output at mysql level, using the superuser access, because this is what Ambari server is trying to do. [root@node1 ~]# mysql -u hive -h node1.hortonworks.com -p
Enter password:
ERROR 1045 (28000): Access denied for user 'hive'@'node1.hortonworks.com' (using password: YES)
How-to-fix Observe the output before exception from Ambari log, look at the username, password and the command string which is being attempted. resource_management.core.exceptions.Fail: Execution of 'export HIVE_CONF_DIR=/usr/hdp/current/hive-metastore/conf/conf.server ; /usr/hdp/current/hive-server2-hive2/bin/schematool -initSchema -dbType mysql -userName hive -passWord [PROTECTED] -verbose' returned 1. which: no hbase in (/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-1.b15.el7_2.x86_64/jre/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/var/lib/ambari-agent:/var/lib/ambari-agent) If you are not using the root user, ensure that you add another property to hive-site.xml i.e., javax.jdo.option.ConnectionPassword with password for the user that is configured for property javax.jdo.option.ConnectionUserName At mysql level, login as root user and grant with the following syntax (you might need to execute 'flush privileges' post this, not necessary. We do need to note that if you do not specify "identified by", the error would still remain even though the user 'hive' has been configured to be identified by 'password' which was set while creating the user. mysql> grant all on *.* to 'hive'@'node1.hortonworks.com' identified by 'changeme';
Query OK, 0 rows affected (0.00 sec) Even if local to the host, test out if you are able to login with the supplied credentials [root@node1 ~]# mysql -u hive -h node1.hortonworks.com -p
Enter password:
... View more
Labels:
11-01-2016
06:30 PM
1 Kudo
good catch Neeraj. You can try disabling the ranger plugin via Ambari to test if it works otherwise.
... View more
10-26-2016
11:05 PM
Problem Using rhive with following parameters to connect to hive library(RHive)
Sys.setenv(HADOOP_CONF_DIR="/etc/hadoop/conf")
Sys.setenv(RHIVE_HIVESERVER_VERSION="2")
Sys.setenv(HIVE_HOME="/usr/hdp/current/hive-client")
Sys.setenv(HADOOP_HOME="/usr/hdp/current/hadoop-client")
rhive.init()
> rhive.connect()
2016-10-26 13:06:22,836 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-10-26 13:06:23,720 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-10-26 13:06:24,631 WARN [main] shortcircuit.DomainSocketFactory (DomainSocketFactory.java:<init>(117)) - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2016-10-26 13:06:25,237 INFO [Thread-5] jdbc.Utils (Utils.java:parseURL(309)) - Supplied authorities: 127.0.0.1:10000
2016-10-26 13:06:25,240 INFO [Thread-5] jdbc.Utils (Utils.java:parseURL(397)) - Resolved authority: 127.0.0.1:10000
2016-10-26 13:06:25,292 INFO [Thread-5] jdbc.HiveConnection (HiveConnection.java:openTransport(209)) - Will try to open client transport with JDBC Uri: jdbc:hive2://127.0.0.1:10000/default
Exception in thread "Thread-5" java.lang.RuntimeException: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at com.nexr.rhive.hive.HiveJdbcClient$HiveJdbcConnector.connect(HiveJdbcClient.java:332)
at com.nexr.rhive.hive.HiveJdbcClient$HiveJdbcConnector.run(HiveJdbcClient.java:314)
Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:255)
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:246)
at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:592)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:195)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at com.nexr.rhive.hive.DatabaseConnection.connect(DatabaseConnection.java:52)
at com.nexr.rhive.hive.HiveJdbcClient$HiveJdbcConnector.connect(HiveJdbcClient.java:325)
... 1 more
Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:266)
at org.apache.hive.service.cli.CLIService.openSessionWithImpersonation(CLIService.java:202)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:402)
at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:297)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1253)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1238)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:83)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy20.open(Unknown Source)
at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:258)
... 12 more
Caused by: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.hive.service.cli.session.HiveSessionImpl.open(HiveSessionImpl.java:137)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
... 20 more
Caused by: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException:Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at org.apache.hadoop.ipc.Client.call(Client.java:1427)
at org.apache.hadoop.ipc.Client.call(Client.java:1358)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:596)
at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
... 25 more
Resolution The mechanism to pass on a user name resolves the issue. Here is the example of the passing the connection string which works for rhive. library(RHive)
Sys.setenv(HADOOP_CONF_DIR="/etc/hadoop/conf")
Sys.setenv(RHIVE_HIVESERVER_VERSION="2")
Sys.setenv(HIVE_HOME="/usr/hdp/current/hive-client")
Sys.setenv(HADOOP_HOME="/usr/hdp/current/hadoop-client")
rhive.init()
rhiveConnection<-rhive.connect(host="127.0.0.1", port=10000, hiveServer2=NA, defaultFS=NULL, updateJar=FALSE, user="hive", password=NULL, db="default", properties = character(0))
NOTE: Ensure following parameters are set correctly JAVA_HOME PATH should pick our JAVA_HOME's binary so "export PATH=$JAVA_HOME/bin:${PATH}" Rhive looks for binaries, thus when setting the HIVE_HOME and HADOOP_HOME, ensure that hive-client and hadoop-client directories are being picked. Verify environment variables for rhive using rhive.env() Very important to use doube quotes "" for enclosing the string/character values, like user/database name
... View more
Labels:
10-25-2016
08:06 PM
2 Kudos
Observed in Teradata's distribution of Hortonworks, when ACID is enabled via Ambari, following error can show up. Error: java.lang.AbstractMethodError: Method org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract (state=,code=0)
org.apache.hive.service.cli.HiveSQLException: java.lang.AbstractMethodError: Method org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:258)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:244)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:247)
at org.apache.hive.beeline.Commands.execute(Commands.java:848)
at org.apache.hive.beeline.Commands.sql(Commands.java:713)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:983)
at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:718)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:767)
at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:485)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:468)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.RuntimeException: java.lang.AbstractMethodError: Method org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:83)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy20.executeStatementAsync(Unknown Source)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:276)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1317)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1302)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.AbstractMethodError: Method org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract
at org.postgresql.jdbc3.Jdbc3ResultSet.isClosed(Jdbc3ResultSet.java)
at org.apache.hadoop.hive.metastore.txn.TxnHandler.close(TxnHandler.java:1047)
at org.apache.hadoop.hive.metastore.txn.TxnHandler.close(TxnHandler.java:1059)
at org.apache.hadoop.hive.metastore.txn.TxnHandler.getOpenTxns(TxnHandler.java:279)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_open_txns(HiveMetaStore.java:5567)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at com.sun.proxy.$Proxy8.get_open_txns(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getValidTxns(HiveMetaStoreClient.java:1829)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
at com.sun.proxy.$Proxy9.getValidTxns(Unknown Source)
at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidTxns(DbTxnManager.java:327)
at org.apache.hadoop.hive.ql.Driver.recordValidTxns(Driver.java:994)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:441)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:316)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1189)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1183)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:419)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:406)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
... 18 more
Closing: 0: jdbc:hive2://localhost:10000/default
This has nothing to do with the distribution itself, however, it is a result of the JDBC driver copied in this location: /usr/hdp/2.x.x.x-xxx/hive/lib Resolution Find the version of your metadata database, ideally it could be one of the following (existing databases)
MySQL Oracle PostgreSQL Find the compliant JDBC drivers
For MySQL (finding the version of DB) or ask your Database Administrator who manages this DB [root@sandbox lib]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 41
Server version: 5.1.73 Source distribution
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
In the above scenario, version of MySQL DB is 5.1.73. Ensure that the relevant JDBC driver is downloaded HERE For PostgreSQL (finding version of DB) or ask your Database Administrator who manages this DB [root@sandbox lib]# su - postgres
-bash-4.1$ psql -c "select version()"
version
-------------------------------------------------------------------------------------------------------------------
PostgreSQL 8.4.20 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11), 64-bit
(1 row)
-bash-4.1$
The version of PostgreSQL db here is 8.4.20. You can get the drivers for PostgreSQL JDBC HERE Additionally, ensure that only JDBC4 type of drivers are being used. In the error raised above JDBC3 type drivers were being used which are not as efficient as JDBC4. Download the relevant JDBC4 jars and replace them with the drivers in /usr/hdp/2.x.x.x.xxxx/hive/lib Ensure that hive.txn.manager is set to "org.apache.hadoop.hive.ql.lockmgr.DbTxnManager", then restart Ambari server and all other services which indicate that a restart is required Test out by running some queries via Hive View, Beeline OR Hive Shell
... View more
Labels:
10-22-2016
03:39 AM
Thats the default entry when configuring via Ambari. This is a pain everyone goes through 🙂
... View more
10-22-2016
03:03 AM
The values should be * for both the properties.
... View more
10-22-2016
03:03 AM
1 Kudo
@Tao ZhongTry adding "hadoop.proxyuser.hdfs.hosts" and "hadoop.proxyuser.hdfs.hosts" properties in custom core-site.xml via ambari or in the file you aren't using ambari. Restart HDFS/MapR/Yarn services.
... View more
10-13-2016
11:15 PM
Goals Use custom kerberos Principal for configuring HAWQ login Verifying the usage by creating, loading and query an object with non-default principal NOTE HAWQ does not allow these parameters to be changed during the installation as configuration files are only generated after initialization of the HAWQ init process, so this is applicable only when you have a HAWQ cluster up and running with kerberos enabled. Steps Part 1: Kerberos Ensure that the HAWQ is up and running Login to kerberos server and create a custom headless principal kadmin.local: addprinc -randkey custom-postgres@HORTONWORKS.COM
WARNING: no policy specified for custom-postgres@HORTONWORKS.COM; defaulting to no policy
Principal "custom-postgres@HORTONWORKS.COM" created.
Export the key into a keytab file, ensure that the name remains as "hawq.service.keytab" or else this will be required to change kadmin.local: xst -k hawq.service.keytab custom-postgres@HORTONWORKS.COM
....
kadmin.local: quit
Copy over the keytab file over to all the hosts on which you either have HAWQ master, standby or Segments. For fail safe, keep a backup of the existing "hawq.service.keytab". Verify the difference between the old and the new keytabs [gpadmin@master master]$ klist -kt /etc/security/keytabs/hawq.service.keytab.orig
Keytab name: FILE:/etc/security/keytabs/hawq.service.keytab.orig
KVNO Timestamp Principal
---- ------------------- ------------------------------------------------------
2 10/10/2016 22:02:01 postgres@HORTONWORKS.COM
2 10/10/2016 22:02:01 postgres@HORTONWORKS.COM
2 10/10/2016 22:02:01 postgres@HORTONWORKS.COM
2 10/10/2016 22:02:01 postgres@HORTONWORKS.COM
2 10/10/2016 22:02:01 postgres@HORTONWORKS.COM
[gpadmin@master master]$ klist -kt /etc/security/keytabs/hawq.service.keytab
Keytab name: FILE:/etc/security/keytabs/hawq.service.keytab
KVNO Timestamp Principal
---- ------------------- ------------------------------------------------------
2 10/13/2016 15:42:36 custom-postgres@HORTONWORKS.COM
2 10/13/2016 15:42:36 custom-postgres@HORTONWORKS.COM
2 10/13/2016 15:42:36 custom-postgres@HORTONWORKS.COM
2 10/13/2016 15:42:36 custom-postgres@HORTONWORKS.COM
2 10/13/2016 15:42:36 custom-postgres@HORTONWORKS.COM
2 10/13/2016 15:42:37 custom-postgres@HORTONWORKS.COM
Part 2: HAWQ Now that we have successfully created the new custom principal, we need to modify HAWQs configuration file. This is simplified. Login to HAWQ master as gpadmin user Verify the service name for HAWQ's existing service name [gpadmin@master master]$ hawq config -s krb_srvname
GUC : krb_srvname
Value : postgres
[gpadmin@master master]$
Verify the Kerberos cache file name [gpadmin@master master]$ hawq config -s krb5_ccname
GUC : krb5_ccname
Value : /tmp/postgres.ccname
Retrieve the filename for Kerberos cache file, this is unique for gpadmin user and can be retrieved once you login using the new kerberos principal. Login as gpadmin user and run kdestroy before running the following [gpadmin@master master]$ klist
klist: No credentials cache found (ticket cache FILE:/tmp/krb5cc_1002)
Execute the following commands to modify HAWQ configuration throughout master and segment nodes [gpadmin@master master]$ hawq config -c krb_srvname -v custom_postgres --skipvalidation
GUC krb_srvname already exist in hawq-site.xml
Update it with value: custom_postgres
GUC : krb_srvname
Value : custom_postgres
Modify the cache file name and location according to our setup & finding [gpadmin@master master]$ hawq config -c krb5_ccname -v /tmp/krb5cc_1002 --skipvalidation
GUC krb5_ccname already exist in hawq-site.xml
Update it with value: /tmp/krb5cc_1002
GUC : krb5_ccname
Value : /tmp/krb5cc_1002
NOTE New values are not updated until the service have been restarted once. Proceed with restarting HAWQ services via Ambari or using hawq stop and hawq start. After the restart, verify if the values for the properties were modified [gpadmin@master master]$ hawq config -s krb_srvname && hawq config -s krb5_ccname
GUC : krb_srvname
Value : custom-postgres
GUC : krb5_ccname
Value : /tmp/krb5cc_1002
Login to the HAWQ database using psql prompt and verify if you are able to create objects [gpadmin@master ~]$ psql -p 10432 hdb
psql (8.4.20, server 8.2.15)
WARNING: psql version 8.4, server version 8.2.
Some psql features might not work.
Type "help" for help.
hdb=# drop table test;
DROP TABLE
hdb=# create table test (col1 int);
CREATE TABLE
hdb=#
Validate if you are able to insert and read the data hdb=# insert into test select * from generate_series(1,100);
INSERT 0 100
hdb=# select * from test limit 5;
col1
------
1
2
3
4
5
(5 rows)
hdb=#
If the changes are not made to HAWQ configuration and custom principals are used for hawq.service.keytab generation, you will get an error similar to the following hdb=# create table test (col1 int);
WARNING: failed to login to Kerberos, command line: kinit -k -t /etc/security/keytabs/hawq.service.keytab -c /tmp/postgres.ccname postgres
WARNING: failed to login to Kerberos, command line: kinit -k -t /etc/security/keytabs/hawq.service.keytab -c /tmp/postgres.ccname postgres
CONTEXT: Dropping file-system object -- Relation Directory: '16385/16388/16514'
WARNING: could not remove relation directory 16385/16388/16514: Permission denied
CONTEXT: Dropping file-system object -- Relation Directory: '16385/16388/16514'
ERROR: could not create relation directory hdfs://master.hortonworks.com.hortonworks.com:8020/hawq_default/16385/16388/16514: Permission denied
hdb=# \q
... View more
Labels:
10-07-2016
02:17 AM
4 Kudos
Goals
Get basic understanding on how variables are set at HiveCLI and Beeline utilities Retrieving values from table based on variabels Creating views based on variables Executing multiple statements using session variables via HiveCLI and Beeline Manipulating variables at runtime within one session (possible only at hiveCLI) HiveCLI Variables can be set at the session level or while launching the hive shell. Setting session variable while launching hive shell [hive@sandbox ~]$ hive -hiveconf myvar=2016-01-01
Logging initialized using configuration in file:/etc/hive/2.3.2.0-2950/0/hive-log4j.properties
hive> set myvar;
myvar=2016-01-01
hive>
OR Setting the variable at the runtime within Hive CLI hive> set myvar=2061-01-20
> ;
hive> set myvar;
myvar=2061-01-20
hive>
Performing SELECTs based on variable to retrieve data hive> select * from t_t1 limit 2;
OK
string_1 2016-01-01
string_2 2016-01-02
Time taken: 4.904 seconds, Fetched: 2 row(s)
hive> describe t_t1;
OK
f1 string
d1 date
Time taken: 0.62 seconds, Fetched: 2 row(s)
hive> set myvar=2016-01-02;
hive> select * from t_t1 where d1 = '${hiveconf:myvar}';
OK
string_2 2016-01-02
Time taken: 1.735 seconds, Fetched: 1 row(s)
Creating VIEWs based on tables using variables. hive> create view v_t1 as select * from t_t1 where d1 = '${hiveconf:myvar}';
OK
Time taken: 2.823 seconds
hive> select * from v_t1;
OK
string_2 2016-01-02
Time taken: 0.699 seconds, Fetched: 1 row(s)
hive> set myvar;
myvar=2016-01-02
hive>
NOTE: Views are created only on static values, even if it is null or empty. Using variables at session level to execute multiple statements. [hive@sandbox ~]$ hive -hiveconf myvar=2016-01-01 -e 'select * from t_t1 where d1="${hiveconf:myvar}";'
Logging initialized using configuration in file:/etc/hive/2.3.2.0-2950/0/hive-log4j.properties
OK
string_1 2016-01-01
Time taken: 9.942 seconds, Fetched: 1 row(s)
Executing multiple statements at session level during runtime [hive@sandbox ~]$ hive -hiveconf myvar=2016-01-01 -e 'select * from t_t1 where d1="${hiveconf:myvar}"; select * from t_t1 where d1="${hiveconf:myvar}" AND d1 is not null limit 2'
Logging initialized using configuration in file:/etc/hive/2.3.2.0-2950/0/hive-log4j.properties
OK
string_1 2016-01-01
Time taken: 7.965 seconds, Fetched: 1 row(s)
OK
string_1 2016-01-01
Time taken: 2.595 seconds, Fetched: 1 row(s
Manipulating variable with multiple values in one session #Create a file query.sql with following entries
set myvar=2016-01-01;
select * from t_t1 where d1="${hiveconf:myvar}";
set myvar=2016-01-02;
select * from t_t1 where d1="${hiveconf:myvar}";
# Execute with "-f" option to execute a file via Hive Shell
[hive@sandbox ~]$ hive -f query.sql
Logging initialized using configuration in file:/etc/hive/2.3.2.0-2950/0/hive-log4j.properties
OK
string_1 2016-01-01
Time taken: 7.969 seconds, Fetched: 1 row(s)
OK
string_2 2016-01-02
Time taken: 0.7 seconds, Fetched: 1 row(s) Beeline Setting session variable while launching the beeline client (Variables are represented using # in the JDBC url) [hive@sandbox ~]$ beeline -u "jdbc:hive2://localhost:10000/default;#myvar=2016-01-01" -n hive -p ''
0: jdbc:hive2://localhost:10000/default> set myvar;
+-------------------+--+
| set |
+-------------------+--+
| myvar=2016-01-01 |
+-------------------+--+
1 row selected (0.21 seconds)
Setting variable at Runtime is not an option with beeline 0: jdbc:hive2://localhost:10000/default> set myvar=2016-01-02;
Error: Error while processing statement: Cannot modify myvar at runtime. It is not in list of params that are allowed to be modified at runtime (state=42000,code=1)
NOTE: One can workaround this problem by setting the property "hive.security.authorization.sqlstd.confwhitelist.append" at customer hiveserver2-site via Ambari or in the xml file. For our variable, we can set this in the following way. hive.security.authorization.sqlstd.confwhitelist.append=myvar This change requires a restart. After restart of the services, a new beeline session will allow you to set the properties at runtime. 0: jdbc:hive2://localhost:10000/default> set myvar;
+-------------------+--+
| set |
+-------------------+--+
| myvar=2016-01-03 |
+-------------------+--+
1 row selected (0.008 seconds)
0: jdbc:hive2://localhost:10000/default> set myvar=2016-01-04;
No rows affected (0.003 seconds)
0: jdbc:hive2://localhost:10000/default> set myvar;
+-------------------+--+
| set |
+-------------------+--+
| myvar=2016-01-04 |
+-------------------+--+
1 row selected (0.006 seconds)
0: jdbc:hive2://localhost:10000/default>
Performing SELECTs based on session variables; 0: jdbc:hive2://localhost:10000/default> set myvar=2016-01-01;
No rows affected (0.003 seconds)
0: jdbc:hive2://localhost:10000/default> select * from t_t1 where d1 = "${hiveconf:myvar}";
+-----------+-------------+--+
| t_t1.f1 | t_t1.d1 |
+-----------+-------------+--+
| string_1 | 2016-01-01 |
+-----------+-------------+--+
1 row selected (1.61 seconds)
0: jdbc:hive2://localhost:10000/default> set myvar=2016-01-03;
No rows affected (0.004 seconds)
0: jdbc:hive2://localhost:10000/default> select * from t_t1 where d1 = '${hiveconf:myvar}';
+-----------+-------------+--+
| t_t1.f1 | t_t1.d1 |
+-----------+-------------+--+
| string_3 | 2016-01-03 |
+-----------+-------------+--+
1 row selected (1.547 seconds)
NOTE:- To enforce the values set by you for the variable "myvar" above method can be used. Creating VIEWS based on the variable (Same as in HiveCLI) Executing multiple statements using variable at session level [hive@sandbox ~]$ beeline -u "jdbc:hive2://localhost:10000/default;#myvar=2016-01-01" -n hive -p '' -e 'select * from t_t1 where d1 = "${myvar}"; select * from t_t1 where d1 != "${myvar}";'
Connecting to jdbc:hive2://localhost:10000/default;#myvar=2016-01-01
Connected to: Apache Hive (version 1.2.1.2.3.2.0-2950)
Driver: Hive JDBC (version 1.2.1.2.3.2.0-2950)
Transaction isolation: TRANSACTION_REPEATABLE_READ
+-----------+-------------+--+
| t_t1.f1 | t_t1.d1 |
+-----------+-------------+--+
| string_1 | 2016-01-01 |
+-----------+-------------+--+
1 row selected (0.702 seconds)
+-----------+-------------+--+
| t_t1.f1 | t_t1.d1 |
+-----------+-------------+--+
| string_2 | 2016-01-02 |
| string_3 | 2016-01-03 |
| string_4 | 2016-01-04 |
| string_5 | 2016-01-05 |
| string_6 | 2016-01-06 |
| string_7 | 2016-01-07 |
| string_8 | 2016-01-08 |
| string_9 | 2016-01-09 |
+-----------+-------------+--+
8 rows selected (1.595 seconds)
Beeline version 1.2.1.2.3.2.0-2950 by Apache Hive
Closing: 0: jdbc:hive2://localhost:10000/default;#myvar=2016-01-01
Multiple statements can be executed in the beeline by modifying the session variable at runtime in the same way as with Hive CLI Try it yourselves 0: jdbc:hive2://localhost:10000/default> show create table t_t1;
+-----------------------------------------------------------------+--+
| createtab_stmt |
+-----------------------------------------------------------------+--+
| CREATE EXTERNAL TABLE `t_t1`( |
| `f1` string, |
| `d1` date) |
| ROW FORMAT DELIMITED |
| FIELDS TERMINATED BY ',' |
| STORED AS INPUTFORMAT |
| 'org.apache.hadoop.mapred.TextInputFormat' |
| OUTPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' |
| LOCATION |
| 'hdfs://sandbox.hortonworks.com:8020/Data2/testdata' |
| TBLPROPERTIES ( |
| 'COLUMN_STATS_ACCURATE'='false', |
| 'numFiles'='1', |
| 'numRows'='-1', |
| 'rawDataSize'='-1', |
| 'totalSize'='162', |
| 'transient_lastDdlTime'='1475686495') |
+-----------------------------------------------------------------+--+
18 rows selected (0.362 seconds)
0: jdbc:hive2://localhost:10000/default> insert into t_t1 values ('Name1','2016-01-01'), ('Name2','2016-01-02');
,
Goals Get basic understanding on how variables are set at HiveCLI and Beeline utilities Retrieving values from table based on variabels Creating views based on variables Executing multiple statements using session variables via HiveCLI and Beeline Manipulating variables at runtime within one session (possible only at hiveCLI) HiveCLI Variables can be set at the session level or while launching the hive shell.
Setting session variable while launching hive shell [hive@sandbox ~]$ hive -hiveconf myvar=2016-01-01
Logging initialized using configuration in file:/etc/hive/2.3.2.0-2950/0/hive-log4j.properties
hive> set myvar;
myvar=2016-01-01
hive>
OR
Setting the variable at the runtime within Hive CLI hive> set myvar=2061-01-20
> ;
hive> set myvar;
myvar=2061-01-20
hive>
Performing SELECTs based on variable to retrieve data hive> select * from t_t1 limit 2;
OK
string_1 2016-01-01
string_2 2016-01-02
Time taken: 4.904 seconds, Fetched: 2 row(s)
hive> describe t_t1;
OK
f1 string
d1 date
Time taken: 0.62 seconds, Fetched: 2 row(s)
hive> set myvar=2016-01-02;
hive> select * from t_t1 where d1 = '${hiveconf:myvar}';
OK
string_2 2016-01-02
Time taken: 1.735 seconds, Fetched: 1 row(s)
Creating VIEWs based on tables using variables. hive> create view v_t1 as select * from t_t1 where d1 = '${hiveconf:myvar}';
OK
Time taken: 2.823 seconds
hive> select * from v_t1;
OK
string_2 2016-01-02
Time taken: 0.699 seconds, Fetched: 1 row(s)
hive> set myvar;
myvar=2016-01-02
hive>
NOTE: Views are created only on static values, even if it is null or empty.
Using variables at session level to execute multiple statements. [hive@sandbox ~]$ hive -hiveconf myvar=2016-01-01 -e 'select * from t_t1 where d1="${hiveconf:myvar}";'
Logging initialized using configuration in file:/etc/hive/2.3.2.0-2950/0/hive-log4j.properties
OK
string_1 2016-01-01
Time taken: 9.942 seconds, Fetched: 1 row(s)
Executing multiple statements at session level during runtime [hive@sandbox ~]$ hive -hiveconf myvar=2016-01-01 -e 'select * from t_t1 where d1="${hiveconf:myvar}"; select * from t_t1 where d1="${hiveconf:myvar}" AND d1 is not null limit 2'
Logging initialized using configuration in file:/etc/hive/2.3.2.0-2950/0/hive-log4j.properties
OK
string_1 2016-01-01
Time taken: 7.965 seconds, Fetched: 1 row(s)
OK
string_1 2016-01-01
Time taken: 2.595 seconds, Fetched: 1 row(s
Manipulating variable with multiple values in one session #Create a file query.sql with following entries
set myvar=2016-01-01;
select * from t_t1 where d1="${hiveconf:myvar}";
set myvar=2016-01-02;
select * from t_t1 where d1="${hiveconf:myvar}";
# Execute with "-f" option to execute a file via Hive Shell
[hive@sandbox ~]$ hive -f query.sql
Logging initialized using configuration in file:/etc/hive/2.3.2.0-2950/0/hive-log4j.properties
OK
string_1 2016-01-01
Time taken: 7.969 seconds, Fetched: 1 row(s)
OK
string_2 2016-01-02
Time taken: 0.7 seconds, Fetched: 1 row(s) Beeline
Setting session variable while launching the beeline client (Variables are represented using # in the JDBC url) [hive@sandbox ~]$ beeline -u "jdbc:hive2://localhost:10000/default;#myvar=2016-01-01" -n hive -p ''
0: jdbc:hive2://localhost:10000/default> set myvar;
+-------------------+--+
| set |
+-------------------+--+
| myvar=2016-01-01 |
+-------------------+--+
1 row selected (0.21 seconds)
Setting variable at Runtime is not an option with beeline 0: jdbc:hive2://localhost:10000/default> set myvar=2016-01-02;
Error: Error while processing statement: Cannot modify myvar at runtime. It is not in list of params that are allowed to be modified at runtime (state=42000,code=1)
NOTE: One can workaround this problem by setting the property "hive.security.authorization.sqlstd.confwhitelist.append" at customer hiveserver2-site via Ambari or in the xml file. For our variable, we can set this in the following way. hive.security.authorization.sqlstd.confwhitelist.append=myvar This change requires a restart. After restart of the services, a new beeline session will allow you to set the properties at runtime. 0: jdbc:hive2://localhost:10000/default> set myvar;
+-------------------+--+
| set |
+-------------------+--+
| myvar=2016-01-03 |
+-------------------+--+
1 row selected (0.008 seconds)
0: jdbc:hive2://localhost:10000/default> set myvar=2016-01-04;
No rows affected (0.003 seconds)
0: jdbc:hive2://localhost:10000/default> set myvar;
+-------------------+--+
| set |
+-------------------+--+
| myvar=2016-01-04 |
+-------------------+--+
1 row selected (0.006 seconds)
0: jdbc:hive2://localhost:10000/default>
Performing SELECTs based on session variables; 0: jdbc:hive2://localhost:10000/default> set myvar=2016-01-01;
No rows affected (0.003 seconds)
0: jdbc:hive2://localhost:10000/default> select * from t_t1 where d1 = "${hiveconf:myvar}";
+-----------+-------------+--+
| t_t1.f1 | t_t1.d1 |
+-----------+-------------+--+
| string_1 | 2016-01-01 |
+-----------+-------------+--+
1 row selected (1.61 seconds)
0: jdbc:hive2://localhost:10000/default> set myvar=2016-01-03;
No rows affected (0.004 seconds)
0: jdbc:hive2://localhost:10000/default> select * from t_t1 where d1 = '${hiveconf:myvar}';
+-----------+-------------+--+
| t_t1.f1 | t_t1.d1 |
+-----------+-------------+--+
| string_3 | 2016-01-03 |
+-----------+-------------+--+
1 row selected (1.547 seconds)
NOTE:- To enforce the values set by you for the variable "myvar" above method can be used.
Creating VIEWS based on the variable (Same as in HiveCLI)
Executing multiple statements using variable at session level [hive@sandbox ~]$ beeline -u "jdbc:hive2://localhost:10000/default;#myvar=2016-01-01" -n hive -p '' -e 'select * from t_t1 where d1 = "${myvar}"; select * from t_t1 where d1 != "${myvar}";'
Connecting to jdbc:hive2://localhost:10000/default;#myvar=2016-01-01
Connected to: Apache Hive (version 1.2.1.2.3.2.0-2950)
Driver: Hive JDBC (version 1.2.1.2.3.2.0-2950)
Transaction isolation: TRANSACTION_REPEATABLE_READ
+-----------+-------------+--+
| t_t1.f1 | t_t1.d1 |
+-----------+-------------+--+
| string_1 | 2016-01-01 |
+-----------+-------------+--+
1 row selected (0.702 seconds)
+-----------+-------------+--+
| t_t1.f1 | t_t1.d1 |
+-----------+-------------+--+
| string_2 | 2016-01-02 |
| string_3 | 2016-01-03 |
| string_4 | 2016-01-04 |
| string_5 | 2016-01-05 |
| string_6 | 2016-01-06 |
| string_7 | 2016-01-07 |
| string_8 | 2016-01-08 |
| string_9 | 2016-01-09 |
+-----------+-------------+--+
8 rows selected (1.595 seconds)
Beeline version 1.2.1.2.3.2.0-2950 by Apache Hive
Closing: 0: jdbc:hive2://localhost:10000/default;#myvar=2016-01-01
Multiple statements can be executed in the beeline by modifying the session variable at runtime in the same way as with Hive CLI
... View more
Labels: