Member since
05-10-2016
184
Posts
60
Kudos Received
6
Solutions
07-06-2021
10:26 AM
@diplompils It is not necessary that file is lost if you are getting the output as false for recoverLease command. Usually file can't be deleted until it has lease acquired and not explicitly deleted using rm command. You can try below- hdfs debug recoverLease -path <file> -retries 10 Or you may check - https://issues.apache.org/jira/browse/HDFS-8576
... View more
04-22-2020
10:38 PM
Good article and is still relevant for me. I would also add the following for Kudu tables create table hive.TABLE_PARAMS_BKUP as select * from hive.TABLE_PARAMS ;
UPDATE hive.TABLE_PARAMS
SET PARAM_VALUE = 'masterhostname01.fqdn,masterhostname02.fqdn,masterhostname03.fqdn'
WHERE PARAM_KEY = 'kudu.master_addresses'
and PARAM_VALUE like '%oldmasterhostname%'; And also for Sentry URI's that contain the HDFS namespace create table sentry.SENTRY_DB_PRIVILEGE_BKUP as select * from sentry.SENTRY_DB_PRIVILEGE ;
UPDATE sentry.SENTRY_DB_PRIVILEGE
SET URI = REPLACE(URI, 'hdfs://oldcluster-ns', 'hdfs://newcluster-ns/')
WHERE URI like '%oldcluster-ns%';
... View more
08-21-2018
09:27 PM
Environment CentOS 7 Ambari 2.7.0.0 Error: After ambari-server setup, and deploying the desired configuration via Ambari, found this error in the ambari-server.log.
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px 'Courier New'; color: #28fe14; background-color: #000000; background-color: rgba(0, 0, 0, 0.9)}
span.s1 {font-variant-ligatures: no-common-ligatures}
span.s2 {font-variant-ligatures: no-common-ligatures; color: #000000; background-color: rgba(40, 254, 20, 0.9)}
2018-08-21 20:09:05,733 WARN [ambari-client-thread-36] HttpChannel:507 - /api/v1/clusters//requests/3
org.springframework.security.web.firewall.RequestRejectedException: The request was rejected because the URL was not normalized.
at org.springframework.security.web.firewall.StrictHttpFirewall.getFirewalledRequest(StrictHttpFirewall.java:123)
at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:193)
at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:177)
at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:347)
at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:263)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1621)
at org.apache.ambari.server.api.MethodOverrideFilter.doFilter(MethodOverrideFilter.java:73)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1621)
at org.apache.ambari.server.api.AmbariPersistFilter.doFilter(AmbariPersistFilter.java:53)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1621)
at org.apache.ambari.server.security.AbstractSecurityHeaderFilter.doFilter(AbstractSecurityHeaderFilter.java:130)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1621)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:51)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1621)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:541)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1592)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1239)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:481)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1561)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1141)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:561)
at org.apache.ambari.server.controller.AmbariHandlerList.processHandlers(AmbariHandlerList.java:221)
at org.apache.ambari.server.controller.AmbariHandlerList.processHandlers(AmbariHandlerList.java:210)
at org.apache.ambari.server.controller.AmbariHandlerList.handle(AmbariHandlerList.java:140)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:564)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:110)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
at org.eclipse.jetty.util.thread.Invocable.invokePreferred(Invocable.java:122)
at org.eclipse.jetty.util.thread.strategy.ExecutingExecutionStrategy.invoke(ExecutingExecutionStrategy.java:58)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:201)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:133)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:672)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:590)
at java.lang.Thread.run(Thread.java:745)
Ambari UI would be stuck at something like this: Workaround Try the "retry" option, for me it went past the "waiting" state and through to the next error, which is a know issue File "/usr/lib/ambari-agent/lib/resource_management/core/source.py", line 197, in get_content raise Fail("Failed to download file from {0} due to HTTP error: {1}".format(self.url, str(ex)))resource_management.core.exceptions.Fail: Failed to download file from http://xlhive3.openstacklocal:8080/resources/mysql-connector-java.jar due to HTTP error: HTTP Error 404: Not Found Try downloading and appropriately naming the mysql-connector jar. If that doesn't work, the copy the jar under /var/lib/ambari-server/resources Restart ambari-server and retry the setup
... View more
Labels:
04-27-2018
02:07 AM
Problem While using LLAP, (can happen anytime, even though it was working a second ago), you might see a MetaException complaining about incorrect MySQL server syntax. Here is the the error output with 10 retries by default, so this could get lengthy. 0: jdbc:hive2://xlautomation-1.h.c:10500/defa> analyze table L3_MONTHLY_dw_dimRisk compute statistics for columns;
Getting log thread is interrupted, since query is done!
Error: Error while compiling statement: FAILED: RuntimeException Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient (state=42000,code=40000)
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: RuntimeException Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:277)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:263)
at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:303)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:244)
at org.apache.hive.beeline.Commands.execute(Commands.java:871)
at org.apache.hive.beeline.Commands.sql(Commands.java:729)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1000)
at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:835)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:793)
at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:493)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:476)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: RuntimeException Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:376)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:193)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:278)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:312)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:517)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:504)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:509)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1497)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1482)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1667)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:83)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3627)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3679)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3659)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:358)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1300)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1272)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:191)
... 15 more
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException:null
at sun.reflect.GeneratedConstructorAccessor105.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1665)
... 26 more
Caused by: MetaException(message:javax.jdo.JDODataStoreException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'OPTION SQL_SELECT_LIMIT=DEFAULT' at line 1
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:388)
at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:213)
at org.apache.hadoop.hive.metastore.ObjectStore.getObjectCount(ObjectStore.java:1294)
at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionCount(ObjectStore.java:1277)
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
at com.sun.proxy.$Proxy28.getPartitionCount(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updateMetrics(HiveMetaStore.java:6960)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:451)
at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:79)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:7034)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:140)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.GeneratedConstructorAccessor105.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1665)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:83)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3627)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3679)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3659)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:358)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1300)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1272)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:191)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:278)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:312)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:517)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:504)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:509)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1497)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1482)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) Assumptions We are using mysql as backed Metastore DB Error shows up only while running LLAP HS2 works fine Issue encountered on HDP-2.6.4 distro Issue Known bug with older version of mysql-connector https://bugs.mysql.com/bug.php?id=66659. This seems to have been addressed in version "mysql-connector-java-5.1.26" and above. Solution(s) Use a new mysql-connector version compatible with your Mysql backend DB version If you do not see issues with HS2, then copy the mysql-connector JAR on this host where HSI (HiveServerInteractive) is installed, in this location "/usr/hdp/2.6.4.0-91/hive2/lib/". There is no softlink so it should be renamed to "mysql-connector-java.jar" Validating the version On host which serves HiveServer2 instance, look for mysql-connector* [root@xlautomation-2 ~]# find /usr/hdp/2.6.4.0-91/ -name "mysql-connector*"
/usr/hdp/2.6.4.0-91/hive2/lib/mysql-connector-java-5.1.45-bin.jar
/usr/hdp/2.6.4.0-91/hive2/lib/mysql-connector-java.jar
/usr/hdp/2.6.4.0-91/hive/lib/mysql-connector-java.jar
Use the jar -xvf command by copying the files to a temp directory, should you want to validate the version. You should be able to do so, using: [root@xlautomation-1]# mkdir abc
[root@xlautomation-1]# mv mysql-connector-java.jar ./abc
[root@xlautomation-1]# /usr/jdk64/jdk1.8.0_112/bin/jar -xvf mysql-connector-java.jar
[root@xlautomation-1]# cd META-INF/
[root@xlautomation-1]# head /tmp/abc/META-INF/MANIFEST.MF
Manifest-Version: 1.0
Ant-Version: Apache Ant 1.7.1
Created-By: 4.4.6 20120305 (Red Hat 4.4.6-4) (Free Software Foundation
, Inc.)
Built-By: mockbuild
Bundle-Vendor: Sun Microsystems Inc.
Bundle-Classpath: .
Bundle-Version: 5.1.17
Bundle-Name: Sun Microsystems' JDBC Driver for MySQL
Bundle-ManifestVersion: 2
Default MySQL connector for HiveServer2 comes with a higher version, you can validate this using [root@xlautomation-1 META-INF]# head /tmp/bcd/META-INF/MANIFEST.MF
Manifest-Version: 1.0
Ant-Version: Apache Ant 1.8.2
Created-By: 1.8.0_92-b14 (Oracle Corporation)
Built-By: pb2user
Specification-Title: JDBC
Specification-Version: 4.2
Specification-Vendor: Oracle Corporation
Implementation-Title: MySQL Connector Java
Implementation-Version: 5.1.45
Implementation-Vendor-Id: com.mysql
... View more
Labels:
05-29-2018
08:41 PM
Thanks Shyam, solved my issue with only copying the latest jar file .
... View more
01-26-2018
11:01 PM
2 Kudos
Simple illustration of locking in Hive when ACID is enabled Considerations for illustration Cluster Version: HDP -2.5.6.0 Hive Version: Hive 1.2.1000 Enabled with following properties in place
hive.support.concurrency=true hive.compactor.initiator.on=true hive.compactor.worker.threads=1 hive.exec.dynamic.partition.mode=nonstrict hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager (if non-Ambari cluster) Types of Supported Locks
S = SHARED or SHARED_READ X = EXCLUSIVE Tables used for testing orc_tab (ORC format table with col1 int and col2 string), non-transactional orc_tab_bucketed(ORC format table with col1 int and col2 string, transactional) txt_tab (TEXT format table with col1 int, col2 string, non-transactional, for loading purposes) Either tables have closed to 5 GB data on a Single node cluster SCENARIO 1 (Non Transactional Table) SELECT blocks ALTER SELECT starts first followed by ALTER Session A beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab order by col1 limit 2" Session B beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "ALTER TABLE orc_tab ADD COLUMNS (col3 string)" Session C +----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+
| lockid | database | table | partition | lock_state | blocked_by | lock_type | transaction_id | last_heartbeat | acquired_at | user | hostname | agent_info |
+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+
| Lock ID | Database | Table | Partition | State | Blocked By | Type | Transaction ID | Last Heartbeat | Acquired At | User | Hostname | Agent Info |
| 31.1 | default | orc_tab | NULL | ACQUIRED | | SHARED_READ | NULL | 1517003062122 | 1517003062122 | hive | xlpatch.openstacklocal | hive_20180126214422_aaeb4b28-5170-4131-b509-ef0213c8b842 |
| 32.1 | default | orc_tab | NULL | WAITING | 31.1 | EXCLUSIVE | NULL | 1517003063314 | NULL | hive | xlpatch.openstacklocal | hive_20180126214422_a65af104-05d1-4c19-ab54-7bb37b4cdbfa |
+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+ SCENARIO 2 (Non Transactional Table) SELECT blocks INSERT OVERWRITE SELECT starts first followed by INSERT OVERWRITE Session A beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab order by col1 limit 2" Session B beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "INSERT OVERWRITE TABLE orc_tab SELECT col1,col2 from txt_tab" Session C 0: jdbc:hive2://localhost:10000/default> show locks;+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| lockid | database | table | partition | lock_state | blocked_by | lock_type | transaction_id | last_heartbeat | acquired_at | user | hostname | agent_info |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID | Database | Table | Partition | State | Blocked By | Type | Transaction ID | Last Heartbeat | Acquired At | User | Hostname | Agent Info || 36.1 | default | orc_tab | NULL | ACQUIRED | | SHARED_READ | NULL | 1517003567582 | 1517003567582 | hive | xlpatch.openstacklocal | hive_20180126215247_7537e30b-d5bf-4fc8-aa23-8e860efe1ac8 || 37.1 | default | txt_tab | NULL | WAITING | | SHARED_READ | NULL | 1517003568897 | NULL | hive | xlpatch.openstacklocal | hive_20180126215248_875685ed-a552-4009-892c-e13c61cf7eb5 || 37.2 | default | orc_tab | NULL | WAITING | 36.1 | EXCLUSIVE | NULL | 1517003568897 | NULL | hive | xlpatch.openstacklocal | hive_20180126215248_875685ed-a552-4009-892c-e13c61cf7eb5 |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+ SCENARIO 3 (Non Transactional Table) SELECT blocks INSERT SELECT starts first followed by INSERT Session A beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab order by col1 limit 2" Session B beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "INSERT INTO orc_tab SELECT col1,col2 from txt_tab limit 20" Session C 0: jdbc:hive2://localhost:10000/default> show locks;+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| lockid | database | table | partition | lock_state | blocked_by | lock_type | transaction_id | last_heartbeat | acquired_at | user | hostname | agent_info |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID | Database | Table | Partition | State | Blocked By | Type | Transaction ID | Last Heartbeat | Acquired At | User | Hostname | Agent Info || 38.1 | default | orc_tab | NULL | ACQUIRED | | SHARED_READ | NULL | 1517004119030 | 1517004119030 | hive | xlpatch.openstacklocal | hive_20180126220158_775842e7-5e34-42d0-b574-874076fd5204 || 39.1 | default | txt_tab | NULL | WAITING | | SHARED_READ | NULL | 1517004120971 | NULL | hive | xlpatch.openstacklocal | hive_20180126220200_9e9eeb8c-9c32-42fd-8ddf-c96f08699224 || 39.2 | default | orc_tab | NULL | WAITING | 38.1 | EXCLUSIVE | NULL | 1517004120971 | NULL | hive | xlpatch.openstacklocal | hive_20180126220200_9e9eeb8c-9c32-42fd-8ddf-c96f08699224 |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+4 rows selected (0.028 seconds) SCENARIO 4 (Transactional Table) SELECT does not block INSERT SELECT starts first followed by INSERT Session A beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab_bucketed order by col1 limit 2" Session B beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "INSERT INTO orc_tab_bucketed SELECT col1,col2 from txt_tab limit 20" Session C 0: jdbc:hive2://localhost:10000/default> show locks;+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| lockid | database | table | partition | lock_state | blocked_by | lock_type | transaction_id | last_heartbeat | acquired_at | user | hostname | agent_info |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID | Database | Table | Partition | State | Blocked By | Type | Transaction ID | Last Heartbeat | Acquired At | User | Hostname | Agent Info || 42.1 | default | orc_tab_bucketed | NULL | ACQUIRED | | SHARED_READ | NULL | 1517004495025 | 1517004495025 | hive | xlpatch.openstacklocal | hive_20180126220814_cae3893a-8e97-49eb-8b07-a3a60c4a6dc2 || 43.1 | default | txt_tab | NULL | ACQUIRED | | SHARED_READ | 3 | 0 | 1517004495874 | hive | xlpatch.openstacklocal | hive_20180126220815_a335e284-476a-42e0-b758-e181e6ab44e9 || 43.2 | default | orc_tab_bucketed | NULL | ACQUIRED | | SHARED_READ | 3 | 0 | 1517004495874 | hive | xlpatch.openstacklocal | hive_20180126220815_a335e284-476a-42e0-b758-e181e6ab44e9 |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+4 rows selected (0.02 seconds) SCENARIO 5 (Transactional Table) SELECT does not block INSERT OVERWRITE SELECT starts first followed by INSERT OVERWRITE Session A beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab_bucketed order by col1 limit 2" Session B beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "ALTER TABLE orc_tab_bucketed ADD COLUMNS (col3 string)" Session C 0: jdbc:hive2://localhost:10000/default> show locks;Getting log thread is interrupted, since query is done!+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| lockid | database | table | partition | lock_state | blocked_by | lock_type | transaction_id | last_heartbeat | acquired_at | user | hostname | agent_info |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID | Database | Table | Partition | State | Blocked By | Type | Transaction ID | Last Heartbeat | Acquired At | User | Hostname | Agent Info || 53.1 | default | orc_tab_bucketed | NULL | ACQUIRED | | SHARED_READ | NULL | 1517005855005 | 1517005855005 | hive | xlpatch.openstacklocal | hive_20180126223053_db2d0054-6cb6-48fb-b732-6ca677007695 || 54.1 | default | orc_tab_bucketed | NULL | WAITING | 53.1 | EXCLUSIVE | NULL | 1517005855870 | NULL | hive | xlpatch.openstacklocal | hive_20180126223054_6294af5a-15da-4178-9a83-40f150e08cb1 |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+3 rows selected (0.064 seconds) Synopsis Without "transactional" feature set to true
EXCLUSIVE lock (ALTER) waits for SHARED (SELECT) EXCLUSIVE lock (INSERT OVERWRITE) waits for SHARED (SELECT) EXCLUSIVE lock (INSERT) waits for SHARED (SELECT) With "transactional" enabled
EXCLUSIVE lock (ALTER) waits for SHARED (SELECT) INSERT/SELECT both take SHARED lock
... View more
Labels:
09-27-2017
10:23 PM
3 Kudos
Goal Create a new Ambari view for Hive Interactive. Use this link to get detailed information on configuring views. https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.1.0/bk_ambari-views/content/settings_and_cluster_configuration.html Steps
Navigate to Ambari page with admin privileges and click on the username dropdown icon
Select the views link to explore all the views available in Ambari
Collapse the "Hive" dropdown and click on "Create Instance" to create a new view for LLAP/Interactive
Give the name of this instance per your requirement
Ensure that under "Settings" tab, "User Interactive Mode" is set to true
If the cluster is kerberized, use proper auth method and principal name
Also update the proper JDBC URL with principal name NOTE: If ranger is enabled, ensure that the user trying to access the database objects does have the permissions to browse the contents of the database(s).
... View more
Labels:
08-03-2017
08:41 PM
6 Kudos
Goal: Demonstrate how to change the database location in HDFS and Metastore There are circumstances wherein we can consider moving the database location. By default, the location for default and custom databases is defined within the value of hive.metastore.warehouse.dir, which is /apps/hive/warehouse. Here are the illustrated steps to change a custom database location, for instance "dummy.db", along with the contents of the database. Verify the details of the database we would like to move to a new location [hive@xlautomation-2 ~]$ beeline -u "jdbc:hive2://xlautomation-2.h.c:10000/default;principal=hive/xlautomation-2.h.c@H.C"
0: jdbc:hive2://xlautomation-2.h.c:10000/defa> create database dummy;
No rows affected (0.394 seconds)
0: jdbc:hive2://xlautomation-2.h.c:10000/defa> describe database dummy;
+----------+----------+--------------------------------------------------------------+-------------+-------------+-------------+--+
| db_name | comment | location | owner_name | owner_type | parameters |
+----------+----------+--------------------------------------------------------------+-------------+-------------+-------------+--+
| dummy | | hdfs://xlautomation-1.h.c:8020/apps/hive/warehouse/dummy.db | hive | USER | |
+----------+----------+--------------------------------------------------------------+-------------+-------------+-------------+--+
1 row selected (0.561 seconds)
NOTE: The example provides the database location i.e. /apps/hive/warehouse/dummy.db which needs to be updated. Verified the same using dummy table to test whether the location update was indeed successful 0: jdbc:hive2://xlautomation-2.h.c:10000/defa> create table dummy.test123 (col1 string, col2 string) row format delimited fields terminated by ',' stored as textfile;
No rows affected (0.691 seconds)
0: jdbc:hive2://xlautomation-2.h.c:10000/defa> insert into dummy.test123 values (1,1),(2,2),(3,3),(4,4),(5,5),(6,6);
INFO : Session is already open
INFO : Dag name: insert into dummy.tes...3),(4,4),(5,5),(6,6)(Stage-1)
INFO : Status: Running (Executing on YARN cluster with App id application_1499274604190_0034)
INFO : Loading data to table dummy.test123 from hdfs://xlautomation-1.h.c:8020/apps/hive/warehouse/dummy.db/test123/.hive-staging_hive_2017-08-03_16-20-11_965_647196527379814552-1/-ext-10000
INFO : Table dummy.test123 stats: [numFiles=1, numRows=6, totalSize=24, rawDataSize=18]
No rows affected (2.47 seconds)
0: jdbc:hive2://xlautomation-2.h.c:10000/defa> select * from dummy.test123;
+---------------+---------------+--+
| test123.col1 | test123.col2 |
+---------------+---------------+--+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 6 | 6 |
+---------------+---------------+--+
6 rows selected (0.451 seconds)
Create a new storage DIR of our choice (we used newdummy.db) and replicate the permission at the directory level. [hive@xlautomation-2 ~]$ hdfs dfs -mkdir -p /apps/hive/warehouse/newdummy.db
[hive@xlautomation-2 ~]$ hdfs dfs -mkdir -p /apps/hive/warehouse/newdummy.db
[hive@xlautomation-2 ~]$ hdfs dfs -chmod 777 /apps/hive/warehouse/newdummy.db
Verify if the DB (dir) level permissions are the same [hive@xlautomation-2 ~]$ hdfs dfs -ls /apps/hive/warehouse | egrep dummy.db
drwxrwxrwx - hive hdfs 0 2017-08-03 16:19 /apps/hive/warehouse/dummy.db
drwxrwxrwx - hive hdfs 0 2017-08-03 16:27 /apps/hive/warehouse/newdummy.db
Copy all the underlying contents from /apps/hive/warehouse/dummy.db/ into the new directory [hive@xlautomation-2 ~]$ hdfs dfs -cp -f -p /apps/hive/warehouse/dummy.db/* /apps/hive/warehouse/newdummy.db/ Caution: The usage of "cp" with "p" to preserve the permission is prone to the following error cp: Access time for hdfs is not configured. Please set dfs.namenode.accesstime.precision configuration parameter. This is because the value of dfs.namenode.accesstime.precision is set to 0 by default, in hortonworks HDP distribution. Since this is a client level configuration, it can be configured in hdfs-site.xml on a non-ambari managed cluster in client i.e., from 0 to 3600000. We can verify this at the client level by running the following command. [hive@xlautomation-2 ~]$ hdfs getconf -confKey dfs.namenode.accesstime.precision
3600000
Once the change is made, copy the contents of database folder /dummy.db/* to the new location i.e., /newdummy.db/ as HDFS user. We are overwriting (-f) any existing files within new directory and (-p) preserving the permissions [hdfs@xlautomation-2 ~]$ hdfs dfs -cp -f -p /apps/hive/warehouse/dummy.db/* /apps/hive/warehouse/newdummy.db/ Check the permissions once the copy is completed [hdfs@xlautomation-2 ~]$ hdfs dfs -ls /apps/hive/warehouse/dummy.db/
Found 1 items
drwxrwxrwx - hive hdfs 0 2017-08-03 16:20 /apps/hive/warehouse/dummy.db/test123
[hdfs@xlautomation-2 ~]$
[hdfs@xlautomation-2 ~]$
[hdfs@xlautomation-2 ~]$ hdfs dfs -ls /apps/hive/warehouse/newdummy.db/
Found 1 items
drwxrwxrwx - hive hdfs 0 2017-08-03 16:20 /apps/hive/warehouse/newdummy.db/test123
With the privileged user access to metastore db (hive in our case) we may need to update three tables i.e., DBS, SDS and FUNC_RU as they log the locations for database, table and function in that order. In our example, since we do not have any functions, we will just update SDS and DBS tables mysql> update SDS set location= replace(location,'hdfs://xlautomation-1.h.c:8020/apps/hive/warehouse/dummy.db','hdfs://xlautomation-1.h.c:8020/apps/hive/warehouse/newdummy.db') where location like '%dummy.db%';
Query OK, 3 rows affected (0.53 sec)
Rows matched: 3 Changed: 3 Warnings: 0
mysql> update DBS set db_location_uri= replace(db_location_uri,'hdfs://xlautomation-1.h.c:8020/apps/hive/warehouse/dummy.db','hdfs://xlautomation-1.h.c:8020/apps/hive/warehouse/newdummy.db') where db_location_uri like '%dummy.db%';
Query OK, 1 row affected (0.06 sec)
Rows matched: 1 Changed: 1 Warnings: 0
NOTE: If you want to try and run this before committing the changes in metastore, use begin; before and end; after your UPDATE statements. This update statement will replace all the occurrences of specified string within DBS and SDS tables. Check if the changes made to the tables were permanent, the location should be updated to */newdummy.db 0: jdbc:hive2://xlautomation-2.h.c:10000/defa> describe database dummy;
+----------+----------+-----------------------------------------------------------------+-------------+-------------+-------------+--+
| db_name | comment | location | owner_name | owner_type | parameters |
+----------+----------+-----------------------------------------------------------------+-------------+-------------+-------------+--+
| dummy | | hdfs://xlautomation-1.h.c:8020/apps/hive/warehouse/newdummy.db | hive | USER | |
+----------+----------+-----------------------------------------------------------------+-------------+-------------+-------------+--+
1 row selected (0.444 seconds)
Verify the data from the table and also confirm its location 0: jdbc:hive2://xlautomation-2.h.c:10000/defa> describe formatted dummy.test123;
+-------------------------------+-------------------------------------------------------------------------+-----------------------------+--+
| col_name | data_type | comment |
+-------------------------------+-------------------------------------------------------------------------+-----------------------------+--+
| # col_name | data_type | comment |
| | NULL | NULL |
| col1 | string | |
| col2 | string | |
| | NULL | NULL |
| # Detailed Table Information | NULL | NULL |
| Database: | dummy | NULL |
| Owner: | hive | NULL |
| CreateTime: | Thu Aug 03 16:19:33 UTC 2017 | NULL |
| LastAccessTime: | UNKNOWN | NULL |
| Protect Mode: | None | NULL |
| Retention: | 0 | NULL |
| Location: | hdfs://xlautomation-1.h.c:8020/apps/hive/warehouse/newdummy.db/test123 | NULL |
| Table Type: | MANAGED_TABLE | NULL |
| Table Parameters: | NULL | NULL |
| | COLUMN_STATS_ACCURATE | {\"BASIC_STATS\":\"true\"} |
| | numFiles | 1 |
| | numRows | 6 |
| | rawDataSize | 18 |
| | totalSize | 24 |
| | transient_lastDdlTime | 1501777214 |
| | NULL | NULL |
| # Storage Information | NULL | NULL |
| SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL |
| InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL |
| OutputFormat: | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL |
| Compressed: | No | NULL |
| Num Buckets: | -1 | NULL |
| Bucket Columns: | [] | NULL |
| Sort Columns: | [] | NULL |
| Storage Desc Params: | NULL | NULL |
| | field.delim | , |
| | serialization.format | , |
+-------------------------------+-------------------------------------------------------------------------+-----------------------------+--+
33 rows selected (0.362 seconds)
0: jdbc:hive2://xlautomation-2.h.c:10000/defa> select * from dummy.test123;
+---------------+---------------+--+
| test123.col1 | test123.col2 |
+---------------+---------------+--+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 6 | 6 |
+---------------+---------------+--+
6 rows selected (0.275 seconds)
0: jdbc:hive2://xlautomation-2.h.c:10000/defa>
Considerations Remove the old database directory only when you are sure the tables are readable To check if hive or other privileged user has access to modify contents in metastore database, login to mysql and run the following commands (ensure that you are logged on to the node that hosts metastore database) mysql> show grants for hive;
+--------------------------------------------------------------------------------------------------------------+
| Grants for hive@% |
+--------------------------------------------------------------------------------------------------------------+
| GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%' IDENTIFIED BY PASSWORD '*7ACE763ED393514FE0C162B93996ECD195FFC4F5' |
| GRANT ALL PRIVILEGES ON `hive`.* TO 'hive'@'%' |
+--------------------------------------------------------------------------------------------------------------+
2 rows in set (0.02 sec)
mysql> select user,host from user;
+------+--------------------+
| user | host |
+------+--------------------+
| hive | % |
| root | 127.0.0.1 |
| root | localhost |
| root | xlautomation-2.h.c |
+------+--------------------+
4 rows in set (0.00 sec)
All the operations mentioned above was performed on a kerberized cluster hive --service metatool -updateLocation did not succeed in updating the location, it is successful when changing the namenode uri to HA short name configuration For any external tables whose locations are different, it should ideally not affect its access. Copy output of "hdfs dfs -ls -R /apps/hive/warehouse/dummy.db" to ensure that you have a copy of the permissions before getting rid of the directory.
... View more
Labels:
05-04-2017
08:27 PM
GOAL To change the default log location for HUE from /var/log/hue to elsewhere. Steps Default HUE service writes the log files to /var/log/hue directory. This is somehow enforced in the code which states this The ``log_dir`` will replace the %LOG_DIR% in log.conf. If not specified, we look for the DESTKOP_LOG_DIR environment variable, and then default to the DEFAULT_LOG_DIR. However, trying to set either variables i.e. DESKTOP_LOG_DIR and DEFAULT_LOG_DIR don't seem to work. We can however change the %LOG_DIR% with an absolute path within /etc/hue/conf/log.conf. Here is an output of the log.conf replaced with absolute path for the log directory. #args=('%LOG_DIR%/access.log', 'a', 1000000, 3)
args=('/opt/log/hue/access.log', 'a', 1000000, 3)
--
#args=('%LOG_DIR%/error.log', 'a', 1000000, 3)
args=('/opt/log/hue/error.log', 'a', 1000000, 3)
--
#args=('%LOG_DIR%/%PROC_NAME%.log', 'a', 1000000, 3)
args=('/opt/log/hue/%PROC_NAME%.log', 'a', 1000000, 3)
--
#args=('%LOG_DIR%/shell_output.log', 'a', 1000000, 3)
args=('/opt/log/hue/shell_output.log', 'a', 1000000, 3)
--
#args=('%LOG_DIR%/shell_input.log', 'a', 1000000, 3)
args=('/opt/log/hue/shell_input.log', 'a', 1000000, 3)
NOTE: Do ensure that the new log directory exists and has hue as user and group. Post the changes, a restart of hue service should be good enough to route the new logs in the new location.
... View more
Labels: