Member since
10-04-2016
243
Posts
281
Kudos Received
43
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1184 | 01-16-2018 03:38 PM | |
6153 | 11-13-2017 05:45 PM | |
3060 | 11-13-2017 12:30 AM | |
1524 | 10-27-2017 03:58 AM | |
28464 | 10-19-2017 03:17 AM |
10-19-2017
03:17 AM
1 Kudo
Try using rdd.sortByKey(false) This will sort in descending order
... View more
10-18-2017
11:54 PM
3 Kudos
This article is an extension to the official HDP document. Apart from following the steps listed in this document, you must perform the following checks to ensure the hook is configured correctly and does not result in errors when you start executing queries in Hive. 1. In hive-site.xml, verify hive.server2.async.exec.threads is not set to 1. If so, then increase to 100. 2. In hive-site.xml, verify the max thread pool size is not set to 1. Increase it to 5 to begin with and you may need to increase it further depending on the load. Recommended values:
<property>
<name>atlas.hook.hive.maxThreads</name>
<value>5</value>
</property>
<property>
<name>hive.server2.async.exec.threads</name>
<value>100</value>
</property>
... View more
Labels:
10-18-2017
11:39 PM
2 Kudos
Scenario: The cluster is using both Hive and Atlas components. Sometimes a simple query like 'show databases' fails with the error stack shown below: beeline> show databases; Getting log thread is interrupted, since query is done! Error: Error while processing statement: FAILED: Hive Internal Error: java.util.concurrent.RejectedExecutionException(Task java.util.concurrent.FutureTask@e871c01 rejected from java.util.concurrent.ThreadPoolExecutor@5b868755[Running, pool size = 1, active threads = 1, queued tasks = 10000, completed tasks = 14807]) (state=08S01,code=12) java.sql.SQLException: Error while processing statement: FAILED: Hive Internal Error: java.util.concurrent.RejectedExecutionException(Task java.util.concurrent.FutureTask@e871c01 rejected from java.util.concurrent.ThreadPoolExecutor@5b868755[Running, pool size = 1, active threads = 1, queued tasks = 10000, completed tasks = 14807]) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:282) at org.apache.hive.beeline.Commands.execute(Commands.java:848) at org.apache.hive.beeline.Commands.sql(Commands.java:713) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:983) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:823) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:781) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:485) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:468) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) HiveServer2 Log: 2017-10-10 14:00:38,985 INFO [HiveServer2-Background-Pool: Thread-273112]: log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) - <PERFLOG method=PostHook.org.apache.atlas.hive.hook.HiveHook from=org.apache.hadoop.hive.ql.Driver> 2017-10-10 14:00:38,986 ERROR [HiveServer2-Background-Pool: Thread-273112]: ql.Driver (SessionState.java:printError(962)) - FAILED: Hive Internal Error: java.util.concurrent.RejectedExecutionException(Task java.util.concurrent.FutureTask@3f389d45 rejected from java.util.concurrent.ThreadPoolExecutor@5b868755[Running, pool size = 1, active threads = 1, queued tasks = 10000, completed tasks = 14807]) java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@3f389d45 rejected from java.util.concurrent.ThreadPoolExecutor@5b868755[Running, pool size = 1, active threads = 1, queued tasks = 10000, completed tasks = 14807] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) at org.apache.atlas.hive.hook.HiveHook.run(HiveHook.java:174) Root Cause Often users are led to believe that this issue can be fixed by removing 'org.apache.atlas.hive.hook.HiveHook' from hive.exec.post.hooks property. hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook, org.apache.atlas.hive.hook.HiveHook
However, when you are using both Atlas and Hive, then 'org.apache.atlas.hive.hook.HiveHook' should not be removed. Instead, the error clearly indicates that this issue is due to improper ThreadPool configuration. In this case the max thread pool size is 1 and the waiting queue size is 10000. Solution 1. In hive-site.xml, verify the value for property "hive.server2.async.exec.threads". If set to 1, increase to 100. 2. Increase max thread pool related values with respect to Atlas threads in hive-site.xml, example <property>
<name>atlas.hook.hive.maxThreads</name>
<value>5</value>
</property>
<property>
<name>atlas.hook.hive.minThreads</name>
<value>1</value>
</property>
... View more
Labels:
10-18-2017
11:35 PM
thank you. I had to increase the maxThreads to get it working.
... View more
10-17-2017
05:29 AM
3 Kudos
Try this page for various wiki on setting up the environment and committing your code. https://cwiki.apache.org/confluence/display/AMBARI/Ambari+Development
... View more
10-17-2017
02:24 AM
@Turing nix - I have updated my original answer to address your concerns with auditing. Please refer the same and kindly consider accepting the answer if it helps you. Thank you.
... View more
10-16-2017
10:15 PM
6 Kudos
If you have recently upgraded from HDP-2.6.1 to HDP-2.6.2, you may end up facing this issue. I have seen this for a Kerberized Cluster and doAs=true in Hive. You may have carried over same configurations for Zeppelin, Interpreter, Hive from 2.6.1 to 2.6.2 yet a simple query like 'show databases' may result in this error stacktrace shown below: org.apache.zeppelin.interpreter.InterpreterException: Error in doAs at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:415) at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:633) at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:733) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:101) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:502) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1884) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:407) ... 13 more Caused by: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: java.net.ConnectException: Connection refused at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:218) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:156) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:208) at org.apache.commons.dbcp2.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:79) at org.apache.commons.dbcp2.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:205) at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363) at org.apache.commons.dbcp2.PoolingDriver.connect(PoolingDriver.java:129) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnectionFromPool(JDBCInterpreter.java:362) at org.apache.zeppelin.jdbc.JDBCInterpreter.access$000(JDBCInterpreter.java:89) at org.apache.zeppelin.jdbc.JDBCInterpreter$1.run(JDBCInterpreter.java:410) at org.apache.zeppelin.jdbc.JDBCInterpreter$1.run(JDBCInterpreter.java:407) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) ... 14 more Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused at org.apache.thrift.transport.TSocket.open(TSocket.java:185) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssuming Solution 1. Edit Interpreter to add the following property and value in JDBC Interpreter Goto http://zeppelinhost:port/#/interpreter . Replace zeppelinhost and port details. Scroll down to jdbc interpreter and click edit. hive.proxy.user.property = hive.server2.proxy.user 2. Restart Interpreter 3. Play the notebook and issue will be resolved. Explanation: When Zeppelin server is running with authentication enabled, then the interpreter can utilize Hive's user proxy feature i.e. send extra parameter for creating and running a session ("hive.server2.proxy.user=": "${loggedInUser}"). This is configured by specifying the parameter as noted in Step 1 above.
... View more
Labels:
10-16-2017
09:49 PM
3 Kudos
@Ramya Jayathirtha Verify Hive Service status and try connecting to hive using beeline to ensure hive is working fine. Then, try adding the following parameter and value in your jdbc interpreter. hive.proxy.user.property = hive.server2.proxy.user
... View more
10-16-2017
09:40 PM
1 Kudo
Have you checked ? https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_release-notes/content/known_issues.html Description of Problem: After upgrading from Ambari 2.4.2 to Ambari 2.5.2 and subsequent HDP stack upgrade from 2.5 to 2.6, jdbc(hive) interpreter fails to work correctly in Zeppelin. Error Message: You might see one of the following errors in the Zeppelin stacktrace after running jdbc(hive):
Error in doAs Failed to validate proxy privilege of zeppelin Workaround:
Make sure hadoop.proxyuser.zeppelin.groups=* and hadoop.proxyuser.zeppelin.hosts=* are set in HDFS core-site.xml . If not, then configure these properties and restart all stale services. (AMBARI-21772 is currently tracking this item). Make user hive.url is configured correctly in Zeppelin's JDBC hive interpreter. Note The URL configured might be wrong, especially on secured and/or wire-encrypted clusters, due to a known issue that we will address in a future release.
Restart HS2.
... View more
10-15-2017
10:08 PM
3 Kudos
When the FSImage file is large (like 30 GB or more), sometimes due to other contributing factors like RPC bandwidth, network congestion, request queue length etc, it can take a long time to upload/download. This in turn can leads the Zookeeper to believe that the NameNode is not responding. It displays the SERVICE_NOT_RESPONDING status. Thereafter, it triggers a failover transition. The logs display the following statements: 2017-09-04 05:02:26,017 INFO namenode.TransferFsImage (TransferFsImage.java:receiveFile(575)) -
"Combined time for fsimage download and fsync to all disks took 237.14s.
The fsimage download took 237.14s at 141130.21 KB/s.
Synchronous (fsync) write to disk of /opt/hadoop/hdfs/namenode/image/current/fsimage.ckpt_0000000012106114957
took 0.00s. Synchronous (fsync) write to disk of
/var/hadoop/hdfs/namenode/image/current/fsimage.ckpt_0000000012106114957 took 0.00s..
2017-09-04 05:02:26,018 WARN client.QuorumJournalManager (IPCLoggerChannel.java:call(388)) -
Remote journal 192.168.1.1:8485 failed to write txns 12106579989-12106579989.
Will try to write to this JN again after the next log roll.
org.apache.hadoop.ipc.RemoteException(java.io.IOException):
IPC's epoch 778 is less than the last promised epoch 779
2017-09-04 05:02:26,019 WARN ha.HealthMonitor (HealthMonitor.java:doHealthChecks(211)) -
Transport-level exception trying to monitor health of namenode at nn1.test.com/192.168.1.2:8023:
java.io.EOFException End of File Exception between local host is: "nn1.test.com/192.168.1.2";
destination host is: "nn1.test.com/192.168.1.2":8023; : java.io.EOFException;
For more details see: http://wiki.apache.org/hadoop/EOFException
2017-09-04 05:02:26,020 INFO ha.HealthMonitor (HealthMonitor.java:enterState(249)) -
Entering state SERVICE_NOT_RESPONDING
2017-09-04 05:02:26,021 INFO ha.ZKFailoverController (
ZKFailoverController.java:setLastHealthState(852)) -
Local service NameNode at nn1.test.com/192.168.1.2:8023 Entered state: SERVICE_NOT_RESPONDING If the contributing factors are not addressed and FSImage file size continues to be high, then such fail overs will become very frequent (3 or more times in a week).
Root Cause This issue occurs in the following scenarios: The FSImage upload/download is making the disk/network too busy, which is causing request queues to build up and the NameNode to appear unresponsive. Typically, in overloaded clusters where the NameNode is too busy to process heartbeats, it spuriously marks DataNodes as dead. This scenario also leads to spurious fail overs. Solution To resolve this issue, do the following: Add image transfer throttling. Throttling will use less bandwidth for image transfers. Hence, although the transfer takes longer, the NameNode will remain more responsive throughout. Throttling can be enabled by setting dfs.image.transfer.bandwidthPerSec in hdfs-site.xml. It always expects value in bytes. The following example will limit the transfer bandwidth to 50MB/s. <property>
<name>dfs.image.transfer.bandwidthPerSec</name>
<value>50000000</value>
</property> Enable the DataNode life protocol. This will reduce spurious failovers. The Lifeline protocol is a feature recently added by the Apache Hadoop Community (see Apache HDFS Jira HDFS-9239). It introduces a new lightweight RPC message that is used by the DataNodes to report their health to the NameNode. It was developed in response to problems seen in some overloaded clusters where the NameNode was too busy to process heartbeats and spuriously marked DataNodes as dead. For a non-HA cluster, the feature can be enabled with the following configuration in hdfs-site.xml: <property>
<name>dfs.namenode.lifeline.rpc-address</name>
<value>mynamenode.example.com:8050</value>
</property>
(Replace mynamenode.example.com with the hostname or IP address of your namenode. The port number can be different too.) For an HA cluster, the lifeline RPC port can be enabled with the following setup, replacing mycluster, nn1 and nn2 appropriately. <property>
<name>dfs.namenode.lifeline.rpc-address.mycluster.nn1</name>
<value>mynamenode1.example.com:8050</value>
</property>
<property>
<name>dfs.namenode.lifeline.rpc-address.mycluster.nn2</name>
<value>mynamenode2.example.com:8050</value>
</property>
Additional lifeline protocol settings are documented in the HDFS-9239 release note. However, these can be left at their default values for most clusters. Note: Changing the lifeline protocol settings requires a restart of the NameNodes, DataNodes and ZooKeeper Failover Controllers to take full effect. If you have NameNode HA setup, you can restart the NameNodes one at a time followed by a rolling restart of the remaining components to avoid cluster downtime. For some amazing tips on Scaling HDFS, refer to this 4 part guide
... View more
Labels: