Member since
04-25-2016
579
Posts
609
Kudos Received
111
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1118 | 02-12-2020 03:17 PM | |
883 | 08-10-2017 09:42 AM | |
7140 | 07-28-2017 03:57 AM | |
1321 | 07-19-2017 02:43 AM | |
1021 | 07-13-2017 11:42 AM |
03-25-2021
03:35 AM
Hi, @rajkumar_singh - im getting same issue. Just wondering if you were able to fix this issue? May I ask how did you resolve it?
... View more
09-14-2020
12:16 AM
@RajaChintala You have to make below changes. HADOOP_OPTS="-Dsun.security.krb5.debug=true" For more details follow below docs: https://docs.cloudera.com/documentation/enterprise/5-10-x/topics/cdh_sg_debug_sun_kerberos_enable.html https://spark.apache.org/docs/2.0.0/running-on-yarn.html#troubleshooting-kerberos https://spark.apache.org/docs/2.0.0/running-on-yarn.html#troubleshooting-kerberos
... View more
05-08-2020
07:47 PM
The full format of an HDFS uri is: hdfs://NAMESERVICE/path/to/your/file
hdfs://NAMESERVICE/path/to/your/directory or hdfs://ANYNAMENODE:PORT/path/to/your/file
hdfs://ANYNAMENODE:PORT/path/to/your/directory It is ok to omit the nameservice/namenode part (to use the defaultFs). To do it correctly, you need to keep the `hdfs://` part and relative path that starts with a `/`. That is: hdfs:///path/to/your/file (Note that it looks like `hdfs:` followed by 3x `/`s)
... View more
02-19-2020
09:20 AM
Writing this so that it can help someone in future: I was installing Hive and getting error that It hive metastore wasn't able to connect, and I successfully resolved the error by recreating the hive metastore database. Someone the user which was created in mysql Hive metastore wasn't working properly and not able to authenticate. So I dropped metastore DB, Dropped User. Recreated Metastore DB, Recreated User, Granted all privileges and then it was working without issues.
... View more
02-17-2020
08:21 AM
Hi, @rajkumar_singh thanks to your advice I found my file hiveserver2 Interactive.log and the error is in task 4-5. I noticed 2 things : - First, unfortunately before the first NPE error in the file hiveserver2 Interactive.log it is another NPE that we find - Secondly, in task 3 we have - Cannot get a table snapshot for factmachine_mv but I don't thinks that is the error's source. HiveServeur2Interactive.log: 2020-02-17T15:11:13,525 INFO [HiveServer2-Background-Pool: Thread-5750]: FileOperations (FSStatsAggregator.java:aggregateStats(101)) - Read stats for : dwh_dev.factmachine_mv/ numRows 1231370
2020-02-17T15:11:13,525 INFO [HiveServer2-Background-Pool: Thread-5750]: FileOperations (FSStatsAggregator.java:aggregateStats(101)) - Read stats for : dwh_dev.factmachine_mv/ rawDataSize 552885130
2020-02-17T15:11:13,525 WARN [HiveServer2-Background-Pool: Thread-5750]: metadata.Hive (Hive.java:alterTable(778)) - Cannot get a table snapshot for factmachine_mv
2020-02-17T15:11:13,609 INFO [HiveServer2-Background-Pool: Thread-5750]: stats.BasicStatsTask (SessionState.java:printInfo(1227)) - Table dwh_dev.factmachine_mv stats: [numFiles=2, numRows=1231370, totalSize=1483151, rawDataSize=552885130]
2020-02-17T15:11:13,609 INFO [HiveServer2-Background-Pool: Thread-5750]: stats.BasicStatsTask (BasicStatsTask.java:aggregateStats(271)) - Table dwh_dev.factmachine_mv stats: [numFiles=2, numRows=1231370, totalSize=1483151, rawDataSize=552885130]
2020-02-17T15:11:13,620 INFO [HiveServer2-Background-Pool: Thread-5750]: mapred.FileInputFormat (FileInputFormat.java:listStatus(259)) - Total input files to process : 1
2020-02-17T15:11:14,881 INFO [HiveServer2-Background-Pool: Thread-5750]: ql.Driver (Driver.java:launchTask(2710)) - Starting task [Stage-5:DDL] in serial mode
2020-02-17T15:11:14,902 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (CalcitePlanner.java:genLogicalPlan(385)) - Starting generating logical plan
2020-02-17T15:11:14,904 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(12232)) - Completed phase 1 of Semantic Analysis
2020-02-17T15:11:14,904 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2113)) - Get metadata for source tables
2020-02-17T15:11:14,904 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2244)) - Get metadata for subqueries
2020-02-17T15:11:14,905 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2113)) - Get metadata for source tables
2020-02-17T15:11:14,935 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2244)) - Get metadata for subqueries
2020-02-17T15:11:14,936 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2268)) - Get metadata for destination tables
2020-02-17T15:11:14,936 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2268)) - Get metadata for destination tables
2020-02-17T15:11:14,944 INFO [HiveServer2-Background-Pool: Thread-5750]: ql.Context (Context.java:getMRScratchDir(551)) - New scratch dir is hdfs://lvdcluster/tmp/hive/hive/c0988fe9-1a5e-4a6a-9ddd-7b782d0cdba1/hive_2020-02-17_15-11-14_897_3869104234201454658-21
2020-02-17T15:11:14,945 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(12237)) - Completed getting MetaData in Semantic Analysis
2020-02-17T15:11:14,945 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.BaseSemanticAnalyzer (CalcitePlanner.java:canCBOHandleAst(875)) - Not invoking CBO because the statement has lateral views
2020-02-17T15:11:14,946 ERROR [HiveServer2-Background-Pool: Thread-5750]: exec.TaskRunner (TaskRunner.java:runSequential(108)) - Error in executeTask
java.lang.NullPointerException: null
at org.apache.calcite.plan.RelOptMaterialization.<init>(RelOptMaterialization.java:68) ~[calcite-core-1.16.0.3.1.4.0-315.jar:1.16.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.addMaterializedView(HiveMaterializedViewsRegistry.java:235) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.createMaterializedView(HiveMaterializedViewsRegistry.java:187) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.MaterializedViewTask.execute(MaterializedViewTask.java:59) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2712) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2383) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2055) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1747) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at java.security.AccessController.doPrivileged(Native Method) [?:1.8.0_112]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_112]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) [hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
2020-02-17T15:11:14,947 INFO [HiveServer2-Background-Pool: Thread-5750]: reexec.ReOptimizePlugin (ReOptimizePlugin.java:run(70)) - ReOptimization: retryPossible: false
2020-02-17T15:11:14,948 ERROR [HiveServer2-Background-Pool: Thread-5750]: ql.Driver (SessionState.java:printError(1250)) - FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.metadata.HiveException(Error while invoking FailureHook. hooks: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.reexec.ReExecutionOverlayPlugin$LocalHook.run(ReExecutionOverlayPlugin.java:45)
at org.apache.hadoop.hive.ql.HookRunner.invokeGeneralHook(HookRunner.java:296)
at org.apache.hadoop.hive.ql.HookRunner.runFailureHooks(HookRunner.java:283)
at org.apache.hadoop.hive.ql.Driver.invokeFailureHooks(Driver.java:2664)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2434)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2055)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1747)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226)
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
) For the moment, my solution is to use use a simple view. Because I noticed that even if the materilized view doesn't work, a simple view work with this query.
... View more
02-13-2020
03:06 AM
Accidently , I marked this answer as resolved . @rajkumar_singh I am getting below output after executing "hdfs groups <username>" command. <username>@<kerberose principle > : domain users dev_sudo As i am not much aware of cluster configuration So , Could you please help me to understand the output of this command.
... View more
08-02-2017
08:15 AM
7 Kudos
Creating and running Temporary functions are discouraged while running a query on LLAP because of security reason since many users are sharing same instances of LLAP, it can create a conflict but still, you can create temp functions using add jar and hive.llap.execution.mode=auto. with exclusive llap execution mode(hive.llap.execution.mode=only) you will run into the ClassNotFoundException, hive.llap.execution.mode=auto will allow some part of query(map tasks) to run in the tez container. Here are steps to create a custom permanent function in LLAP(steps are tested on HDP-260) 1. create a jar for UDF function (in this case I am using simple udf): git clone https://github.com/rajkrrsingh/SampleCode
mvn clean package 2. upload the target/SampleCode.jar to the node where HSI is running(in my case I have copied it to /tmp directory) 3. add jar to hive_aux_jars (goto Ambari--> hive --> config --> hive-interactive-env template) export HIVE_AUX_JARS_PATH=$HIVE_AUX_JARS_PATH:/tmp/SampleCode.jar
4. add the jar to Auxillary JAR list (goto Ambari--> hive --> config --> Auxillary JAR list) Auxillary JAR list=/tmp/SampleCode.jar 5. restart LLAP 6. create Permanent Custom function connect to HSI using beeline
create FUNCTION CustomLength as 'com.rajkrrsingh.hiveudf.CustomLength';
describe function CustomLength;
select CustomLength(description) from sample_07 limit 1;
7. check where the SampleCode.jar localized root@hdp26 container_e06_1501140901077_0019_01_000002]# pwd
/hadoop/yarn/local/usercache/hive/appcache/application_1501140901077_0019/container_e06_1501140901077_0019_01_000002
[root@hdp26 container_e06_1501140901077_0019_01_000002]# find . -iname sample*
./app/install/lib/SampleCode.jar
... View more
- Find more articles tagged with:
- Data Processing
- Hive
- hive-udf
- hiveserver2
- How-ToTutorial
- llap
Labels:
12-25-2016
08:16 AM
Problem Description: often we are in need to read and write underlying files from user defined reader and writer. if the the custom reader and writer are written in java or language run over JVM
then we are good to add those in hive_aux_jar or we can add them using add jar option at session level but when if it is written in native language and shipped as *.so file
then we will get java.lang.UnsatisfiedLinkError. we can workaround this problem after adding it to hive-env
1. open Ambari-->Hive-->Advanced-->Advanced hive-env-->hive-env template 2. modify {% if sqla_db_used or lib_dir_available %}
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:{{jdbc_libs_dir}}"
export JAVA_LIBRARY_PATH="$JAVA_LIBRARY_PATH:{{jdbc_libs_dir}}"
{% endif %}
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- Hive
- Issue Resolution
Labels:
12-24-2016
06:43 PM
SYMPTOM:
oozie-hive job is running very slow, sometimes jobs are stuck is in final stage and was not able to complete ROOT CAUSE: Oozie prepares the hive-site.xml for hive action, it has the mapred parameter mapreduce.job.reduces set to 1 by default,the reason for this is oozie prepare action configuration after reading core-site.xml,hdfs-site.xml,mapred-site.xml etc.with the setting of mapreduce.job.reduces=1 the job is running with single reducer hence taking a lot of time to complete. WORKAROUND: set mapreduce.job.reduces to -1 RESOLUTION: there is oozie fix https://issues.apache.org/jira/browse/OOZIE-2205 enhance actionConf which are passed to hive-action.
... View more
- Find more articles tagged with:
- Governance & Lifecycle
- Hive
- Issue Resolution
- Oozie
Labels:
12-24-2016
05:08 PM
SYMPTOM: hive CLI is in hung state for so long, an impatient user did CRTL+C to exit of it and complain about the hive CLI slowness. ROOT CAUSE: user is running hive CLI in Kerberos enabled security, we asked them to enable debug logging at the console using hive --hiveconf hive.root.logger=debug,console and saw the following GSS exception due to ticket expiration. WARN hive.metastore: Failed to connect to the MetaStore Server...
org.apache.thrift.transport.TTransportException: GSS initiate failed
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:426)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1531)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3000)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3019)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1237)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:484)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:680)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
during hive cli startup it tries to connect to metastore but do not have a valid TGT hence failing with the GSS exception. WORKAROUND: NA RESOLUTION: to fail it fast we can use following properties in hive configuration hive.metastore.connect.retries- no of times Client will try to connect to Metastore by default it is set to 24. hive.metastore.client.connect.retry.delay- the delay after failure this is by default set to 5s.
... View more
- Find more articles tagged with:
- Hive
- Issue Resolution
- issue-resolution
- Kerberos
- Security
Labels:
12-24-2016
05:33 PM
You may try VBoxManage internalcommands repairhd --format OVA --filename <image> Use the "--dry-run" option to check what the tool would do. Once image repaired again try to import it.
... View more
01-03-2017
09:41 PM
Alicia, please see my answer above on Oct 24. If you are running Spark on YARN you will have to go through the YARN RM UI to get to the Spark UI for a running job. Link for YARN UI is available from Ambari YARN service. For a completed job, you will need to go through Spark History Server. Link for Spark history server is available from the Ambari Spark service.
... View more
12-24-2016
09:11 AM
1 Kudo
while investigating a performance issue with topology assignments, I figured out these high-level steps which storm uses for topologies assignments. 1. for backward compatibility for the old topologies Backward ClientJarTransformerRunner starts and invoke StormShadeTransformer which transformed the jar write into /tmp/<some_random_string>.jar 2. StormSubmitter start uploading topology jar to nimbus inbox using NimbusClient o.a.s.StormSubmitter - Uploading topology jar /tmp/27ed633ac9aa11e6a850fa163e19dd06.jar to assigned location: /hadoop/storm/nimbus/inbox/stormjar-b1eca4ae-d021-4e93-aaf1-986c9a5772ad.jar
Start uploading file '/tmp/27ed633ac9aa11e6a850fa163e19dd06.jar' to '/hadoop/storm/nimbus/inbox/stormjar-b1eca4ae-d021-4e93-aaf1-986c9a5772ad.jar'
o.a.s.StormSubmitter - Successfully uploaded topology jar to assigned location: /hadoop/storm/nimbus/inbox/stormjar-b1eca4ae-d021-4e93-aaf1-986c9a5772ad.jar
3. nimbus client submit topology to Nimbus using thrift call o.a.s.StormSubmitter - Submitting topology wordcount in distributed mode with conf {"storm.zookeeper.topology.auth.scheme":"digest","storm.zookeeper.topology.auth.payload":"-5184467572710101881:-6542959882697852797","topology.workers":3,"topology.debug":true}
o.a.s.StormSubmitter - Finished submitting topology: wordcount
4. nimbus received topology submission from zookeeper o.a.s.d.nimbus [INFO] Received topology submission for wordcount with conf {"topology.max.task.parallelism" nil, "topology.submitter.principal" "", "topology.acker.executors" nil, "topology.eventlogger.executors" 0, "topology.workers" 3, "topology.debug" true, "storm.zookeeper.superACL" nil, "topology.users" (), "topology.submitter.user" "storm", "topology.kryo.register" nil, "topology.kryo.decorators" (), "storm.id" "wordcount-1-1482564367", "topology.name" "wordcount"}
5.nimbus create assignments in zookeeper and set a watch 2016-12-24 07:26:08.696 o.a.s.d.nimbus [INFO] Setting new assignment for topology id wordcount-1-1482564367: #org.apache.storm.daemon.common.Assignment{:master-code-dir "/hadoop/storm", :node->host {"3cb18e51-aa66-424c-8165-e9101ab134bb" "rkk3.hdp.local"}, :executor->node+port {[8 8] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [12 12] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [2 2] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [7 7] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [22 22] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [3 3] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [24 24] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [1 1] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [18 18] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [6 6] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [28 28] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [20 20] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [9 9] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [23 23] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [11 11] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [16 16] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [13 13] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [19 19] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [21 21] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [5 5] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [27 27] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [29 29] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [26 26] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [10 10] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [14 14] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [4 4] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [15 15] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [25 25] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [17 17] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700]}, :executor->start-time-secs {[8 8] 1482564368, [12 12] 1482564368, [2 2] 1482564368, [7 7] 1482564368, [22 22] 1482564368, [3 3] 1482564368, [24 24] 1482564368, [1 1] 1482564368, [18 18] 1482564368, [6 6] 1482564368, [28 28] 1482564368, [20 20] 1482564368, [9 9] 1482564368, [23 23] 1482564368, [11 11] 1482564368, [16 16] 1482564368, [13 13] 1482564368, [19 19] 1482564368, [21 21] 1482564368, [5 5] 1482564368, [27 27] 1482564368, [29 29] 1482564368, [26 26] 1482564368, [10 10] 1482564368, [14 14] 1482564368, [4 4] 1482564368, [15 15] 1482564368, [25 25] 1482564368, [17 17] 1482564368}, :worker->resources {["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700] [0.0 0.0 0.0], ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701] [0.0 0.0 0.0]}} 6. supervisors got watchevent and read from the assignments 2016-12-24 07:26:09.577 o.a.s.d.supervisor [DEBUG] All assignment: {6701 {:storm-id "wordcount-1-1482564367", :executors ([8 8] [12 12] [2 2] [22 22] [24 24] [18 18] [6 6] [28 28] [20 20] [16 16] [26 26] [10 10] [14 14] [4 4]), :resources [0.0 0.0 0.0]}, 6700 {:storm-id "wordcount-1-1482564367", :executors ([7 7] [3 3] [1 1] [9 9] [23 23] [11 11] [13 13] [19 19] [21 21] [5 5] [27 27] [29 29] [15 15] [25 25] [17 17]), :resources [0.0 0.0 0.0]}} 7. supervisors start downloading the topology jar after download it start launching workers
2016-12-24 07:26:12.728 o.a.s.d.supervisor [INFO] Launching worker with assignment {:storm-id "wordcount-1-1482564367", :executors [[7 7] [3 3] [1 1] [9 9] [23 23] [11 11] [13 13] [19 19] [21 21] [5 5] [27 27] [29 29] [15 15] [25 25] [17 17]], :resources #object[org.apache.storm.generated.WorkerResources 0x28e35c1e "WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this supervisor 3cb18e51-aa66-424c-8165-e9101ab134bb on port 6700 with id ac690504-6b52-4c88-a5bd-50fa78992368
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- FAQ
- nimbus
- Storm
- supervisors
Labels:
12-24-2016
07:06 AM
SYMPTOM: hiveserver2 logs are filled with following exceptions: 2016-12-22 16:36:49,643 WARN ipc.Client (Client.java:run(685)) - Exception encountered while connecting to the server :
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:563)
at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:378)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:732)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:728)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:727)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:378)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1492)
at org.apache.hadoop.ipc.Client.call(Client.java:1402)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy23.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:773)
at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy24.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2162)
at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1363)
at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1359)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1359)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.ranger.audit.destination.HDFSAuditDestination.getLogFileStream(HDFSAuditDestination.java:226)
at org.apache.ranger.audit.destination.HDFSAuditDestination.logJSON(HDFSAuditDestination.java:123)
at org.apache.ranger.audit.queue.AuditFileSpool.sendEvent(AuditFileSpool.java:890)
at org.apache.ranger.audit.queue.AuditFileSpool.runDoAs(AuditFileSpool.java:838)
at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:759)
at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:757)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1689)
at org.apache.ranger.audit.queue.AuditFileSpool.run(AuditFileSpool.java:765)
at java.lang.Thread.run(Thread.java:745)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
ROOT CAUSE: hiveserver2 configured with ranger plugin which writes hdfs audit event to both database as well as hdfs. the hiveserver2 thread hiveServer2.async.multi_dest.batch_hiveServer2.async.multi_dest.batch.hdfs_destWriter is trying to write audit events on hdfs but due to TGT got expired. WORKAROUND: disable audit events write on hdfs. RESOLUTION: this has been fixed in https://issues.apache.org/jira/browse/RANGER-1136, so apply a patch to avoid this.
... View more
- Find more articles tagged with:
- Data Processing
- hiveserver2
- Issue Resolution
- Ranger
Labels:
12-23-2016
06:56 PM
Kafka Producer (Python) yum install -y python-pip
pip install kafka-python
//kafka producer sample code
vim kafka_producer.py
from kafka import KafkaProducer
from kafka.errors import KafkaError
producer = KafkaProducer(bootstrap_servers=['rkk1.hdp.local:6667'])
topic = "kafkatopic"
producer.send(topic, b'test message')
//run it
python kafka_consumer.py
//test it
[root@rkk1 ~]# /usr/hdp/current/kafka-broker/bin/kafka-console-consumer.sh --zookeeper `hostname`:2181 --topic kafkatopic
{metadata.broker.list=rkk1.hdp.local:6667,rkk2.hdp.local:6667,rkk3.hdp.local:6667, request.timeout.ms=30000, client.id=console-consumer-41051, security.protocol=PLAINTEXT}
test message Kafka Producer (Scala) mkdir kafkaproducerscala
cd kafkaproducerscala/
mkdir -p src/main/scala
cd src/main/scala
vim KafkaProducerScala.scala
object KafkaProducerScala extends App {
import java.util.Properties
import org.apache.kafka.clients.producer._
val props = new Properties()
props.put("bootstrap.servers", "rkk1:6667")
props.put("acks","1")
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
val producer = new KafkaProducer[String, String](props)
val topic="kafkatopic"
for(i<- 1 to 50) {
val record = new ProducerRecord(topic, "key"+i, "value"+i)
producer.send(record)
}
producer.close()
}
cd -
vim build.sbt
val kafkaVersion = "0.9.0.0"
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.kafka" % "kafka-clients" % kafkaVersion
resolvers += Resolver.mavenLocal
sbt package
sbt run
... View more
- Find more articles tagged with:
- How-ToTutorial
- Kafka
- Sandbox & Learning
Labels:
12-23-2016
06:37 PM
SYMPTOM: hivemetastore crashing with outofmemoryerror during ACID compactions. ERROR [Thread-13]: compactor.Cleaner (Cleaner.java:run(140)) - Caught an exception in the main loop of compactor cleaner, java.lang.OutOfMemoryError: Java heap space
ERROR [Thread-13]: compactor.Cleaner (Cleaner.java:run(140)) - Caught an exception in the main loop of compactor cleaner, java.lang.OutOfMemoryError: Java heap space ROOT CAUSE: Enabled heap dump on outofmemory, after Analysis the heap dump we found that there are lots of entries for FileSystem$Cache$Key,FileSystem objects which was causing a memory leak WORKAROUND: set fs.hdfs.impl.disable.cache=true set fs.file.impl.disable.cache=true RESOLUTION: this has been fixed in https://issues.apache.org/jira/browse/HIVE-13151, so apply a patch to avoid this.
... View more
- Find more articles tagged with:
- Data Processing
- Hive
- hive-metastore
- Issue Resolution
Labels:
09-11-2018
08:02 PM
@Rajkumar Singh Can we override these value in PublishKafka processor instead of making change at broker/producer level? If no, then why this "PublishKafka_0_10 1.5.0.3.1.0.0-564" processor in NiFi has field "Max Request Size" which allow us to modify. Default value of this field is 1MB. Thanks!
... View more
12-23-2016
06:18 AM
3 Kudos
often we have need to read the parquet file, parquet-meta data or parquet-footer, parquet tools is shipped with parquet-hadoop library which can help us to read parquet. these are simple steps to build parquet-tools and demonstrate use of it. prerequisites: maven 3,git, jdk-7/8 // Building a parquet tools git clone https://github.com/Parquet/parquet-mr.git
cd parquet-mr/parquet-tools/
mvn clean package -Plocal // know the schema of the parquet file java -jar parquet-tools-1.6.0.jar schema sample.parquet // Read parquet file java -jar parquet-tools-1.6.0.jar cat sample.parquet // Read few lines in parquet file java -jar parquet-tools-1.6.0.jar head -n5 sample.parquet // know the meta information of the parquet file java -jar parquet-tools-1.6.0.jar meta sample.parquet
... View more
- Find more articles tagged with:
- Data Processing
- How-ToTutorial
- parquet
- tool
12-23-2016
05:49 AM
SYMPTOM: user complains that he is seeing some special characters (^@^@^@^@^@^@^) in GC logs similar to this concurrent-mark-sweep perm gen total 29888K, used 29792K [0x00000007e0000000, 0x00000007e1d30000, 0x0000000800000000)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2016-12-10T00:09:31.648-0500: 912.497: [GC2016-12-10T00:09:31.648-0500: 912.497: [ParNew: 574321K->15632K(629120K), 0.0088960 secs] 602402K->43713K(2027264K), 0.0090030 secs]
ROOT CAUSE: there was some query which was running on hiveserver2 in mr mode which spins Map Join actually forking a new process after inheriting the all java arguments was actually writing to same HiveServer2 GC files. this is introducing a special characters in the GC logs and also some of the GC events got skipped while writing to GC file. WORKAROUND: NA RESOLUTION: Run the query in Tez mode which will force the map join to run in to task container or set hive.exec.submit.local.task.via.child=false which will not fork a child process to run a local map task but this can be risky if somehow you mapjoin goes OOM then it can stall your service
... View more
- Find more articles tagged with:
- Data Processing
- hiveserver2
- Issue Resolution
- issue-resolution
Labels:
12-22-2016
04:42 PM
1 Kudo
SYMPTOM: The following jmx tool command is failing with "port already in user" error /usr/hdp/current/kafka-broker/bin/kafka-run-class.sh kafka.tools.JmxTool --object-name kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec --jmx-url service:jmx:rmi:///jndi/rmi://`hostname`:1099/jmxrmi ROOT CAUSE: JMX tool invoke the kafka-run-class.sh after reading kafka-env.sh, since we have already have export $JMX_PORT=1099 there then it tried to bind the same host while invoking the jmx tool which result into error. WORKAROUND: NA RESOLUTION: edit kafka-run-class.sh Replace section
# JMX port to use
if [ $JMX_PORT ]; then
KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT"
fi
To the following:
Changed security protocol to the correct ones and kafka cli script as follows.
# JMX port to use
if [ $ISKAFKASERVER = "true" ]; then
JMX_REMOTE_PORT=$JMX_PORT
else
JMX_REMOTE_PORT=$CLIENT_JMX_PORT
fi
if [ $JMX_REMOTE_PORT ]; then
KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_REMOTE_PORT"
fi
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- Issue Resolution
- issue-resolution
- Kafka
Labels:
02-16-2017
07:42 PM
Does setting hive.mv.file.thread=0 reduce the performance of the insert query.Can you explain what does setting this configration has to do with HDP 2.5 upgrade? Is the move task parallelism not present in the previous version
... View more
12-22-2016
12:51 PM
1 Kudo
SYMPTOM: there are lots of heap dump file created under the directory controlled by java flag -XX:HeapDumpPath, user complains that HiveServer2 has generated these dump since this flag was setup by him for HiveServer2. looking at the HiveServer2 logs, there was no trace for startup and stop services. even there was no sign of any type of failure in the logs. ROOT CAUSE: At very preliminary analysis of heap dump I examine the process arguments which looks like this hive 12345 1234 9.8 27937992 26020536 ? Sl Dec01 19887:12 \_ /etc/alternatives/java_sdk_1.8.0/bin/java -Xmx24576m -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Dhdp. version=2.3.4.20-3 -Dhadoop.log.dir=/var/log/hadoop/hive -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.4.2.-258/hadoop -Dhadoop.id.str=hive -Dhadoop.root.logger=INFO,console -Djava.library. path=:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/lib/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.2.-258/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net. preferIPv4Stack=true -Xmx24576m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive -Dhadoop.security.logger=INFO,NullAppender -Dhdp.version=2.3.4.20-3 -Dhadoop.log.dir=/var/log/hadoop/hive - Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.4.2.-258/hadoop -Dhadoop.id.str=hive -Dhadoop.root.logger=INFO,console -Djava.library.path=:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64- 64:/usr/lib/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.2.-258/hadoop/lib/native:/usr/hdp/2.4.2.-258/hadoop/lib/native/Linux-amd... hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx24576m -Xmx24576m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive -Dhadoop.security. logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/hdp/2.4.2.-258/hive/lib/hive-common-1.2.1.2.3.4.74-1.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan file:/tmp/hive/261a57a5- caab-4f98-9fa2-f50209ba29e9/*****/-local-10006/plan.xml -jobconffile file:/tmp/hive/K*****/-local-10007/jobconf.xml the java process argument suggests that it is not hiveserver2, it looks like map side join spun by cliDriver. we looked at a hive-env file which seems messed up, specially HADOOP_CLIENT_OPTS if [ "$SERVICE" = "cli" ]; then
if [ -z "$DEBUG" ]; then
export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive "
else
export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive"
fi
fi
# The heap size of the jvm stared by hive shell script can be controlled via:
export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive "
so after looking over we are sure that it was hive CLI process which spun Map Side join actually went into OOM not the Hiveserver2. WORKAROUND: NA RESOLUTION: Ask user to modify hive-env and set HADOOP_CLIENT_OPTS appropriately.
... View more
- Find more articles tagged with:
- Data Processing
- Hive
- hiveserver2
- Issue Resolution
Labels:
08-27-2018
02:00 PM
got the same issues multiple times on our HDP 2.3 cluster.
... View more
12-22-2016
06:32 AM
2 Kudos
@Yukti Agrawal
On accessing the RM UI and click on the applicationId and corresponding logs link. It will take you to a screen something as below. Here you would see a file dag_*.dot, this should give you the query/MR graph that is being executed. Other option would be if the execution engine is TEZ, then you can leverage TEZ UI to view the actual query as well.
... View more
12-21-2016
05:58 PM
1 Kudo
SYMPTOM: hive metastore process is going down frequently with following exceptions: 2016-12-16 01:30:45,016 ERROR [Thread[Thread-5,5,main]]: thrift.TokenStoreDelegationTokenSecretManager (TokenStoreDelegationTokenSecretManager.java:run(331)) - ExpiredTokenRemover thread received unexpected exception. org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started
org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started
at org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnRawStore(DBTokenStore.java:131)
at org.apache.hadoop.hive.thrift.DBTokenStore.getToken(DBTokenStore.java:76)
at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.removeExpiredTokens(TokenStoreDelegationTokenSecretManager.java:256)
at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager$ExpiredTokenRemover.run(TokenStoreDelegationTokenSecretManager.java:319)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started
at org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47)
at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131)
at org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88)
at org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80)
at org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:392)
at org.apache.hadoop.hive.metastore.ObjectStore.getToken(ObjectStore.java:6412)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
at com.sun.proxy.$Proxy0.getToken(Unknown Source)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnRawStore(DBTokenStore.java:123)
... 4 more
2016-12-16 01:30:45,017 INFO [Thread-3]: metastore.HiveMetaStore (HiveMetaStore.java:run(5643)) - Shutting down hive metastore.
ROOT CAUSE: when hive metastore start, the DelegationTokenSecretManager will maintain the same objectstore,this lead to the concurrent issue. WORKAROUND: NA RESOLUTION: this concurrency issue has been fixed in HIVE-11616 and HIVE-13090 so need to apply the patches.
... View more
- Find more articles tagged with:
- Data Processing
- Hive
- hive-metastore
- Issue Resolution
Labels:
12-21-2016
05:43 PM
1 Kudo
SYMPTOM: HiveServer2 is hung and is not able to execute simple query like show tables, during the investigation, we took some jstacks and realize that the following thread is processing very slow. "HiveServer2-Handler-Pool: Thread-86129" #86129 prio=5 os_prio=0 tid=0x00007f3ad9e1a800 nid=0x1003b runnable [0x00007f3a73b0a000]
java.lang.Thread.State: RUNNABLE
at java.util.HashMap$TreeNode.find(HashMap.java:1851)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.find(HashMap.java:1861)
at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:1979)
at java.util.HashMap.putVal(HashMap.java:637)
at java.util.HashMap.put(HashMap.java:611)
at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:290)
at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:746)
at org.apache.hadoop.hive.ql.ppd.OpProcFactory$JoinerPPD.process(OpProcFactory.java:464)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:133)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
at org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:135)
at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:192)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10167)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:211)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:406)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:290)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
- locked <0x00000005c1e1bc18> (a java.lang.Object)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:388)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:375)
at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy45.executeStatementAsync(Unknown Source)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:274)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
ROOT CAUSE: the compilation stage of a query in hiveserver2 is single threaded before Hive-2, so only one query is able to compile during that stage and other queries will remain in the wait state. we observe that during compilation this thread is executing expression factory for predicate pushdown processing, in which Each processor determines whether the expression is a possible candidate for predicate pushdown optimization for the given operator. the user was running a huge query at that time with which consist of more than 300+ 'case and then condition' which is taking too long. WORKAROUND: ask customer to set hive.optimize.ppd=false at session level while running this query and ask them to rewrite the sql in more optimized way. RESOLUTION: set hive.optimize.ppd=false at session level
... View more
- Find more articles tagged with:
- Data Processing
- Hive
- hiveserver2
- Issue Resolution
Labels:
12-21-2016
05:24 PM
1 Kudo
SYMPTOM: hiveserver2 is too slow to respond to the simple query and taking lot of time to complete or something unresponsive. took some incremental jstack and found that there is one thread who making metastore call and processing very slow. Thread 24233: (state = IN_NATIVE)
- java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise)
- java.net.SocketInputStream.socketRead(java.io.FileDescriptor, byte[], int, int, int) @bci=8, line=116 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int, int) @bci=79, line=170 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=141 (Compiled frame)
- oracle.net.ns.Packet.receive() @bci=157, line=311 (Compiled frame)
- oracle.net.ns.DataPacket.receive() @bci=1, line=105 (Compiled frame)
- oracle.net.ns.NetInputStream.getNextPacket() @bci=48, line=305 (Compiled frame)
- oracle.net.ns.NetInputStream.read(byte[], int, int) @bci=33, line=249 (Compiled frame)
- oracle.net.ns.NetInputStream.read(byte[]) @bci=5, line=171 (Compiled frame)
- oracle.net.ns.NetInputStream.read() @bci=5, line=89 (Compiled frame)
- oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket() @bci=11, line=123 (Compiled frame)
- oracle.jdbc.driver.T4CSocketInputStreamWrapper.read() @bci=55, line=84 (Compiled frame)
- oracle.jdbc.driver.T4CMAREngineStream.unmarshalUB1() @bci=6, line=429 (Compiled frame)
- oracle.jdbc.driver.T4CTTIfun.receive() @bci=16, line=397 (Compiled frame)
- oracle.jdbc.driver.T4CTTIfun.doRPC() @bci=116, line=257 (Compiled frame)
- oracle.jdbc.driver.T4C8Oall.doOALL(boolean, boolean, boolean, boolean, boolean, oracle.jdbc.internal.OracleStatement$SqlKind, int, byte[], int, oracle.jdbc.driver.Accessor[], int, oracle.jdbc.driver. Accessor[], int, byte[], char[], short[], int, oracle.jdbc.driver.DBConversion, byte[], java.io.InputStream[][], byte[][][], oracle.jdbc.oracore.OracleTypeADT[][], oracle.jdbc.driver.OracleStatement, byte[], char[], short[], oracle.jdbc.driver.T4CTTIoac[], int[], int[], int[], oracle.jdbc.driver.NTFDCNRegistration, oracle.jdbc.driver.ByteArray, long[], int[], boolean) @bci=903, line=587 (Compiled frame)
- oracle.jdbc.driver.T4CPreparedStatement.doOall8(boolean, boolean, boolean, boolean, boolean, int) @bci=780, line=225 (Compiled frame)
- oracle.jdbc.driver.T4CPreparedStatement.doOall8(boolean, boolean, boolean, boolean, boolean) @bci=23, line=53 (Compiled frame)
- oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe() @bci=37, line=774 (Compiled frame)
- oracle.jdbc.driver.OracleStatement.executeMaybeDescribe() @bci=106, line=925 (Compiled frame)
- oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout() @bci=250, line=1111 (Compiled frame)
- oracle.jdbc.driver.OraclePreparedStatement.executeInternal() @bci=145, line=4798 (Compiled frame)
- oracle.jdbc.driver.OraclePreparedStatement.executeQuery() @bci=18, line=4845 (Compiled frame)
- oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery() @bci=4, line=1501 (Compiled frame)
- com.jolbox.bonecp.PreparedStatementHandle.executeQuery() @bci=68, line=174 (Compiled frame)
- org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery() @bci=4, line=375 (Compiled frame)
- org.datanucleus.store.rdbms.SQLController.executeStatementQuery(org.datanucleus.ExecutionContext, org.datanucleus.store.connection.ManagedConnection, java.lang.String, java.sql.PreparedStatement) @ bci=120, line=552 (Compiled frame)
- org.datanucleus.store.rdbms.scostore.JoinListStore.listIterator(org.datanucleus.state.ObjectProvider, int, int) @bci=329, line=770 (Compiled frame)
- org.datanucleus.store.rdbms.scostore.AbstractListStore.listIterator(org.datanucleus.state.ObjectProvider) @bci=4, line=93 (Compiled frame)
- org.datanucleus.store.rdbms.scostore.AbstractListStore.iterator(org.datanucleus.state.ObjectProvider) @bci=2, line=83 (Compiled frame)
- org.datanucleus.store.types.wrappers.backed.List.loadFromStore() @bci=77, line=264 (Compiled frame)
- org.datanucleus.store.types.wrappers.backed.List.iterator() @bci=8, line=492 (Compiled frame)
- org.apache.hadoop.hive.metastore.ObjectStore.convertToFieldSchemas(java.util.List) @bci=21, line=1199 (Compiled frame)
- org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(org.apache.hadoop.hive.metastore.model.MStorageDescriptor, boolean) @bci=39, line=1266 (Compiled frame)
- org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(org.apache.hadoop.hive.metastore.model.MStorageDescriptor) @bci=3, line=1281 (Compiled frame)
ROOT CAUSE: After further investigation, we found that metastore db was running outside of the cluster and shared by other applications.when there was a load on metastore database then it is taking too long to respond to the query. since this is happening in the compilation stage of the query which is single threaded hence the other incoming query which lands on Hiveserver2 will wait until this query is done with the compilation. WORKAROUND: NA RESOLUTION: Run Metastore database inside the cluster and don't share it with the other applications if you are running huge workload on HiveServer2. please check network between HiveServer2 node and MySQL node to see any bottleneck.
... View more
- Find more articles tagged with:
- Data Processing
- Hive
- hiveserver2
- Issue Resolution
- Metastore
Labels:
12-21-2016
05:07 PM
1 Kudo
SYMPTOM: beeline connection to hiveserver2 is failing with the following exception intermittently. "The initCause method cannot be used. To set the cause of this exception, use a constructor with a Throwable[] argument." further looking at the HiveServer2 logs we saw stack trace like this ava.sql.SQLException: Could not retrieve transation read-only status server
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
at org.datanucleus.api.jdo.JDOPersistenceManager.getDataStoreConnection(JDOPersistenceManager.java:2259)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getProductName(MetaStoreDirectSql.java:171)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.determineDbType(MetaStoreDirectSql.java:152)
at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:122)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:300)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:263)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
at com.sun.proxy.$Proxy7.setConf(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.setConf(HiveMetaStore.java:523)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:56)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5798)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
... 35 more
Caused by: java.sql.SQLException: Could not retrieve transation read-only status server
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1086)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:989)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:975)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:920)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:951)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:941)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3972)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3943)
at com.jolbox.bonecp.ConnectionHandle.isReadOnly(ConnectionHandle.java:867)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:422)
at org.datanucleus.store.rdbms.RDBMSStoreManager.getNucleusConnection(RDBMSStoreManager.java:1382)
at org.datanucleus.api.jdo.JDOPersistenceManager.getDataStoreConnection(JDOPersistenceManager.java:2245)
... 51 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 381,109 milliseconds ago. The last packet sent successfully to the server was
381,109 milliseconds ago.
at sun.reflect.GeneratedConstructorAccessor55.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1129)
at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3988)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2598)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2778)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2828)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2777)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1651)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3966)
... 56 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3969)
ROOT CAUSE: After analyzing TCP DUMP we found that it's MySQL server who is dropping network packets very frequently. WORKAROUND: NA RESOLUTION: need to check and fix the network issue between HiveServer2 host and MySQL host.
... View more
- Find more articles tagged with:
- Data Processing
- Hive
- hiveserver2
- Issue Resolution
- MySQL
Labels:
12-21-2016
04:17 PM
2 Kudos
SYMPTOM: HiveServer2 remains in hung state, jstack reveals the following trace. "HiveServer2-Handler-Pool: Thread-139105" prio=10 tid=0x00007ff34e080800 nid=0x3d43e in Object.wait() [0x00007ff30974e000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at org.apache.hadoop.ipc.Client.call(Client.java:1417)
- locked <0x00000003e1c5f298> (a org.apache.hadoop.ipc.Client$Call)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy23.checkAccess(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.checkAccess(ClientNamenodeProtocolTranslatorPB.java:1469)
at sun.reflect.GeneratedMethodAccessor93.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy24.checkAccess(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.checkAccess(DFSClient.java:3472)
at org.apache.hadoop.hdfs.DistributedFileSystem$53.doCall(DistributedFileSystem.java:2270)
at org.apache.hadoop.hdfs.DistributedFileSystem$53.doCall(DistributedFileSystem.java:2267)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.access(DistributedFileSystem.java:2267)
at sun.reflect.GeneratedMethodAccessor92.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.shims.Hadoop23Shims.checkFileAccess(Hadoop23Shims.java:1006)
at org.apache.hadoop.hive.common.FileUtils.checkFileAccessWithImpersonation(FileUtils.java:378)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:417)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
at org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.isURIAccessAllowed(RangerHiveAuthorizer.java:752)
at org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges(RangerHiveAuthorizer.java:252)
at org.apache.hadoop.hive.ql.Driver.doAuthorizationV2(Driver.java:837)
at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:628)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:504)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:316)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1189)
- locked <0x0000000433e8e4b8> (a java.lang.Object)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1183)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:419)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:400)
at sun.reflect.GeneratedMethodAccessor148.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy37.executeStatement(Unknown Source)
at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:263)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1317)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1302)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
ROOT CAUSE: hive.security.authorization.enabled is true and user is running a create external table command with a non-existent directory, so authorizer will check parent directory and all the files/directory inside it recursively.the issue is reported in https://issues.apache.org/jira/browse/HIVE-10022. WORKAROUND: Restart hiveserver2 and create a table in a directory which has few file under it. RESOLUTION: the fix for this is available as HOTFIX-332, if you are using Ranger based authorization then please get the fix for RANGER-1126.
... View more
- Find more articles tagged with:
- Data Processing
- Hive
- hiveserver2
- Issue Resolution
Labels:
12-21-2016
02:21 PM
3 Kudos
ENV: HDP-2.5 Java : openjdk version "1.8.0_111" the following storm topology consist of a KafkaSpout and a SinkTypeBolt Step 1: Create pom.xml with following dependencies <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>hadoop</groupId>
<artifactId>KafkaSpoutStorm</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>stormkafka</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<repositories>
<repository>
<id>HDPReleases</id>
<name>HDP Releases</name>
<url>http://repo.hortonworks.com/content/repositories/public</url>
<layout>default</layout>
</repository>
<repository>
<id>HDPJetty</id>
<name>Hadoop Jetty</name>
<url>http://repo.hortonworks.com/content/repositories/jetty-hadoop/</url>
<layout>default</layout>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-core</artifactId>
<version>1.0.1.2.5.3.0-37</version>
<scope>provided</scope>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.10</artifactId>
<version>0.10.0.2.5.3.0-37</version>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka</artifactId>
<version>1.0.1.2.5.3.0-37</version>
</dependency>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-hdfs</artifactId>
<version>1.0.1.2.5.3.0-37</version>
</dependency>
<dependency>
<groupId>com.googlecode.json-simple</groupId>
<artifactId>json-simple</artifactId>
<version>1.1</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>1.4</version>
<configuration>
<createDependencyReducedPom>true</createDependencyReducedPom>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.rajkrrsingh.storm.Topology</mainClass>
</transformer>
</transformers>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>2.4</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
<resources>
<resource>
<directory>src/main/java</directory>
<includes>
<include> **/*.properties</include>
</includes>
</resource>
</resources>
</build>
</project>
Step 2: clone the git repo to get the complete code git clone https://github.com/rajkrrsingh/KafkaSpoutStorm.git Step 3: modify default_config.properties according to your cluster Step 4: build using maven, this will create a fat jar in target folder mvn clean package
Step 5: Now Run it on storm cluster storm jar KafkaSpoutStorm-0.0.1-SNAPSHOT.jar com.rajkrrsingh.storm.Topology
... View more
- Find more articles tagged with:
- Data Science & Advanced Analytics
- FAQ
- How-ToTutorial
- kafka-spout
- Storm
Labels: