Created 05-23-2023 02:48 PM
Hi, I am running CDH 7.1.7p0. I have a external ORC database on S3 bucket. I did not configure this fs.s3a.server-side-encryption-algorithm in hadoop core-site.xml. However, I will randomly get this error when running hive query. Rerun the same query may success or fail with same error message.
ERROR : Vertex failed, vertexName=Map 14, vertexId=vertex_1684599244994_0012_7_02, diagnostics=[Vertex vertex_1684599244994_0012_7_02 [Map 14] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ss1 initializer failed, vertex=vertex_1684599244994_0012_7_02 [Map 14], java.lang.RuntimeException: ORC split generation failed with exception: java.lang.RuntimeException: java.io.IOException: Cannot find password option fs.s3a.bucket.sg-cdp.fs.s3a.server-side-encryption-algorithm
The database and query was part of the hive benchmark tool from github:
https://github.com/hortonworks/hive-testbench
Can you help me to resolve this problem. Thank you.
Here are the s3a settings in core-site.xml, and you can see there is no encryption related setting.
<name>fs.s3a.endpoint</name>
<name>fs.s3a.access.key</name>
<name>fs.s3a.secret.key</name>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
<name>fs.s3a.path.style.access</name>
<name>fs.s3a.connection.ssl.enabled</name>
<name>fs.s3a.ssl.channel.mode</name>
<name>fs.s3a.threads.max</name>
<name>fs.s3a.connection.maximum</name>
<name>fs.s3a.retry.limit</name>
<name>fs.s3a.retry.interval</name>
<name>fs.s3a.committer.name</name>
<name>fs.s3a.committer.magic.enabled</name>
<name>fs.s3a.committer.threads</name>
<name>fs.s3a.committer.staging.unique-filenames</name>
<name>fs.s3a.committer.staging.conflict-mode</name>
Created on 06-02-2023 06:06 AM - edited 06-02-2023 06:07 AM
Sorry about the typo in the article. "
It is not uncomment and must be uncheck. So please uncheck "Generate HADOOP_CREDSTORE_PASSWORD" in Hive and Hive on Tez, save and restart Hive,Hive on Tez
Created 05-31-2023 01:43 PM
I am sorry for the delay. This is the exception I was looking for.
Specifically we can notice the failure was due to hive trying to pickup password from its own jceks rather than using the plain text password.
Still its unclear why it falls back to jceks instead of using password. I will check with existing known issues. For the timebeing Can you try configuring the password through jceks as indicated in this article https://my.cloudera.com/knowledge/How-to-configure-HDFS-and-Hive-to-use-different-JCEKS-and?id=32605...
You can ignore step 9 to 11 from the article.
sg-cdp is the bucket name, so instead of fs.s3a.bucket.scc-803070-bucket-1.security.credential.provider.path please use the property fs.s3a.bucket.sg-cdp.security.credential.provider.path
Created 06-01-2023 04:42 PM
Thank you for getting back to me.
I am not sure about this step 6, do I manually edit the hive-site.xml and tez-site.xml?
Uncomment the property "Generate HADOOP_CREDSTORE_PASSWORD" from Hive service and Hive on Tez service. This is the flag to enable or disable the generation of HADOOP_CREDSTORE_PASSWORD (generate_jceks_password).
Instead of manually edit the hive/hive-on-tez xml, I checked the box ''generate_jceks_password' on Cloudera Manager for both Hive and Hive-on-tez.
I tried few queries, the password problem has not reappeared yet. I will run all 99 tpcds queries again to see if it comes back.
Thank you.
Created on 06-02-2023 06:06 AM - edited 06-02-2023 06:07 AM
Sorry about the typo in the article. "
It is not uncomment and must be uncheck. So please uncheck "Generate HADOOP_CREDSTORE_PASSWORD" in Hive and Hive on Tez, save and restart Hive,Hive on Tez
Created 06-05-2023 09:38 AM
If I unchecked the "Generate HADOOP_CREDSTORE_PASSWORD" in Hive and Hive on Tez, save and restart Hive,Hive on Tez, the password problem continues when I run tpcds queries.
When I checked it and restart Hive and Hive on Tez, the password problem goes away. I am confused what is the right procedure.
Created 06-05-2023 02:57 PM
Thats interesting.
Let me make sure my understanding is correct.
1. So after configuring password through jceks, there are failures if "Generate HADOOP_CREDSTORE_PASSWORD" is unchecked. Can I please have the full exception, just want to make sure if the full exception points to anything else in latest failure.
2. Did you also remove the properties you already had added i.e "<name>fs.s3a.access.key</name>
<name>fs.s3a.secret.key</name>" in core-site.xml
Created 05-29-2023 11:30 AM
Not sure this is the log you are referring:
[root@ce-n3 hive]# ls -lt
total 111680
-rw-r--r-- 1 hive hive 97636958 May 29 14:22 hadoop-cmf-hive_on_tez-HIVESERVER2-ce-n3.rtp.lab.xyz.com.log.out
Today, I ran these 3 tests:
1) 2023-05-29 13:13:06 connect to hive using beeline -u 'jdbc:hive2://localhost:10000', the query completed successfully
2) 2023-05-29 13:19:59 connect to hive (it defaults to connecting via thrift:2181), run same query as #1, it fails with encryption password not found
Connecting to jdbc:hive2://ce-n1.rtp.lab.xyz.com:2181,ce-n2.rtp.lab.xyz.com:2181,ce-n3.rtp.lab.xyz.com:2181/default;password=root;serviceDiscoveryMode=zooKeeper;user=root;zooKeeperNamespace=hiveserver2
3) 2023-05-29 13:28:48 connect to hive using beeline -u 'jdbc:hive2://localhost:10000', run query2, fails with encryption password not found.
I do not find a option to attach file in here, even I expanded the toolbar. Maybe I overlooked?
Anyway, in the above log, I see error re thrift (even when there are no activity)
2023-05-29 13:07:08,636 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-173]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_372]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_372]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_372]
Caused by: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
... 4 more
2023-05-29 13:07:08,641 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-173]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_372]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_372]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_372]
Caused by: org.apache.thrift.transport.TTransportException: Invalid status 22
Created 05-29-2023 05:18 AM
@ac-ntap Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. If you are still experiencing the issue, can you provide the information @venkatsambath has requested? Thanks.
Regards,
Diana Torres,Created 05-29-2023 02:39 PM
The problem is not resolved. I posted more information above as requested by venkatsambath.
Created 05-29-2023 02:41 PM
@venkatsambath
Not sure this is the log you are referring:
[root@ce-n3 hive]# ls -lt
total 111680
-rw-r--r-- 1 hive hive 97636958 May 29 14:22 hadoop-cmf-hive_on_tez-HIVESERVER2-ce-n3.rtp.lab.xyz.com.log.out
Today, I ran these 3 tests:
1) 2023-05-29 13:13:06 connect to hive using beeline -u 'jdbc:hive2://localhost:10000', the query completed successfully
2) 2023-05-29 13:19:59 connect to hive (it defaults to connecting via thrift:2181), run same query as #1, it fails with encryption password not found
Connecting to jdbc:hive2://ce-n1.rtp.lab.xyz.com:2181,ce-n2.rtp.lab.xyz.com:2181,ce-n3.rtp.lab.xyz.com:2181/default;password=root;serviceDiscoveryMode=zooKeeper;user=root;zooKeeperNamespace=hiveserver2
3) 2023-05-29 13:28:48 connect to hive using beeline -u 'jdbc:hive2://localhost:10000', run query2, fails with encryption password not found.
I do not find an option to attach file in here, even I expanded the toolbar. Maybe I overlooked?
Anyway, in the above log, I see error re thrift (even when there are no activity)
2023-05-29 13:07:08,636 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-173]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_372]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_372]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_372]
Caused by: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
... 4 more
2023-05-29 13:07:08,641 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-173]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_372]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_372]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_372]
Caused by: org.apache.thrift.transport.TTransportException: Invalid status 22
Created 06-01-2023 06:41 AM
Hi @ac-ntap This exception is not related. please ignore this and kindly review my other comment "https://community.cloudera.com/t5/Support-Questions/Hive-query-failed-with-java-io-IOException-Canno..."