Support Questions

Find answers, ask questions, and share your expertise

Hive query failed with java.io.IOException: Cannot find password option fs.s3a.bucket.sg-cdp.fs.s3a.server-side-encryption-algorithm

avatar
Explorer

Hi, I am running CDH 7.1.7p0.  I have a external ORC database on S3 bucket.  I did not configure this fs.s3a.server-side-encryption-algorithm in hadoop core-site.xml.  However, I will randomly get this error when running hive query.  Rerun the same query may success or fail with same error message.  

ERROR : Vertex failed, vertexName=Map 14, vertexId=vertex_1684599244994_0012_7_02, diagnostics=[Vertex vertex_1684599244994_0012_7_02 [Map 14] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ss1 initializer failed, vertex=vertex_1684599244994_0012_7_02 [Map 14], java.lang.RuntimeException: ORC split generation failed with exception: java.lang.RuntimeException: java.io.IOException: Cannot find password option fs.s3a.bucket.sg-cdp.fs.s3a.server-side-encryption-algorithm


The database and query was part of the hive benchmark tool from github:
https://github.com/hortonworks/hive-testbench
Can you help me to resolve this problem.   Thank you. 

Here  are the s3a settings in core-site.xml, and you can see there is no encryption related setting.
<name>fs.s3a.endpoint</name>
<name>fs.s3a.access.key</name>
<name>fs.s3a.secret.key</name>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
<name>fs.s3a.path.style.access</name>
<name>fs.s3a.connection.ssl.enabled</name>
<name>fs.s3a.ssl.channel.mode</name>
<name>fs.s3a.threads.max</name>
<name>fs.s3a.connection.maximum</name>
<name>fs.s3a.retry.limit</name>
<name>fs.s3a.retry.interval</name>
<name>fs.s3a.committer.name</name>
<name>fs.s3a.committer.magic.enabled</name>
<name>fs.s3a.committer.threads</name>
<name>fs.s3a.committer.staging.unique-filenames</name>
<name>fs.s3a.committer.staging.conflict-mode</name>

1 ACCEPTED SOLUTION

avatar

Sorry about the typo in the article. "

  1. Uncomment the property "Generate HADOOP_CREDSTORE_PASSWORD" from Hive  service and Hive on Tez service. This is the flag to enable or disable the generation of HADOOP_CREDSTORE_PASSWORD (generate_jceks_password)."

It is not uncomment and must be uncheck. So please uncheck "Generate HADOOP_CREDSTORE_PASSWORD" in Hive and Hive on Tez, save and restart Hive,Hive on Tez

View solution in original post

19 REPLIES 19

avatar

I am sorry for the delay. This is the exception I was looking for.

Specifically we can notice the failure was due to hive trying to pickup password from its own jceks rather than using the plain text password.

 

Still its unclear why it falls back to jceks instead of using password. I will check with existing known issues. For the timebeing Can you try configuring the password through jceks as indicated in this article https://my.cloudera.com/knowledge/How-to-configure-HDFS-and-Hive-to-use-different-JCEKS-and?id=32605... 

 

You can ignore step 9 to 11 from the article.

sg-cdp is the bucket name, so instead of fs.s3a.bucket.scc-803070-bucket-1.security.credential.provider.path please use the property fs.s3a.bucket.sg-cdp.security.credential.provider.path 

avatar
Explorer

Thank you for getting back to me. 

I am not sure about this step 6, do I manually edit the hive-site.xml and tez-site.xml? 

Uncomment the property "Generate HADOOP_CREDSTORE_PASSWORD" from Hive  service and Hive on Tez service. This is the flag to enable or disable the generation of HADOOP_CREDSTORE_PASSWORD (generate_jceks_password).


Instead of manually edit the hive/hive-on-tez xml, I checked the box ''generate_jceks_password' on Cloudera Manager for both Hive and Hive-on-tez. 

acntap_1-1685661935652.png

I tried few queries, the password problem has not reappeared yet.  I will run all 99 tpcds queries again to see if it comes back. 
Thank  you.

avatar

Sorry about the typo in the article. "

  1. Uncomment the property "Generate HADOOP_CREDSTORE_PASSWORD" from Hive  service and Hive on Tez service. This is the flag to enable or disable the generation of HADOOP_CREDSTORE_PASSWORD (generate_jceks_password)."

It is not uncomment and must be uncheck. So please uncheck "Generate HADOOP_CREDSTORE_PASSWORD" in Hive and Hive on Tez, save and restart Hive,Hive on Tez

avatar
Explorer

If I unchecked the "Generate HADOOP_CREDSTORE_PASSWORD" in Hive and Hive on Tez, save and restart Hive,Hive on Tez, the password problem continues when I run tpcds queries.
When I checked it and restart Hive and Hive on Tez, the password problem goes away.  I am confused what is the right procedure. 

avatar

Thats interesting. 

Let me make sure my understanding is correct.

1. So after configuring password through jceks, there are failures if "Generate HADOOP_CREDSTORE_PASSWORD" is unchecked. Can I please have the full exception, just want to make sure if the full exception points to anything else in latest failure.

2. Did you also remove the properties you already had added i.e "<name>fs.s3a.access.key</name>
<name>fs.s3a.secret.key</name>" in core-site.xml

avatar
Explorer

Not sure this is the log you are referring:
[root@ce-n3 hive]# ls -lt
total 111680
-rw-r--r-- 1 hive hive 97636958 May 29 14:22 hadoop-cmf-hive_on_tez-HIVESERVER2-ce-n3.rtp.lab.xyz.com.log.out
Today, I ran these 3 tests:
1) 2023-05-29 13:13:06 connect to hive using beeline -u 'jdbc:hive2://localhost:10000', the query completed successfully
2) 2023-05-29 13:19:59 connect to hive (it defaults to connecting via thrift:2181), run same query as #1, it fails with encryption password not found
Connecting to jdbc:hive2://ce-n1.rtp.lab.xyz.com:2181,ce-n2.rtp.lab.xyz.com:2181,ce-n3.rtp.lab.xyz.com:2181/default;password=root;serviceDiscoveryMode=zooKeeper;user=root;zooKeeperNamespace=hiveserver2
3) 2023-05-29 13:28:48 connect to hive using beeline -u 'jdbc:hive2://localhost:10000', run query2, fails with encryption password not found.
I do not find a option to attach file in here, even I expanded the toolbar.  Maybe I overlooked? 
Anyway, in the above log, I see error re thrift (even when there are no activity)
2023-05-29 13:07:08,636 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-173]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_372]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_372]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_372]
Caused by: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
... 4 more
2023-05-29 13:07:08,641 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-173]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_372]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_372]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_372]
Caused by: org.apache.thrift.transport.TTransportException: Invalid status 22


 

avatar
Community Manager

@ac-ntap Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. If you are still experiencing the issue, can you provide the information @venkatsambath has requested? Thanks.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Explorer

The problem is not resolved. I posted more information above as requested by  venkatsambath. 

avatar
Explorer

@venkatsambath 
Not sure this is the log you are referring:
[root@ce-n3 hive]# ls -lt
total 111680
-rw-r--r-- 1 hive hive 97636958 May 29 14:22 hadoop-cmf-hive_on_tez-HIVESERVER2-ce-n3.rtp.lab.xyz.com.log.out
Today, I ran these 3 tests:
1) 2023-05-29 13:13:06 connect to hive using beeline -u 'jdbc:hive2://localhost:10000', the query completed successfully
2) 2023-05-29 13:19:59 connect to hive (it defaults to connecting via thrift:2181), run same query as #1, it fails with encryption password not found
Connecting to jdbc:hive2://ce-n1.rtp.lab.xyz.com:2181,ce-n2.rtp.lab.xyz.com:2181,ce-n3.rtp.lab.xyz.com:2181/default;password=root;serviceDiscoveryMode=zooKeeper;user=root;zooKeeperNamespace=hiveserver2
3) 2023-05-29 13:28:48 connect to hive using beeline -u 'jdbc:hive2://localhost:10000', run query2, fails with encryption password not found.

I do not find an option to attach file in here, even I expanded the toolbar.  Maybe I overlooked? 
Anyway, in the above log, I see error re thrift (even when there are no activity)

2023-05-29 13:07:08,636 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-173]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_372]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_372]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_372]
Caused by: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
... 4 more
2023-05-29 13:07:08,641 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-173]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status 22
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) ~[hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) [hive-exec-3.1.3000.7.1.7.0-551.jar:3.1.3000.7.1.7.0-551]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_372]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_372]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_372]
Caused by: org.apache.thrift.transport.TTransportException: Invalid status 22

avatar

Hi @ac-ntap This exception is not related. please ignore this and kindly review my other comment "https://community.cloudera.com/t5/Support-Questions/Hive-query-failed-with-java-io-IOException-Canno..."