Member since
05-24-2019
345
Posts
13
Kudos Received
6
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1053 | 07-15-2025 08:50 AM | |
| 860 | 12-13-2024 08:17 AM | |
| 2005 | 07-22-2024 01:41 PM | |
| 3488 | 09-21-2023 06:52 AM | |
| 8046 | 02-17-2023 10:46 AM |
01-09-2026
10:46 PM
@Hadoop16 FYI ➤ This error occurs because of a token delegation gap between Hive and the HDFS Router. In a Kerberized cluster, when Hive (running on a DataNode/Compute node via Tez or MapReduce) attempts to write to HDFS, it needs a Delegation Token. When you use an HDFS Router address, Hive must be explicitly told to obtain a token specifically for the Router's service principal, which may be different from the backend NameNodes. ➤ The Root Cause The error Client cannot authenticate via:[TOKEN, KERBEROS] at the FileSinkOperator stage indicates that the tasks running on your worker nodes do not have a valid token to "speak" to the Router at router_host:8888. When Hive plans the job, it usually fetches tokens for the default filesystem. If your fs.defaultFS is set to a regular NameNode but your table location is an RBF address, Hive might not be fetching the secondary token required for the Router. ➤ The Fix: Configure Token Requirements You need to ensure Hive and the underlying MapReduce/Tez framework know to fetch tokens for the Router's URI. 1. Add the Router URI to Hive's Token List In your Hive session (or globally in hive-site.xml), you must define the Router as a "known" filesystem that requires tokens. SET hive.metastore.token.signature=hdfs://router_host:8888; SET mapreduce.job.hdfs-servers=hdfs://router_host:8888,hdfs://nameservice-backend; 2. Configure HDFS Client to "Trust" the Router for Tokens In core-site.xml or hdfs-site.xml, you need to enable the Router to act as a proxy for the backend NameNodes so it can pass the tokens correctly. <property> <name>dfs.federation.router.delegation.token.enable</name> <value>true</value> </property> ➤ Critical Kerberos Configuration Because the Router is an intermediary, it must be allowed to impersonate the user (Hive) when talking to the backend. Ensure your ProxyUser settings in core-site.xml include the Router's service principal. Assuming your Router runs as the hdfs or router user: <property> <name>hadoop.proxyuser.router.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.router.hosts</name> <value>*</value> </property> ➤ Diagnostic Verification To prove if the token is missing, run this command from the datanode_host mentioned in your error logs using the same user running the Hive job: # Check if you can manually get a token for the router hdfs fetchdt --renewer hdfs hdfs://router_host:8888 router.token # Check the contents of your current credentials cache klist -f If fetchdt fails, the issue is with the Router's ability to issue tokens. If it succeeds but Hive fails, the issue is with Hive's Job Submission not including the Router URI in the mapreduce.job.hdfs-servers list.
... View more
08-08-2025
10:21 AM
Hive logging is configured on /etc/hive/conf/hive-log4j2.properties. Look for these: property.hive.log.dir property.hive.log.file That is the log location you are looking for Thanks, -JMP
... View more
07-17-2025
09:50 AM
I will check with the development team to see where they are failing with the Hive 4 support and let you know. Thanks
... View more
07-15-2025
08:50 AM
@LSIMS PageWriter.java#L99 is where your operation is failing We need simple steps to reproduce the same error on a brand new environment to investigate further
... View more
07-04-2025
11:19 AM
@Jackallboy Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. If you are still experiencing the issue, can you provide the information requested? Thanks.
... View more
04-03-2025
01:55 AM
@JoseManuel Thank you for the suggestion. I have added the Hive on Tez roles to all NODEMANAGER nodes, but unfortunately, the issue persists. The Spark3 action is still failing.
... View more
01-22-2025
12:20 PM
I tested this in 7.1.9 SP1 CHF 4, with Postgres
... View more
01-17-2025
10:19 AM
@Kalpit What stacktrace do you get on Hive 3.1.3? Have you tried installing Apache hive 4.0.1 We tested and did not get any errors [hive@ccycloud ~]$ beeline -e "CREATE TABLE demo (name string); INSERT INTO demo (name) VALUES ('Dean'); SELECT * FROM demo;" ... !connect jdbc:hive2://localhost:10000/default hive [passwd stripped] Connecting to jdbc:hive2://localhost:10000/default Connected to: Apache Hive (version 4.0.1) Driver: Hive JDBC (version 4.0.1) Transaction isolation: TRANSACTION_REPEATABLE_READ Executing command: CREATE TABLE demo (name string); INSERT INTO demo (name) VALUES ('Dean'); going to print operations logs printed operations logs Getting log thread is interrupted, since query is done! INFO : Compiling command(queryId=hive_20250117101442_d8c29cd4-dc66-4e86-bfc3-acfcda4dbf6d): CREATE TABLE demo (name string) INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20250117101442_d8c29cd4-dc66-4e86-bfc3-acfcda4dbf6d); Time taken: 0.057 seconds INFO : Operation CREATETABLE obtained 1 locks INFO : Executing command(queryId=hive_20250117101442_d8c29cd4-dc66-4e86-bfc3-acfcda4dbf6d): CREATE TABLE demo (name string) INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20250117101442_d8c29cd4-dc66-4e86-bfc3-acfcda4dbf6d); Time taken: 0.411 seconds No rows affected (0.6 seconds) going to print operations logs printed operations logs going to print operations logs INFO : Compiling command(queryId=hive_20250117101443_db567dbe-f0dd-4e5a-bbce-89de35c306f2): INSERT INTO demo (name) VALUES ('Dean') INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_col0, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20250117101443_db567dbe-f0dd-4e5a-bbce-89de35c306f2); Time taken: 0.225 seconds INFO : Operation QUERY obtained 0 locks INFO : Executing command(queryId=hive_20250117101443_db567dbe-f0dd-4e5a-bbce-89de35c306f2): INSERT INTO demo (name) VALUES ('Dean') INFO : Query ID = hive_20250117101443_db567dbe-f0dd-4e5a-bbce-89de35c306f2 INFO : Total jobs = 1 INFO : Launching Job 1 out of 1 INFO : Starting task [Stage-1:MAPRED] in serial mode INFO : Subscribed to counters: [] for queryId: hive_20250117101443_db567dbe-f0dd-4e5a-bbce-89de35c306f2 INFO : Tez session hasn't been created yet. Opening session INFO : Dag name: INSERT INTO demo (name) VALUES ('Dean') (Stage-1) INFO : HS2 Host: [ccycloud.database.root.comops.site], Query ID: [hive_20250117101443_db567dbe-f0dd-4e5a-bbce-89de35c306f2], Dag ID: [dag_1737137683361_0001_1], DAG Session ID: [application_1737137683361_0001] INFO : Status: Running (Executing on YARN cluster with App id application_1737137683361_0001) INFO : Starting task [Stage-2:DEPENDENCY_COLLECTION] in serial mode INFO : Starting task [Stage-0:MOVE] in serial mode INFO : Loading data to table default.demo from file:/tmp/warehouse/demo/.hive-staging_hive_2025-01-17_10-14-43_081_2234726835117592123-3/-ext-10000 INFO : Starting task [Stage-3:STATS] in serial mode INFO : Executing stats task INFO : Table default.demo stats: [numFiles=1, numRows=1, totalSize=5, rawDataSize=4, numFilesErasureCoded=0] INFO : Completed executing command(queryId=hive_20250117101443_db567dbe-f0dd-4e5a-bbce-89de35c306f2); Time taken: 0.955 seconds printed operations logs Getting log thread is interrupted, since query is done! 1 row affected (1.196 seconds) INFO : Compiling command(queryId=hive_20250117101722_7e3b0555-950a-401c-8cca-34c08582d8b3): SELECT * FROM demo INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:demo.name, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20250117101722_7e3b0555-950a-401c-8cca-34c08582d8b3); Time taken: 0.09 seconds INFO : Operation QUERY obtained 0 locks INFO : Executing command(queryId=hive_20250117101722_7e3b0555-950a-401c-8cca-34c08582d8b3): SELECT * FROM demo INFO : Completed executing command(queryId=hive_20250117101722_7e3b0555-950a-401c-8cca-34c08582d8b3); Time taken: 0.0 seconds +------------+ | demo.name | +------------+ | Dean | +------------+ 1 row selected (0.171 seconds) Beeline version 4.0.1 by Apache Hive Closing: 0: jdbc:hive2://localhost:10000/default [hive@ccycloud ~]$
... View more
12-11-2024
06:33 AM
Hello, These are NOT ERRORS: INFO conf.Configuration: resource-types.xml not found.
INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. As for this: INFO mapreduce.Job: map 0% reduce 0% How many mappers were specified for the IMPORT? Try locating the running containers in YARN and take a few JSTACKs to find out if the mapper is stuck waiting from your source database, if so make sure there are no firewall/network rules preventing the flow of data. Are you able to execute SQOOP EVAL on the source DB? If so, try using options: -jt local
-m 1
--verbose If the job completes, that would confirm a communication issue from your NodeManagers to the source DB
... View more