Member since
03-30-2023
16
Posts
4
Kudos Received
0
Solutions
05-13-2024
02:50 AM
1 Kudo
Hi Team, looks like you have 3 DN with rack topology have you check how many racks there as per rack awareness 1 block will be on 1 DN in each rach or max 2 block in each rack with different datanode You can try triggering the block report manually from DN to NN hdfs dfsadmin -triggerBlockReport <datanode>:<ipc_port>
... View more
02-25-2024
05:38 PM
1 Kudo
Hi, it seems the table and partition can't be created, also the files on each datanodes can't be located by the namenode. 1. Is there a way to re-point those files? (non dfs used data to the actual directory) Configured Capacity: 1056759873536 (984.18 GB)
DFS Used: 475136 (464 KB)
Non DFS Used: 433030918144 (403.29 GB)
DFS Remaining: 623711703040 (580.88 GB)
DFS Used%: 0.00%
DFS Remaining%: 59.02% Datanode directory: bash-4.2$ cd /hadoop/dfs/data
bash-4.2$ ls -l
total 10485776
drwxrwsr-x. 4 hadoop root 4096 Feb 23 11:15 current
-rw-r--r--. 1 hadoop root 58 Feb 26 09:34 in_use.lock
-rw-rw-r--. 1 hadoop root 10737418240 Aug 28 05:26 tempfile
drwxrwsr-x. 2 hadoop root 4096 Feb 23 13:05 test 2. Next, how can we proceed with creating the tables and partitions? logs of namenode: 2024-02-26 06:52:26,604 DEBUG security.UserGroupInformation: PrivilegedAction as:presto (auth:SIMPLE) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
2024-02-26 06:52:26,604 DEBUG hdfs.StateChange: *DIR* NameNode.rename: /tmp/presto-reporting-operator/576b4b93-ae3b-41ff-b401-be50023f776f/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237 to /operator_metering/storage/metering_health_check/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237
2024-02-26 06:52:26,604 DEBUG security.UserGroupInformation: Failed to get groups for user presto by java.io.IOException: No groups found for user presto
2024-02-26 06:52:26,604 DEBUG hdfs.StateChange: DIR* NameSystem.renameTo: /tmp/presto-reporting-operator/576b4b93-ae3b-41ff-b401-be50023f776f/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237 to /operator_metering/storage/metering_health_check/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237
2024-02-26 06:52:26,604 DEBUG hdfs.StateChange: DIR* FSDirectory.renameTo: /tmp/presto-reporting-operator/576b4b93-ae3b-41ff-b401-be50023f776f/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237 to /operator_metering/storage/metering_health_check/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237
2024-02-26 06:52:26,604 WARN hdfs.StateChange: DIR* FSDirectory.unprotectedRenameTo: failed to rename /tmp/presto-reporting-operator/576b4b93-ae3b-41ff-b401-be50023f776f/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237 to /operator_metering/storage/metering_health_check/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237 because destination's parent does not exist
2024-02-26 06:52:26,604 DEBUG ipc.Server: Served: rename, queueTime= 0 procesingTime= 0
2024-02-26 06:52:26,604 DEBUG ipc.Server: IPC Server handler 5 on 9820: responding to Call#55244 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.rename from 10.128.38.29:59164
2024-02-26 06:52:26,604 DEBUG ipc.Server: IPC Server handler 5 on 9820: responding to Call#55244 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.rename from 10.128.38.29:59164 Wrote 36 bytes.
2024-02-26 06:52:26,607 DEBUG ipc.Server: got #55245
2024-02-26 06:52:26,607 DEBUG ipc.Server: IPC Server handler 6 on 9820: Call#55245 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 10.128.38.29:59164 for RpcKind RPC_PROTOCOL_BUFFER
2024-02-26 06:52:26,607 DEBUG security.UserGroupInformation: PrivilegedAction as:presto (auth:SIMPLE) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
2024-02-26 06:52:26,607 DEBUG security.UserGroupInformation: Failed to get groups for user presto by java.io.IOException: No groups found for user presto
2024-02-26 06:52:26,607 DEBUG metrics.TopMetrics: a metric is reported: cmd: getfileinfo user: presto (auth:SIMPLE)
2024-02-26 06:52:26,607 DEBUG top.TopAuditLogger: ------------------- logged event for top service: allowed=true ugi=presto (auth:SIMPLE) ip=/10.128.38.29 cmd=getfileinfo src=/operator_metering/storage/metering_health_check dst=null perm=null
2024-02-26 06:52:26,607 DEBUG ipc.Server: Served: getFileInfo, queueTime= 0 procesingTime= 0
2024-02-26 06:52:26,607 DEBUG ipc.Server: IPC Server handler 6 on 9820: responding to Call#55245 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 10.128.38.29:59164
2024-02-26 06:52:26,607 DEBUG ipc.Server: IPC Server handler 6 on 9820: responding to Call#55245 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 10.128.38.29:59164 Wrote 34 bytes.
2024-02-26 06:52:26,608 DEBUG ipc.Server: got #55246
2024-02-26 06:52:26,608 DEBUG ipc.Server: IPC Server handler 4 on 9820: Call#55246 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 10.128.38.29:59164 for RpcKind RPC_PROTOCOL_BUFFER logs of reporting-operator: time="2024-02-23T14:14:21Z" level=error msg="cannot insert into Presto table operator_health_check" app=metering component=testWriteToPresto error="presto: query failed (200 OK): \"com.facebook.presto.spi.PrestoException: Failed to create directory: hdfs://hdfs-namenode-proxy:9820/tmp/presto-reporting-operator/1d20c5c5-11e0-47b4-9bce-eaa724db21eb\"" whenever we're trying to query, this is the error: Error running query: Partition location does not exist: hdfs://hdfs-namenode-0.hdfs-namenode:9820/user/hive/warehouse/datasource_mlp_gpu_request_slots/dt=2024-02-08 Thank you!
... View more
10-04-2023
11:30 PM
@Noel_0317 The directory /hadoop/dfs/name/ might be your Namenode data directory that contains the metadata in the form of fsimage and edits. So won't recommend deleting it if that's the case. You can confirm if this directory is indeed the NN data directory by checking the HDFS configuration. If the cluster is working and still taking writes, you can verify if the Namenode Data dir has been changed to a different mount point if the latest data available on it is from July.
... View more
09-30-2023
03:58 AM
1 Kudo
Hi @Noel_0317 I am sharing few hdfs commands to be checked on file system level of hdfs. hdfs dfs -df /hadoop - Shows the capacity, free and used space of the filesystem. hdfs dfs -df -h /hadoop - Shows the capacity, free and used space of the filesystem. -h parameter Formats the sizes of files in a human-readable fashion. hdfs dfs -du /hadoop/file - Show the amount of space, in bytes, used by the files that match the specified file pattern. hdfs dfs -du -s /hadoop/file - Rather than showing the size of each individual file that matches the pattern, shows the total (summary) size. hdfs dfs -du -h /hadoop/file - Show the amount of space, in bytes, used by the files that match the specified file pattern. Formats the sizes of files in a human-readable. fashion Let me know if this helps.
... View more
08-24-2023
01:08 AM
BTW strange to locate datanode in kubernetes as pods are usually used for stateless tasks and a datanode is almost exclusively statefull by nature as it keeps data from HDFS
... View more
08-04-2023
12:36 AM
@Noel_0317 Do you have rack awareness configured for the Datanodes? Also, check for any disk-level issues on the datanode. Try enabling Debug for block placement : log4j.logger.org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy=DEBUG
log4j.logger.org.apache.hadoop.hdfs.protocol.BlockStoragePolicy=DEBUG
... View more
07-31-2023
11:56 PM
Hi @Noel_0317, The error indicates that there are multiple partitions in the where condition. Can you try the below query: INSERT OVERWRITE TABLE db.table_name PARTITION(dt='2023-03-26') select distinct * from db.table_name where dt = '2023-03-26'; Let us know how it goes. Cheers!
... View more
06-12-2023
11:51 PM
Block pool used is visible and it's current size but it seems the namenode can't locate the blocks in the datanodes that's why it's showing 0 blocks.
... View more
Labels:
- Labels:
-
Apache Hive
03-31-2023
11:34 AM
Can you isolate any connection issues between your NN and DN pods? Maybe you can try doing an nc or telnet to the NN port from the DN pod?
... View more