Member since
07-30-2020
219
Posts
45
Kudos Received
60
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
426 | 11-20-2024 11:11 PM | |
485 | 09-26-2024 05:30 AM | |
1080 | 10-26-2023 08:08 AM | |
1851 | 09-13-2023 06:56 AM | |
2125 | 08-25-2023 06:04 AM |
11-21-2024
12:02 AM
1 Kudo
Thank you @rki_ ! That is absolutely what happened. I had a node that the /tmp/ folder still contained old journalnode data. After cleaning it up and doing initializeSharedEdits i managed to start cluster. Note: I had this exact exception on two slave nodes: WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage java.io.IOException: There appears to be a gap in the edit log. We expected txid 121994, but got txid 121998. I did hdfs namenode -recover on both slave nodes and then was able to start both namenodes propely. The data is replicated within all 3 nodes. Thank you so much for the help!
... View more
09-29-2024
09:55 PM
Hi rki_, It seems like both hbase:meta and hbase:namespace tables are not online. I am attaching the master log for your review, and if you know a way to fix this, can you check it? 2024-09-30 10:11:28,981 WARN [master/dc1-apache-hbase:16000:becomeActiveMaster] master.HMaster (HMaster.java:isRegionOnline(1373)) - hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1727422999057, server=dc1-apache-hbase.mobitel.lk,16020,1727159057270}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined. 2024-09-30 10:12:28,982 WARN [master/dc1-apache-hbase:16000:becomeActiveMaster] master.HMaster (HMaster.java:isRegionOnline(1373)) - hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1727422999057, server=dc1-apache-hbase.mobitel.lk,16020,1727159057270}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined. 2024-09-30 10:13:19,391 ERROR [ActiveMasterInitializationMonitor-1727422999267] master.MasterInitializationMonitor (MasterInitializationMonitor.java:run(67)) - Master failed to complete initialization after 900000ms. Please consider submitting a bug report including a thread dump of this process. 2024-09-30 10:13:28,982 WARN [master/dc1-apache-hbase:16000:becomeActiveMaster] master.HMaster (HMaster.java:isRegionOnline(1373)) - hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1727422999057, server=dc1-apache-hbase.mobitel.lk,16020,1727159057270}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined. 2024-09-30 10:13:36,668 INFO [master:store-WAL-Roller] monitor.StreamSlowMonitor (StreamSlowMonitor.java:<init>(122)) - New stream slow monitor dc1-apache-hbase.mobitel.lk%2C16000%2C1727422992087.1727671416667 2024-09-30 10:13:36,684 INFO [master:store-WAL-Roller] wal.AbstractFSWAL (AbstractFSWAL.java:logRollAndSetupWalProps(834)) - Rolled WAL /hbase/MasterData/WALs/dc1-apache-hbase.mobitel.lk,16000,1727422992087/dc1-apache-hbase.mobitel.lk%2C16000%2C1727422992087.1727670516635 with entries=0, filesize=85 B; new WAL /hbase/MasterData/WALs/dc1-apache-hbase.mobitel.lk,16000,1727422992087/dc1-apache-hbase.mobitel.lk%2C16000%2C1727422992087.1727671416667 2024-09-30 10:13:37,089 INFO [WAL-Archive-0] wal.AbstractFSWAL (AbstractFSWAL.java:archiveLogFile(815)) - Archiving hdfs://192.168.6.205:9000/hbase/MasterData/WALs/dc1-apache-hbase.mobitel.lk,16000,1727422992087/dc1-apache-hbase.mobitel.lk%2C16000%2C1727422992087.1727670516635 to hdfs://192.168.6.205:9000/hbase/MasterData/oldWALs/dc1-apache-hbase.mobitel.lk%2C16000%2C1727422992087.1727670516635 2024-09-30 10:13:37,092 INFO [WAL-Archive-0] region.MasterRegionUtils (MasterRegionUtils.java:moveFilesUnderDir(50)) - Moved hdfs://192.168.6.205:9000/hbase/MasterData/oldWALs/dc1-apache-hbase.mobitel.lk%2C16000%2C1727422992087.1727670516635 to hdfs://192.168.6.205:9000/hbase/oldWALs/dc1-apache-hbase.mobitel.lk%2C16000%2C1727422992087.1727670516635$masterlocalwal$ 2024-09-30 10:14:28,982 WARN [master/dc1-apache-hbase:16000:becomeActiveMaster] master.HMaster (HMaster.java:isRegionOnline(1373)) - hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1727422999057, server=dc1-apache-hbase.mobitel.lk,16020,1727159057270}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined. 2024-09-30 10:15:28,983 WARN [master/dc1-apache-hbase:16000:becomeActiveMaster] master.HMaster (HMaster.java:isRegionOnline(1373)) - hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1727422999057, server=dc1-apache-hbase.mobitel.lk,16020,1727159057270}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined. 2024-09-30 10:16:28,983 WARN [master/dc1-apache-hbase:16000:becomeActiveMaster] master.HMaster (HMaster.java:isRegionOnline(1373)) - hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1727422999057, server=dc1-apache-hbase.mobitel.lk,16020,1727159057270}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined. 2024-09-30 10:16:41,861 INFO [RS-EventLoopGroup-1-1] hbase.Server (ServerRpcConnection.java:processConnectionHeader(550)) - Connection from 192.168.6.205:57364, version=2.5.10, sasl=false, ugi=super (auth:SIMPLE), service=MasterService 2024-09-30 10:17:28,984 WARN [master/dc1-apache-hbase:16000:becomeActiveMaster] master.HMaster (HMaster.java:isRegionOnline(1373)) - hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1727422999057, server=dc1-apache-hbase.mobitel.lk,16020,1727159057270}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined. 2024-09-30 10:18:28,985 WARN [master/dc1-apache-hbase:16000:becomeActiveMaster] master.HMaster (HMaster.java:isRegionOnline(1373)) - hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1727422999057, server=dc1-apache-hbase.mobitel.lk,16020,1727159057270}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined. 2024-09-30 10:19:28,985 WARN [master/dc1-apache-hbase:16000:becomeActiveMaster] master.HMaster (HMaster.java:isRegionOnline(1373)) - hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1727422999057, server=dc1-apache-hbase.mobitel.lk,16020,1727159057270}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined. 2024-09-30 10:20:28,985 WARN [master/dc1-apache-hbase:16000:becomeActiveMaster] master.HMaster (HMaster.java:isRegionOnline(1373)) - hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1727422999057, server=dc1-apache-hbase.mobitel.lk,16020,1727159057270}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined. Thank you!
... View more
09-13-2024
04:59 AM
2 Kudos
See if you can raise a support ticket with Cloudera. The app log needs a detailed review to know what is causing the container to get fail.
... View more
08-28-2024
04:30 AM
I am unable to locate /hbase-secure znode , which one should i delete have the same issue , I am just having /hbase znode
... View more
02-25-2024
05:38 PM
1 Kudo
Hi, it seems the table and partition can't be created, also the files on each datanodes can't be located by the namenode. 1. Is there a way to re-point those files? (non dfs used data to the actual directory) Configured Capacity: 1056759873536 (984.18 GB)
DFS Used: 475136 (464 KB)
Non DFS Used: 433030918144 (403.29 GB)
DFS Remaining: 623711703040 (580.88 GB)
DFS Used%: 0.00%
DFS Remaining%: 59.02% Datanode directory: bash-4.2$ cd /hadoop/dfs/data
bash-4.2$ ls -l
total 10485776
drwxrwsr-x. 4 hadoop root 4096 Feb 23 11:15 current
-rw-r--r--. 1 hadoop root 58 Feb 26 09:34 in_use.lock
-rw-rw-r--. 1 hadoop root 10737418240 Aug 28 05:26 tempfile
drwxrwsr-x. 2 hadoop root 4096 Feb 23 13:05 test 2. Next, how can we proceed with creating the tables and partitions? logs of namenode: 2024-02-26 06:52:26,604 DEBUG security.UserGroupInformation: PrivilegedAction as:presto (auth:SIMPLE) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
2024-02-26 06:52:26,604 DEBUG hdfs.StateChange: *DIR* NameNode.rename: /tmp/presto-reporting-operator/576b4b93-ae3b-41ff-b401-be50023f776f/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237 to /operator_metering/storage/metering_health_check/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237
2024-02-26 06:52:26,604 DEBUG security.UserGroupInformation: Failed to get groups for user presto by java.io.IOException: No groups found for user presto
2024-02-26 06:52:26,604 DEBUG hdfs.StateChange: DIR* NameSystem.renameTo: /tmp/presto-reporting-operator/576b4b93-ae3b-41ff-b401-be50023f776f/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237 to /operator_metering/storage/metering_health_check/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237
2024-02-26 06:52:26,604 DEBUG hdfs.StateChange: DIR* FSDirectory.renameTo: /tmp/presto-reporting-operator/576b4b93-ae3b-41ff-b401-be50023f776f/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237 to /operator_metering/storage/metering_health_check/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237
2024-02-26 06:52:26,604 WARN hdfs.StateChange: DIR* FSDirectory.unprotectedRenameTo: failed to rename /tmp/presto-reporting-operator/576b4b93-ae3b-41ff-b401-be50023f776f/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237 to /operator_metering/storage/metering_health_check/20240226_065226_04624_gjgwp_25ca095b-e61e-45a9-b4e3-d12a880a2237 because destination's parent does not exist
2024-02-26 06:52:26,604 DEBUG ipc.Server: Served: rename, queueTime= 0 procesingTime= 0
2024-02-26 06:52:26,604 DEBUG ipc.Server: IPC Server handler 5 on 9820: responding to Call#55244 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.rename from 10.128.38.29:59164
2024-02-26 06:52:26,604 DEBUG ipc.Server: IPC Server handler 5 on 9820: responding to Call#55244 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.rename from 10.128.38.29:59164 Wrote 36 bytes.
2024-02-26 06:52:26,607 DEBUG ipc.Server: got #55245
2024-02-26 06:52:26,607 DEBUG ipc.Server: IPC Server handler 6 on 9820: Call#55245 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 10.128.38.29:59164 for RpcKind RPC_PROTOCOL_BUFFER
2024-02-26 06:52:26,607 DEBUG security.UserGroupInformation: PrivilegedAction as:presto (auth:SIMPLE) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
2024-02-26 06:52:26,607 DEBUG security.UserGroupInformation: Failed to get groups for user presto by java.io.IOException: No groups found for user presto
2024-02-26 06:52:26,607 DEBUG metrics.TopMetrics: a metric is reported: cmd: getfileinfo user: presto (auth:SIMPLE)
2024-02-26 06:52:26,607 DEBUG top.TopAuditLogger: ------------------- logged event for top service: allowed=true ugi=presto (auth:SIMPLE) ip=/10.128.38.29 cmd=getfileinfo src=/operator_metering/storage/metering_health_check dst=null perm=null
2024-02-26 06:52:26,607 DEBUG ipc.Server: Served: getFileInfo, queueTime= 0 procesingTime= 0
2024-02-26 06:52:26,607 DEBUG ipc.Server: IPC Server handler 6 on 9820: responding to Call#55245 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 10.128.38.29:59164
2024-02-26 06:52:26,607 DEBUG ipc.Server: IPC Server handler 6 on 9820: responding to Call#55245 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 10.128.38.29:59164 Wrote 34 bytes.
2024-02-26 06:52:26,608 DEBUG ipc.Server: got #55246
2024-02-26 06:52:26,608 DEBUG ipc.Server: IPC Server handler 4 on 9820: Call#55246 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from 10.128.38.29:59164 for RpcKind RPC_PROTOCOL_BUFFER logs of reporting-operator: time="2024-02-23T14:14:21Z" level=error msg="cannot insert into Presto table operator_health_check" app=metering component=testWriteToPresto error="presto: query failed (200 OK): \"com.facebook.presto.spi.PrestoException: Failed to create directory: hdfs://hdfs-namenode-proxy:9820/tmp/presto-reporting-operator/1d20c5c5-11e0-47b4-9bce-eaa724db21eb\"" whenever we're trying to query, this is the error: Error running query: Partition location does not exist: hdfs://hdfs-namenode-0.hdfs-namenode:9820/user/hive/warehouse/datasource_mlp_gpu_request_slots/dt=2024-02-08 Thank you!
... View more
01-05-2024
12:34 AM
I think your intention is to retrieve these data for your own monitoring or reporting tasks. If so, you can try requesting JMX to obtain the relevant data, such as through http://namenode:port/jmx.
... View more
10-30-2023
11:05 PM
Hi @rki_ , i tired that but still it is failing with same error. sudo -u hive beeline -u "jdbc:hive2://machine1.dev.domain.com:2181/default;password=hive;principal=hive/_HOST@DEV.domain.COM;serviceDiscoveryMode=zooKeeper;ssl=true;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks;trustStorePassword=****;user=hive;zooKeeperNamespace=hiveserver2" --hiveconf dfs.replication=1 -n hive--showHeader=false --outputformat=tsv2 -e "use testdb; export table newt1 to '/staging/exporttable/testdb/newt1';" Error : 23/10/31 02:01:14 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 configs from ZooKeeper Error: Could not open client transport for any of the Server URI's in ZooKeeper: Failed to open new session: java.lang.IllegalArgumentException: Cannot modify dfs.replication at runtime. It is not in list of params that are allowed to be modified at runtime (state=08S01,code=0)
... View more
10-26-2023
06:41 PM
Hi rki_, I understood. Thank you for the information.
... View more
10-24-2023
01:34 AM
To enable Ranger authorization for HDFS on the same cluster we should not select the Ranger service dependency but we should select the 'Enable Ranger Authorization' checkbox instead of the Ranger service under HDFS. In the base cluster, even if you select / check the box for "Ranger_service", the CM seem to indicate saving configuration successfully, but that box will never be checked, and a warning message will be logged in CM server logs indicating "CyclicDependencyConfigUpdateListener - Unsetting dependency from service hdfs to service ranger to prevent cyclic dependency". Refer the below article which is for Solr-Ranger dependency. https://my.cloudera.com/knowledge/WARN-quotUnsetting-dependency-from-servicequot-when-Ranger?id=329275
... View more