Member since
10-04-2018
23
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5825 | 01-14-2019 04:01 PM |
01-27-2019
03:43 PM
The problem appears to have been caused by the virus scanning software running.
... View more
01-27-2019
03:43 PM
This shows no corrupt hfiles. ./hbase hbck -checkCorruptHFiles Checked 117 hfile for corruption HFiles corrupted: 0 HFiles moved while checking: 0 Mob files moved while checking: 0 Summary: OK Mob summary: OK
... View more
01-27-2019
03:43 PM
I see these "could not obtain block" errors in the log. Here's another one. The block is ok per hdfs dfs <path> -files -blocks. This makes me think hbase can't read the file b/c it can't read the hfile trailer. Need to figure out how to verify/validate and repair the hfile. 2019-01-14 09:58:00,575 ERROR [regionserver/hadoop-2:16020-shortCompactions-1547430414622] regionserver.CompactSplit: Compaction failed region=EDA_ATTACHMENTS,,1546990167772.005c417fdc141d22d49c63fe93014aa8., storeName=DATA, priority=96, startTime=1547484994585
org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://hadoop-1.nit.disa.mil:8020/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/84c4ac0eb34048f88b2c6267eb4b0f1a
at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:545)
at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:579)
at org.apache.hadoop.hbase.regionserver.StoreFileReader.<init>(StoreFileReader.java:104)
at org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:270)
at org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:357)
at org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:465)
at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:683)
at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:676)
at org.apache.hadoop.hbase.regionserver.HStore.validateStoreFile(HStore.java:1858)
at org.apache.hadoop.hbase.regionserver.HStore.moveFileIntoPlace(HStore.java:1431)
at org.apache.hadoop.hbase.regionserver.HStore.moveCompactedFilesIntoPlace(HStore.java:1419)
at org.apache.hadoop.hbase.regionserver.HStore.doCompaction(HStore.java:1387)
at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1375)
at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2095)
at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:592)
at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:634)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-869721575-207.132.83.245-1543446665241:blk_1073784662_43855 file=/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/84c4ac0eb34048f88b2c6267eb4b0f1a
at org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:870)
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:853)
___
This shows the block is ok.
hdfs/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/84c4ac0eb34048f88b2c6267eb4b0f1a -files -blocks
/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/84c4ac0eb34048f88b2c6267eb4b0f1a 85047198 bytes, replicated: replication=2, 1 block(s): OK
0. BP-869721575-207.132.83.245-1543446665241:blk_1073784662_43855 len=85047198 Live_repl=2
___
... View more
01-27-2019
03:43 PM
I'm getting errors in the regionserver logs about could not obtain block. Above is an example but there are different errors related to the same could not locate block or file. The file is in hdfs. The regionservers can't recover and eventually crash. When restarting the regionserver they will try to locate the block and can't and continue to go down.
... View more
01-27-2019
03:43 PM
2019-01-08 16:22:29,475 WARN [MemStoreFlusher.0] impl.BlockReaderFactory: I/O error constructing remote block reader.
java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/<ip>:51076 remote=/<ip>:50010]
2019-01-08 16:22:29,477 WARN [MemStoreFlusher.0] hdfs.DFSClient: Failed to connect to /<ip>:50010 for file /apps/hbase/data/data/default/EDA_ATTACHMENTS/376661f95c7be7f667a876480e732976/.tmp/DATA/92e54a03a44042a1be63a7ff04158792 for block BP-869721575-<ip>-1543446665241:blk_1073772872_32065, add to deadNodes and continue.
2019-01-08 16:31:06,275 ERROR [regionserver/hadoop-2:16020-shortCompactions-1546916652740] regionserver.CompactSplit: Compaction failed region=EDA_ATTACHMENTS,,1546990167772.005c417fdc141d22d49c63fe93014aa8., storeName=DATA, priority=96, startTime=1546990170152
org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://hadoop-1.nit.disa.mil:8020/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/3e9c1942fe484a26a81ba5a2578a69d5
at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:545)
at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:579)
at org.apache.hadoop.hbase.regionserver.StoreFileReader.<init>(StoreFileReader.java:104)
at org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:270)
at org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:357)
at org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:465)
at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:683)
at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:676)
at org.apache.hadoop.hbase.regionserver.HStore.validateStoreFile(HStore.java:1858)
at org.apache.hadoop.hbase.regionserver.HStore.moveFileIntoPlace(HStore.java:1431)
at org.apache.hadoop.hbase.regionserver.HStore.moveCompactedFilesIntoPlace(HStore.java:1419)
at org.apache.hadoop.hbase.regionserver.HStore.doCompaction(HStore.java:1387)
at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2095)
at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:592)
at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:634)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-869721575-<ip>-1543446665241:blk_1073772893_32086 file=/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/3e9c1942fe484a26a81ba5a2578a69d5
at org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:870)
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:853)
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:832)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:564)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:754)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:820)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:401)
at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:532)
... View more
Labels:
- Labels:
-
Apache HBase
01-14-2019
04:01 PM
This was a temp table. I wasn't able to drop the table b/c it was in a 'DISABLING' state, not 'DISABLED'. I was able to figure out a way to manually remove the table. Remove the table directory in hdfs, /apps/hbase/data/data/default/<table_name>. Remove the rows in hbase:meta that referenced the table, deleteall 'hbase:meta', '<row_name>' Stop hbase, zookeeper. Start zookeeper, hbase.
... View more
01-14-2019
03:03 PM
This was a temp table. I wasn't able to drop the table b/c it was in a 'DISABLING' state, not 'DISABLED'. I was able to figure out a way to manually remove the table. Remove the table directory in hdfs, /apps/hbase/data/data/default/<table_name>. Remove the rows in hbase:meta that referenced the table, deleteall 'hbase:meta', '<row_name>' Stop hbase, zookeeper. Start zookeeper, hbase.
... View more
01-13-2019
07:52 PM
When I try to unassign the region (close) I see this in the log. Is there a way to manually set the state of the region or force it to close. 2019-01-13 12:46:06,272 WARN [PEWorker-16] assignment.RegionTransitionProcedure: Failed transition, suspend 2048secs pid=2931, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure table=EDA_CONTRACTS_TEMP, region=52fdc258feeeae9285551f2cc231d841, server=hadoop-2,16020,1546877712972; rit=OPENING, location=hadoop-2,16020,1546877712972; waiting on rectified condition fixed by other Procedure or operator intervention
org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but current state=OPENING
... View more
01-13-2019
04:28 AM
hbase(main):002:0> is_enabled 'EDA_CONTRACTS_TEMP'
false
Took 0.1362 seconds
=> false
hbase(main):003:0> is_disabled 'EDA_CONTRACTS_TEMP'
false
Took 0.0290 seconds
=> 1
./hbase hbck -details
Number of regions in transition: 1
EDA_CONTRACTS_TEMP,,1545342660748.52fdc258feeeae9285551f2cc231d841. state=OPENING, ts=Thu Jan 10 19:07:32 MST 2019 (177746s ago), server=null
... View more
01-13-2019
04:24 AM
I have a table stuck in 'DISABLING' state and one of the four table regions is stuck in 'OPENING' state. Hbase 2.0 hbck doesn't allow fixing tables. ./hbase zkcli, ls /hbase-unsecure/table/<table name> doesn't show anything restarting the hbase master, regionservers, and zookeeper doesn't resolve How do I fix this issue.
... View more
Labels:
- Labels:
-
Apache HBase
12-28-2018
10:26 PM
Thanks for digging into this and giving temporary workaround.
I changed app.js and the regionserver shows started (green checkmark). The actions available list on the regionserver now are stop, turn on maintenance mode, and delete is disabled. } else if (isInServiceDesired) {
//this.setStatusAs('RS_DECOMMISSIONED'); this.setSatusAs('INSERVICE'); sets the regionserver to the correct status and correct actions list. } else if (isInServiceDesired) {
//this.setStatusAs('RS_DECOMMISSIONED');
this.setStatusAs('INSERVICE');
} Thanks again for looking at this. How would I track the apache bug submission.
... View more
12-27-2018
09:00 PM
id | component_name | desired_state | host_id | service_name | admin_state
-----+----------------------+---------------+---------+--------------+-------------
154 | PHOENIX_QUERY_SERVER | STARTED | 1 | HBASE | INSERVICE
155 | PHOENIX_QUERY_SERVER | STARTED | 2 | HBASE | INSERVICE
159 | HBASE_CLIENT | INSTALLED | 2 | HBASE |
157 | HBASE_CLIENT | INSTALLED | 1 | HBASE |
158 | HBASE_CLIENT | INSTALLED | 3 | HBASE |
156 | PHOENIX_QUERY_SERVER | STARTED | 3 | HBASE | INSERVICE
152 | HBASE_REGIONSERVER | STARTED | 3 | HBASE |
153 | HBASE_REGIONSERVER | STARTED | 1 | HBASE |
151 | HBASE_MASTER | STARTED | 2 | HBASE |
"href" : "https://hadoop-1.nit.disa.mil:8443/api/v1/clusters/PIEE/hosts/hadoop-2.nit.disa.mil/host_components/HBASE_REGIONSERVER/?fields=*",
"HostRoles" : {
"cluster_name" : "PIEE",
"component_name" : "HBASE_REGIONSERVER",
"desired_admin_state" : "INSERVICE",
"desired_repository_version" : "3.0.1.0-187",
"desired_stack_id" : "HDP-3.0",
"desired_state" : "STARTED",
"display_name" : "RegionServer",
"host_name" : "hadoop-2.nit.disa.mil",
"maintenance_state" : "OFF",
"public_host_name" : "hadoop-2.nit.disa.mil",
"reload_configs" : false,
"service_name" : "HBASE",
"stale_configs" : false,
"state" : "STARTED",
"upgrade_state" : "NONE",
"version" : "3.0.1.0-187",
"actual_configs" : { }
}
... View more
12-27-2018
09:00 PM
Hi, I updated the hostcomponentdesiredstate table but the regionserver component is still showing 'Decommissioning'. ambari=> select id,component_name,desired_State,host_id,service_name,admin_state from hostcomponentdesiredstate where service_name='HBASE';
id | component_name | desired_state | host_id | service_name | admin_state
-----+----------------------+---------------+---------+--------------+-------------
154 | PHOENIX_QUERY_SERVER | STARTED | 1 | HBASE | INSERVICE
155 | PHOENIX_QUERY_SERVER | STARTED | 2 | HBASE | INSERVICE
159 | HBASE_CLIENT | INSTALLED | 2 | HBASE |
157 | HBASE_CLIENT | INSTALLED | 1 | HBASE |
158 | HBASE_CLIENT | INSTALLED | 3 | HBASE |
156 | PHOENIX_QUERY_SERVER | STARTED | 3 | HBASE | INSERVICE
152 | HBASE_REGIONSERVER | STARTED | 3 | HBASE |
153 | HBASE_REGIONSERVER | STARTED | 1 | HBASE |
151 | HBASE_MASTER | STARTED | 2 | HBASE |
(9 rows)
The desired_admin_state still says 'inservice' in the rest api. "HostRoles" : {
"cluster_name" : "PIEE",
"component_name" : "HBASE_REGIONSERVER",
"desired_admin_state" : "INSERVICE",
"desired_repository_version" : "3.0.1.0-187",
"desired_stack_id" : "HDP-3.0",
"desired_state" : "STARTED",
"display_name" : "RegionServer",
"host_name" : "hadoop-2.nit.disa.mil",
"maintenance_state" : "OFF",
"public_host_name" : "hadoop-2.nit.disa.mil",
"reload_configs" : false,
"service_name" : "HBASE",
"stale_configs" : false,
"state" : "STARTED",
"upgrade_state" : "NONE",
"version" : "3.0.1.0-187",
"actual_configs" : { }
},
... View more
12-27-2018
06:28 PM
I changed the admin_state to NULL and restarted ambari. The regionserver status still shows 'Decommissioning'. The rest api still shows desired_admin_state" : "INSERVICE.
... View more
12-27-2018
04:53 PM
It displays started and a green check mark for a few seconds then eventually goes to a decommissioning status. It's as if there's a process that's checking something and changing the status
... View more
12-26-2018
06:21 PM
Here is the output from ambari-agent log debug level: DEBUG 2018-12-26 11:21:32,977 HeartbeatThread.py:99 - Heartbeat response is {u'status': u'OK', u'id': 8}
... View more
12-26-2018
06:12 PM
Partial outptut: {
"href" : "https://hadoop-1.nit.disa.mil:8443/api/v1/clusters/PIEE/hosts/hadoop-2.nit.disa.mil/host_components/HBASE_REGIONSERVER/?fields=*",
"HostRoles" : {
"cluster_name" : "PIEE",
"component_name" : "HBASE_REGIONSERVER",
"desired_admin_state" : "INSERVICE",
"desired_repository_version" : "3.0.1.0-187",
"desired_stack_id" : "HDP-3.0",
"desired_state" : "STARTED",
"display_name" : "RegionServer",
"host_name" : "hadoop-2.nit.disa.mil",
"maintenance_state" : "OFF",
"public_host_name" : "hadoop-2.nit.disa.mil",
"reload_configs" : false,
"service_name" : "HBASE",
"stale_configs" : false,
"state" : "STARTED",
"upgrade_state" : "NONE",
"version" : "3.0.1.0-187",
"actual_configs" : { }
},
... View more
12-26-2018
04:57 PM
Ambari is showing hbase regionservers in a 'Decommissioning' status. When the regionservers are started it goes to the green checkmark (Installed status), then a few seconds later it goes to decommissioning status. The regionservers are up and appear to be running ok. ./hbase zkcli, ls /hbase-unsecure/draining and nothing is showing ./hbase-unsecure/rs and the regionservers are showing The ambari database hostcomponentstate table shows regionservers 'STARTED' I don't see anything in the ambari-agent log yet. The ambari rest api shows desireds_state : STARTED for regionservers How is the component status being set in the ambari console and how do I resolve this?
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache HBase
10-10-2018
01:33 PM
That worked. Thank you very much. I'm going to look for documentation listing all the available properties.
... View more
10-10-2018
01:32 PM
That worked. Thank you very much. I'm going to look for the documentation listing all the available properties.
... View more
10-09-2018
03:53 PM
I configured ssl for hdfs. I have an alert in the ambari console that related to the secondary namenode http address port 50090. Connection failed to http://wawf-wk-hadoop2.caci.com:50090 (). Is there an alternatate https property in hdfs-site.xml for dfs.secondary.http.address.
... View more
Labels:
- Labels:
-
Apache Hadoop