Created 06-20-2016 03:16 PM
I am trying to import data to hbase using importtsv. I tried couple of examples from online to learn and they worked fine.
Fx with below command I get the success and see the table on hbase shell.
bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=',' -Dimporttsv.columns=HBASE_ROW_KEY,cf tab4 /user/hduser/gutenberg/simple1.txt
2016-06-20 17:08:03,138 INFO [main] mapreduce.Job: map 100% reduce 0% 2016-06-20 17:08:03,139 INFO [main] mapreduce.Job: Job job_local1517188704_0001 completed successfully 2016-06-20 17:08:03,269 INFO [main] mapreduce.Job: Counters: 24 File System Counters FILE: Number of bytes read=4018816 FILE: Number of bytes written=25538098
but the actual data I want to upload has lots of columns and rows. (237 to 400k respectively) To check out integrity I uploaded and tried to import a 237 column 10 row version of the data, to make it visible I am pasting the part of the data and the command.
1,date,serial,serial,date,bsrid,SW version,bsr Group Id,Processed kpi Number,reserved.... 2,20151206,1211080003,1211080003,20151206,103,30092,0,24,0,0,...
Here is a part of the command I am running.
bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=',' -Dimporttsv.columns=HBASE_ROW_KEY,date,serial,serial_1,date_1,bsrid,SW_version,bsr_Group_Id,Processed_kpi_Number, (until 237th column) tab5 /user/hduser/KPI-Dumps/test.csv
When I try to run the command it gets the below status and when I type jps command I can't see the HRegionServer anymore and I have to restart HBase
2016-06-20 16:20:07,804 INFO [communication thread] mapred.LocalJobRunner: map
2016-06-20 16:20:08,581 INFO [main] mapreduce.Job: map 100% reduce 0%
2016-06-20 16:21:15,159 INFO [hconnection-0x1abc383-metaLookup-shared--pool5-t2] client.RpcRetryingCaller: Call exception, tries=10, retries=35, started=68390 ms ago, cancelled=false, msg=row 'tab5,1,99999999999999' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=cellops-optiplex-gx620,16201,1466432370372, seqNum=0
2016-06-20 16:21:35,227 INFO [hconnection-0x1abc383-metaLookup-shared--pool5-t2] client.RpcRetryingCaller: Call exception, tries=11, retries=35, started=88487 ms ago, cancelled=false, msg=row 'tab5,1,99999999999999' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=cellops-optiplex-gx620,16201,1466432370372, seqNum=0
There is no typo or any missing comma or etc, double checked. My only guess is the difference of the table related to the column order is different, Hbase created the columns based on alphabetical order instead of the version I wanted it to create even though I gave the correct order based on the file I want to upload. Do you think it gets confused during upload?
From the created table
..
{NAME => 'Vs_SuccActiveCallReDirectUMTS'}
{NAME => 'bsr_Group_Id'} {NAME => 'bsrid'}
{NAME => 'date'} {NAME => 'date_1'}
{NAME => 'reserved'}
{NAME => 'reserved_1'}
{NAME => 'reserved_10'}
{NAME => 'reserved_11'}
...
hbase-site.xml is like this and its a pseudo distributed installation with zookeper also installed manually.
<configuration> <property> <name>hbase.rootdir</name> <value>/usr/local/hadoop/hbase</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/usr/local/Hbase/zookeeper</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>zookeeper.znode.parent</name> <value>/hbase-unsecure</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>localhost</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.rpc.timeout</name> <value>120000</value> </property> <property> <name>hbase.client.scanner.timeout.period</name> <value>120000</value> </property> <property> <name>hbase.cells.scanned.per.heartbeat.check</name> <value>10000</value> </property> </configuration>
hduser@cellops-OptiPlex-GX620:/usr/local/Hbase$ jps 9712 NodeManager 31301 QuorumPeerMain 18294 HMaster 9481 ResourceManager 9289 SecondaryNameNode 18414 HRegionServer 9023 DataNode 8799 NameNode 18847 Jps
Created 06-20-2016 03:31 PM
"When I try to run the command it gets the below status and when I type jps command I can't see the HRegionServer anymore and I have to restart HBase"
Have you looked at the RegionServer log to determine why it is no longer running? It sounds like something is causing your RegionServer to fail (perhaps, out of memory?) and then HBase cannot proceed because it requires at least one RegionServer.
Investigate the end of the RegionServer log to determine the failure.
Created 06-20-2016 03:31 PM
"When I try to run the command it gets the below status and when I type jps command I can't see the HRegionServer anymore and I have to restart HBase"
Have you looked at the RegionServer log to determine why it is no longer running? It sounds like something is causing your RegionServer to fail (perhaps, out of memory?) and then HBase cannot proceed because it requires at least one RegionServer.
Investigate the end of the RegionServer log to determine the failure.
Created 06-20-2016 03:44 PM
Yes actually I didn't think about it since I think its not much a big file to process either. But there is below output in the log. I think I need to change hbase-env.sh java heap space from 1G to 4G?
# java.lang.OutOfMemoryError: Java heap space # -XX:OnOutOfMemoryError="kill -9 %p" # Executing /bin/sh -c "kill -9 18414"...
Created 06-20-2016 05:57 PM
Yes, that looks like exactly what happened. 4G is a good heap size to start.
Created 06-20-2016 03:47 PM
18414 was the region server process.
Was it still running ?
Created 06-20-2016 03:50 PM
9712 NodeManager 19811 Jps 31301 QuorumPeerMain 18294 HMaster 9481 ResourceManager 9289 SecondaryNameNode 9023 DataNode 8799 NameNode
nope. I mean this is after the execution of the command according to Josh's suggestion. It killed the process. I know I need to increase the heap size but the machine I am working on is not so powerful either. I will try to see with 4G
Created 06-20-2016 03:51 PM
bq. Hbase created the columns based on alphabetical order
When you query hbase, you observe alphabetical order because that's what hbase stores internally.
Do you observe fewer than 10 rows after importing the sample data ?
Created 06-21-2016 08:24 AM
Yesterday I managed to import the sample data with success after increasing the heap size. Afterwards I stopped my hbase and hadoop instances and gave my ubuntu 12.04 a restart. After the restart hadoop doesn't come up. I now get below errors
namenode.log and datanode.log have same error
2016-06-21 10:11:12,347 WARN org.mortbay.log: failed jsp: java.lang.NoSuchFieldError: IS_SECURITY_ENABLED 2016-06-21 10:11:12,358 WARN org.mortbay.log: failed org.mortbay.jetty.webapp.WebAppContext@1599640{/,file:/usr/local/hadoop/share/hadoop/hdfs/webapps/hdfs}: java.lang.NoSuchFieldError: IS_SECURITY_ENABLED 2016-06-21 10:11:12,359 WARN org.mortbay.log: failed ContextHandlerCollection@181aa00: java.lang.NoSuchFieldError: IS_SECURITY_ENABLED 2016-06-21 10:11:12,360 ERROR org.mortbay.log: Error starting handlers java.lang.NoSuchFieldError: IS_SECURITY_ENABLED
2016-06-21 10:11:12,401 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:50070 2016-06-21 10:11:12,402 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system... 2016-06-21 10:11:12,403 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped. 2016-06-21 10:11:12,404 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete. 2016-06-21 10:11:12,404 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join java.io.IOException: Problem in starting http server. Server handlers failed
secondarynamenode.log
2016-06-21 10:11:27,718 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: SecondaryNameNode metrics system started 2016-06-21 10:11:28,230 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /app/hadoop/tmp/dfs/namesecondary/in_use.lock acquired by nodename 10100@cellops-OptiPlex-GX620 2016-06-21 10:11:28,372 FATAL org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Failed to start secondary namenode org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /app/hadoop/tmp/dfs/namesecondary. Reported: -60. Expecting = -57.
resourcemanager.log
2016-06-21 10:11:29,608 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: registered UNIX signal handlers for [TERM, HUP, INT] 2016-06-21 10:11:30,505 INFO org.apache.hadoop.conf.Configuration: found resource core-site.xml at file:/usr/local/hadoop/etc/hadoop/core-site.xml 2016-06-21 10:11:30,675 INFO org.apache.hadoop.security.Groups: clearing userToGroupsMap cache 2016-06-21 10:11:30,784 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager java.lang.NoSuchMethodError: org.apache.hadoop.yarn.server.utils.BuilderUtils.newApplicationResourceUsageReport(IILorg/apache/hadoop/yarn/api/records/Resource;Lorg/apache/hadoop/yarn/api/records/Resource;Lorg/apache/hadoop/yarn/api/records/Resource;JJ)Lorg/apache/hadoop/yarn/api/records/ApplicationResourceUsageReport;
nodemanager.log
2016-06-21 10:11:35,263 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.getSocketAddr(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;I)Ljava/net/InetSocketAddress; at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.serviceInit(ResourceLocalizationService.java:247)
I uninstalled and installed java back again. zookeeper is still running so I think its not java related. I tried with Hbase, it also seems working, regionserver is up master is down due its not able to reach hdfs. Somehow hadoop is crashed during the reboot. How can I repair it instead of installing again?
Created 06-21-2016 03:24 PM
It looks like you somehow upgraded (some?) HDFS jars and messed up the Hadoop classpath. It could not load expected variables from the classpath. Additionally, it seems like the SecondaryNameNode is reporting that there is a newer filesystem layout (which would imply a newer version of HDFS was at one point running) and that it is expecting an older version (which implies that the SNN is using an older version of HDFS). Make sure you have consistent versions of HDFS installed.
Created 06-22-2016 09:47 AM
Hi Josh, yes I figured that out after comparing my log files with working one. There was an older version and as you told the path was mixed. Its fixed now. I have another problem during upload, its getting a timeout during my import of the big file but I guess its better to open another thread for that problem. Thanks a lot!