Member since
02-12-2016
22
Posts
17
Kudos Received
0
Solutions
06-07-2017
07:20 AM
While running below query in HUE i am getting error. Any suggestion on this. Command : msck repair table import_********; Error : Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
... View more
Labels:
06-06-2016
08:32 PM
Below is the error from our application logs: Jun 06, 2016 12:32:25 PM org.apache.zookeeper.ZooKeeper <init> INFO: Initiating client connection,
connectString=nykdsr000003239.intranet.barcapint.com:2181,nykdsr000003238.intranet.barcapint.com:2181,nykdsr000003240.intranet.barcapint.com:2181
sessionTimeout=60000 watcher=hconnection-0x755f37310x0,
quorum=nykdsr000003239.intranet.barcapint.com:2181,nykdsr000003238.intranet.barcapint.com:2181,nykdsr000003240.intranet.barcapint.com:2181,
baseZNode=/hbase
Jun 06, 2016 12:32:25 PM org.apache.zookeeper.ClientCnxn$SendThread
logStartConnect
INFO: Opening socket connection to server
nykdsr000003239.intranet.barcapint.com/10.60.72.79:2181. Will not attempt to
authenticate using SASL (unknown error)
Jun 06, 2016 12:32:25 PM org.apache.zookeeper.ClientCnxn$SendThread
primeConnection
INFO: Socket connection established, initiating session, client:
/10.173.76.85:52746, server:
nykdsr000003239.intranet.barcapint.com/10.60.72.79:2181
Jun 06, 2016 12:32:25 PM org.apache.zookeeper.ClientCnxn$SendThread
onConnected
INFO: Session establishment complete on server
nykdsr000003239.intranet.barcapint.com/10.60.72.79:2181, sessionid =
0x2552005d31512ac, negotiated timeout = 60000
Jun 06, 2016 12:32:31 PM org.apache.phoenix.metrics.Metrics initialize
INFO: Initializing metrics system: phoenix
Jun 06, 2016 12:32:32 PM org.apache.hadoop.metrics2.impl.MetricsConfig
loadFirst
WARNING: Cannot locate configuration: tried
hadoop-metrics2-phoenix.properties,hadoop-metrics2.properties
Jun 06, 2016 12:32:32 PM org.apache.hadoop.metrics2.impl.MetricsSystemImpl
startTimer
INFO: Scheduled snapshot period at 10 second(s).
Jun 06, 2016 12:32:32 PM
org.apache.hadoop.metrics2.impl.MetricsSystemImpl start
INFO: phoenix metrics system started
Jun 06, 2016 12:32:34 PM org.apache.hadoop.conf.Configuration
warnOnceIfDeprecated
INFO: hadoop.native.lib is deprecated. Instead, use
io.native.lib.available
Jun 06, 2016 12:32:37 PM
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper <init>
INFO: Process identifier=hconnection-0x2b55d383 connecting to ZooKeeper
ensemble=nykdsr000003239.intranet.barcapint.com:2181,nykdsr000003238.intranet.barcapint.com:2181,nykdsr000003240.intranet.barcapint.com:2181
Jun 06, 2016 12:32:37 PM
org.apache.zookeeper.ZooKeeper <init>
INFO: Initiating client
connection, connectString=nykdsr000003239.intranet.barcapint.com:2181,nykdsr000003238.intranet.barcapint.com:2181,nykdsr000003240.intranet.barcapint.com:2181
sessionTimeout=60000 watcher=hconnection-0x2b55d3830x0,
quorum=nykdsr000003239.intranet.barcapint.com:2181,nykdsr000003238.intranet.barcapint.com:2181,nykdsr000003240.intranet.barcapint.com:2181,
baseZNode=/hbase
Jun 06, 2016 12:32:37 PM org.apache.zookeeper.ClientCnxn$SendThread
logStartConnect
INFO: Opening socket connection to server
nykdsr000003239.intranet.barcapint.com/10.60.72.79:2181. Will not attempt to
authenticate using SASL (unknown error)
Jun 06, 2016 12:32:37 PM org.apache.zookeeper.ClientCnxn$SendThread
primeConnection
INFO: Socket connection established, initiating session, client:
/10.173.76.85:52749, server: nykdsr000003239.intranet.barcapint.com/10.60.72.79:2181
Jun 06, 2016 12:32:37 PM org.apache.zookeeper.ClientCnxn$SendThread
onConnected
INFO: Session establishment complete on server
nykdsr000003239.intranet.barcapint.com/10.60.72.79:2181, sessionid =
0x2552005d31512ad, negotiated timeout = 60000
Jun 06, 2016 12:33:47 PM
org.apache.hadoop.hbase.client.RpcRetryingCaller callWithRetries
INFO: Call exception, tries=10,
retries=35, started=68364 ms ago, cancelled=false, msg=row '' on table
'SYSTEM.CATALOG' at
region=SYSTEM.CATALOG,,1444218102919.828f7e39c3ea24467feb470c07ca9e84.,
hostname=nykdsr000003241.intranet.barcapint.com,60020,1452857807728, seqNum=311
Jun 06, 2016 12:34:07 PM
org.apache.hadoop.hbase.client.RpcRetryingCaller callWithRetries
INFO: Call exception, tries=11,
retries=35, started=88541 ms ago, cancelled=false, msg=row '' on table
'SYSTEM.CATALOG' at region=SYSTEM.CATALOG,,1444218102919.828f7e39c3ea24467feb470c07ca9e84.,
hostname=nykdsr000003241.intranet.barcapint.com,60020,1452857807728, seqNum=311
Jun 06, 2016 12:34:27 PM
org.apache.hadoop.hbase.client.RpcRetryingCaller callWithRetries
INFO: Call exception, tries=12,
retries=35, started=108617 ms ago, cancelled=false, msg=row '' on table
'SYSTEM.CATALOG' at
region=SYSTEM.CATALOG,,1444218102919.828f7e39c3ea24467feb470c07ca9e84.,
hostname=nykdsr000003241.intranet.barcapint.com,60020,1452857807728, seqNum=311
Jun 06, 2016 12:34:47 PM
org.apache.hadoop.hbase.client.RpcRetryingCaller callWithRetries
INFO: Call exception, tries=13,
retries=35, started=128796 ms ago, cancelled=false, msg=row '' on table
'SYSTEM.CATALOG' at
region=SYSTEM.CATALOG,,1444218102919.828f7e39c3ea24467feb470c07ca9e84.,
hostname=nykdsr000003241.intranet.barcapint.com,60020,1452857807728, seqNum=311
Jun 06, 2016 12:35:07 PM
org.apache.hadoop.hbase.client.RpcRetryingCaller callWithRetries
INFO: Call exception, tries=14,
retries=35, started=148878 ms ago, cancelled=false, msg=row '' on table
'SYSTEM.CATALOG' at
region=SYSTEM.CATALOG,,1444218102919.828f7e39c3ea24467feb470c07ca9e84.,
hostname=nykdsr000003241.intranet.barcapint.com,60020,1452857807728, seqNum=311
Jun 06, 2016 12:35:28 PM
org.apache.hadoop.hbase.client.RpcRetryingCaller callWithRetries
INFO: Call exception, tries=15,
retries=35, started=169088 ms ago, cancelled=false, msg=row '' on table
'SYSTEM.CATALOG' at
region=SYSTEM.CATALOG,,1444218102919.828f7e39c3ea24467feb470c07ca9e84.,
hostname=nykdsr000003241.intranet.barcapint.com,60020,1452857807728, seqNum=311
... View more
Labels:
04-25-2016
05:28 PM
Hi Sunile,
Yes I am trying it on Hive queries. Its on Cloudera. Updated from 5.4 to 5.5.2 recently.
I read some blogs online and found there sometime slowness occurs when we restart the cluster.
I didnt find any thing in error logs. And as per my work experience Slowness is a very common issue. It will be very helpful if your share some of your valuable experience regarding slowness in running job. What point kept in mind while facing slowness issue.
... View more
03-14-2016
09:34 PM
1 Kudo
HBase jobs
are failing with below error : INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed! [main] ERROR
org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate
exception from backed error: AttemptID:attempt_1xyz Info:Container killed by the
ApplicationMaster. Container killed on request. Exit code is
143 Container exited with a non-zero exit code
143" Hbase loading script is failing. It seems like main issue to
be Hbase hot spotting and Cluster is heavily used over few region server(Which
are doing all loading part and other jobs are running in parallel). As a result
I am getting Hbase session time out which is set for 60 sec. and map tasks
sometimes fails depending on the usage of the cluster and the whole job fails
if it gets too much timeouts.
Can this issue be resolved by .
1) setting Hbase session timeout to 120 sec.
2) Can we increase no. of region server, if yes will this help in resolving the
issue. Additional : Providing a overview on region server will be helpful. Please share your views
... View more
Labels:
03-03-2016
04:13 PM
2 Kudos
We have an issue in one of our environment, where the report manager keeps going down with GC pauses, this scenario is still evident after revising the heap twice for this service. Keen to understand what are the factors which contribute to this service's load which is leading to such a scenario. Also is increasing the heap is the only resort or GC algorithm it uses by default can be revisited? Please let me know your thoughts.
... View more
02-24-2016
09:26 AM
It seems like, the replication factor is 1 my case. How to get it recovered from DR cluster. ?
... View more
02-23-2016
09:12 AM
3 Kudos
In my HDFS status summary, I see the following messages about missing and under-replicated blocks: 2,114 missing blocks in the cluster. 5,114,551 total blocks in the cluster. Percentage missing blocks: 0.04%. Critical threshold: any. On executing the command : hdfs fsck -list-corruptfileblocks I got following output : The filesystem under path '/' has 2114 CORRUPT files What is the best way to fix these corrupt files and also fix the underreplicated block problem?
... View more
- Tags:
- Hadoop Core
- HDFS
Labels:
02-16-2016
10:38 PM
Yes. it is prod environment.
... View more
02-16-2016
09:53 PM
@Neeraj Sabharwal yes the system is running out of space. Can you please suggest me a better way rather than creating a soft link.
... View more
02-16-2016
01:13 AM
@Neeraj Sabharwal Can you please help us out here by providing an example if this lies in your scope.
... View more
02-16-2016
01:09 AM
@Neeraj Sabharwal Yes I was thinking to move the data, deletion is out of my boundaries as per my role.
... View more
02-15-2016
11:55 PM
1 Kudo
@Neeraj Sabharwal got the logs.seems like It is related to memory issue. Unfortunately i don't have permissions to delete it.
Can I create a soft link to move the data around as a work around.
Can you please assist if I create a soft link for any lib, will it move the present data or the upcoming data or both ?
... View more
02-15-2016
11:26 PM
1 Kudo
@Neeraj Sabharwal Thanks for your assistance. I found the above details from the log at the instant when warning was generated. after that the service went down. Apart from this noting more was there in the log. Also the memory uses is just below the critical threshold capacity. We set 90 % for critical limit and right now it is 89 .4 %. So i mentioned is it related to memory issue.
... View more
02-15-2016
11:02 PM
3 Kudos
There are certain times where we need to change the priority of
the hadoop jobs. Due to some business criticality, we want some jobs to have
high priority and some jobs to have low priority. So, that the important jobs
are completed early. If
Hadoop cluster is using the Capacity Scheduler with priorities enabled for
queues, then we can set priority of our hadoop jobs. This article explain to set the priority of hadoop jobs and explained how to change the priority of
Hadoop Jobs. 1)Set the priority in Map Reduce Program:
In Map/Reduce program we can set the job priority using following way. Configuration conf = new Configuration();
// set the priority to VERY_HIGH
conf.set("mapred.job.priority", JobPriority. VERY_HIGH .toString()); Allowed
priority values are:VERY_HIGH,
HIGH, NORMAL, LOW, VERY_LOW 2)Set the priority in Pig Program:
We can set priority of Pig job using below property, This property is used to
set the job priority is Pig Programming : job.priority For
example: grunt> SET job.priority 'high' If you
are trying to set priority in Pig Script then write this property before load
statement
For
example: SET job.priority 'high';
A = LOAD '/user/hdfs/myfile.txt' USING PigStorage() AS (ID, Name); Acceptable
values to set the priority is:very_low,
low, normal, high, very_high Please
note these values are case insensitive. 3)Set the priority for Hive Query:
In Hive we can set the job priority using below property. SET mapred.job.priority=VERY_HIGH; You need
to set this value before your query.
Allowed priority values are:VERY_HIGH,
HIGH, NORMAL, LOW, VERY_LOW Themapred.job.priorityis deprecated.
The new property ismapreduce.job.priority We can
also change the priority of the running hadoop jobs. Usage: hadoop job -set-priority job-id priority
For
example: hadoop job -set-priority job_20120111540_54485 VERY_HIGH Allowed
priority values are:VERY_HIGH,
HIGH, NORMAL, LOW, VERY_LOW
... View more
- Find more articles tagged with:
- hadoop
- How-ToTutorial
- jobs
- Security
Labels:
02-15-2016
10:50 PM
1 Kudo
Error starting NodeManager java.lang.NullPointerException at
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:289) at
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:252) at
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:235) at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:250) at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:445) at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:492)
... View more
Labels:
02-15-2016
07:42 AM
@Neeraj Sabharwal Thanks for your assistance.
... View more
02-14-2016
09:45 AM
1 Kudo
Error : javax.security.sasl.SaslException: GSS initiate failed
[Caused by GSSException: No valid credentials provided (Mechanism level: Failed
to find any Kerberos tgt)] at
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect This Warning message is generated in every 5th second. Can somebody help me with this, as per my investigation it seems to be a Licence issue.
... View more
Labels:
02-14-2016
09:32 AM
1 Kudo
Apache Kafka is a high-throughput distributed messaging system developed
by LinkedIn. Kafka is a distributed, partitioned commit log service,
that provides the functionality of a messaging system with a unique design.
It is written in Scala and does not follow JMS (Java Message
Service) standards. The best way to learn about Kafka is read the original design
page http://kafka.apache.org/
.That will give you an overview of the motivation behind the design choices and
what makes Kafka efficient. It is also a very engaging read if you are
interested in systems. In terms of adoption, Kafka is currently used in production at
LinkedIn, Twitter, Tumblr, Square and a number of different companies. You can
read about the uses cases that those companies found for Kafka here : https://cwiki.apache.org/confluence/display/KAFKA/Powered+By
... View more
02-13-2016
12:02 PM
2 Kudos
Since Hadoop gives precedence to the delegation tokens, we must make sure we login as a different user, get new tokens and replace the old ones in the current user's credentials cache to avoid not being able to get new ones. This may help.
... View more
02-13-2016
11:53 AM
@Roberto Sancho , NIS seems to be workaround, but i didn't find it secure. You can get about it in : "http://aput.net/~jheiss/krbldap/howto.html".
But i will suggest to go with Kerberos.
... View more
02-13-2016
10:16 AM
1 Kudo
e know Hadoop is used in clustered environment where we have clusters, each cluster will have multiple racks, each rack will have multiple datanodes. So to make HDFS fault tolerant in your cluster you need to consider following failures- DataNode failure Rack failure Chances of Cluster failure is fairly low so let not think about it. In the above cases you need to make sure that - If one DataNode fails, you can get the same data from another DataNode If the entire Rack fails, you can get the same data from another Rack So thats why I think default replication factor is set to 3, so that not 2 replica goes to same DataNode and at-least 1 replica goes to different Rack to fulfill the above mentioned Fault-Tolerant criteria. Hope this will help.
... View more