Member since
04-20-2016
86
Posts
27
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
935 | 03-13-2017 04:06 AM | |
986 | 03-09-2017 01:55 PM | |
305 | 01-05-2017 02:13 PM | |
1274 | 12-29-2016 05:43 PM | |
1354 | 12-28-2016 11:03 PM |
10-17-2018
12:11 PM
1 Kudo
Great article !!!
... View more
10-17-2018
12:11 PM
@PJ Since you have ranger enabled, its possible that your permission is denied at Ranger end. I would definitely check the Ranger Audit logs for any events for the users and see if we are hitting the permission denied in there. Also I would add a ranger hdfs policy to allow user user1 write access to /user/user1/sparkeventlogs once I validate it was Ranger who was blocking the permissions.
... View more
10-17-2018
12:04 PM
@Dukool SHarma
Safe mode is a NameNode state in which the node doesn’t accept any changes to the HDFS namespace, meaning HDFS will be in a read-only state. Safe mode is entered automatically at NameNode startup, and the NameNode leaves safe mode automatically when the configured minimum percentage of blocks satisfies the minimum replication condition.
When you start up the NameNode, it doesn’t start replicating data to the DataNodes right away. The NameNode first automatically enters a special read-only state of operation called safe mode. In this mode, the NameNode doesn’t honor any requests to make changes to its namespace. Thus, it refrains from replicating, or even deleting, any data blocks until it leaves the safe mode.
The DataNodes continuously send two things to the NameNode—a heartbeat indicating they’re alive and well and a block report listing
all data blocks being stored on a DataNode. Hadoop considers a data block “safely” replicated once the NameNode receives enough block reports from the DataNodes indicating they have a minimum number of replicas of that block. Hadoop makes the NameNode wait for the DataNodes to report blocks so it doesn’t start replicating data prematurely by attempting to replicate data even when the correct
number of replicas exists on DataNodes that haven’t yet reported their block information.
When a preconfigured percentage of blocks are reported as safely replicated, the NameNode leaves the safe mode and starts serving block information to clients. It’ll also start replicating all blocks that the DataNodes have reported as being under replicated.
Use the dfsadmin –safemode command to manage safe mode operations for the NameNode. You can check the current safe mode status with the -safemode get command: $ hdfs dfsadmin -safemode get
Safe mode is OFF in hadoop01.localhost/10.192.2.21:8020
Safe mode is OFF in hadoop02.localhost/10.192.2.22:8020
$ You can place the NameNode in safe mode with the -safemode enter command: $ hdfs dfsadmin -safemode enter
Safe mode is ON in hadoop01.localhost/10.192.2.21:8020
Safe mode is ON in hadoop02.localhost/10.192.2.22:8020
$ Finally, you can take the NameNode out of safemode with the –safemode leave command: $ hdfs dfsadmin -safemode leave
Safe mode is OFF in hadoop01.localhost/10.192.2.21:8020
Safe mode is OFF in hadoop02.localhost/10.192.2.22:8020
$
... View more
08-10-2017
12:49 PM
Are we closing the spark context here ? Usually a ".close()" call is done, the JVM should be able to clean up those directories .
... View more
08-10-2017
12:42 PM
It ideally should be getting picked up from DNS or /etc/hosts files . Considering you have 5 nodes , can you add these entries to your /etc/hosts files and try it again ?
... View more
04-04-2017
02:52 PM
@Garima Verma You have not given the stack trace here so folks will not really know how to address that clearly unless the stack trace is provided. But given the explanation that was provided, I would suggest that you can try to pass the given xml with the "--files" to the spark submit command and then try again.
... View more
04-04-2017
02:47 PM
@Nikhil Pawar One thing you could do here is to increase the "spark.executor.heartbeatInterval" which is default set to 10secs to something higher and test it out . Also something to look at would be to review the executor logs to see if you have any OOM / GC issues when the executors are running on the jobs that you kick off from spark.
... View more
03-14-2017
01:40 PM
@Jeff Watson Can you give us the command for the spark-submit and also attach the console o/p in here for us to check ?
... View more
03-13-2017
06:34 PM
Can you check what is the "io.file.buffer.size" is set to here? You may need to tweak it to set this to below what the "MAX_PACKET_SIZE" is set to . Referencing a great blog post here (http://johnjianfang.blogspot.com/2014/10/hadoop-two-file-buffer-size.html) For example, take a look at the BlockSender in HDFS.class BlockSender implements java.io.Closeable {
/**
* Minimum buffer used while sending data to clients. Used only if
* transferTo() is enabled. 64KB is not that large. It could be larger, but
* not sure if there will be much more improvement.
*/
private static final int MIN_BUFFER_WITH_TRANSFERTO = 64*1024;
private static final int TRANSFERTO_BUFFER_SIZE = Math.max(
HdfsConstants.IO_FILE_BUFFER_SIZE, MIN_BUFFER_WITH_TRANSFERTO);
}
The BlockSender uses "io.file.buffer.size" as the transfer buffer size. If this parameter is not defined, the default buffer size 64KB is used. The above explains why most hadoop IOs were either 4K or 64K chunks in my friend's cluster since he did not tune the cluster. To achieve a better performance, we should tune "io.file.buffer.size" to a much bigger value, for example, up to 16MB. The upper limit is set by the MAX_PACKET_SIZEin org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.
... View more
03-13-2017
06:19 PM
Try running it in the debug mode and then provide the o/p here. For hivecli you could do as below: hive --hiveconf hive.root.logger=DEBUG,console Once done , re-run the query and see where it fails. That should give you better insight on the failure here
... View more
03-13-2017
06:16 PM
@Saikiran Parepally Please accept the answer if that has helped to resolve the issue
... View more
03-13-2017
04:12 AM
@Saikiran Parepally Did that fix the issue here?
... View more
03-13-2017
04:06 AM
1 Kudo
@nbalaji-elangovan
Copy/Symlink the hbase-site.xml under /etc/spark/conf as below: ln -s /etc/hbase/conf/hbase-site.xml /etc/spark/conf/hbase-site.xml Once done execute the spark-submit as you had done earlier and then try again
... View more
03-09-2017
10:13 PM
@Saikiran Parepally Better follow this doc here to review your settings : http://crazyadmins.com/setup-cross-realm-trust-two-mit-kdc/
... View more
03-09-2017
01:55 PM
@Saikiran Parepally Also along with Josh's suggestion, please check on the cross cluster realm setup here. Refer the documentation below: https://community.hortonworks.com/articles/18686/kerberos-cross-realm-trust-for-distcp.html You will need to have the cross cluster realm setup right to perform the copyTable across two secure clusters
... View more
01-27-2017
08:17 PM
@Sami Ahmad Please restart your yarn services on the cluster to reflect the changes
... View more
01-26-2017
03:34 PM
Please accept the answer by Sandeep so that this question can be marked as addressed.
... View more
01-23-2017
11:42 AM
1 Kudo
Second the answer by @mkumar Also check out the link below. Some good writeup which I found gives much details on it http://wanwenli.com/kafka/2016/11/04/Kafka-Group-Coordinator.html
... View more
01-20-2017
10:56 PM
@Joan Viladrosa No .. I don't think we can do that .
... View more
01-20-2017
10:07 PM
@Saikiran Parepally The HBase Thrift interface doesn’t have any built-in load balancing . So the recommendation will be to do a load balancing to be done with external tools such a DNS round-robin, a virtual IP address, or in code.
... View more
01-20-2017
09:34 PM
Please check from the Yarn RM UI why the application master failed? The AM logs would give you indication depending on exceptions as to what caused the failures . Once that is identified, you can proceed to fix it.
... View more
01-20-2017
07:33 PM
2 Kudos
Please set the value for hbase.hregion.majorcompaction to "0" <property>
<name>hbase.hregion.majorcompaction</name>
<value>0</value>
</property> This will disable the major compactions and you can trigger it manually during the off peak hours. Make sure we restart hbase services for this to take effect.
... View more
01-13-2017
05:30 PM
1 Kudo
@Jason Morse I don't think so we have a solution yet. See the below JIRA which was raised . Its still "unresolved" https://issues.cloudera.org/browse/HUE-2738
... View more
01-13-2017
05:27 PM
Check the HS2 logs . That would indicate why would HS2 be going down when Hue is unable to connect to it.
... View more
01-05-2017
02:25 PM
@yong yang Please refer to the link below: http://hortonworks.com/hadoop-tutorial/a-lap-around-apache-spark/
... View more
01-05-2017
02:22 PM
@Sanjeev This is to ensure that when the job is submitted to the cluster , in which case the job resources needed to run the job ( job jars files , config files and the computed input splits ) needs to be propagated to the cluster nodes so that there are lot of copies across the cluster for nodemanagers to access when they execute the tasks for the job . This is just to ensure that we have redundancy for the job resources when the tasks are executed. It should be ok to set this to a high value in such a big cluster.
... View more
01-05-2017
02:13 PM
Backup the directory "/data/hadoop/oozie/data/oozie-db" to another directory and then remove it using the command: rm -rf /data/hadoop/oozie/data/oozie-db Then restart oozie from Ambari and try again
... View more
01-05-2017
12:28 PM
You are probably hitting the issue as described in the hue documentation : http://gethue.com/hbase-browsing-with-doas-impersonation-and-kerberos/ Note
If you are getting this error:
1
Caused by: org.apache.hadoop.hbase.thrift.HttpAuthenticationException: Authorization header received from the client is empty.
You are very probably hitting https://issues.apache.org/jira/browse/HBASE-13069. Also make sure the HTTP/_HOST principal is in the keytab of for their HBase Thrift Server. Beware that as a follow-up you might get https://issues.apache.org/jira/browse/HBASE-14471.
... View more
01-05-2017
12:15 PM
Your spark job is failing with the below exceptions : 17/01/04 11:05:28 INFO Client: client token: N/A diagnostics: Application application_1483479696331_0001 failed 2 times due to AM Container for appattempt_1483479696331_0001_000002 exited with exitCode: -1000For more detailed output, check the application tracking page: http://hadoop2.example.com:8088/cluster/app/application_1483479696331_0001 Then click on links to logs of each attempt.Diagnostics: File does not exist: hdfs://hadoop1.example.com:8020/user/hive/.sparkStaging/application_1483479696331_0001/py4j-0.10.1-src.zipjava.io.FileNotFoundException: File does not exist: hdfs://hadoop1.example.com:8020/user/hive/.sparkStaging/application_1483479696331_0001/py4j-0.10.1-src.zip And looking at the earlier stack , it tries to upload the file over to HDFS as shown in the stack below: 17/01/04 11:04:55 INFO Client: Uploading resource file:/usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip -> hdfs://hadoop1.example.com:8020/user/hive/.sparkStaging/application_1483479696331_0001/py4j-0.10.1-src.zip As you are running the spark job as "admin" user , can you check if you have access / permissions to write into the hdfs path? Can you try the same job as a "hive" user and check ?
... View more
01-03-2017
03:36 PM
@Ward Bekker Looks like the hbase master is not up on the node . Can you check if the hbase master is up and running on the node? You can do below to check the hbase master process: 1. lsof -i:16000
2. netstat -anp | grep 16000 | grep LISTEN If you do not see any process listening to on this port then bounce the hbase master process and see if the service comes up and you are able to pull up the hbase master UI and access it.
... View more