About skurup

skurup · ‎10-17-2018

Great article !!!

skurup · ‎10-17-2018

@PJ Since you have ranger enabled, its possible that your permission is denied at Ranger end. I would definitely check the Ranger Audit logs for any events for the users and see if we are hitting the permission denied in there. Also I would add a ranger hdfs policy to allow user user1 write access to /user/user1/sparkeventlogs once I validate it was Ranger who was blocking the permissions.

skurup · ‎10-17-2018

@Dukool SHarma Safe mode is a NameNode state in which the node doesn’t accept any changes to the HDFS namespace, meaning HDFS will be in a read-only state. Safe mode is entered automatically at NameNode startup, and the NameNode leaves safe mode automatically when the configured minimum percentage of blocks satisfies the minimum replication condition. When you start up the NameNode, it doesn’t start replicating data to the DataNodes right away. The NameNode first automatically enters a special read-only state of operation called safe mode. In this mode, the NameNode doesn’t honor any requests to make changes to its namespace. Thus, it refrains from replicating, or even deleting, any data blocks until it leaves the safe mode. The DataNodes continuously send two things to the NameNode—a heartbeat indicating they’re alive and well and a block report listing all data blocks being stored on a DataNode. Hadoop considers a data block “safely” replicated once the NameNode receives enough block reports from the DataNodes indicating they have a minimum number of replicas of that block. Hadoop makes the NameNode wait for the DataNodes to report blocks so it doesn’t start replicating data prematurely by attempting to replicate data even when the correct number of replicas exists on DataNodes that haven’t yet reported their block information. When a preconfigured percentage of blocks are reported as safely replicated, the NameNode leaves the safe mode and starts serving block information to clients. It’ll also start replicating all blocks that the DataNodes have reported as being under replicated. Use the dfsadmin –safemode command to manage safe mode operations for the NameNode. You can check the current safe mode status with the -safemode get command: $ hdfs dfsadmin -safemode get Safe mode is OFF in hadoop01.localhost/10.192.2.21:8020 Safe mode is OFF in hadoop02.localhost/10.192.2.22:8020 $ You can place the NameNode in safe mode with the -safemode enter command: $ hdfs dfsadmin -safemode enter Safe mode is ON in hadoop01.localhost/10.192.2.21:8020 Safe mode is ON in hadoop02.localhost/10.192.2.22:8020 $ Finally, you can take the NameNode out of safemode with the –safemode leave command: $ hdfs dfsadmin -safemode leave Safe mode is OFF in hadoop01.localhost/10.192.2.21:8020 Safe mode is OFF in hadoop02.localhost/10.192.2.22:8020 $

skurup · ‎08-10-2017

Are we closing the spark context here ? Usually a ".close()" call is done, the JVM should be able to clean up those directories .

skurup · ‎04-04-2017

@Nikhil Pawar One thing you could do here is to increase the "spark.executor.heartbeatInterval" which is default set to 10secs to something higher and test it out . Also something to look at would be to review the executor logs to see if you have any OOM / GC issues when the executors are running on the jobs that you kick off from spark.

skurup · ‎03-14-2017

@Jeff Watson Can you give us the command for the spark-submit and also attach the console o/p in here for us to check ?

skurup · ‎03-13-2017

Can you check what is the "io.file.buffer.size" is set to here? You may need to tweak it to set this to below what the "MAX_PACKET_SIZE" is set to . Referencing a great blog post here (http://johnjianfang.blogspot.com/2014/10/hadoop-two-file-buffer-size.html) For example, take a look at the BlockSender in HDFS.class BlockSender implements java.io.Closeable { /** * Minimum buffer used while sending data to clients. Used only if * transferTo() is enabled. 64KB is not that large. It could be larger, but * not sure if there will be much more improvement. */ private static final int MIN_BUFFER_WITH_TRANSFERTO = 64*1024; private static final int TRANSFERTO_BUFFER_SIZE = Math.max( HdfsConstants.IO_FILE_BUFFER_SIZE, MIN_BUFFER_WITH_TRANSFERTO); } The BlockSender uses "io.file.buffer.size" as the transfer buffer size. If this parameter is not defined, the default buffer size 64KB is used. The above explains why most hadoop IOs were either 4K or 64K chunks in my friend's cluster since he did not tune the cluster. To achieve a better performance, we should tune "io.file.buffer.size" to a much bigger value, for example, up to 16MB. The upper limit is set by the MAX_PACKET_SIZEin org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.

skurup · ‎03-13-2017

Try running it in the debug mode and then provide the o/p here. For hivecli you could do as below: hive --hiveconf hive.root.logger=DEBUG,console Once done , re-run the query and see where it fails. That should give you better insight on the failure here

skurup · ‎03-13-2017

@Saikiran Parepally Please accept the answer if that has helped to resolve the issue

skurup · ‎03-13-2017

@Saikiran Parepally Did that fix the issue here?

Online	Offline
Last Visited	‎05-07-2025 11:31 AM

Member Since	‎04-20-2016 12:41 PM
Last Visited	‎05-07-2025 11:31 AM
Posts	86
Kudos received	27

Cloudera Community

Re: Unable to connect spark python with phoenix in...

Re: Error when trying HBase CopyTable across two K...

Re: HDP 2.5 : Oozie issue

Re: Hadoop security Failed

Re: HBase master start and shutdown after 5min

Re: Integrating Apache Hive with Apache Spark - Hi...

Re: Hdfs acl giving permission denied error for sp...

Re: What is SafemodeProblem ? How User come out of...

Re: Apache Spark is not deleting the folders in th...

Re: I am getting this error iin sparks " WARN Nett...

Re: Phoenix driver not found in Spark job

Re: HDFS WRITE_BLOCK - Incorrect value for packet ...

Re: Hive queries taking LONG time to start

Re: Error when trying HBase CopyTable across two K...

Re: Error when trying HBase CopyTable across two K...