About sshimpi

sshimpi · ‎05-26-2016

@Raja Ray Can you check application logs for any error ? If possible can you get screenshot of Yarn webUI attached here. Also from below link i see restarting yarn helps - http://stackoverflow.com/questions/33496491/high-availability-jobs-not-getting-submitted-immediately-after-name-node-fail

sshimpi · ‎05-25-2016

@bigdata.neophyte Answers inline. How does snapshots help for Disaster Recovery? What are the best practices around using snapshots for DR purposes? Especially trying to understand when data is directly stored on HDFS, Hive data and HBase data Snapshot will not be best option for DR. Can a directory be deleted using hdfs dfs -rmr -skipTrash /data/snapshot-dir? Or is it that all the snapshots have to be deleted first and then the snapshotting be disabled before allowing the directory be deleted? No. If the directory is snapshottable it cannot be deleted. As I understand, no data is copied for snapshots, but only metadata is maintained for the blocks added/ modified / deleted. If that’s the case, just wondering what happens when the comamnd hdfs dfs -rm /data/snapshot-dir/file1 is run. Will the file be moved to the trash? If so, will the snapshot maintain the reference to the entry in trash? Will trach eviction has any impact in this case? If a file is deleted from snapshottable directory it will be stored in .snapshot folder and you can copy data back. For eg. [hdfs@node1 ~]$ hdfs dfs -rm -r -skipTrash /tmp/test/anaconda-ks.cfg Deleted /tmp/test/anaconda-ks.cfg [hdfs@node1 ~]$ hadoop fs -ls /tmp/test/.snapshot/s20160526-022510.203 -rw-r--r-- 3 hdfs hdfs 1155 2016-05-26 02:23 /tmp/test/.snapshot/s20160526-022510.203/anaconda-ks.cfg [hdfs@node1 ~]$ hadoop fs -cp /tmp/test/.snapshot/s20160526-022510.203/anaconda-ks.cfg /tmp/test You will be able to see the data back in the file. What happens when one of the sub-directory under the snapshot directory is deleted? For example, if the command hdfs dfs -rmr -skipTrash /data/sub-dir is run? Can the data be recovered from snapshots? If the subdirectory already exist when " hdfs dfsadmin -allowSnapshot" was run then if we delete directory it will be there in ".snapshot" folder. Eg " /user/test1/.snapshot/s20160526-025323.341/subdir/ambari.properties.2" If the subdirectory is created later then it will not be backed up. Can snapshots be deleted / archived automatically based on policies, for example time-based? In the above example, how long will the sub-dir data be maintained in the snapshot? You can create custom script to delete/archive snap based on policies. The snapshot will be maintained till we delete it. How does snapshots work along with HDFS quotas. For example, assume a directory with a quota of 1GB with snapshotting enabled. Assume the directory is closer to its full quota and a user deleted a large file to store some other dataset. Will the new data be allowed to be saved to the directory or will the operation be stopped because the quota limits have been exceeded? It will will allow you to exceed the quota mentioned. It will give warning. Apologies if some of the questions doesn’t make sense. I am still trying to understand these concepts at a ground level.

sshimpi · ‎05-24-2016

@mike harding Please check this tutorials for ranger and knox - https://github.com/seanorama/masterclass/tree/master/security-advanced https://github.com/abajwa-hw/security-workshops Hope that helps.

sshimpi · ‎05-24-2016

@hari kiran There might be network issue SYMPTOM: Reading/writing of data with HDFS --put, ROOT CAUSE: The message "Slow-BlockReceiver-write-packet-to-mirror" is normally an indication that there is a problem with with underlying networking infrastructure. SOLUTION: Ensure from the OS level that the networking as well as NIC parameters are set up correctly. - verify that MTU value set up as expected - communication mode is correct (full-duplex) - there are not too many errors at the interface level (dropped/overruns) COMMANDS TO USE FOR DEBUGGING: #dmesg <-- identify issues with NIC device driver #ifconfig -a <-- MTU, errorrs (dropped packets/buffer overruns) #ethtool ethX <-- identify/set speed/ negotiation mode/ duplex setting for the inferface In addition, running iperf between datanodes will highlight overall networks transfer issues. #iperf -s (server) #iperf -c 10.1.1.1 -f m -d (client)

sshimpi · ‎05-24-2016

Adding service through ambari gives error as shown below - [root@sandbox ~]# curl -u admin:admin -i -X POST -d '{"ServiceInfo":{"service_name":"STORM"}}' http://xxx.xxx.xxx.xxx:8080/api/v1/clusters/Sandbox/services HTTP/1.1 400 Bad Request Set-Cookie: AMBARISESSIONID=qraouzksi4vktobhob5heqml;Path=/ Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-Type: text/plain Content-Length: 107 Server: Jetty(7.6.7.v20120910) { "status" : 400, "message" : "CSRF protection is turned on. X-Requested-By HTTP header is required." You need to disable CSRF protection as mentioned below - 1.Login to ambari server using cli [superuser credentials] vi /etc/ambari-server/conf/ambari.properties 2. Add below line at the bottom of the file api.csrfPrevention.enabled=false 3. Restart ambari server #ambari-server restart 4. Try executing POST command again to add service and it should work [root@sandbox ~]# curl -u admin:admin -i -X POST -d '{"ServiceInfo":{"service_name":"STORM"}}' http://xxx.xxx.xxx.xxx:8080/api/v1/clusters/Sandbox/services HTTP/1.1 201 Created Set-Cookie: AMBARISESSIONID=1t4c7yfbu64nw1nenrgplco7sd;Path=/ Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-Type: text/plain Content-Length: 0 Server: Jetty(7.6.7.v20120910) Thanks.

sshimpi · ‎05-24-2016

@Klaus Lucas I see same issue here - https://community.hortonworks.com/questions/6259/target-replicas-is-10-but-found-3-replicas.html Check if you get any info from here.

sshimpi · ‎05-24-2016

@Vikas Gadade After configuring namenode HA you need to added below settings in topology.xml. Please check - https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_Knox_Gateway_Admin_Guide/content/configure_knox_for_webhdfs_ha.html <provider> <role>ha</role> <name>HaProvider</name> <enabled>true</enabled> <param> <name>WEBHDFS</name> <value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true</value> </param> </provider> <service> <role>WEBHDFS</role> <url>http://{host1}:50070/webhdfs</url> <url>http://{host2}:50070/webhdfs</url> </service>

sshimpi · ‎05-23-2016

@Muhammad Syamsudin Right as @Kuldeep Kulkarni mentioned. Please find some few more details on - http://hortonworks.com/blog/announcing-sandbox-2-0/ EASY ENABLEMENT OF HBASE AND AMBARI On the About Sandbox page in the Sandbox, you’ll find buttons to enable/disable Ambari and Hbase. We’ve shipped the Sandbox with these disabled to keep the memory requirements at the existing 4 Gb of physical RAM. In order to run Ambari or Hbase you’ll need to have 8Gb of of physical RAM and you’ll need to increase the memory allocation in your virtual machine to 4 Gb. If you have enough physical RAM, go ahead an increase the memory allocation to the VM and you’ll see a performance improvement.

sshimpi · ‎05-23-2016

@Andrey Nikitin re-checking the above comment from @Kuldeep Kulkarni I see that you started hiveservice2 using cli and hence its not ambari managed or you are still able to see hiveserver2 stop in ambari. Can you stop/kill the process of hiveserver2 from cli and try to start hive service from ambari ? use this - $ps -aef |grep hiveserver2 <- get pid from this command $kill -9 <pid-of-hiveserver2-process> restart hive service from ambari.

sshimpi · ‎05-23-2016

@Harini Yadav Please check this - Ranger will always takes 1st precedence and then POSX permissions/HDFS acl's. Also setting "xasecure.add-hadoop-authorization" = false in ranger-hdfs-security.xml in /etc/hadoop/conf will stop the fall back to HDFS ACL. Please check below url's for more details - http://hortonworks.com/blog/best-practices-in-hdfs-authorization-with-apache-ranger/ https://community.hortonworks.com/questions/22054/should-we-disable-hdfs-default-acl-to-enable-range.html

Online	Offline
Last Visited	‎12-07-2017 08:26 AM

Member Since	‎02-08-2016 09:06 AM
Last Visited	‎12-07-2017 08:26 AM
Posts	793
Kudos received	667

Cloudera Community

Re: Issue with Ranger User/group sync

Re: Ranger HDFS test connection fails

Re: Error while configuring NameNode High Availabi...

Re: Ranger policies on HDFS

Re: Can we do column value level restriction in Ap...

Re: Map Reduce Application is not getting submitte...

Re: Snapshots, Backup and DR

Re: How to setup Hive Authentication in my cluster...

Re: Copying Data to HDFS takes too much time

How to resolve CSRF protection error while adding...

Re: Target Replicas is 5 but found 3 live replica(...

Re: Knox Error after configuring namenode HA

Re: HDP sandbox startup too long on virtualbox

Re: can't start hiveserver2 with ambari - beeline ...

Re: Ranger Admin stops applying policy updates.