Member since
02-08-2016
793
Posts
669
Kudos Received
85
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3067 | 06-30-2017 05:30 PM | |
3988 | 06-30-2017 02:57 PM | |
3309 | 05-30-2017 07:00 AM | |
3884 | 01-20-2017 10:18 AM | |
8401 | 01-11-2017 02:11 PM |
05-26-2016
11:08 AM
@Raja Ray Can you check application logs for any error ? If possible can you get screenshot of Yarn webUI attached here. Also from below link i see restarting yarn helps - http://stackoverflow.com/questions/33496491/high-availability-jobs-not-getting-submitted-immediately-after-name-node-fail
... View more
05-25-2016
04:05 PM
3 Kudos
@bigdata.neophyte Answers inline.
How does snapshots help for Disaster Recovery? What are the best practices around using snapshots for DR purposes? Especially trying to understand when data is directly stored on HDFS, Hive data and HBase data Snapshot will not be best option for DR. Can a directory be deleted using hdfs dfs -rmr -skipTrash /data/snapshot-dir? Or is it that all the snapshots have to be deleted first and then the snapshotting be disabled before allowing the directory be deleted? No. If the directory is snapshottable it cannot be deleted. As I understand, no data is copied for snapshots, but only metadata is maintained for the blocks added/ modified / deleted. If that’s the case, just wondering what happens when the comamnd hdfs dfs -rm /data/snapshot-dir/file1 is run. Will the file be moved to the trash? If so, will the snapshot maintain the reference to the entry in trash? Will trach eviction has any impact in this case? If a file is deleted from snapshottable directory it will be stored in .snapshot folder and you can copy data back. For eg. [hdfs@node1 ~]$ hdfs dfs -rm -r -skipTrash /tmp/test/anaconda-ks.cfg
Deleted /tmp/test/anaconda-ks.cfg
[hdfs@node1 ~]$ hadoop fs -ls /tmp/test/.snapshot/s20160526-022510.203
-rw-r--r-- 3 hdfs hdfs 1155 2016-05-26 02:23 /tmp/test/.snapshot/s20160526-022510.203/anaconda-ks.cfg
[hdfs@node1 ~]$ hadoop fs -cp /tmp/test/.snapshot/s20160526-022510.203/anaconda-ks.cfg /tmp/test You will be able to see the data back in the file. What happens when one of the sub-directory under the snapshot directory is deleted? For example, if the command hdfs dfs -rmr -skipTrash /data/sub-dir is run? Can the data be recovered from snapshots? If the subdirectory already exist when "
hdfs dfsadmin -allowSnapshot" was run then if we delete directory it will be there in ".snapshot" folder. Eg "
/user/test1/.snapshot/s20160526-025323.341/subdir/ambari.properties.2" If the subdirectory is created later then it will not be backed up. Can snapshots be deleted / archived automatically based on policies, for example time-based? In the above example, how long will the sub-dir data be maintained in the snapshot? You can create custom script to delete/archive snap based on policies. The snapshot will be maintained till we delete it. How does snapshots work along with HDFS quotas. For example, assume a directory with a quota of 1GB with snapshotting enabled. Assume the directory is closer to its full quota and a user deleted a large file to store some other dataset. Will the new data be allowed to be saved to the directory or will the operation be stopped because the quota limits have been exceeded? It will will allow you to exceed the quota mentioned. It will give warning. Apologies if some of the questions doesn’t make sense. I am still trying to understand these concepts at a ground level.
... View more
05-24-2016
12:35 PM
1 Kudo
@mike harding Please check this tutorials for ranger and knox - https://github.com/seanorama/masterclass/tree/master/security-advanced https://github.com/abajwa-hw/security-workshops Hope that helps.
... View more
05-24-2016
10:13 AM
2 Kudos
@hari kiran There might be network issue SYMPTOM:
Reading/writing of data with HDFS --put,
ROOT CAUSE:
The message "Slow-BlockReceiver-write-packet-to-mirror" is normally an indication that there is a problem with with underlying networking infrastructure.
SOLUTION:
Ensure from the OS level that the networking as well as NIC parameters are set up correctly.
- verify that MTU value set up as expected
- communication mode is correct (full-duplex)
- there are not too many errors at the interface level (dropped/overruns)
COMMANDS TO USE FOR DEBUGGING:
#dmesg <-- identify issues with NIC device driver
#ifconfig -a <-- MTU, errorrs (dropped packets/buffer overruns)
#ethtool ethX <-- identify/set speed/ negotiation mode/ duplex setting for the inferface
In addition, running iperf between datanodes will highlight overall networks transfer issues.
#iperf -s (server)
#iperf -c 10.1.1.1 -f m -d (client)
... View more
05-24-2016
09:23 AM
8 Kudos
Adding service through ambari gives error as shown below -
[root@sandbox ~]# curl -u admin:admin -i -X POST -d '{"ServiceInfo":{"service_name":"STORM"}}' http://xxx.xxx.xxx.xxx:8080/api/v1/clusters/Sandbox/services
HTTP/1.1 400 Bad Request
Set-Cookie: AMBARISESSIONID=qraouzksi4vktobhob5heqml;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Type: text/plain
Content-Length: 107
Server: Jetty(7.6.7.v20120910)
{
"status" : 400,
"message" : "CSRF protection is turned on. X-Requested-By HTTP header is required."
You need to disable CSRF protection as mentioned below -
1.Login to ambari server using cli [superuser credentials]
vi /etc/ambari-server/conf/ambari.properties
2. Add below line at the bottom of the file
api.csrfPrevention.enabled=false
3. Restart ambari server
#ambari-server restart
4. Try executing POST command again to add service and it should work
[root@sandbox ~]# curl -u admin:admin -i -X POST -d '{"ServiceInfo":{"service_name":"STORM"}}' http://xxx.xxx.xxx.xxx:8080/api/v1/clusters/Sandbox/services
HTTP/1.1 201 Created
Set-Cookie: AMBARISESSIONID=1t4c7yfbu64nw1nenrgplco7sd;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Type: text/plain
Content-Length: 0
Server: Jetty(7.6.7.v20120910)
Thanks.
... View more
Labels:
05-24-2016
08:43 AM
@Klaus Lucas I see same issue here - https://community.hortonworks.com/questions/6259/target-replicas-is-10-but-found-3-replicas.html Check if you get any info from here.
... View more
05-24-2016
07:38 AM
3 Kudos
@Vikas Gadade
After configuring namenode HA you need to added below settings in topology.xml. Please check - https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_Knox_Gateway_Admin_Guide/content/configure_knox_for_webhdfs_ha.html <provider>
<role>ha</role>
<name>HaProvider</name>
<enabled>true</enabled>
<param>
<name>WEBHDFS</name>
<value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true</value>
</param>
</provider>
<service>
<role>WEBHDFS</role>
<url>http://{host1}:50070/webhdfs</url>
<url>http://{host2}:50070/webhdfs</url>
</service>
... View more
05-23-2016
07:13 PM
1 Kudo
@Muhammad Syamsudin
Right as @Kuldeep Kulkarni mentioned. Please find some few more details on - http://hortonworks.com/blog/announcing-sandbox-2-0/ EASY ENABLEMENT OF HBASE AND AMBARI
On the About Sandbox page in the Sandbox, you’ll find buttons to enable/disable Ambari and Hbase. We’ve shipped the Sandbox with these disabled to keep the memory requirements at the existing 4 Gb of physical RAM. In order to run Ambari or Hbase you’ll need to have 8Gb of of physical RAM and you’ll need to increase the memory allocation in your virtual machine to 4 Gb. If you have enough physical RAM, go ahead an increase the memory allocation to the VM and you’ll see a performance improvement.
... View more
05-23-2016
12:52 PM
@Andrey Nikitin re-checking the above comment from @Kuldeep Kulkarni I see that you started hiveservice2 using cli and hence its not ambari managed or you are still able to see hiveserver2 stop in ambari. Can you stop/kill the process of hiveserver2 from cli and try to start hive service from ambari ? use this - $ps -aef |grep hiveserver2 <- get pid from this command $kill -9 <pid-of-hiveserver2-process> restart hive service from ambari.
... View more
05-23-2016
11:33 AM
@Harini Yadav Please check this - Ranger will always takes 1st precedence and then POSX permissions/HDFS acl's. Also setting "xasecure.add-hadoop-authorization" = false in ranger-hdfs-security.xml in /etc/hadoop/conf will stop the fall back to HDFS ACL. Please check below url's for more details - http://hortonworks.com/blog/best-practices-in-hdfs-authorization-with-apache-ranger/ https://community.hortonworks.com/questions/22054/should-we-disable-hdfs-default-acl-to-enable-range.html
... View more