Member since
09-18-2015
191
Posts
81
Kudos Received
40
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2045 | 08-04-2017 08:40 AM | |
5421 | 05-02-2017 01:18 PM | |
1109 | 04-24-2017 08:35 AM | |
1116 | 04-24-2017 08:21 AM | |
1334 | 06-01-2016 08:54 AM |
05-13-2016
11:36 AM
1 Kudo
Hi @simran kaur this may or may not help depending on your exact scenario, however I've done something similar before by using Falcon (which is driving Oozie underneath) to do exactly this. Have a look at https://github.com/apache/falcon/tree/master/addons/hdfs-snapshot-mirroring The reason this is nice is that it provides built in functionality to handle: * Create snapshots in source directory
* Copy this directory between HDFS clusters
* Create snapshot in target directory
* Handle snapshot retention in source and target directories It's honestly going to be much easier than writing that all yourself within Oozie, you don't need to use it to mirror those snapshots between clusters, you can use it within a single cluster. Hope that helps!
... View more
05-12-2016
02:12 PM
Hi @ccasano, understood, I don't believe such a list exists right now, unless @lpapp knows differently, or could generate such a list?
... View more
05-12-2016
10:39 AM
Hi @kavitha velaga for this kind of monitoring, I'd suggest using an external monitoring framework, something like Munin, Ganglia or whatever framework you already use within your org. Most of these frameworks can handle the recording of Round Trip Times (RTT) from hosts to something like an s3 endpoint. Hope that helps.
... View more
05-12-2016
07:25 AM
Hi @Anandha L Ranganathan All the steps are covered in the ambari wizard for the downgrade, but you'll need to pay close attention to the various databases etc that require backing up as you go through. Both roll-back and full downgrade are possible. Make sure you have satisfied all of the prerequisites before beginning as well, and preferably have read through the upgrade document several times to ensure you have accounted for everything. http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_upgrading_Ambari/content/_upgrading_HDP_prerequisites.html I've performed a number of upgrades myself and while there is always the odd thing that crops up, its usually something minor that is easily fixed.
Good luck and I hope everything goes smoothly.
... View more
05-12-2016
07:19 AM
1 Kudo
Hi @ccasano There was a post that covers the minimum privilidge set that you require for Cloudbreak on AWS by @lpapp. https://community.hortonworks.com/questions/30242/list-of-policies-required-by-cloudbreak-to-launch.html As for the intracacies of VPC management, unless anyone here knows, that might be a question better answered by Amazon.
... View more
05-10-2016
12:24 PM
@Fazil Aijaz I've also reached out to the training team asking them to respond to those comments, so thanks for drawing our attention to them!
... View more
05-10-2016
12:18 PM
Hi @Fazil Aijaz yes I believe those issues have been resolved, as you noted, a good fast internet connection is certainly required, but I know a few people who have completed the exam sucessfully in the last week or so. I will verify with the team that everything is working as expected, but if you don't hear from me here in the next day or so then everything is good to go. Good luck with your exam!
... View more
05-10-2016
07:17 AM
1 Kudo
Hi @John Yawney. So answering your questions one at a time: 1) Currently Atlas 0.6 (that or later is expected in the next HDP release) supports the following hooks (therefore tracks the govornanace information for data that is touched by these systems) Hive, Sqoop, Falcon, Storm (with Spark, NiFi and HBase expected around end of year). Currently anything that doesn't have a hook won't be tracked. For time frame information on the next release, there's no public information but if you look historically we have regularly announced new releases around the US Hadoop Summit timeframe. This gives a good idea of what is coming down the line for Atlas: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62695330 2) Atlas is designed to be open and extensible so you could absolutely add 3rd party liniage information into the metastore. Any work you do in that area would also be greatly appreciated if contributed upstream also, with the added benifit that when accepted you won't need to suppor that code on your own. 3) Assuming you're talking Hive here? Column level liniage is expected around the end of the year. 4) You can, usually in combination with Ranger, but you'll need to be more specific about exactly what you mean for these rules. Additional notes.... For more detailed information, as I know the documentation around Atlas is pretty poor... I'd strongly advise watching three sessions that occurred during the recent European Hadoop Summit, search for sessions by Andrew Ahn (there are three!) http://www.hadoopsummit.org/dublin/agenda/
... View more
05-09-2016
06:21 PM
So I'd start looking at what log files are consuming space in /var/log remove some of the older ones that have rolled over etc should be pretty safe. 4.9GB in /usr seems a bit large too, maybe investigate what's consuming such a large percentage of your space in there too. As usual, remove any unneeded packages at the O/S level of course. 10GB is honestly a bit small for a root partition, might want to bump that up a bit, or at least spinning up some extra storage to mount as /var and /usr to give yourself a bit more space. Hadoop is very good at generating logs, so it's very easy to fill up a root partition if you're not careful and don't have it split off elsewhere.
... View more
05-09-2016
06:45 AM
Hi @Sunile Manjee. It's a bit more fully featured than that, Wandisco solutions generally operate at a level above the underlying service, so if one of the HBase clusters dies, the users don't notice any change, the second area of significant importance is Wandisco's solution ability to handle significant distances between clusters, where as standard HBase replication is usually within the same datacentre or only via very fast links, not full geo replication. https://www.wandisco.com/product/fusion-active-active-hbase They also have a webinar on HBase, the final section is all about using Fusion Active HBase across multiple datacentres: http://www.wandisco.com/webinar/replay/hadoop-hbase-depth Hope that helps!
... View more