About drussell

drussell · ‎05-24-2016

Hi @SparkRocks It's certainly possible, honestly people requiring very low latency (down to a few ms) are mostly still using Storm today, but Spark is getting closer and closer to that point, while it might still be slightly higher latency but it depends how low you want to get. Storm might be the perfect technology to implement it, but If you're happy with Spark, have experience and knowledge already invested and are generally comfortable with the way Spark does things, I'd say go for doing it in Spark. Hope that helps.

drussell · ‎05-23-2016

Hi @Manoj Dhake I've answered a very similar question recently, the short answer is that integration is coming up very soon, but not available in the current 0.5 release of Atlas. Atlas 0.6 or later currently in preview for the next HDP release. I would strongly recommend reading my answer to the following question here: https://community.hortonworks.com/questions/32350/atlas-05-current-functionalities.html#answer-32404 ... and if you have any further questions, don't hesitate to let me know. Also check out another answer in that thread from Andrew Ahn who shows you how you can see the current preview sandbox with the latest functionality in it. Hope that helps!

drussell · ‎05-23-2016

Hi @Raj sharma. Have you tried looking at jshs2? https://www.npmjs.com/package/jshs2 Alternatively for Hive you can also use the REST API to make queries https://cwiki.apache.org/confluence/display/Hive/HiveWebInterface Finally for HDFS you can use the REST API for HDFS https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html Hope that helps!

drussell · ‎05-23-2016

Hi @li zhen. The current automated playbooks are currently designed largely for a from scratch deployment, not optimised for use with existing infrastructure. If you want to work through the installation manually, I'd suggest very closely reading the playbooks and their configuration to comment out the sections you don't need due to installing on existing infrastructure, you would start from here: https://github.com/apache/incubator-metron/tree/master/metron-deployment#metron This is assuming that your existing cluster was installed via Ambari, also pay close attention to the configuration files that need to be adjusted as part of the deployment. Honestly though, if you're looking for a quick and easy deployment, deploying fresh is by far the fastest method, from 0 to a complete Metron environment is possible on AWS in under 90 minutes. Good luck!

drussell · ‎05-19-2016

Hi @Prateek Gupta. This question has been answered previously here: https://community.hortonworks.com/questions/13782/downgrade-of-ambari-22-to-21.html Hope that helps!

drussell · ‎05-16-2016

Hi @bschofield Currently NiFi does not use any form of UDP acceleration for its site to site protocol. One proposed solution that has been suggested previously is to add a "PutUDP" processor to match the existing "ListenUDP" processor. The NiFi site to site protocol already batches up FlowFiles to reduce the TCP overhead during large transmissions. Hope that helps.

drussell · ‎05-16-2016

Hi @Asim munshi. This really depends on a lot of variables. Is this the first time that the team is doing this? Assuming they're using Ambari, that they're exprienced in this and have done it before in several other different environments, but they're also looking to perform performance and stability testing before handover etc I would recommend 1 week which includes at least a day and a half of contingency for things that crop up during deployment. If however this environment has been used before i.e. networking, firewalling, routing, etc etc is all in a known good state, and the hardware is all 100% correctly provisioned you could easily get everything done including Ambari and the whole cluster in a morning. I realise that's quite a wide range of times but hopefully that gives you some kind of range to work with. Hope that helps.

drussell · ‎05-14-2016

Hi @Mukthyar Azam Hue is indeed installed on the HDP 2.4 sandbox, however it is not started by default, or fully configured, you would need to start the service, create the port forward to port 8000 within virtualbox (assuming you're using that) and then fix the configuration. I'd strongly recommend looking at Ambari Views as it is a complete replacement for things like the filebrowser, hive view etc etc that are now very good within the Ambari Views framework. http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_ambari_views_guide/content/ch_using_ambari_views.html

drussell · ‎05-13-2016

Hi @Smart Solutions Already a good answer as above, but I'd also add .... Are you using Ambari on this cluster? If you're not using Ambari then all the configuration files you mentioned should be under some sort of configuration management control, such as Chef, Puppet, Ansible etc. If you are using Ambari, then of course all this information is kept in the Ambari database, where all of the configuration is represented and where you can do revision control of the configs within the Ambari environment (plus perform config version comparison etc). Technically it is possible to use the Ambari API to query all of the configuration parameters which make up the various xml configurations but that would be a giant chunk of work to do it that way. See https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/configuration.md What's your requirement to export the configs driven by? Always happy to understand more :o)

drussell · ‎05-13-2016

Hi @Manoj Dhake column level liniage is expected in Atlas for around the end of 2016. For more detailed information, I'd strongly advise watching three sessions that occurred during the recent European Hadoop Summit, search for sessions by Andrew Ahn (there are three!) http://www.hadoopsummit.org/dublin/agenda/ Hope that helps.

Online	Offline
Last Visited	‎12-10-2018 10:03 AM

Member Since	‎09-18-2015 08:21 AM
Last Visited	‎12-10-2018 10:03 AM
Posts	191
Kudos received	80

Cloudera Community

Re: Metastore HA Active/Active ?

Re: Hi All, I want to integrate Ab initio tool wit...

Re: Hadoop Rack-Awareness is only for datanode ser...

Re: Kafka installation best practices in HDF

Re: Best tools for file transfer and ingest.

Re: Spark Streaming 2.0 is it suitable for Low Lat...

Re: Will I get the Lineage for Apache Falcon in At...

Re: Can we connect hive or hdfs through node js. C...

Re: install the metron in the existed ambari and e...

Re: Please help to downgrade Ambari 2.2 to 2.1 ver...

Re: Does nifi offer UDP acceleration for fast larg...

Re: time to provision a spark cluster

Re: Do we hav Hue in HDP2.4?

Re: How to export all HDP configuration files (xml...

Re: Does apache atlas provide column level lineage...