Member since
10-01-2015
3933
Posts
1150
Kudos Received
374
Solutions
05-06-2016
05:10 PM
right now there are no concrete release dates, I would wait until Hadoop Summit San Jose for any announcements.
... View more
05-03-2016
02:30 AM
2 Kudos
I'm a long-time user of Apache Bigtop. My experience with Hadoop and Bigtop predates Ambari. I started using Bigtop with version 0.3. I remember pulling bigtop.repo file and install Hadoop, Pig and Hive for some quick development. Bigtop makes it convenient and easy. Bigtop has matured since then and there are now multiple ways of deployment. There's still a way to pull repo and install manually but there's better ways now with Vagrant and Docker. I won't rehash how to deploy Bigtop using Docker as it was beautifly described here. Admittedly, I'm running it on Mac and was not able to provision a cluster using Docker. I did not try with non-OSX. This post is about Vagrant. Let's get started: Install VirtualBox and Vagrant Download 1.1.0 release wget http://www.apache.org/dist/bigtop/bigtop-1.1.0/bigtop-1.1.0-project.tar.gz uncompress the tarball tar -xvzf bigtop-1.1.0-project.tar.gz change directory to bigtop-1.1.0/bigtop-deploy/vm/vagrant-puppet-vm cd bigtop-1.1.0/bigtop-deploy/vm/vagrant-puppet-vm here you can review the README but to keep it short you can edit the vagrantconfig.yaml for any additional customization like changing VM memory, OS, number of CPUs, components (e.g. hadoop, spark, tez, hama, solr) etc and also number of VMs you'd like to provision. This last part is the killer feature, you can provision a Sandbox with multiple nodes, not a single VM. Same is true with Docker provisioner but I can't confirm that for you. Feel free to read the README in bigtop-1.1.0/bigtop-deploy/vm/vagrant-puppet-docker for that approach. then you can start provisioning your custom sandbox with vagrant up wait 5-10min and then you can use standard Vagrant commands to interact with your custom Sandbox. vagrant ssh bigtop1 now just create your local user and off you go sudo -u hdfs hdfs dfs -mkdir /user/vagrant
sudo -u hdfs hdfs dfs -chown -R vagrant:hdfs /user/vagrant for your convenience, add the bigtop machine(s) to /etc/hosts Now, you're probably wondering why would I use Bigtop over regular sandbox? Well, Sandbox has been getting pretty resource heavy and has a lot of components. I like to provision a small cluster with just a few components like hadoop, spark, yarn and pig. Bigtop makes this possible and runs easily within a memory strapped VM. One downside is that with the latest release, Spark is at 1.5.0 and Hortonworks Sandbox is at 1.6.0, story is the same with other components. There are version gaps and if you can look past it, you have a quick way to prototype without much fuss! This is by no means meant to steal thunder from an excellent Ambari quick start guide, this is meant to demonstrate yet another approach from a rich ecosystem of Hadoop tools.
... View more
Labels:
04-27-2016
06:14 PM
@Deepesh here's a Spark benchmark from @vshukla https://community.hortonworks.com/questions/29085/spark-benchmarking-tools.html
... View more
03-01-2016
05:49 PM
I wouldn't know where to drop a large vmdk like that? @Vladimir Zlatkin I'll give it a go once more before going the PS support route ;).
... View more
03-01-2016
05:41 PM
@Vladimir Zlatkin I just tried out your demo, there are a few things I got stuck on. I was not able to download the custom dashboard via your URL due to the parenthesis. I wrapped the URL in double quotes and it worked. Then I was having issues creating a collection hl7_messages, is creating solr directory in zookeeper necessary? I created it anyway but I also had to refer to Ali's nifi twitter tutorial for reference in terms of Solr. here's link https://community.hortonworks.com/content/kbentry/1282/sample-hdfnifi-flow-to-push-tweets-into-solrbanana.html I finally was able to create hl7_messages collection but upon template execution, I'm getting errors in Solr that the collection doesn't exist, hence asking you whether zk step is necessary? Will do further digging.
Ensure no log files owned by root (current sandbox version has files owned by root in log dir which causes problems when starting solr) chown -R solr:solr /opt/lucidworks-hdpsearch/solr
Run solr setup steps as solr user su solr
Setup the Banana dashboard by copying default.json to dashboard dir cd /opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/banana/app/dashboards/ mv default.json default.json.ori
... View more
03-01-2016
12:25 AM
Bravo Vladimir! Do you have a github repo? I'd like to contribute.
... View more
02-26-2016
04:56 PM
@Mark Smith this is in reference to the current API which is 1.x.
... View more
02-15-2016
02:18 PM
@Paul Boal this is an article, and our thread is becoming too large, next time try to open a new question on HCC instead. Here's a solution you can try http://zeltov.blogspot.com/2015/11/external-jars-not-getting-picked-up-in_9.html
... View more
02-14-2016
11:48 PM
@Paul Boal use this guide to work with hive udfs in spark http://hortonworks.com/hadoop-tutorial/apache-spark-1-4-1-technical-preview-with-hdp/ And here's example of invoking csvserde https://community.hortonworks.com/content/kbentry/8313/apache-hive-csv-serde-example.html
... View more