- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
What is missing from Sandbox on Azure instructions?
- Labels:
-
Apache Ambari
-
Apache Hadoop
-
Apache Hive
Created on ‎08-16-2016 04:09 PM - edited ‎09-16-2022 03:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm already using Sandbox on a Linux laptop to learn Hortonworks. But that is at home. At work we have an MSDN subscription so we have some credits each month and several of us were thinking of working the sandbox tutorials on Azure at work - a group learning experience as we are brining in Hadoop.
However on Azure, this is the problem I encounter -
After standing up the sandbox on Azure following the instructions, I am able to login to Ambari OK. I can open the ssh OK. I can upload the .csv files for the first tutorial to /tmp/maria_dev/data OK. But then I hit a wall submitting the first query. It just sits there... forever. No log, nothing. When I check Azure's disk subsystem, it doesn't appear to be doing anything at all. The VM is set for Size A5 standard, per the instructions. This has 2 cores, 14 GB RAM and 4 data disks for a max 4x500 IOPS, and also has load balancing. But it doesn't seem to want to process Hive queries, or else it is excruciatingly slow at doing so. My l'il 'ol laptop can finish that first query to load a .csv file in less than 1 minute. After 5 minutes I just decided I was burning credits watching Azure do nothing but spin the meter.
Has anyone encountered a similar problem? Is there some step I missed? Some action I have to do with the disk subsystem? Any help is appreciated. Thanks in advance.
Created ‎08-17-2016 04:16 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK, the problem... Finally intuited from punching stuff on the Azure interface that there were no disks on the VM. I stress 'intuited' because this is not in the documentation anywhere - but to be fair to Hortonworks, this is a fault of MSFT. The impression they provide with Resource Template is that it a working system will be stood up. But deployment complete does not mean ready to run. But, the problem goes deeper as no matter what 'template' is picked the only disk that gets built into the storage resource is the ~45 GB 'system disk' for the Linux OS. The data disks are not in the storage resource, so, no place to land the data. That's why the sandbox won't run any of the queries - no place to land the data.
I guess I'll just use my trusty laptop to finish the tutorials. Some day when I have tons of time on my hands to flop around from site to site gathering all the 'oh everyone just knows that' tidbits about standing up an Azure VM, I may take another stab at it... yeah, right.
Created ‎08-16-2016 06:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When I encountered a problem with the Sandbox on Azure, I finally located a message in Azure that indicated that the VM size wasn't large enough. I, too, had selected the recommended size. Unfortunately, this happened months ago and I don't recall where I found that info. But the VM size is something you might check.
Created ‎08-17-2016 11:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks. I will see if that is the problem. If it is I will award you the 'answer prize' 🙂
Created ‎08-17-2016 04:16 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK, the problem... Finally intuited from punching stuff on the Azure interface that there were no disks on the VM. I stress 'intuited' because this is not in the documentation anywhere - but to be fair to Hortonworks, this is a fault of MSFT. The impression they provide with Resource Template is that it a working system will be stood up. But deployment complete does not mean ready to run. But, the problem goes deeper as no matter what 'template' is picked the only disk that gets built into the storage resource is the ~45 GB 'system disk' for the Linux OS. The data disks are not in the storage resource, so, no place to land the data. That's why the sandbox won't run any of the queries - no place to land the data.
I guess I'll just use my trusty laptop to finish the tutorials. Some day when I have tons of time on my hands to flop around from site to site gathering all the 'oh everyone just knows that' tidbits about standing up an Azure VM, I may take another stab at it... yeah, right.
Created ‎08-17-2016 05:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for posting what you discovered. Hopefully it will save some other folks some hassles.
Created ‎08-17-2016 10:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As I would not have discovered this without your comment, I gave you the points !!
Created ‎09-29-2016 06:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Circling back. Don't know if the browser change did it or what. Once you have the VM "purchased" circle back and click on the storage subsystem. You will see there are no disks. Click on the OS disk. You should now see options to attach new or attach existing. Add your disks. Next click on your IP address. This should open a control panel with a red configuration tool box. If you see that there is no DNS name, click the tool box, give your VM a host name. Note the domain suffix. That is what you will need to enter into the URL of a browser to register your sandbox and being using it.
