Created on 08-16-2016 04:09 PM - edited 09-16-2022 03:35 AM
I'm already using Sandbox on a Linux laptop to learn Hortonworks. But that is at home. At work we have an MSDN subscription so we have some credits each month and several of us were thinking of working the sandbox tutorials on Azure at work - a group learning experience as we are brining in Hadoop.
However on Azure, this is the problem I encounter -
After standing up the sandbox on Azure following the instructions, I am able to login to Ambari OK. I can open the ssh OK. I can upload the .csv files for the first tutorial to /tmp/maria_dev/data OK. But then I hit a wall submitting the first query. It just sits there... forever. No log, nothing. When I check Azure's disk subsystem, it doesn't appear to be doing anything at all. The VM is set for Size A5 standard, per the instructions. This has 2 cores, 14 GB RAM and 4 data disks for a max 4x500 IOPS, and also has load balancing. But it doesn't seem to want to process Hive queries, or else it is excruciatingly slow at doing so. My l'il 'ol laptop can finish that first query to load a .csv file in less than 1 minute. After 5 minutes I just decided I was burning credits watching Azure do nothing but spin the meter.
Has anyone encountered a similar problem? Is there some step I missed? Some action I have to do with the disk subsystem? Any help is appreciated. Thanks in advance.
Created 08-17-2016 04:16 PM
OK, the problem... Finally intuited from punching stuff on the Azure interface that there were no disks on the VM. I stress 'intuited' because this is not in the documentation anywhere - but to be fair to Hortonworks, this is a fault of MSFT. The impression they provide with Resource Template is that it a working system will be stood up. But deployment complete does not mean ready to run. But, the problem goes deeper as no matter what 'template' is picked the only disk that gets built into the storage resource is the ~45 GB 'system disk' for the Linux OS. The data disks are not in the storage resource, so, no place to land the data. That's why the sandbox won't run any of the queries - no place to land the data.
I guess I'll just use my trusty laptop to finish the tutorials. Some day when I have tons of time on my hands to flop around from site to site gathering all the 'oh everyone just knows that' tidbits about standing up an Azure VM, I may take another stab at it... yeah, right.
Created 08-16-2016 06:55 PM
When I encountered a problem with the Sandbox on Azure, I finally located a message in Azure that indicated that the VM size wasn't large enough. I, too, had selected the recommended size. Unfortunately, this happened months ago and I don't recall where I found that info. But the VM size is something you might check.
Created 08-17-2016 11:31 AM
Thanks. I will see if that is the problem. If it is I will award you the 'answer prize' 🙂
Created 08-17-2016 04:16 PM
OK, the problem... Finally intuited from punching stuff on the Azure interface that there were no disks on the VM. I stress 'intuited' because this is not in the documentation anywhere - but to be fair to Hortonworks, this is a fault of MSFT. The impression they provide with Resource Template is that it a working system will be stood up. But deployment complete does not mean ready to run. But, the problem goes deeper as no matter what 'template' is picked the only disk that gets built into the storage resource is the ~45 GB 'system disk' for the Linux OS. The data disks are not in the storage resource, so, no place to land the data. That's why the sandbox won't run any of the queries - no place to land the data.
I guess I'll just use my trusty laptop to finish the tutorials. Some day when I have tons of time on my hands to flop around from site to site gathering all the 'oh everyone just knows that' tidbits about standing up an Azure VM, I may take another stab at it... yeah, right.
Created 08-17-2016 05:31 PM
Thanks for posting what you discovered. Hopefully it will save some other folks some hassles.
Created 08-17-2016 10:25 PM
As I would not have discovered this without your comment, I gave you the points !!
Created 09-29-2016 06:18 PM
Circling back. Don't know if the browser change did it or what. Once you have the VM "purchased" circle back and click on the storage subsystem. You will see there are no disks. Click on the OS disk. You should now see options to attach new or attach existing. Add your disks. Next click on your IP address. This should open a control panel with a red configuration tool box. If you see that there is no DNS name, click the tool box, give your VM a host name. Note the domain suffix. That is what you will need to enter into the URL of a browser to register your sandbox and being using it.