11-05-2014 05:05 PM
OK, downloaded the VM and uploaded a public dataset in CSV format (the CMS data from data dot gov). The file is just under 2GB with about 10 million rows. Imported the file - OK.
Create a table using HUE pointing to the csv file and things look good EXCEPT it only pulls in 300K rows. Is "Big Data" limited to just 300K rows and 64MB?
FWIW, used the same data file and it imported fine into SQL Server in just a few seconds will all rows and no errors so the source file appears fine.
Could there be some magical config that makes the VM work with, well, data sets with more than 300K rows?
01-10-2015 11:11 AM