Member since
07-31-2017
7
Posts
0
Kudos Received
0
Solutions
11-13-2017
09:55 AM
I'm not sure it is possible to execute Python files in HDFS; hence the error that only local files are supported. (If you know how to make it work with HDFS let me know!) To get this to work for me I had to manually upload my Python files to a directory on the Livy server itself. You also have to make sure that the directory in which you put the Python files is listed in the livy.file.local-dir-whitelist property in livy.conf on the Livy server. You might also have to restart the Livy server but I'm not sure about that as I wasn't the server admin. After doing all this you can invoke POST /batches by giving the path to your Python file in the 'file' arg of the request. Make sure you use the "file:" protocol in the path's value. Only one forward slash is needed after the colon; example value: "file:/data/pi.py"
... View more
09-19-2017
08:45 AM
"Are there any downsides..?" - in Jupyter you have to decide upfront what language each notebook is going to be in. But in Zeppelin you can switch between languages within a single notebook, passing variable values between the languages. This allows you immense flexibility. For example, you could use Scala to calculate some results in Spark and then display the results in JavaScript/HTML /Angular using your own custom visualizations. You can use the best tool for the job if you're comfortable working in more than one language. Disclaimer: I've hardly used Jupyter and I know it has cell 'magics' which I think allow you to write individual cells in a language different to the one assigned for the whole notebook. However I don't know if you can pass data between cells of different languages in the same notebook. The issue at https://github.com/ipython/ipython/issues/4386 suggests not (although that is almost a year old now): "ipython/jupyter will not be supporting multiple kernels for a single notebook with variables being passed around between them, so I am closing this issue." For me, this is a big downside for using Jupyter.
... View more
08-23-2017
09:59 AM
I've given up on this for now and am just using DSNs to attach to Hive. Hopefully one day Hortonworks will make their drivers available via NuGet so that my app can talk to Hive directly via a connection string, and not require my end-users to install the ODBC drivers themselves and configure a DSN.
... View more
08-21-2017
08:26 AM
@Sindhu Thanks; yes, that link is where I downloaded the ODBC driver from in the first place. But do you know if Hortonworks will make their Windows drivers available via NuGet? If Hortonworks aren't prepared to make them available on NuGet soon then I'll have to figure out which of the many dlls in the existing ODBC driver I'll need to embed in my app; or else get one of these other drivers for Hive: https://www.nuget.org/packages?q=Hive . It's just they look pretty old and many are using Thrift rather than standard ODBC to communicate with Hive.
... View more
08-16-2017
01:33 PM
I've created a Windows 7 client application in C# that successfully connects to a Hive database using Hortonworks ODBC Driver for Apache Hive (v2.1.10). However I first had to install that driver on my machine and create a DSN which I then referenced in my application. Instead I want to include the driver libraries directly within my application so that users of my app would not need to install the driver on their machines nor create a DSN. For example, users can connect to a PostGres database using this technique. To support that all I had to do was use Nuget Package Manager inside Visual Studio to download the Npgsql driver. So, is the Hortonworks ODBC Driver for Apache Hive available in NuGet Package Manager? There seem to be multiple Hive-related drivers but they look pretty old and I can't see any from Hortonworks. Failing that, which dlls and files from my local installation of the driver should I embed in my app? There are 32 and 64 bit versions of my app so I guess each one will have to use the 32 and 64 bit drivers respectively. The app is using .Net Framework 4 and I'm constrained to Visual Studio 2010 for now.
... View more
Labels:
- Labels:
-
Apache Hive