Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Reach MS SQL via ODBC in Python within CDSW

avatar
New Contributor

Hello, we see some samples related to using PYODBC in Python within CDSW at this at bottom (https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_import_data.html)

I tried this and it failed even after I corrected the URL pointing to the library on google.  I am not knowledgeable enough to debug further.

 

And we have received some guidance here related to unixODBC.

(https://github.com/mkleehammer/pyodbc/wiki/Connecting-to-SQL-Server-from-RHEL-or-Centos)

We are told we must first install this into the CDSW edge node environment.  We could do that, but have not yet, because I am confused by existence of two options.

 

Has anybody succeeded in using either of these ODBC drivers for reach back into an MS SQL database instance to pull data directly via SQL queries into CDSW?  If so, which did you use, and would you be willing to share your verbose steps/experience?  Thank you!

1 REPLY 1

avatar
New Contributor

The solution is to create your own custom CDSW engine.

 

# Built on CDSW v8
Starting with v8 with CDSW 1.6

```
$cat /etc/issue
Ubuntu 16.04.6 LTS \n \l
```

## Built from [mssql-docker](https://github.com/microsoft/mssql-docker/blob/master/oss-drivers/pyodbc/Dockerfile)

Dockerfile:

FROM  docker.repository.cloudera.com/cdsw/engine:8

MAINTAINER IQVIA

# apt-get and system utilities
RUN apt-get update && apt-get install -y \
curl apt-utils apt-transport-https debconf-utils gcc build-essential g++-5\
&& rm -rf /var/lib/apt/lists/*

# adding custom MS repository
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list

# install SQL Server drivers
RUN apt-get update && ACCEPT_EULA=Y apt-get install -y msodbcsql unixodbc-dev

# install SQL Server tools
RUN apt-get update && ACCEPT_EULA=Y apt-get install -y mssql-tools
RUN echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
RUN /bin/bash -c "source ~/.bashrc"

# upgrade pip
RUN pip install --upgrade pip

# install SQL Server Python SQL Server connector module - pyodbc
RUN pip install pyodbc