Member since
05-30-2018
1322
Posts
715
Kudos Received
148
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4025 | 08-20-2018 08:26 PM | |
| 1930 | 08-15-2018 01:59 PM | |
| 2360 | 08-13-2018 02:20 PM | |
| 4077 | 07-23-2018 04:37 PM | |
| 4993 | 07-19-2018 12:52 PM |
06-28-2016
10:19 PM
5 Kudos
How to get a docker image up and running which encapulates a PyCharm IDE integrated with spark and pybuilder. The IDE reside on the docker container and will be display on your laptop/machine. This is to isolate your development enviorment with has spark integrated with spark. Why? I am a spark developer and spend significant time trying to build a integrated environment. I am spending way too much time on integration before doing what I get paid to do --- Develop! Creating a isolated environment which is integrated with spark and a CIT, easily spun up and down, and repeatable is something which would accelerate my efficiency.
Download latest virtualbox from here. To run docker containers or build images a docker machine is required. Download docker machine from here. Download xQuartz to display the IDE on your laptop. View my docker page for information on the docker image here. Clone my PyCharm github repo. You are doing this bootstrap code sample code I have built to your docker container during launch. For example I performed git clone in my /Users/smanjee/docktest
git clone https://github.com/sunileman/pycharm.git To start this tutorial start docker machine in a new terminal. For example on my laptop here is the start script :/Applications/Docker/Docker*app/Contents/Resources/Scripts/start.sh Run docker-machine env to check the IP your machine is assigned (informational only) Pull the image docker pull sunileman/pycharm Build the image docker build -t sunileman/pycharm . Open another terminal and start port forwarding socat TCP-LISTEN:6000,reuseaddr,fork UNIX-CLIENT:\"$DISPLAY\" Get your IP address (not docker machines) Run the image docker run -it -v /tmp/.X11-unix/:/tmp/.X11-unix/ -v ~/docktest/pycharm/PycharmProjects:/root/PycharmProjects -v ~/docktest/pycharm/.Pycharm40:/root/.PyCharm40 -e DISPLAY=XXX.XX.XX.X:0 --rm sunileman/pycharm Replace XXX.xx.xx.x with your IP replace ~/docktest/pycharm/PycharmProjects with your path to pycharm which you downloaded from my github repo Replace ~/docktest/pycharm/.Pycharm40 with your path to pycharm which you downloaded from my github repo Click on I do not have previous versions Click on OK Click on OPEN to open the project you mounted to the docker container Find the PyCharm project to open
Now the project has been imported
So you have the project imported into your IDE which is running within the docker container. To prove the IDE is connected/integrated with spark simply run the python file and you will see spark modules have been imported
... View more
Labels:
06-27-2016
01:13 PM
I had no issues using Anaconda as my development python on my production cluster. Just be sure to install it in a separate location and don't overwrite the standard OS install of Python.
... View more
07-19-2016
03:57 AM
@rdoktorics do you have example of how to do this? without recipe I am having difficult understanding how to add ranger metastore (DB) which was performed via recipe.
... View more
07-18-2018
08:51 PM
Just came across this question and wanted to note that the most current NiFi documentation link for LDAP is here: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#ldap_login_identity_provider But a good place to start to get an overview of NiFi's new auth model is here: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user_authentication
... View more
06-23-2016
04:02 AM
@Saurabh Kumar hortonworks and teradata are partners. They have built documentation on the hdp teradata connector here.
... View more
06-30-2016
07:17 PM
1 Kudo
Knox was designed for perimeter security and having it outside the firewall allows you to lock down your data/control nodes as stated. This approach makes it easy to hide hosts/ports that may change and provides users with one main access pattern. As mentioned in the other reply your firewall policy needs to account for the hosts/ports used. This is something we have deloyed on our edge node along with Hue and other UI services and fronted with a load balancer for high availability.
... View more
10-04-2016
09:46 AM
@dnyanesh kulkarnni 1)Can you update how much data is there? 2) You want to build poc cluster or production?
... View more
08-17-2018
11:52 AM
@Guven
Guvenal this is a great custom processor, you should create a HOWTO Article here about how to use it and or create one about how to make Custom Processors.
... View more
06-10-2016
03:32 PM
http://kafka.apache.org/documentation.html#configuration
... View more