- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Python interpreter not configured in Zeppelin
- Labels:
-
Apache Zeppelin
Created ‎07-02-2018 11:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I ve check the list of interpreters that are installed on my zeppelin, and I found out that python doesn't belong to the list. now for use python command I use %spark.pyspark.
I would know if it's a good idea to use pyspark instead of python, and is it recommanded to have python interpreted even if I have pyspark which works fine for python code?
Created ‎07-02-2018 08:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, psypark interpreter can be used to run python. However the application will automatically have reference to spark libraries. Also note pyspark interpreter launches a yarn application and by default this is configured to run with 2 executors - This means you will see an application master + 2 containers for the running pyspark interpreter.
If you are not really making any use of spark and only write code that does not need to be run in cluster perhaps you should consider installing just the python interpreter.
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created ‎07-02-2018 08:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, psypark interpreter can be used to run python. However the application will automatically have reference to spark libraries. Also note pyspark interpreter launches a yarn application and by default this is configured to run with 2 executors - This means you will see an application master + 2 containers for the running pyspark interpreter.
If you are not really making any use of spark and only write code that does not need to be run in cluster perhaps you should consider installing just the python interpreter.
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created ‎07-03-2018 09:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What about if I want to use PANDAS and Matplotlib, should I use Pyspark?
Created ‎07-03-2018 11:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Yassine Yes, you could use Pandas and Matplotlib along with pyspark. For example you could use spark api to read data from cluster in parallel, process the data and then you could transform the spark dataframe to pandas and use matplotlib to show the results. There are other interactions but I think this may be the most common one I've seen.
