About Data1701

Data1701 · ‎04-20-2022

I typically upload csv files into Cloudera Data Science Workbench, but I wonder if there is a way to programmatically read in a csv file from a shared server drive while in yarn mode? Using the below code, I get an error. Any Tips? df = spark.read.format('csv').load('Q:\\project\\data_folder\\file.csv', header=True) The error I get is: IllegalArgumentException: 'java.net.URISyntaxException: Relative path in absolute URI: Q:%5Cproject%5Cdata_folder/file.csv'

Data1701 · ‎04-19-2022

I did need to manually add 'cores' to the list, but I got it now. Thanks!

Data1701 · ‎04-17-2022

I'm very new to Cloudera and Spark. I need to know the specifications of the cluster I'm using, so I can better optimize my Spark session configuration (executors/cores/ram/etc). I'm trying to view these details in Cloudera Manager, but I'm not sure where to look. I want to know on the specific cluster I'm using, how many nodes it has, and how many cores and how much memory per node? Forgive my naivety, but some guidance would be appreciated!

Online	Offline
Last Visited	‎04-20-2022 02:43 AM

Member Since	‎04-17-2022 04:49 PM
Last Visited	‎04-20-2022 02:43 AM
Posts	3

Cloudera Community

How to read in a csv file from server location?

Re: How do I view cluster specifications?

How do I view cluster specifications?