Member since
04-17-2022
3
Posts
0
Kudos Received
0
Solutions
04-20-2022
09:18 AM
I typically upload csv files into Cloudera Data Science Workbench, but I wonder if there is a way to programmatically read in a csv file from a shared server drive while in yarn mode? Using the below code, I get an error. Any Tips? df = spark.read.format('csv').load('Q:\\project\\data_folder\\file.csv', header=True) The error I get is: IllegalArgumentException: 'java.net.URISyntaxException: Relative path in absolute URI: Q:%5Cproject%5Cdata_folder/file.csv'
... View more
Labels:
04-19-2022
12:46 PM
I did need to manually add 'cores' to the list, but I got it now. Thanks!
... View more
04-17-2022
05:11 PM
I'm very new to Cloudera and Spark. I need to know the specifications of the cluster I'm using, so I can better optimize my Spark session configuration (executors/cores/ram/etc). I'm trying to view these details in Cloudera Manager, but I'm not sure where to look. I want to know on the specific cluster I'm using, how many nodes it has, and how many cores and how much memory per node? Forgive my naivety, but some guidance would be appreciated!
... View more