Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Unable to read csv file in R

Highlighted

Unable to read csv file in R

Explorer

Hello All,

I am new to big data environment. I am trying to read csv file in R through zeppelin notebook. I have enabled the dependencies repository in interpreter and added the below lines in the starting of my notebook.

%dep
z.load("com.databricks:spark-csv_2.10:1.2.0")

While running the below getting error

%spark2.r
a<-read.csv("/sankar/test.csv")

Error in file(file, "rt"): cannot open the connection

This file have 666 permissions, Please let me know how to fix this issue. Thanks in advance.

Regards

Sankar

30 REPLIES 30
Highlighted

Re: Unable to read csv file in R

Highlighted

Re: Unable to read csv file in R

Explorer

Thanks Daniel for the wonderful doc, I have followed each and every step but still getting the same error. Not sure what exactly went wrong. I am running my cluster on Ubuntu, I believe it is not an issue with the OS.

Highlighted

Re: Unable to read csv file in R

@sysadmin CreditVidya

The steps I have presented in the HCC are for CentOS.

If you have followed them:

a) have you successfully run the test in CLI - point 3?

b) have you successfully run the test on Zeppelin - point 4b?

If the above is successful, the next step I would check:

- zeppelin-zeppelin-<zeppelin_node>.log file

- zeppelin-interpreter-spark2-....log file

Highlighted

Re: Unable to read csv file in R

Explorer

Hi Daniel,

Please find the answers below.

for a)

	root@hdfs-hadoop-gs1f:/home/smittapally# R -e "print(1+1)"
R version 3.3.3 RC (2017-02-27 r72279) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
  Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> print(1+1)
[1] 2
>
>

root@hdfs-hadoop-gs1f:	/home/smittapally#

For b)

b-output.png

Attached zeppelin logs.

zeppelin-zeppelin-hdfs-hadoop-tswv.zip

zeppelin-interpreter-sbirajdar-spark-zeppelin-hdfs.zip

Highlighted

Re: Unable to read csv file in R

Hi @sysadmin CreditVidya

Point 4b) is to test a bit more than a spark2 version. Look at the HCC and run the rest for %spark2.r.

Highlighted

Re: Unable to read csv file in R

Explorer

Hi Daniel,

Please find attached screenshot.

4b.png

Highlighted

Re: Unable to read csv file in R

Run now this:

%spark2.r
a<-read.csv("/tmp/test.csv")
print(a)

Also show me where you have the CSV file in your OS. Just list it

Highlighted

Re: Unable to read csv file in R

Explorer
hdfs@hdfs-hadoop-tswv:~$ hdfs dfs -ls /tmp/updated.csv
-rw-r--r--   3 spark hdfs      56638 2017-06-05 16:25 /tmp/updated.csv
hdfs@hdfs-hadoop-tswv:~$

updated.png

Highlighted

Re: Unable to read csv file in R

Send me your update.csv file so I will test it at my site

Also, attach zeppelin interpreter spark2 log. The one you previously attached is for spark.

What is the version of ubuntu you are on?

Don't have an account?
Coming from Hortonworks? Activate your account here