Member since
07-17-2016
15
Posts
2
Kudos Received
0
Solutions
09-12-2016
02:30 PM
I used public URI of the server to login and got the error.
... View more
09-09-2016
08:57 PM
Actually, I logged in to Rstudio using the address of the server: ec2-yy-yyy-yyy-yy.compute-1.amazonaws.com:8787
... View more
09-09-2016
03:44 PM
1 Kudo
I am trying to use SparkR in Rstudio in HDP2.4 deployed on AWS EC2 cluster. I installed R, Rstudio and other R packages but after I login to R and try to start spark context, I encountered the problem below. Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client/")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"),.libPaths()))
library(SparkR)
# Create a spark context and a SQL context
sc <- SparkR::sparkR.init(master = "yarn-client")
Retrying connect to server: ip-xxx-xx-xx-xx.ec2.internal/xxx.xx.xx.xx:8050. Already tried 49 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
16/09/09 11:41:32
... View more
Labels:
- Labels:
-
Apache Spark
08-02-2016
07:02 PM
1 Kudo
I am trying to read data from HDFS on AWS EC2 cluster using Jupiter Notebook. It has 7 nodes. I am using HDP 2.4 and my code is below. The table has millions of rows but the code does not return any rows. "ec2-xx-xxx-xxx-xx.compute-1.amazonaws.com" is the server (ambari-server). from pyspark.sql import SQLContext
sqlContext = HiveContext(sc)
demography = sqlContext.read.load("hdfs://ec2-xx-xxx-xx-xx.compute-1.amazonaws.com:8020/tmp/FAERS/demography_2012q4_2016q1_duplicates_removed. csv", format="com.databricks.spark.csv", header="true", inferSchema="true")
demography.printSchema()
demography.cache()
print demography.count()
... View more
Labels:
07-21-2016
02:37 PM
I changed to root using "sudo su -", then ran the command "su hdfs", but I get an error message that user "hdfs" is not known.
... View more
07-21-2016
12:16 AM
When I use the command $su hdfs, it asks me for password. However, I have not set any password.
... View more
07-20-2016
07:12 PM
But it is saying it has used 88.1% of 5.63 GB. But when I check the disk size in the terminals, I the report shown in the attachment
... View more