Support Questions

Find answers, ask questions, and share your expertise

Error while importing log data from webserver to HDFS in CDH 5.5

avatar
Contributor

 

My intention is to pull data from webserver to HDFS,i tried with flume but data is not getting pushed to HDFS working hence i wrote simple below scala program in CDH 5.5.

Please note i checked spark service is up at Cloudera managaer console

Here is the program i tried running at scala console


scala> import org.apache.spark.SparkContext
import org.apache.spark.SparkContext

scala>

scala> val dataRDD = sc.textFile("http://10.3.9.34:9900/messages")
<console>:14: error: not found: value sc
val dataRDD = sc.textFile("http://10.3.9.34:9900/messages")
^

scala> dataRDD.collect().foreach(println)
<console>:15: error: not found: value dataRDD
dataRDD.collect().foreach(println)
^

scala>

scala> dataRDD.count()
<console>:15: error: not found: value dataRDD
dataRDD.count()
^

scala>

scala> import org.apache.spark.SparkContext
import org.apache.spark.SparkContext

scala> val dataRDD = sc.textFile("http://10.3.9.34:9900/messages")
<console>:16: error: not found: value sc
val dataRDD = sc.textFile("http://10.3.9.34:9900/messages")

 

Exact error is :-

16/07/21 23:35:35 ERROR SparkContext: Error initializing SparkContext.
org.apache.hadoop.security.AccessControlException: Permission denied: user=cloudera, access=WRITE, inode="/user/spark/applicationHistory":spark:supergroup:drwxr-xr-x

1 ACCEPTED SOLUTION

avatar
Contributor
3 REPLIES 3

avatar
Contributor

avatar
Master Collaborator

Here you just ran the scala shell. You have to use the spark-shell to use Spark.

avatar
Contributor

i typed spark-shell and i got scala console