Member since
09-06-2016
108
Posts
36
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2589 | 05-11-2017 07:41 PM | |
1228 | 05-06-2017 07:36 AM | |
6806 | 05-05-2017 07:00 PM | |
2383 | 05-05-2017 06:52 PM | |
6501 | 05-02-2017 03:56 PM |
03-29-2017
06:02 PM
I'm using the email alert processor as per demo instructions.
... View more
03-29-2017
10:25 AM
Hi, I've created the demo topology from the SAM docs. When deploying, it fails instantly with the following message: An exception with message [/tmp/b7ea53f1-77ca-495f-b9f7-ed28f3097ac1.jar (No such file or directory)] was thrown while processing request.
... View more
03-22-2017
10:25 AM
1 Kudo
Hi, If you remove this component from the '/usr/hdp/current/oozie-server/oozie.war', you should be able to start the service: zip -d oozie.war ext-2.2/*
Regards, ward
... View more
03-19-2017
08:59 PM
Thx @yvora !
... View more
03-18-2017
01:32 PM
When running spark code in Zeppelin via Livy interpreter, I only see a few containers allocated in yarn. What settings do I need to change to make sure I leverage full cluster capacity? I'm using a cluster created by Hortonworks Datacloud on AWS
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache Zeppelin
03-17-2017
09:02 PM
solved by not having to many partitions for parallelize
... View more
03-16-2017
08:18 PM
The exception seems to happen when nFiles is larger, like 1000, not when it's 10. spark-submit --master yarn-cluster --class com.cisco.dfsio.test.Runner hdfs:///user/$USER/mantl-apps/benchmarking-apps/spark-test-dfsio-with-dependencies.jar --file data/testdfsio-write --nFiles 1000 --fSize 200000 -m write --log data/testdfsio-write/testHdfsIO-WRITE.log btw: not my code.
... View more
03-16-2017
06:56 PM
When running this small piece of Scala code I get a "org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://xxx.eu-west-1.compute.internal:8020/user/cloudbreak/data/testdfsio-write". Below the piece of code where the `saveAsTextFile` is executed. The directory does not exist before running this script. Why is this FileAlreadyExistsException being raised? // Create a Range and parallelize it, on nFiles partitions
// The idea is to have a small RDD partitioned on a given number of workers
// then each worker will generate data to write
val a = sc.parallelize(1 until config.nFiles + 1, config.nFiles)
val b = a.map(i => {
// generate an array of Byte (8 bit), with dimension fSize
// fill it up with "0" chars, and make it a string for it to be saved as text
// TODO: this approach can still cause memory problems in the executor if the array is too big.
val x = Array.ofDim[Byte](fSizeBV.value).map(x => "0").mkString("")
x
})
// Force computation on the RDD
sc.runJob(b, (iter: Iterator[_]) => {})
// Write output file
val (junk, timeW) = profile {
b.saveAsTextFile(config.file)
}
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark