Member since
10-02-2015
76
Posts
80
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1984 | 11-15-2016 03:28 PM | |
3396 | 11-15-2016 03:15 PM | |
2113 | 07-25-2016 08:03 PM | |
1742 | 05-11-2016 04:10 PM | |
3620 | 02-02-2016 08:09 PM |
12-16-2015
12:51 AM
3 Kudos
@Nikolaos Stanogias This looks like a bug in 2.3.2 Make sure Atlas is started and out of maintenance mode. That will work.
... View more
12-11-2015
04:22 AM
import org.apache.spark.mllib.regression.LinearRegressionWithSGD
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.ml.feature.{OneHotEncoder, StringIndexer}
import sqlContext.implicits._
val df = sqlContext.sql("select mnemonic, average, median, stddev from wellbook.curve_statistics")
val indexer = new StringIndexer()
.setInputCol("mnemonic")
.setOutputCol("mnemonicIndex")
.fit(df)
val indexed = indexer.transform(df)
val encoder = new OneHotEncoder().setInputCol("mnemonicIndex").
setOutputCol("mnemonicVec")
val encoded = encoder.transform(indexed)
val data = encoded.select("mnemonicVec", "average", "median", "stddev")
val parsedData = data.map(row => LabeledPoint(row.getDouble(0), row.getAs[Vector](1)))
<console>:297: error: kinds of the type arguments (Vector) do not conform to the expected kinds of the type parameters (type T).
Vector's type parameters do not match type T's expected parameters:
type Vector has one type parameter, but type T has none
val parsedData = data.map(row => LabeledPoint(row.getDouble(0), row.getAs[Vector](1))
... View more
Labels:
- Labels:
-
Apache Spark
12-11-2015
02:04 AM
hadoop.proxyuser.hive.groups = * Worked for me. Thanks @Ali Bajwa
... View more
12-09-2015
07:55 PM
@Ali Bajwa Should it be the python directory or the pyspark directory? as in /usr/loca/../python or /usr/hdp/2..../spark/python
... View more
12-09-2015
07:39 PM
@Ofer Mendelevith I think its an issue with LabeledPoint. It's expecting Labeled but not getting it. val examples = MLUtils.loadLabeledData(sc,"hdfs:///user/zeppelin/las_demo/part-00000").cache() val splits = examples.randomSplit(Array(0.8, 0.2)) val training = splits(0).cache()
val test = splits(1).cache() val numTraining = training.count() val numTest = test.count() println(s"Training: $numTraining, test: $numTest.") val updater = new SquaredL2Updater()
val model = {
val algorithm = new LogisticRegressionWithSGD()
algorithm.optimizer.setNumIterations(200).setStepSize(1.0).setUpdater(updater).setRegParam(0.1)
algorithm.run(training).clearThreshold()
}
val rprediction = model.predict(test.map(_.features))
val rpredictionAndLabel = rprediction.zip(testRDD.map(_.label)) val rmetrics = new BinaryClassificationMetrics(rpredictionAndLabel) ERROR is as follows: warning: there were 1 deprecation warning(s); re-run with -deprecation for details
examples: org.apache.spark.rdd.RDD[org.apache.spark.mllib.regression.LabeledPoint] = MapPartitionsRDD[52] at map at MLUtils.scala:214
splits: Array[org.apache.spark.rdd.RDD[org.apache.spark.mllib.regression.LabeledPoint]] = Array(PartitionwiseSampledRDD[53] at randomSplit at <console>:72, PartitionwiseSampledRDD[54] at randomSplit at <console>:72)
training: org.apache.spark.rdd.RDD[org.apache.spark.mllib.regression.LabeledPoint] = PartitionwiseSampledRDD[53] at randomSplit at <console>:72
test: org.apache.spark.rdd.RDD[org.apache.spark.mllib.regression.LabeledPoint] = PartitionwiseSampledRDD[54] at randomSplit at <console>:72
numTraining: Long = 19589
numTest: Long = 4889
Training: 19589, test: 4889.
updater: org.apache.spark.mllib.optimization.SquaredL2Updater = org.apache.spark.mllib.optimization.SquaredL2Updater@3b9284cd
org.apache.spark.SparkException: Input validation failed.
at org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:210)
at org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:190)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:81)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:87)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:89)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:91)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:93)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:95)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:97)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:99)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:101)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:103)
at $iwC$$iwC$$iwC.<init>(<console>:105)
at $iwC$$iwC.<init>(<console>:107)
at $iwC.<init>(<console>:109)
at <init>(<console>:111)
at .<init>(<console>:115)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:655)
at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:620)
at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:613)
at org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
... View more
12-09-2015
03:20 PM
I had 2 versions of Python installed. Zeppelin is still using the older one.
... View more
12-09-2015
02:46 PM
Iam able to import a library in pyspark shell without any problems, but when I try to import the same library in Zeppelin, I get an error ImportError: No module named xxxxx
... View more
Labels:
- Labels:
-
Apache Zeppelin