Member since
12-14-2015
6
Posts
6
Kudos Received
0
Solutions
07-07-2017
07:20 AM
It gave some inspiration: this worked for me: It exposes a stupid function called "add" that adds 1 to the first column of the dataframe package example
import org.apache.spark.sql.DataFrame;
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object Hello {
def add(df : DataFrame): DataFrame = {
val fc = df.columns(0);
var df2 = df.withColumn( fc, df.col(fc) + 1 );
return df2;
}
}
from pyspark.sql import DataFrame
df2 = DataFrame(sc._jvm.example.Hello.add(df._jdf),sqlContext)
... View more
07-04-2017
07:48 AM
Is it possible to call a scala function from python. The scala function takes a dataframe and returns a dataframe. If possible, with lazy evaluation. Example: df = sqlContext.read
.format("com.databricks.spark.csv")
df2 = scalaFunctionBinding(df)
df2.take(10)
... View more
Labels:
- Labels:
-
Apache Spark
12-14-2015
05:52 PM
1 Kudo
Which permissions do I need to set to store a table in the Hive metastore from pyspark?
... View more
Labels:
- Labels:
-
Apache Hive