Member since
12-14-2015
6
Posts
6
Kudos Received
0
Solutions
07-07-2017
07:20 AM
It gave some inspiration: this worked for me: It exposes a stupid function called "add" that adds 1 to the first column of the dataframe package example
import org.apache.spark.sql.DataFrame;
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object Hello {
def add(df : DataFrame): DataFrame = {
val fc = df.columns(0);
var df2 = df.withColumn( fc, df.col(fc) + 1 );
return df2;
}
}
from pyspark.sql import DataFrame
df2 = DataFrame(sc._jvm.example.Hello.add(df._jdf),sqlContext)
... View more
12-16-2015
01:27 AM
1 Kudo
@wiljan van ravensteijn If you are running Hive with doAs=false, metastore service must have write permission in /app/hive/warehouse and also in new directories/tables you create. If you are using Sandbox 2.3.2, easiest way is to define a Ranger police to grant access to hive user, another solution would be to execute pyspark using hive user.
... View more