Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

ERROR OCCURED WHILE INGEST THE DATA INTO ELASTICSEARCH THROUGH PYSPARK

Highlighted

ERROR OCCURED WHILE INGEST THE DATA INTO ELASTICSEARCH THROUGH PYSPARK

New Contributor

i try to load a csv file into elasticsearch using pyspark.when i execute the spark submit,it show an error

11276-screenshot-from-2017-01-10-155056.png

i try the sample code below

from pyspark import SparkContext, SparkConf
from pyspark.sql import SQLContext
if __name__ == "__main__":
    conf = SparkConf().setAppName("WriteToES")
    sc = SparkContext(conf=conf)
    sqlContext = SQLContext(sc)
    es_conf = {"es.nodes" : "sandbox.hortonworks.com","es.port" : "9200","es.nodes.client.only" : "true","es.resource" : "spark/data"}
    es_df_p = sc.textFile("/sample/hello.csv").map(lambda line: line.split(","))
    es_df_pf= es_df_p.groupBy("element_id").count().map(lambda (a,b): ('id',{'element_id': a,'count': b}))
    es_df_pf.saveAsNewAPIHadoopFile(
    path='-',
    outputFormatClass="org.elasticsearch.hadoop.mr.EsOutputFormat",
    keyClass="org.apache.hadoop.io.NullWritable",
    valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable",
    conf=es_conf)

spark-submit --master yarn-cluster --jars /root/elasticsearch-spark-20_2.10-5.1.1.jar /root/es_spark_write.py 
Don't have an account?
Coming from Hortonworks? Activate your account here