Support Questions
Find answers, ask questions, and share your expertise

Deploying oozie spark python job in CDH5.3

Highlighted

Deploying oozie spark python job in CDH5.3

New Contributor

Hello, everyone, I'm a new guy, please tell me how to resolve problems or where I make mistakes, I'm looking forward to you ,thanks a lot. question one: define: ${jobTracker} ${nameNode} spark://master:7077 client MySpark wordcount.py /user/impala/wordcount.py wordcount.py: if __name__ == '__main__': master = "spark://master:7077" app_name = "spark_sql_wc_app" inputsrc='hdfs://master:8020/output/back/back-portal-loginlog/20150801' sc = SparkContext(master, app_name) sql_context = SQLContext(sc) lines = sc.textFile(input) words = lines.map(lambda line: line.split(",")) pairs = words.map(lambda word: (word, 1)) word_counts = pairs.reduceByKey(lambda x, y: x + y) word_counts.pprint() sc.stop() error: Main class [org.apache.oozie.action.hadoop.SparkMain], exit code [2] question two: define: ${jobTracker} ${nameNode} /user/impala/hello.sh error: Cannot run program "hello.sh" (in directory "/yarn/nm/usercache/impala/appcache/application_1444880827025_0073/container_1444880827025_0073_01_000002"): error=2, No such file or directory