<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: first steps with Spark and Hue in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82642#M85065</link>
    <description>It is a normal python &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;Just use myvariable = "value", so&lt;BR /&gt;app_name = "My gorgeous application"&lt;BR /&gt;conf = SparkConf().setAppName(app_name).setMaster(master)</description>
    <pubDate>Tue, 20 Nov 2018 17:11:34 GMT</pubDate>
    <dc:creator>Tomas79</dc:creator>
    <dc:date>2018-11-20T17:11:34Z</dc:date>
    <item>
      <title>first steps with Spark and Hue</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82625#M85060</link>
      <description>&lt;P&gt;Hello! I am tying to run a Spark script in hue:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Name of my script: SparkTest.py&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Content of my script:&lt;/P&gt;&lt;P&gt;text_file = sc.textFile("hdfs://...testFile.txt")&lt;/P&gt;&lt;P&gt;counts = text_file.flatMap(lambda line: line.split(" ")) \&lt;/P&gt;&lt;P&gt;&amp;nbsp; .map(lambda word: (word, 1)) \&lt;/P&gt;&lt;P&gt;&amp;nbsp; .reduceByKey(lambda a, b: a + b)&lt;/P&gt;&lt;P&gt;&amp;nbsp; counts.saveAsTextFile("hdfs://...RESULT.txt")&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Content of my testFile:&lt;/P&gt;&lt;P&gt;Test Test&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Problem:&lt;/P&gt;&lt;P&gt;After running of this script my RESULT.txt file is still empty.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Question:&lt;/P&gt;&lt;P&gt;- Which Spark/Hue configuration do I need to run simple Spark scripts with the help of hue?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I use VM Cloudera 5.13&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 13:54:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82625#M85060</guid>
      <dc:creator>Stefan_S</dc:creator>
      <dc:date>2022-09-16T13:54:54Z</dc:date>
    </item>
    <item>
      <title>Re: first steps with Spark and Hue</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82626#M85061</link>
      <description>It is hard to tell from this what can be the problem. Can you post the spark logs, do you have access to the Spark job UI? Do you get some error messages?</description>
      <pubDate>Tue, 20 Nov 2018 15:52:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82626#M85061</guid>
      <dc:creator>Tomas79</dc:creator>
      <dc:date>2018-11-20T15:52:54Z</dc:date>
    </item>
    <item>
      <title>Re: first steps with Spark and Hue</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82634#M85062</link>
      <description>&lt;P&gt;I have added SparkContext to my script:&lt;/P&gt;&lt;P&gt;from pyspark import SparkContext, SparkConf&lt;BR /&gt;&lt;BR /&gt;conf = SparkConf().setAppName(appNameTEST).setMaster(master)&lt;BR /&gt;sc = SparkContext(conf=conf)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Most relevant error log in hue:&lt;/P&gt;&lt;P&gt;Traceback (most recent call last):&lt;BR /&gt;&amp;nbsp; File "/yarn/nm/usercache/cloudera/appcache/application_1542723589859_0008/container_1542723589859_0008_01_000002/SparkTest.py", line 3, in &amp;lt;module&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; conf = SparkConf().setAppName(appNameTEST).setMaster(master)&lt;BR /&gt;NameError: name 'appNameTEST' is not defined&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Less relevant error:&lt;/P&gt;&lt;P&gt;2018-11-20 08:07:59,555 [DataStreamer for file /user/cloudera/oozie-oozi/0000003-181120074347071-oozie-oozi-W/spark2-b3ea--spark/action-data.seq] WARN&amp;nbsp; org.apache.hadoop.hdfs.DFSClient&amp;nbsp; - Caught exception&lt;BR /&gt;java.lang.InterruptedException&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;at java.lang.Object.wait(Native Method)&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;at java.lang.Thread.join(Thread.java:1281)&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;at java.lang.Thread.join(Thread.java:1355)&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Nov 2018 16:26:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82634#M85062</guid>
      <dc:creator>Stefan_S</dc:creator>
      <dc:date>2018-11-20T16:26:33Z</dc:date>
    </item>
    <item>
      <title>Re: first steps with Spark and Hue</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82636#M85063</link>
      <description>NameError: name 'appNameTEST' is not defined -&amp;gt; that is a syntax error, python does not know any variable with this name</description>
      <pubDate>Tue, 20 Nov 2018 16:50:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82636#M85063</guid>
      <dc:creator>Tomas79</dc:creator>
      <dc:date>2018-11-20T16:50:42Z</dc:date>
    </item>
    <item>
      <title>Re: first steps with Spark and Hue</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82640#M85064</link>
      <description>&lt;P&gt;Thank you, you are right. How can set a variable in python for Spark?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I thought that "conf = SparkConf().setAppName(appNameTEST).setMaster(master)" would set this variable?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Nov 2018 17:05:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82640#M85064</guid>
      <dc:creator>Stefan_S</dc:creator>
      <dc:date>2018-11-20T17:05:35Z</dc:date>
    </item>
    <item>
      <title>Re: first steps with Spark and Hue</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82642#M85065</link>
      <description>It is a normal python &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;Just use myvariable = "value", so&lt;BR /&gt;app_name = "My gorgeous application"&lt;BR /&gt;conf = SparkConf().setAppName(app_name).setMaster(master)</description>
      <pubDate>Tue, 20 Nov 2018 17:11:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82642#M85065</guid>
      <dc:creator>Tomas79</dc:creator>
      <dc:date>2018-11-20T17:11:34Z</dc:date>
    </item>
    <item>
      <title>Re: first steps with Spark and Hue</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82693#M85066</link>
      <description>&lt;P&gt;Thank you!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have change it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Also I have changed my paths. Because the path is for directory and not for a file. I have also added a / to my path. Now I get results which I have expected. I changed "setMaster to "local" because it is just a small Cloudera VM without cluster.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is a simple Spark script which can be executed in hue per Spark editor:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;from pyspark import SparkContext, SparkConf&lt;BR /&gt;appNameTEST ="my first working application"&lt;BR /&gt;&lt;BR /&gt;conf = SparkConf().setAppName(appNameTEST).setMaster("local")&lt;BR /&gt;sc = SparkContext(conf=conf)&lt;BR /&gt;&lt;BR /&gt;text_file = sc.textFile("hdfs:///user/hive/warehouse/TEST/FilePath")&lt;BR /&gt;counts = text_file.flatMap(lambda line: line.split(" ")) \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .map(lambda word: (word, 1)) \&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .reduceByKey(lambda a, b: a + b)&lt;BR /&gt;counts.saveAsTextFile("hdfs:///user/hive/warehouse/TEST/RESULT")&lt;/P&gt;</description>
      <pubDate>Wed, 21 Nov 2018 09:54:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/first-steps-with-Spark-and-Hue/m-p/82693#M85066</guid>
      <dc:creator>Stefan_S</dc:creator>
      <dc:date>2018-11-21T09:54:52Z</dc:date>
    </item>
  </channel>
</rss>

