<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question create Analytics from http usng spark streaming in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/43471#M36734</link>
    <description>&lt;P&gt;&lt;SPAN&gt;Hi My reqmnt is to create Analytics from &lt;/SPAN&gt;&lt;A href="http://10.3.9.34:9900/messages" target="_blank" rel="nofollow"&gt;http://10.3.9.34:9900/messages&lt;/A&gt;&lt;SPAN&gt; that is pull data from from&lt;/SPAN&gt;&lt;A href="http://10.3.9.34:9900/messages" target="_blank" rel="nofollow"&gt;http://10.3.9.34:9900/messages&lt;/A&gt;&lt;SPAN&gt; and put this data in HDFS location /user/cloudera/flume and from HDFS create Analytics report using Tableau or HUE UI . i tried with below code at scala console of spark-shell of CDH5.5 but unable to fetch data from the http link&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN class="kwd"&gt;import&lt;/SPAN&gt;&lt;SPAN class="pln"&gt; org&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;apache&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;spark&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="typ"&gt;SparkContext&lt;/SPAN&gt;
&lt;SPAN class="kwd"&gt;val&lt;/SPAN&gt;&lt;SPAN class="pln"&gt; dataRDD &lt;/SPAN&gt;&lt;SPAN class="pun"&gt;=&lt;/SPAN&gt;&lt;SPAN class="pln"&gt; sc&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;textFile&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;“http&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;:&lt;/SPAN&gt;&lt;SPAN class="com"&gt;//10.3.9.34:9900/messages”)&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;dataRDD&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;collect&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;().&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;foreach&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;println&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;)&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;dataRDD&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;count&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;()&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;dataRDD&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;saveAsTextFile&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;“&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;/&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;user&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;/&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;cloudera&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;/&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;flume”&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;)&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;I get below error at scala console :- java.io.IOException: No FileSystem for scheme: http at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2623) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2637) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2680) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2662) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:379) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)&lt;/P&gt;</description>
    <pubDate>Tue, 21 Apr 2026 13:50:13 GMT</pubDate>
    <dc:creator>Tdas</dc:creator>
    <dc:date>2026-04-21T13:50:13Z</dc:date>
    <item>
      <title>create Analytics from http usng spark streaming</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/43471#M36734</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Hi My reqmnt is to create Analytics from &lt;/SPAN&gt;&lt;A href="http://10.3.9.34:9900/messages" target="_blank" rel="nofollow"&gt;http://10.3.9.34:9900/messages&lt;/A&gt;&lt;SPAN&gt; that is pull data from from&lt;/SPAN&gt;&lt;A href="http://10.3.9.34:9900/messages" target="_blank" rel="nofollow"&gt;http://10.3.9.34:9900/messages&lt;/A&gt;&lt;SPAN&gt; and put this data in HDFS location /user/cloudera/flume and from HDFS create Analytics report using Tableau or HUE UI . i tried with below code at scala console of spark-shell of CDH5.5 but unable to fetch data from the http link&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN class="kwd"&gt;import&lt;/SPAN&gt;&lt;SPAN class="pln"&gt; org&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;apache&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;spark&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="typ"&gt;SparkContext&lt;/SPAN&gt;
&lt;SPAN class="kwd"&gt;val&lt;/SPAN&gt;&lt;SPAN class="pln"&gt; dataRDD &lt;/SPAN&gt;&lt;SPAN class="pun"&gt;=&lt;/SPAN&gt;&lt;SPAN class="pln"&gt; sc&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;textFile&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;“http&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;:&lt;/SPAN&gt;&lt;SPAN class="com"&gt;//10.3.9.34:9900/messages”)&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;dataRDD&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;collect&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;().&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;foreach&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;println&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;)&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;dataRDD&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;count&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;()&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;dataRDD&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;saveAsTextFile&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;“&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;/&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;user&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;/&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;cloudera&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;/&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;flume”&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;)&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;I get below error at scala console :- java.io.IOException: No FileSystem for scheme: http at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2623) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2637) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2680) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2662) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:379) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)&lt;/P&gt;</description>
      <pubDate>Tue, 21 Apr 2026 13:50:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/43471#M36734</guid>
      <dc:creator>Tdas</dc:creator>
      <dc:date>2026-04-21T13:50:13Z</dc:date>
    </item>
    <item>
      <title>Re: create Analytics from http usng spark streaming</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/43473#M36735</link>
      <description>&lt;P&gt;You are getting this exception because "sc.testFile" r&lt;SPAN&gt;eads a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;You said that you want to get the data from URL and want to save it to HDFS, then you should do:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;val data = scala.io.Source.fromURL("http://10.3.9.34:9900/messages").mkString
val list = data.split("\n").filter(_ != "")
val rdds = sc.parallelize(list)
rdds.saveAsTextFile(outputDirectory)&lt;/PRE&gt;</description>
      <pubDate>Wed, 03 Aug 2016 07:52:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/43473#M36735</guid>
      <dc:creator>_Umesh</dc:creator>
      <dc:date>2016-08-03T07:52:27Z</dc:date>
    </item>
    <item>
      <title>Re: create Analytics from http usng spark streaming</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/43497#M36736</link>
      <description>&lt;P&gt;First of all thanks Umesh,you got my half problem solved ,appreciate that really but only issue is now its not saving at hdfs location&amp;nbsp;&lt;SPAN&gt;/user/cloudera/flume because of illegal character&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;scala&amp;gt; import org.apache.spark.SparkContext&lt;BR /&gt;import org.apache.spark.SparkContext&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;scala&amp;gt; val data = scala.io.Source.fromURL("&lt;A href="http://10.3.9.34:9900/messages&amp;quot;).mkString" target="_blank"&gt;http://10.3.9.34:9900/messages").mkString&lt;/A&gt;&lt;BR /&gt;data: String =&lt;BR /&gt;"Jul 31 03:38:01 MSAT-T8360-62-RHEL64-24-103934 kernel: imklog 4.6.2, log source = /proc/kmsg started.&lt;BR /&gt;Jul 31 03:38:01 MSAT-T8360-62-RHEL64-24-103934 rsyslogd: [origin software="rsyslogd" swVersion="4.6.2" x-pid="1342" x-info="&lt;A href="http://www.rsyslog.com" target="_blank"&gt;http://www.rsyslog.com&lt;/A&gt;"] (re)start&lt;BR /&gt;Jul 31 03:38:01 MSAT-T8360-62-RHEL64-24-103934 rhsmd: This system is registered to RHN Classic&lt;BR /&gt;Aug 1 03:36:01 MSAT-T8360-62-RHEL64-24-103934 rhsmd: This system is registered to RHN Classic&lt;BR /&gt;Aug 2 03:16:01 MSAT-T8360-62-RHEL64-24-103934 rhsmd: This system is registered to RHN Classic&lt;BR /&gt;Aug 3 03:24:01 MSAT-T8360-62-RHEL64-24-103934 rhsmd: This system is registered to RHN Classic&lt;BR /&gt;"&lt;/P&gt;&lt;P&gt;scala&amp;gt; val list = data.split("\n").filter(_ != "")&lt;BR /&gt;list: Array[String] = Array(Jul 31 03:38:01 MSAT-T8360-62-RHEL64-24-103934 kernel: imklog 4.6.2, log source = /proc/kmsg started., Jul 31 03:38:01 MSAT-T8360-62-RHEL64-24-103934 rsyslogd: [origin software="rsyslogd" swVersion="4.6.2" x-pid="1342" x-info="&lt;A href="http://www.rsyslog.com" target="_blank"&gt;http://www.rsyslog.com&lt;/A&gt;"] (re)start, Jul 31 03:38:01 MSAT-T8360-62-RHEL64-24-103934 rhsmd: This system is registered to RHN Classic, Aug 1 03:36:01 MSAT-T8360-62-RHEL64-24-103934 rhsmd: This system is registered to RHN Classic, Aug 2 03:16:01 MSAT-T8360-62-RHEL64-24-103934 rhsmd: This system is registered to RHN Classic, Aug 3 03:24:01 MSAT-T8360-62-RHEL64-24-103934 rhsmd: This system is registered to RHN Classic)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;scala&amp;gt; val rdds = sc.parallelize(list)&lt;BR /&gt;rdds: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at parallelize at &amp;lt;console&amp;gt;:26&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;scala&amp;gt; rdds.saveAsTextFile(“/user/cloudera/flume”)&lt;BR /&gt;&amp;lt;console&amp;gt;:1: error: illegal character '\u201c'&lt;BR /&gt;rdds.saveAsTextFile(“/user/cloudera/flume”)&lt;BR /&gt;^&lt;BR /&gt;&amp;lt;console&amp;gt;:1: error: illegal character '\u201d'&lt;BR /&gt;rdds.saveAsTextFile(“/user/cloudera/flume”)&lt;BR /&gt;^&lt;/P&gt;&lt;P&gt;scala&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can you please help&lt;/P&gt;</description>
      <pubDate>Wed, 03 Aug 2016 17:24:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/43497#M36736</guid>
      <dc:creator>Tdas</dc:creator>
      <dc:date>2016-08-03T17:24:16Z</dc:date>
    </item>
    <item>
      <title>Re: create Analytics from http usng spark streaming</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/43499#M36737</link>
      <description>&lt;P&gt;Awesome here is working code&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;import org.apache.spark.SparkContext&lt;BR /&gt;val data = scala.io.Source.fromURL("&lt;A href="http://10.3.9.34:9900/messages&amp;quot;).mkString" target="_blank"&gt;http://10.3.9.34:9900/messages").mkString&lt;/A&gt;&lt;BR /&gt;val list = data.split("\n").filter(_ != "")&lt;BR /&gt;val rdds = sc.parallelize(list)&lt;BR /&gt;rdds.saveAsTextFile("/user/cloudera/spark/fromsource")&lt;/P&gt;</description>
      <pubDate>Wed, 03 Aug 2016 18:43:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/43499#M36737</guid>
      <dc:creator>Tdas</dc:creator>
      <dc:date>2016-08-03T18:43:21Z</dc:date>
    </item>
    <item>
      <title>Compatibility Issues between Tableau 10.3 and Cloudera Hive 2.0</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/80270#M36738</link>
      <description>&lt;P&gt;Hello Experts,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We are upgrading Our Cloudera Hive from 1.3 to 2.0, Could you please let us know, if there is known issues related to this, i did a search in Tableau and Cloudera Community, but i didn't found any issues.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in Advance!!!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards,&lt;/P&gt;
&lt;P&gt;Muthu Venkatesh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Sep 2018 14:59:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/create-Analytics-from-http-usng-spark-streaming/m-p/80270#M36738</guid>
      <dc:creator>MuthuVenkatesh</dc:creator>
      <dc:date>2018-09-25T14:59:18Z</dc:date>
    </item>
  </channel>
</rss>

