<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Best posssible way to connect Cassandra cluster to Hadoop in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231534#M61196</link>
    <description>&lt;P style="margin-left: 20px;"&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/12636/ssahi.html" nodeid="12636"&gt;@Sonu Sahi&lt;/A&gt;  &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;Can you be more brief on connectivty between hadoop cluster and cassandra cluster especially when they are in different subnets. What ports and nodes need access? &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;Thanks,&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;Padma.&lt;/P&gt;</description>
    <pubDate>Fri, 19 May 2017 04:07:55 GMT</pubDate>
    <dc:creator>pmj</dc:creator>
    <dc:date>2017-05-19T04:07:55Z</dc:date>
    <item>
      <title>Best posssible way to connect Cassandra cluster to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231530#M61192</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;Can I get some xpert advise on the best possible ways to import cassandra tables into hadoop cluster ? and which ports should be open in hadoop for the connection... ?&lt;/P&gt;&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Tue, 16 May 2017 01:15:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231530#M61192</guid>
      <dc:creator>pmj</dc:creator>
      <dc:date>2017-05-16T01:15:18Z</dc:date>
    </item>
    <item>
      <title>Re: Best posssible way to connect Cassandra cluster to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231531#M61193</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/14451/pjalleda.html" nodeid="14451"&gt;@PJ&lt;/A&gt;  the easiest and least intrusive way is to use Hortonworks Data Flow (powered by Apache NiFi) to quickly build a data flow that queries Cassandra and sends the results to HDFS.  HDF/Nifi includes Cassandra processors that make integration simple.  Take a look at this article about ingesting data into hadoop from a RDBMS, but, you would be using the QueryCassandra processor instead: &lt;A href="https://community.hortonworks.com/articles/87686/rdbms-to-hive-using-nifi-small-medium-tables.html" target="_blank"&gt;https://community.hortonworks.com/articles/87686/rdbms-to-hive-using-nifi-small-medium-tables.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-cassandra-nar/1.2.0/org.apache.nifi.processors.cassandra.QueryCassandra/" target="_blank"&gt;https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-cassandra-nar/1.2.0/org.apache.nifi.processors.cassandra.QueryCassandra/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;As
always, if you find this post useful, don't forget to accept the
answer.&lt;/P&gt;</description>
      <pubDate>Wed, 17 May 2017 01:53:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231531#M61193</guid>
      <dc:creator>ssahi</dc:creator>
      <dc:date>2017-05-17T01:53:19Z</dc:date>
    </item>
    <item>
      <title>Re: Best posssible way to connect Cassandra cluster to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231532#M61194</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/12636/ssahi.html" nodeid="12636"&gt;@Sonu Sahi&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Thanks for your reply. What about sqoop import in hadoop ? to import cassandra tables into hdfs from hadoop client.&lt;/P&gt;</description>
      <pubDate>Wed, 17 May 2017 04:27:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231532#M61194</guid>
      <dc:creator>pmj</dc:creator>
      <dc:date>2017-05-17T04:27:41Z</dc:date>
    </item>
    <item>
      <title>Re: Best posssible way to connect Cassandra cluster to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231533#M61195</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/14451/pjalleda.html" nodeid="14451"&gt;@PJ&lt;/A&gt;&lt;/P&gt;&lt;P&gt;If you wanted to use sqoop instead of HDF/NiFi to import tables, you would need to get an adequate JDBC driver for Cassandra.  I'm not an expert on it, but I think DataStax provides one for their Enterprise software. I've seen quite a few stories about it not working very well though without that JDBC driver.  I think HDF/NiFi would be the better option.&lt;/P&gt;</description>
      <pubDate>Wed, 17 May 2017 05:04:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231533#M61195</guid>
      <dc:creator>ssahi</dc:creator>
      <dc:date>2017-05-17T05:04:20Z</dc:date>
    </item>
    <item>
      <title>Re: Best posssible way to connect Cassandra cluster to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231534#M61196</link>
      <description>&lt;P style="margin-left: 20px;"&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/12636/ssahi.html" nodeid="12636"&gt;@Sonu Sahi&lt;/A&gt;  &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;Can you be more brief on connectivty between hadoop cluster and cassandra cluster especially when they are in different subnets. What ports and nodes need access? &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;Thanks,&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;Padma.&lt;/P&gt;</description>
      <pubDate>Fri, 19 May 2017 04:07:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231534#M61196</guid>
      <dc:creator>pmj</dc:creator>
      <dc:date>2017-05-19T04:07:55Z</dc:date>
    </item>
    <item>
      <title>Re: Best posssible way to connect Cassandra cluster to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231535#M61197</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/14451/pjalleda.html" nodeid="14451"&gt;@PJ&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I can't speak to the network setup specifics in your environment obviously, that should come from the Hadoop and Cassandra admins.  I think the default Cassandra port is 9042, but, you can check that with your admin team.  If you are using HDF/Nifi, you would specify that port in the QueryCassandra processor. The NiFi nodes will require access over that port to the Cassandra environment, and the nodes will also require access to each node in the hadoop cluster.  If you are using Sqoop, the connectivity must be be open between the Cassandra environment and each node in the hadoop cluster on the JDBC port that Cassandra in your environment is configured to use (Sqoop jobs can be initiated from the client node, but will actually instantiate connections from one of the worker nodes in the cluster). &lt;/P&gt;&lt;P&gt;&lt;A href="https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html"&gt;https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/66961/how-sqoop-internally-works.html"&gt;https://community.hortonworks.com/questions/66961/how-sqoop-internally-works.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 19 May 2017 04:43:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231535#M61197</guid>
      <dc:creator>ssahi</dc:creator>
      <dc:date>2017-05-19T04:43:24Z</dc:date>
    </item>
    <item>
      <title>Re: Best posssible way to connect Cassandra cluster to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231536#M61198</link>
      <description>&lt;P&gt;hi &lt;A rel="user" href="https://community.cloudera.com/users/12636/ssahi.html" nodeid="12636"&gt;@Sonu Sahi&lt;/A&gt;, i've added nifi as a service in my HDP 2.6, but i'm having some difficulties to connect it to cassandra which is installed on my linux host. I'm wondering if i have to install cassandra in my sandbox too?&lt;/P&gt;&lt;P&gt;Can you please take a look here for more details: &lt;A href="https://community.hortonworks.com/questions/103622/how-to-use-querycassandra.html" target="_blank"&gt;https://community.hortonworks.com/questions/103622/how-to-use-querycassandra.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 20 May 2017 04:33:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-posssible-way-to-connect-Cassandra-cluster-to-Hadoop/m-p/231536#M61198</guid>
      <dc:creator>adib_elaraki</dc:creator>
      <dc:date>2017-05-20T04:33:30Z</dc:date>
    </item>
  </channel>
</rss>

