<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: What firewall ports should I open for distcp between clusters? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/23920#M4523</link>
    <description>&lt;P&gt;Sorry, still isn't clear what would be the source and destination on the ACL be?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Let's say, we have clusters A and B.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Cluster A datanodes are datanode-A-[1,10] and namenode&amp;nbsp;is&amp;nbsp;namenode-A-1. And, Cluster B &lt;SPAN&gt;datanodes are datanode-B-[1,10] and namenode is namenode-B-1.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1) Do I initiate "distcp" on a host on Cluster A or B?&lt;/P&gt;&lt;P&gt;2) Do ports need to be opened up A-&amp;gt;B or B-&amp;gt;A?&lt;/P&gt;&lt;P&gt;3) If it is A-&amp;gt;B, then what hosts on A need access to what hosts on B?&lt;/P&gt;&lt;P&gt;4) If it is B-&amp;gt;A, then what hosts on B need access to what hosts on A?&lt;/P&gt;</description>
    <pubDate>Wed, 21 Jan 2015 22:27:07 GMT</pubDate>
    <dc:creator>siddhartha.jain-1190932798</dc:creator>
    <dc:date>2015-01-21T22:27:07Z</dc:date>
    <item>
      <title>What firewall ports should I open for distcp between clusters?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/23492#M4520</link>
      <description>&lt;P&gt;I have two clusters behind a firewall and I would like run distcp to copy data from one cluster to another. What ports should I open in the firewall for this&amp;nbsp;communication? For example, I know I need 50070 to the NameNode. But what other ports are required?&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jan 2015 00:44:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/23492#M4520</guid>
      <dc:creator>IT.Services</dc:creator>
      <dc:date>2015-01-09T00:44:39Z</dc:date>
    </item>
    <item>
      <title>Re: What firewall ports should I open for distcp between clusters?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/23810#M4521</link>
      <description>&lt;P&gt;I have a case open with Cloudera support to get an answer.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jan 2015 20:22:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/23810#M4521</guid>
      <dc:creator>siddhartha.jain-1190932798</dc:creator>
      <dc:date>2015-01-16T20:22:50Z</dc:date>
    </item>
    <item>
      <title>Re: What firewall ports should I open for distcp between clusters?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/23837#M4522</link>
      <description>Thanks for logging a case. Just for completeness, here's the answer&lt;BR /&gt;&lt;BR /&gt;Datanode: 1004 (with kerberos), 50010 (without kerberos), 50020 (always)&lt;BR /&gt;​Namenode: 8020​&lt;BR /&gt;&lt;BR /&gt;​The list of ports is documented here for future reference&lt;BR /&gt;&lt;A target="_blank" href="http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_ig_ports.html"&gt;http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_ig_ports.html&lt;/A&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 19 Jan 2015 03:37:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/23837#M4522</guid>
      <dc:creator>GautamG</dc:creator>
      <dc:date>2015-01-19T03:37:50Z</dc:date>
    </item>
    <item>
      <title>Re: What firewall ports should I open for distcp between clusters?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/23920#M4523</link>
      <description>&lt;P&gt;Sorry, still isn't clear what would be the source and destination on the ACL be?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Let's say, we have clusters A and B.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Cluster A datanodes are datanode-A-[1,10] and namenode&amp;nbsp;is&amp;nbsp;namenode-A-1. And, Cluster B &lt;SPAN&gt;datanodes are datanode-B-[1,10] and namenode is namenode-B-1.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1) Do I initiate "distcp" on a host on Cluster A or B?&lt;/P&gt;&lt;P&gt;2) Do ports need to be opened up A-&amp;gt;B or B-&amp;gt;A?&lt;/P&gt;&lt;P&gt;3) If it is A-&amp;gt;B, then what hosts on A need access to what hosts on B?&lt;/P&gt;&lt;P&gt;4) If it is B-&amp;gt;A, then what hosts on B need access to what hosts on A?&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jan 2015 22:27:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/23920#M4523</guid>
      <dc:creator>siddhartha.jain-1190932798</dc:creator>
      <dc:date>2015-01-21T22:27:07Z</dc:date>
    </item>
    <item>
      <title>Re: What firewall ports should I open for distcp between clusters?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/63282#M4524</link>
      <description>&lt;P&gt;The simple answer is to open up the ports in a bidirectional manner on all the hosts. &amp;nbsp;For instance:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;on each node in cluster A: &amp;nbsp;Allow connectivity to 1004 (or 50010 without Kerberos) and 50020 on each datanode in cluster B. As well as 8020 to namenodes in Cluster B.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;on each node in cluster B:&amp;nbsp;&lt;SPAN&gt;Allow&lt;/SPAN&gt;&lt;SPAN&gt; connectivity to 1004 (or 50010 without Kerberos) and 50020 on each datanode in cluster A. As well as 8020 to namenodes in Cluster A.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;However... You are right, where the distcp is executed will determine the source/destination. &amp;nbsp;Executing distcp on Cluster A will cause a mapreduce job to run on cluster A. &amp;nbsp;Each datanode will(may) run a task that will connect to the namenode(s) on cluster B for block locations and then datanodes on cluster B for transfer. &amp;nbsp;I'm not sure if the node the distcp is executed on will need access as well. &amp;nbsp;So I generally run the distcp on one of the datanodes.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 03 Jan 2018 22:04:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-firewall-ports-should-I-open-for-distcp-between/m-p/63282#M4524</guid>
      <dc:creator>JimAcxiom</dc:creator>
      <dc:date>2018-01-03T22:04:28Z</dc:date>
    </item>
  </channel>
</rss>

