<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Reading files from s3 bucket sub folders in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-files-from-s3-bucket-sub-folders/m-p/208717#M84175</link>
    <description>&lt;P&gt;Hi Aditya,&lt;/P&gt;&lt;P&gt;Thanks  a lot for your help. Is it possible to do in scala? As i dont have knowledge on python.&lt;/P&gt;</description>
    <pubDate>Tue, 09 Oct 2018 20:04:04 GMT</pubDate>
    <dc:creator>klprathyusha</dc:creator>
    <dc:date>2018-10-09T20:04:04Z</dc:date>
    <item>
      <title>Reading files from s3 bucket sub folders</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-files-from-s3-bucket-sub-folders/m-p/208715#M84173</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I am trying to read the files from s3 bucket (which contain many sub directories). As of now i am giving the phyisical path to read the files. How to read the files without  hard coded values.&lt;/P&gt;&lt;P&gt;File path :  S3 bucket name/Folder/1005/SoB/20180722_zpsx3Gcc7J2MlNnViVp61/JPR_DM2_ORG/ *.gz files&lt;/P&gt;&lt;P&gt;"S3 bucket name/Folder/" this path is fixed one and client id(1005) we have to pass as a parameter.&lt;/P&gt;&lt;P&gt;Under Sob folder, we are having monthly wise folders and I have to take only latest two months data.&lt;/P&gt;&lt;P&gt;Please help me how to read the data without  hard-coded.&lt;/P&gt;&lt;P&gt;Many thanks for your help.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Oct 2018 10:52:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-files-from-s3-bucket-sub-folders/m-p/208715#M84173</guid>
      <dc:creator>klprathyusha</dc:creator>
      <dc:date>2018-10-09T10:52:15Z</dc:date>
    </item>
    <item>
      <title>Re: Reading files from s3 bucket sub folders</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-files-from-s3-bucket-sub-folders/m-p/208716#M84174</link>
      <description>&lt;P&gt; &lt;A rel="user" href="https://community.cloudera.com/users/97368/klprathyusha.html" nodeid="97368"&gt;@Lakshmi Prathyusha&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;You can write a simple python snippet like below to read the subfolders. I have put a print statement in the code, but you can replace it some subprocess command to run it.&lt;/P&gt;&lt;PRE&gt;from datetime import date, timedelta&lt;BR /&gt;from dateutil.relativedelta import relativedelta&lt;BR /&gt;&lt;BR /&gt;today = date.today()&lt;BR /&gt;two_months_back = today - relativedelta(months=2)&lt;BR /&gt;&lt;BR /&gt;delta = today - two_months_back&lt;BR /&gt;&lt;BR /&gt;for i in range(delta.days + 1):&lt;BR /&gt;    dt = str(two_months_back + timedelta(i)).replace("-", "")&lt;BR /&gt;    print "hdfs dfs -ls s3a://bucket/Folder/1005/SoB/%s" % dt&lt;/PRE&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;-Aditya&lt;/P&gt;</description>
      <pubDate>Tue, 09 Oct 2018 16:52:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-files-from-s3-bucket-sub-folders/m-p/208716#M84174</guid>
      <dc:creator>asirna</dc:creator>
      <dc:date>2018-10-09T16:52:30Z</dc:date>
    </item>
    <item>
      <title>Re: Reading files from s3 bucket sub folders</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-files-from-s3-bucket-sub-folders/m-p/208717#M84175</link>
      <description>&lt;P&gt;Hi Aditya,&lt;/P&gt;&lt;P&gt;Thanks  a lot for your help. Is it possible to do in scala? As i dont have knowledge on python.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Oct 2018 20:04:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-files-from-s3-bucket-sub-folders/m-p/208717#M84175</guid>
      <dc:creator>klprathyusha</dc:creator>
      <dc:date>2018-10-09T20:04:04Z</dc:date>
    </item>
    <item>
      <title>Re: Reading files from s3 bucket sub folders</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-files-from-s3-bucket-sub-folders/m-p/208718#M84176</link>
      <description>&lt;P&gt; &lt;A rel="user" href="https://community.cloudera.com/users/97368/klprathyusha.html" nodeid="97368"&gt;@Lakshmi Prathyusha&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;I'm not sure of how to do this in Scala. I guess you may have similar date time functions in Scala as well. You can apply this logic in Scala.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Oct 2018 21:20:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-files-from-s3-bucket-sub-folders/m-p/208718#M84176</guid>
      <dc:creator>asirna</dc:creator>
      <dc:date>2018-10-09T21:20:52Z</dc:date>
    </item>
  </channel>
</rss>

