<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: what is a structure data and unstructured data in more precise way in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154095#M36225</link>
    <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/12156/himanshurawat.html" nodeid="12156"&gt;@Himanshu  Rawat&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;Welcome to HCC!&lt;/P&gt;&lt;P&gt;Whether we class data as structured or unstructured is
related to its degree of organization. For example, consider the content and
metadata of email. &lt;/P&gt;&lt;P&gt;The metadata associated with the emails I have sent would be
structured. It needs to be very organized so the email servers know the sender,
recipient(s), CC, BCC, time sent/received, etc. For example, the time received can
easily be compared to the time on other emails. I could easily sort my emails
based on time and find the most recent or something from a particular date.&lt;/P&gt;&lt;P&gt;The content or body on the other hand would be considered
unstructured. I could put anything in there. How would I organize emails if I
only considered the content? Number of words? Spaces? Positivity of the post? What
would it mean? &lt;/P&gt;&lt;P&gt;Hope
that helps&lt;/P&gt;</description>
    <pubDate>Thu, 28 Jul 2016 16:18:44 GMT</pubDate>
    <dc:creator>scarroll</dc:creator>
    <dc:date>2016-07-28T16:18:44Z</dc:date>
    <item>
      <title>what is a structure data and unstructured data in more precise way</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154094#M36224</link>
      <description>&lt;P&gt;sorry for theSilly question but I am new to HIve and BIG data world :can any one explain with neat example what is considered as structured and what is considered as unstructured if we compare to the RDBMS&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2016 14:51:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154094#M36224</guid>
      <dc:creator>himanshu_rawat</dc:creator>
      <dc:date>2016-07-28T14:51:16Z</dc:date>
    </item>
    <item>
      <title>Re: what is a structure data and unstructured data in more precise way</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154095#M36225</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/12156/himanshurawat.html" nodeid="12156"&gt;@Himanshu  Rawat&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;Welcome to HCC!&lt;/P&gt;&lt;P&gt;Whether we class data as structured or unstructured is
related to its degree of organization. For example, consider the content and
metadata of email. &lt;/P&gt;&lt;P&gt;The metadata associated with the emails I have sent would be
structured. It needs to be very organized so the email servers know the sender,
recipient(s), CC, BCC, time sent/received, etc. For example, the time received can
easily be compared to the time on other emails. I could easily sort my emails
based on time and find the most recent or something from a particular date.&lt;/P&gt;&lt;P&gt;The content or body on the other hand would be considered
unstructured. I could put anything in there. How would I organize emails if I
only considered the content? Number of words? Spaces? Positivity of the post? What
would it mean? &lt;/P&gt;&lt;P&gt;Hope
that helps&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2016 16:18:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154095#M36225</guid>
      <dc:creator>scarroll</dc:creator>
      <dc:date>2016-07-28T16:18:44Z</dc:date>
    </item>
    <item>
      <title>Re: what is a structure data and unstructured data in more precise way</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154096#M36226</link>
      <description>&lt;P&gt;I Agree with your answer @&lt;A href="https://community.hortonworks.com/users/10930/scarroll.html"&gt;Carroll&lt;/A&gt; but it arised one more question then before big data came into picture how facebook or any other media was doing the processing of big data and unstructured data with the RDBMS?&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2016 16:28:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154096#M36226</guid>
      <dc:creator>himanshu_rawat</dc:creator>
      <dc:date>2016-07-28T16:28:18Z</dc:date>
    </item>
    <item>
      <title>Re: what is a structure data and unstructured data in more precise way</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154097#M36227</link>
      <description>&lt;P&gt;There were (and still are) a number of methods, including:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Throw data away&lt;UL&gt;&lt;LI&gt;Down Sample - Decide what you think is important up front and throw the rest away&lt;/LI&gt;&lt;LI&gt;Age Off - Periodically delete old data&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Warehouse - write old data to tapes and delete off the disks&lt;/LI&gt;&lt;LI&gt;Buy specialised hardware - Very large, expensive dedicated database machines which don't scale&lt;/LI&gt;&lt;LI&gt;Don't use a traditional database - keep everything in files and distribute manually to a cluster&lt;/LI&gt;&lt;LI&gt;Traditional database horizontal scaling - never done it but heard it's difficult&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Apparently, Facebook still uses MySQL "with a complex sharding and caching strategy" - &lt;A href="https://gigaom.com/2011/12/06/facebook-shares-some-secrets-on-making-mysql-scale/"&gt;Gigacom&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2016 17:06:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154097#M36227</guid>
      <dc:creator>scarroll</dc:creator>
      <dc:date>2016-07-28T17:06:30Z</dc:date>
    </item>
    <item>
      <title>Re: what is a structure data and unstructured data in more precise way</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154098#M36228</link>
      <description>&lt;P&gt;Thanks Carroll&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2016 19:30:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-is-a-structure-data-and-unstructured-data-in-more/m-p/154098#M36228</guid>
      <dc:creator>himanshu_rawat</dc:creator>
      <dc:date>2016-07-28T19:30:45Z</dc:date>
    </item>
  </channel>
</rss>

