Created 07-28-2016 07:51 AM
sorry for theSilly question but I am new to HIve and BIG data world :can any one explain with neat example what is considered as structured and what is considered as unstructured if we compare to the RDBMS
Created 07-28-2016 09:28 AM
I Agree with your answer @Carroll but it arised one more question then before big data came into picture how facebook or any other media was doing the processing of big data and unstructured data with the RDBMS?
Created 07-28-2016 09:18 AM
Hi @Himanshu Rawat,
Welcome to HCC!
Whether we class data as structured or unstructured is related to its degree of organization. For example, consider the content and metadata of email.
The metadata associated with the emails I have sent would be structured. It needs to be very organized so the email servers know the sender, recipient(s), CC, BCC, time sent/received, etc. For example, the time received can easily be compared to the time on other emails. I could easily sort my emails based on time and find the most recent or something from a particular date.
The content or body on the other hand would be considered unstructured. I could put anything in there. How would I organize emails if I only considered the content? Number of words? Spaces? Positivity of the post? What would it mean?
Hope that helps
Created 07-28-2016 09:28 AM
I Agree with your answer @Carroll but it arised one more question then before big data came into picture how facebook or any other media was doing the processing of big data and unstructured data with the RDBMS?
Created 07-28-2016 10:06 AM
There were (and still are) a number of methods, including:
Apparently, Facebook still uses MySQL "with a complex sharding and caching strategy" - Gigacom
Created 07-28-2016 12:30 PM
Thanks Carroll