Created on 05-26-2017 07:10 PM - edited 09-16-2022 04:40 AM
Hello Everyone,
Why is PIG called a data flow language?
Thank you.
Created 06-16-2017 06:20 PM
in other words
in SQL We say “what” is to be accomplished
in Pig, we mention “how” a task is to be performed.
when to go for Pig - When we want to process larget set of unorganized, unstructured and decentralized data
Pig doesnt need to have a schema , it will consume unstructured data with delimiters .
please look in to this example .
https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=23494745
let me know if this suffice
Created 05-27-2017 07:04 PM
As you said yes it is a data folow language rather than a query language .
because you write series declaritive of statement that defines relations
where each relations performs new set of data transformation.
To put simple it like how to reterive data rather how you want data - like more of query optimizer (example)
Mainly used in ETL to ingest external data into Hadoop.
hope this suffice
Created 06-16-2017 01:13 PM
Hello csguna,
Thanks for your response. Could you please explain it with one example. That will be much better.
Created 06-16-2017 06:20 PM
in other words
in SQL We say “what” is to be accomplished
in Pig, we mention “how” a task is to be performed.
when to go for Pig - When we want to process larget set of unorganized, unstructured and decentralized data
Pig doesnt need to have a schema , it will consume unstructured data with delimiters .
please look in to this example .
https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=23494745
let me know if this suffice