Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Data Flow Language-PIG

avatar
Contributor

Hello Everyone,

 

Why is PIG called a data flow language?

 

Thank you.

1 ACCEPTED SOLUTION

avatar
Champion

@Geek007

 

in other words 

in SQL  We say “what” is to be accomplished

in Pig,  we mention “how” a task is to be performed.

 

when to go for Pig - When we want to process larget set of unorganized, unstructured and decentralized data

Pig doesnt need to have a schema , it will consume unstructured data with delimiters . 

please look in to this example . 

 

https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=23494745

 

let me know if this suffice 

View solution in original post

3 REPLIES 3

avatar
Champion

As you said yes it is a data folow language rather than a query language .

because you write series declaritive of statement that defines relations 

where each relations performs new set of data transformation. 

To put simple it like how to reterive data rather how you want data  - like more of query optimizer (example)

Mainly used in ETL to ingest external data into Hadoop. 

 

hope this suffice 

avatar
Contributor

Hello csguna,

 

Thanks for your response. Could you please explain it with one example. That will be much better.

avatar
Champion

@Geek007

 

in other words 

in SQL  We say “what” is to be accomplished

in Pig,  we mention “how” a task is to be performed.

 

when to go for Pig - When we want to process larget set of unorganized, unstructured and decentralized data

Pig doesnt need to have a schema , it will consume unstructured data with delimiters . 

please look in to this example . 

 

https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=23494745

 

let me know if this suffice