Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Data Flow Language-PIG

avatar
Contributor

Hello Everyone,

 

Why is PIG called a data flow language?

 

Thank you.

1 ACCEPTED SOLUTION

avatar
Champion

@Geek007

 

in other words 

in SQL  We say “what” is to be accomplished

in Pig,  we mention “how” a task is to be performed.

 

when to go for Pig - When we want to process larget set of unorganized, unstructured and decentralized data

Pig doesnt need to have a schema , it will consume unstructured data with delimiters . 

please look in to this example . 

 

https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=23494745

 

let me know if this suffice 

View solution in original post

3 REPLIES 3

avatar
Champion

As you said yes it is a data folow language rather than a query language .

because you write series declaritive of statement that defines relations 

where each relations performs new set of data transformation. 

To put simple it like how to reterive data rather how you want data  - like more of query optimizer (example)

Mainly used in ETL to ingest external data into Hadoop. 

 

hope this suffice 

avatar
Contributor

Hello csguna,

 

Thanks for your response. Could you please explain it with one example. That will be much better.

avatar
Champion

@Geek007

 

in other words 

in SQL  We say “what” is to be accomplished

in Pig,  we mention “how” a task is to be performed.

 

when to go for Pig - When we want to process larget set of unorganized, unstructured and decentralized data

Pig doesnt need to have a schema , it will consume unstructured data with delimiters . 

please look in to this example . 

 

https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=23494745

 

let me know if this suffice