Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDP 2.3_1 Sandbox Pig Java UDF don't give any stdout

avatar
Contributor

Hello! I am test driving one Java written UDF in HDP sandbox 2.3_1 (No Tez enabled). I use simple "system.out.prinln("phase one");" to locate how my code works . Java code itself is "ok", but some of my logic in code is not working and producing zero output.. What is the -simpliest- way to track down in what part of code system is currently running in java Written UDF ? That System.out.println .. "sometimes" work fine, sometimes like now it doesn't give nothing even If I put one println inside while loop where Bag is processed. Maybe its not even reaching my println line .. ? Log4j -something is one way, but haven't seen any simple examples.. Thank you in advance!

1 ACCEPTED SOLUTION

avatar
Super Collaborator

@petri koski,

UDF (either for Hive or PIG) are running during map-reduce stage (doesn't matter whether it is M/R or TEZ execution engine). In other words, you are println during distributed computing. The code that prints your output is not under your execution shell (unless you are running in local mode).

How to see your printed lines? There are some ways:

- using job tracker UI - find your job and click on logs. One by one across all containers, until you will find it (or in each of them, if your code is applicable to each and every record of processed data).

- using yarn get aggregated logs

yarn logs -applicationId <aplpicationID>

View solution in original post

1 REPLY 1

avatar
Super Collaborator

@petri koski,

UDF (either for Hive or PIG) are running during map-reduce stage (doesn't matter whether it is M/R or TEZ execution engine). In other words, you are println during distributed computing. The code that prints your output is not under your execution shell (unless you are running in local mode).

How to see your printed lines? There are some ways:

- using job tracker UI - find your job and click on logs. One by one across all containers, until you will find it (or in each of them, if your code is applicable to each and every record of processed data).

- using yarn get aggregated logs

yarn logs -applicationId <aplpicationID>