Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

pig while storing got stuck after creating the application.

New Contributor

I'm using Virtualbox in Windows to work on Hadoop and Pig.

In grunt, when I run

STORE B INTO '/users/emp/empsalinc' USING PigStorage(' '); 

it got stuck after creating jobId. Why is it so?




Expert Contributor

When a Pig job gets stuck after creating the JobID, there could be several reasons for this behavior. Here are some common issues and solutions. 

  1. Data Size and Complexity:

    • Check the size and complexity of your data. If the dataset is very large, the storage operation may take a significant amount of time.
    • Optimize your Pig script if possible, and consider processing a smaller subset of the data for testing.
  2. Resource Allocation:

    • Ensure that your Hadoop cluster has sufficient resources allocated for the Pig job. Insufficient memory or available resources can lead to job failures or delays.
    • Check the resource configuration in your Hadoop cluster and adjust it accordingly.
  3. Job Monitoring:

    • Use Hadoop JobTracker or ResourceManager web interfaces to monitor the progress of your Pig job. This can provide insights into where the job might be stuck.
    • Look for any error messages or warnings in the logs.
  4. Logs and Debugging:

    • Examine the Pig logs for any error messages or stack traces. This can help identify the specific issue causing the job to hang.
    • Enable debugging in Pig by adding -Dmapred.job.tracker=<your_job_tracker> to your Pig command, and check the debug logs for more information.
  5. Permissions and Path:

    • Ensure that the specified output path /users/emp/empsalinc is writable by the user running the Pig job.
    • Check for any permission issues or typos in the path.
  6. Network Issues:

    • Network issues or connectivity problems between nodes in your Hadoop cluster can also cause jobs to hang. Check the network configuration and try running simpler jobs to isolate the issue.
  7. Pig Version Compatibility:

    • Ensure that the version of Pig you are using is compatible with your Hadoop distribution. Incompatibility can lead to unexpected issues.
  8. Configuration Settings:

    • Review your Pig script and ensure that the configuration settings are appropriate for your environment. Adjust parameters like, mapreduce.job.queuename, etc., as needed.
  9. Custom UDFs:

    • If your Pig script uses custom User Defined Functions (UDFs), ensure that they are correctly implemented and compatible with the version of Pig you are using.

By investigating these aspects, you should be able to identify the root cause of the job getting stuck after creating the JobID and take appropriate action to resolve the issue