Support Questions

Find answers, ask questions, and share your expertise

Pig job failure with ERROR 2998 : Unhandled internal error

avatar
Expert Contributor

Hi, 

 

I am seeing this error when running pig jobs. Which parameter has to be tuned. 

 

The map and redude cluster wide memory is 4G for map and 4G for reduce. 

We are not settnig any heap while running job. 

 

2017-06-29 20:05:10,664 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS

2017-06-29 20:05:10,718 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS

2017-06-29 20:05:12,220 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: REPLICATED_JOIN,HASH_JOIN,DISTINCT,FILTER

 

2017-06-29 20:42:02,042 [Service Thread] INFO  org.apache.pig.impl.util.SpillableMemoryManager - first memory handler call- Usage threshold init = 698875904(682496K) used = 597759256(583749K) committed = 698875904(682496K) max = 698875904(682496K)

2017-06-29 20:42:02,934 [Service Thread] INFO  org.apache.pig.impl.util.SpillableMemoryManager - first memory handler call - Collection threshold init = 698875904(682496K) used = 457643760(446917K) committed = 698875904(682496K) max = 698875904(682496K)

2017-06-29 20:52:26,919 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Java heap space Details at logfile: xxxxxx

1 REPLY 1

avatar
Rising Star
Hi desind, 
 
I see that you have the map and reduce cluster wide memory set to 4G and 4G respectively.
 
However the parameter that you will need to change is the PIG_HEAPSIZE = X, I would suggest increasing this and running the job again. 
 
For reference on changing properties [1]: 
Pig Properties

Pig supports a number of Java properties that you can use to customize Pig behavior. You can retrieve a list of the properties using the help properties command. All of these properties are optional; none are required.

 

To specify Pig properties use one of these mechanisms:

  • The pig.properties file (add the directory that contains the pig.properties file to the classpath)
  • The -D command line option and a Pig property (pig -Dpig.tmpfilecompression=true)
  • The -P command line option and a properties file (pig -P mypig.properties)
  • The set command (set pig.exec.nocombiner true)

Note: The properties file uses standard Java property file format.

The following precedence order is supported: pig.properties > -D Pig property > -P properties file > set command. This means that if the same property is provided using the –D command line option as well as the –P command line option and a properties file, the value of the property in the properties file will take precedence.

To specify Hadoop properties you can use the same mechanisms:

  • The hadoop-site.xml file (add the directory that contains the hadoop-site.xml file to the classpath)
  • The -D command line option and a Hadoop property (pig –Dmapreduce.task.profile=true)
  • The -P command line option and a property file (pig -P property_file)
  • The set command (set mapred.map.tasks.speculative.execution false)

 

The same precedence holds: hadoop-site.xml > -D Hadoop property > -P properties_file > set command.

Hadoop properties are not interpreted by Pig but are passed directly to Hadoop. Any Hadoop property can be passed this way.

All properties that Pig collects, including Hadoop properties, are available to any UDF via the UDFContext object. To get access to the properties, you can call the getJobConf method.

 
 
Thanks, 
Jordan