Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to launch spark-shell in debug mode

avatar
Guru

I am getting following error when I am running sql in scala ? But error is not clear so can some one please help me to solve this error. Also how to setup debug mode in spark-shelI.

scala> val df = spark.sql("select * from table1 limit 10");

df: org.apache.spark.sql.DataFrame = [itm_nbr: int, overall_e_coefficient: decimal(15,3) ... 16 more fields]

scala> df.show(10)

java.lang.RuntimeException: serious problem

2 ACCEPTED SOLUTIONS

avatar
Master Mentor

@Saurabh

You can pass your own "log4j.properties" path to log messages and pass it to your spark shell command.

Example:

# spark-shell --master yarn --deploy-mode client --files /your/path/to/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/your/path/to/log4j.properties"

View solution in original post

avatar
Master Mentor

@Saurabh

For example if you create a "/tmp/log4j.properties" like following:

# cat /tmp/log4j.properties 
log4j.rootCategory=debug,console
log4j.logger.com.demo.package=debug,console
log4j.additivity.com.demo.package=false
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.out
log4j.appender.console.immediateFlush=true
log4j.appender.console.encoding=UTF-8
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.conversionPattern=%d [%t] %-5p %c - %m%n

.

Then run the spark-shell as following then you should see DEBUG messages.

# su - spark
#  spark-shell --master yarn --deploy-mode client --files /tmp/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/tmp/log4j.properties"
Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
Spark1 will be picked by default
2018-09-17 07:52:29,343 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,388 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,389 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[GetGroups])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since startup])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since last successful login])
2018-09-17 07:52:29,392 [main] DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics
2018-09-17 07:52:29,845 [main] DEBUG org.apache.hadoop.security.SecurityUtil - Setting hadoop.security.token.service.use_ip to true
2018-09-17 07:52:30,386 [main] DEBUG org.apache.hadoop.util.Shell - setsid exited with exit code 0
2018-09-17 07:52:30,501 [main] DEBUG org.apache.hadoop.security.Groups -  Creating new Groups object
2018-09-17 07:52:30,523 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...
2018-09-17 07:52:30,534 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
2018-09-17 07:52:30,535 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-09-17 07:52:30,535 [main] WARN  org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-17 07:52:30,536 [main] DEBUG org.apache.hadoop.util.PerformanceAdvisory - Falling back to shell based
2018-09-17 07:52:30,537 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
2018-09-17 07:52:30,709 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
2018-09-17 07:52:30,751 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login

.

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@Saurabh

You can pass your own "log4j.properties" path to log messages and pass it to your spark shell command.

Example:

# spark-shell --master yarn --deploy-mode client --files /your/path/to/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/your/path/to/log4j.properties"

avatar
Master Mentor

@Saurabh

For example if you create a "/tmp/log4j.properties" like following:

# cat /tmp/log4j.properties 
log4j.rootCategory=debug,console
log4j.logger.com.demo.package=debug,console
log4j.additivity.com.demo.package=false
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.out
log4j.appender.console.immediateFlush=true
log4j.appender.console.encoding=UTF-8
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.conversionPattern=%d [%t] %-5p %c - %m%n

.

Then run the spark-shell as following then you should see DEBUG messages.

# su - spark
#  spark-shell --master yarn --deploy-mode client --files /tmp/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/tmp/log4j.properties"
Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
Spark1 will be picked by default
2018-09-17 07:52:29,343 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,388 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,389 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[GetGroups])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since startup])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since last successful login])
2018-09-17 07:52:29,392 [main] DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics
2018-09-17 07:52:29,845 [main] DEBUG org.apache.hadoop.security.SecurityUtil - Setting hadoop.security.token.service.use_ip to true
2018-09-17 07:52:30,386 [main] DEBUG org.apache.hadoop.util.Shell - setsid exited with exit code 0
2018-09-17 07:52:30,501 [main] DEBUG org.apache.hadoop.security.Groups -  Creating new Groups object
2018-09-17 07:52:30,523 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...
2018-09-17 07:52:30,534 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
2018-09-17 07:52:30,535 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-09-17 07:52:30,535 [main] WARN  org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-17 07:52:30,536 [main] DEBUG org.apache.hadoop.util.PerformanceAdvisory - Falling back to shell based
2018-09-17 07:52:30,537 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
2018-09-17 07:52:30,709 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
2018-09-17 07:52:30,751 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login

.

avatar
Master Mentor

@Saurabh

In case of Spark2 you can enable the DEBUG logging as by invoking the "sc.setLogLevel("DEBUG")" as following:

$ export SPARK_MAJOR_VERSION=2
$ spark-shell --master yarn --deploy-mode client

SPARK_MAJOR_VERSION is set to 2, using Spark2
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://newhwx1.example.com:4040
Spark context available as 'sc' (master = yarn, app id = application_1536125228953_0007).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.0.2.6.5.0-292
      /_/
         
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information.

scala> sc.setLogLevel("DEBUG")

scala> 18/09/17 07:58:57 DEBUG Client: IPC Client (1024266763) connection to newhwx1.example.com/10.10.10.10:8032 from spark sending #69
18/09/17 07:58:57 DEBUG Client: IPC Client (1024266763) connection to newhwx1.example.com/10.10.10.10:8032 from spark got value #69
18/09/17 07:58:57 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 8ms

.

avatar
Guru

Thanks a lot @Jay Kumar SenSharma. It ran in debug mode.