Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

How to launch spark-shell in debug mode

avatar
Guru

I am getting following error when I am running sql in scala ? But error is not clear so can some one please help me to solve this error. Also how to setup debug mode in spark-shelI.

scala> val df = spark.sql("select * from table1 limit 10");

df: org.apache.spark.sql.DataFrame = [itm_nbr: int, overall_e_coefficient: decimal(15,3) ... 16 more fields]

scala> df.show(10)

java.lang.RuntimeException: serious problem

2 ACCEPTED SOLUTIONS

avatar
Master Mentor

@Saurabh

You can pass your own "log4j.properties" path to log messages and pass it to your spark shell command.

Example:

# spark-shell --master yarn --deploy-mode client --files /your/path/to/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/your/path/to/log4j.properties"

View solution in original post

avatar
Master Mentor

@Saurabh

For example if you create a "/tmp/log4j.properties" like following:

# cat /tmp/log4j.properties 
log4j.rootCategory=debug,console
log4j.logger.com.demo.package=debug,console
log4j.additivity.com.demo.package=false
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.out
log4j.appender.console.immediateFlush=true
log4j.appender.console.encoding=UTF-8
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.conversionPattern=%d [%t] %-5p %c - %m%n

.

Then run the spark-shell as following then you should see DEBUG messages.

# su - spark
#  spark-shell --master yarn --deploy-mode client --files /tmp/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/tmp/log4j.properties"
Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
Spark1 will be picked by default
2018-09-17 07:52:29,343 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,388 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,389 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[GetGroups])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since startup])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since last successful login])
2018-09-17 07:52:29,392 [main] DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics
2018-09-17 07:52:29,845 [main] DEBUG org.apache.hadoop.security.SecurityUtil - Setting hadoop.security.token.service.use_ip to true
2018-09-17 07:52:30,386 [main] DEBUG org.apache.hadoop.util.Shell - setsid exited with exit code 0
2018-09-17 07:52:30,501 [main] DEBUG org.apache.hadoop.security.Groups -  Creating new Groups object
2018-09-17 07:52:30,523 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...
2018-09-17 07:52:30,534 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
2018-09-17 07:52:30,535 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-09-17 07:52:30,535 [main] WARN  org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-17 07:52:30,536 [main] DEBUG org.apache.hadoop.util.PerformanceAdvisory - Falling back to shell based
2018-09-17 07:52:30,537 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
2018-09-17 07:52:30,709 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
2018-09-17 07:52:30,751 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login

.

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@Saurabh

You can pass your own "log4j.properties" path to log messages and pass it to your spark shell command.

Example:

# spark-shell --master yarn --deploy-mode client --files /your/path/to/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/your/path/to/log4j.properties"

avatar
Master Mentor

@Saurabh

For example if you create a "/tmp/log4j.properties" like following:

# cat /tmp/log4j.properties 
log4j.rootCategory=debug,console
log4j.logger.com.demo.package=debug,console
log4j.additivity.com.demo.package=false
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.out
log4j.appender.console.immediateFlush=true
log4j.appender.console.encoding=UTF-8
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.conversionPattern=%d [%t] %-5p %c - %m%n

.

Then run the spark-shell as following then you should see DEBUG messages.

# su - spark
#  spark-shell --master yarn --deploy-mode client --files /tmp/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/tmp/log4j.properties"
Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
Spark1 will be picked by default
2018-09-17 07:52:29,343 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,388 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,389 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[GetGroups])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since startup])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since last successful login])
2018-09-17 07:52:29,392 [main] DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics
2018-09-17 07:52:29,845 [main] DEBUG org.apache.hadoop.security.SecurityUtil - Setting hadoop.security.token.service.use_ip to true
2018-09-17 07:52:30,386 [main] DEBUG org.apache.hadoop.util.Shell - setsid exited with exit code 0
2018-09-17 07:52:30,501 [main] DEBUG org.apache.hadoop.security.Groups -  Creating new Groups object
2018-09-17 07:52:30,523 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...
2018-09-17 07:52:30,534 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
2018-09-17 07:52:30,535 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-09-17 07:52:30,535 [main] WARN  org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-17 07:52:30,536 [main] DEBUG org.apache.hadoop.util.PerformanceAdvisory - Falling back to shell based
2018-09-17 07:52:30,537 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
2018-09-17 07:52:30,709 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
2018-09-17 07:52:30,751 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login

.

avatar
Master Mentor

@Saurabh

In case of Spark2 you can enable the DEBUG logging as by invoking the "sc.setLogLevel("DEBUG")" as following:

$ export SPARK_MAJOR_VERSION=2
$ spark-shell --master yarn --deploy-mode client

SPARK_MAJOR_VERSION is set to 2, using Spark2
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://newhwx1.example.com:4040
Spark context available as 'sc' (master = yarn, app id = application_1536125228953_0007).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.0.2.6.5.0-292
      /_/
         
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information.

scala> sc.setLogLevel("DEBUG")

scala> 18/09/17 07:58:57 DEBUG Client: IPC Client (1024266763) connection to newhwx1.example.com/10.10.10.10:8032 from spark sending #69
18/09/17 07:58:57 DEBUG Client: IPC Client (1024266763) connection to newhwx1.example.com/10.10.10.10:8032 from spark got value #69
18/09/17 07:58:57 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 8ms

.

avatar
Guru

Thanks a lot @Jay Kumar SenSharma. It ran in debug mode.