Created on 09-17-2018 07:30 AM - edited 09-16-2022 06:43 AM
I am getting following error when I am running sql in scala ? But error is not clear so can some one please help me to solve this error. Also how to setup debug mode in spark-shelI.
scala> val df = spark.sql("select * from table1 limit 10");
df: org.apache.spark.sql.DataFrame = [itm_nbr: int, overall_e_coefficient: decimal(15,3) ... 16 more fields]
scala> df.show(10)
java.lang.RuntimeException: serious problem
Created 09-17-2018 07:45 AM
You can pass your own "log4j.properties" path to log messages and pass it to your spark shell command.
Example:
# spark-shell --master yarn --deploy-mode client --files /your/path/to/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/your/path/to/log4j.properties"
Created 09-17-2018 07:53 AM
For example if you create a "/tmp/log4j.properties" like following:
# cat /tmp/log4j.properties log4j.rootCategory=debug,console log4j.logger.com.demo.package=debug,console log4j.additivity.com.demo.package=false log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.out log4j.appender.console.immediateFlush=true log4j.appender.console.encoding=UTF-8 log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.conversionPattern=%d [%t] %-5p %c - %m%n
.
Then run the spark-shell as following then you should see DEBUG messages.
# su - spark # spark-shell --master yarn --deploy-mode client --files /tmp/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/tmp/log4j.properties" Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set Spark1 will be picked by default 2018-09-17 07:52:29,343 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)]) 2018-09-17 07:52:29,388 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)]) 2018-09-17 07:52:29,389 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[GetGroups]) 2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since startup]) 2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since last successful login]) 2018-09-17 07:52:29,392 [main] DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics 2018-09-17 07:52:29,845 [main] DEBUG org.apache.hadoop.security.SecurityUtil - Setting hadoop.security.token.service.use_ip to true 2018-09-17 07:52:30,386 [main] DEBUG org.apache.hadoop.util.Shell - setsid exited with exit code 0 2018-09-17 07:52:30,501 [main] DEBUG org.apache.hadoop.security.Groups - Creating new Groups object 2018-09-17 07:52:30,523 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library... 2018-09-17 07:52:30,534 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path 2018-09-17 07:52:30,535 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2018-09-17 07:52:30,535 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2018-09-17 07:52:30,536 [main] DEBUG org.apache.hadoop.util.PerformanceAdvisory - Falling back to shell based 2018-09-17 07:52:30,537 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping 2018-09-17 07:52:30,709 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000 2018-09-17 07:52:30,751 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login
Created 09-17-2018 07:45 AM
You can pass your own "log4j.properties" path to log messages and pass it to your spark shell command.
Example:
# spark-shell --master yarn --deploy-mode client --files /your/path/to/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/your/path/to/log4j.properties"
Created 09-17-2018 07:53 AM
For example if you create a "/tmp/log4j.properties" like following:
# cat /tmp/log4j.properties log4j.rootCategory=debug,console log4j.logger.com.demo.package=debug,console log4j.additivity.com.demo.package=false log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.out log4j.appender.console.immediateFlush=true log4j.appender.console.encoding=UTF-8 log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.conversionPattern=%d [%t] %-5p %c - %m%n
.
Then run the spark-shell as following then you should see DEBUG messages.
# su - spark # spark-shell --master yarn --deploy-mode client --files /tmp/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/tmp/log4j.properties" Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set Spark1 will be picked by default 2018-09-17 07:52:29,343 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)]) 2018-09-17 07:52:29,388 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)]) 2018-09-17 07:52:29,389 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[GetGroups]) 2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since startup]) 2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since last successful login]) 2018-09-17 07:52:29,392 [main] DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics 2018-09-17 07:52:29,845 [main] DEBUG org.apache.hadoop.security.SecurityUtil - Setting hadoop.security.token.service.use_ip to true 2018-09-17 07:52:30,386 [main] DEBUG org.apache.hadoop.util.Shell - setsid exited with exit code 0 2018-09-17 07:52:30,501 [main] DEBUG org.apache.hadoop.security.Groups - Creating new Groups object 2018-09-17 07:52:30,523 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library... 2018-09-17 07:52:30,534 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path 2018-09-17 07:52:30,535 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2018-09-17 07:52:30,535 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2018-09-17 07:52:30,536 [main] DEBUG org.apache.hadoop.util.PerformanceAdvisory - Falling back to shell based 2018-09-17 07:52:30,537 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping 2018-09-17 07:52:30,709 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000 2018-09-17 07:52:30,751 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login
Created 09-17-2018 08:01 AM
In case of Spark2 you can enable the DEBUG logging as by invoking the "sc.setLogLevel("DEBUG")" as following:
$ export SPARK_MAJOR_VERSION=2 $ spark-shell --master yarn --deploy-mode client SPARK_MAJOR_VERSION is set to 2, using Spark2 Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at http://newhwx1.example.com:4040 Spark context available as 'sc' (master = yarn, app id = application_1536125228953_0007). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.0.2.6.5.0-292 /_/ Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112) Type in expressions to have them evaluated. Type :help for more information. scala> sc.setLogLevel("DEBUG") scala> 18/09/17 07:58:57 DEBUG Client: IPC Client (1024266763) connection to newhwx1.example.com/10.10.10.10:8032 from spark sending #69 18/09/17 07:58:57 DEBUG Client: IPC Client (1024266763) connection to newhwx1.example.com/10.10.10.10:8032 from spark got value #69 18/09/17 07:58:57 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 8ms
.
Created 09-17-2018 08:33 AM
Thanks a lot @Jay Kumar SenSharma. It ran in debug mode.