<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question How to debug a SQL query that works using a spark Jupyter Notebook, but fails when executed from Livy? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-debug-a-SQL-query-that-works-using-a-spark-Jupyter/m-p/350117#M235882</link>
    <description>&lt;P&gt;I have a Spark sql query that works when I execute from inside a Jupyter Notebook that has a a PySpark kernel but fails when I execute it submitting to a Livy session. Usually there's no difference when I execute my queries both ways.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried to get the spark session parameters with the command below and to guarantee that they are both the same:&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;&lt;SPAN&gt;spark.sparkContext.getConf().getAll() &lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm using spark 2.3. How can I debug this problem? I know that the query works using spark, but I can't make it work submitting with Livy.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is the query:&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;&lt;SPAN&gt;INSERT INTO sbx_xxxxx.dados_auxiliares_mediana_preco&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; select&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; p.ano, p.cod_cfi, p.preco, p.repeticoes, sum(p2.repeticoes) as qtd_antes&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; from sbx_operacoes_digitais.dados_auxiliares_moda_preco p&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; left join sbx_operacoes_digitais.dados_auxiliares_moda_preco p2&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; on p.ano = p2.ano and p.cod_cfi = p2.cod_cfi and p2.preco &amp;lt;= p.preco&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; group by p.ano, p.cod_cfi, p.preco, p.repeticoes&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here are the stack trace of the query returned by Livy:&lt;/P&gt;&lt;PRE&gt;An error occurred while calling o99.sql.&lt;BR /&gt;: org.apache.spark.SparkException: Job aborted.&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:224)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:154)&lt;BR /&gt;at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)&lt;BR /&gt;at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)&lt;BR /&gt;at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:115)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)&lt;BR /&gt;at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)&lt;BR /&gt;at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)&lt;BR /&gt;at org.apache.spark.sql.Dataset.&amp;lt;init&amp;gt;(Dataset.scala:190)&lt;BR /&gt;at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)&lt;BR /&gt;at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)&lt;BR /&gt;at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;BR /&gt;at java.lang.reflect.Method.invoke(Method.java:498)&lt;BR /&gt;at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)&lt;BR /&gt;at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)&lt;BR /&gt;at py4j.Gateway.invoke(Gateway.java:282)&lt;BR /&gt;at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)&lt;BR /&gt;at py4j.commands.CallCommand.execute(CallCommand.java:79)&lt;BR /&gt;at py4j.GatewayConnection.run(GatewayConnection.java:238)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:750)&lt;BR /&gt;Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree:&lt;BR /&gt;Exchange(coordinator id: 1868588554) hashpartitioning(ano#100, cod_cfi#101, 2001), coordinator[target post-shuffle partition size: 67108864]&lt;BR /&gt;+- *(1) FileScan orc sbx_operacoes_digitais.dados_auxiliares_moda_preco[ano#100,cod_cfi#101,preco#103,repeticoes#104] Batched: true, Format: ORC, Location: InMemoryFileIndex[hdfs://BNDOOP03/corporativo/sbx_operacoes_digitais/dados_auxiliares_moda_preco_..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct&amp;lt;ano:int,cod_cfi:int,preco:decimal(14,2),repeticoes:decimal(14,2)&amp;gt;&lt;BR /&gt;&lt;BR /&gt;at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:119)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)&lt;BR /&gt;at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:371)&lt;BR /&gt;at org.apache.spark.sql.execution.SortExec.inputRDDs(SortExec.scala:121)&lt;BR /&gt;at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:605)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)&lt;BR /&gt;at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.joins.SortMergeJoinExec.doExecute(SortMergeJoinExec.scala:150)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)&lt;BR /&gt;at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:371)&lt;BR /&gt;at org.apache.spark.sql.execution.ProjectExec.inputRDDs(basicPhysicalOperators.scala:41)&lt;BR /&gt;at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:150)&lt;BR /&gt;at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:150)&lt;BR /&gt;at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:605)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)&lt;BR /&gt;at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:180)&lt;BR /&gt;... 23 more&lt;BR /&gt;Caused by: java.lang.AssertionError: assertion failed&lt;BR /&gt;at scala.Predef$.assert(Predef.scala:156)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ExchangeCoordinator.doEstimationIfNecessary(ExchangeCoordinator.scala:201)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ExchangeCoordinator.postShuffleRDD(ExchangeCoordinator.scala:259)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$doExecute$1.apply(ShuffleExchangeExec.scala:124)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$doExecute$1.apply(ShuffleExchangeExec.scala:119)&lt;BR /&gt;at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)&lt;BR /&gt;... 58 more&lt;/PRE&gt;&lt;P&gt;It looks like that the important part is a failed assertion, but It does not gives me any useful info:&lt;/P&gt;&lt;PRE&gt;Caused by: java.lang.AssertionError: assertion failed&lt;BR /&gt;at scala.Predef$.assert(Predef.scala:156)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ExchangeCoordinator.doEstimationIfNecessary(ExchangeCoordinator.scala:201)&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help to teach me how to debut it is greatly appreciated&lt;/P&gt;</description>
    <pubDate>Mon, 15 Aug 2022 20:34:12 GMT</pubDate>
    <dc:creator>PauloNeves</dc:creator>
    <dc:date>2022-08-15T20:34:12Z</dc:date>
    <item>
      <title>How to debug a SQL query that works using a spark Jupyter Notebook, but fails when executed from Livy?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-debug-a-SQL-query-that-works-using-a-spark-Jupyter/m-p/350117#M235882</link>
      <description>&lt;P&gt;I have a Spark sql query that works when I execute from inside a Jupyter Notebook that has a a PySpark kernel but fails when I execute it submitting to a Livy session. Usually there's no difference when I execute my queries both ways.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried to get the spark session parameters with the command below and to guarantee that they are both the same:&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;&lt;SPAN&gt;spark.sparkContext.getConf().getAll() &lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm using spark 2.3. How can I debug this problem? I know that the query works using spark, but I can't make it work submitting with Livy.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is the query:&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;&lt;SPAN&gt;INSERT INTO sbx_xxxxx.dados_auxiliares_mediana_preco&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; select&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; p.ano, p.cod_cfi, p.preco, p.repeticoes, sum(p2.repeticoes) as qtd_antes&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; from sbx_operacoes_digitais.dados_auxiliares_moda_preco p&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; left join sbx_operacoes_digitais.dados_auxiliares_moda_preco p2&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; on p.ano = p2.ano and p.cod_cfi = p2.cod_cfi and p2.preco &amp;lt;= p.preco&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; group by p.ano, p.cod_cfi, p.preco, p.repeticoes&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here are the stack trace of the query returned by Livy:&lt;/P&gt;&lt;PRE&gt;An error occurred while calling o99.sql.&lt;BR /&gt;: org.apache.spark.SparkException: Job aborted.&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:224)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:154)&lt;BR /&gt;at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)&lt;BR /&gt;at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)&lt;BR /&gt;at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:115)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)&lt;BR /&gt;at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)&lt;BR /&gt;at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)&lt;BR /&gt;at org.apache.spark.sql.Dataset.&amp;lt;init&amp;gt;(Dataset.scala:190)&lt;BR /&gt;at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)&lt;BR /&gt;at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)&lt;BR /&gt;at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;BR /&gt;at java.lang.reflect.Method.invoke(Method.java:498)&lt;BR /&gt;at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)&lt;BR /&gt;at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)&lt;BR /&gt;at py4j.Gateway.invoke(Gateway.java:282)&lt;BR /&gt;at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)&lt;BR /&gt;at py4j.commands.CallCommand.execute(CallCommand.java:79)&lt;BR /&gt;at py4j.GatewayConnection.run(GatewayConnection.java:238)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:750)&lt;BR /&gt;Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree:&lt;BR /&gt;Exchange(coordinator id: 1868588554) hashpartitioning(ano#100, cod_cfi#101, 2001), coordinator[target post-shuffle partition size: 67108864]&lt;BR /&gt;+- *(1) FileScan orc sbx_operacoes_digitais.dados_auxiliares_moda_preco[ano#100,cod_cfi#101,preco#103,repeticoes#104] Batched: true, Format: ORC, Location: InMemoryFileIndex[hdfs://BNDOOP03/corporativo/sbx_operacoes_digitais/dados_auxiliares_moda_preco_..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct&amp;lt;ano:int,cod_cfi:int,preco:decimal(14,2),repeticoes:decimal(14,2)&amp;gt;&lt;BR /&gt;&lt;BR /&gt;at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:119)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)&lt;BR /&gt;at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:371)&lt;BR /&gt;at org.apache.spark.sql.execution.SortExec.inputRDDs(SortExec.scala:121)&lt;BR /&gt;at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:605)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)&lt;BR /&gt;at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.joins.SortMergeJoinExec.doExecute(SortMergeJoinExec.scala:150)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)&lt;BR /&gt;at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:371)&lt;BR /&gt;at org.apache.spark.sql.execution.ProjectExec.inputRDDs(basicPhysicalOperators.scala:41)&lt;BR /&gt;at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:150)&lt;BR /&gt;at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:150)&lt;BR /&gt;at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:605)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)&lt;BR /&gt;at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:180)&lt;BR /&gt;... 23 more&lt;BR /&gt;Caused by: java.lang.AssertionError: assertion failed&lt;BR /&gt;at scala.Predef$.assert(Predef.scala:156)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ExchangeCoordinator.doEstimationIfNecessary(ExchangeCoordinator.scala:201)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ExchangeCoordinator.postShuffleRDD(ExchangeCoordinator.scala:259)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$doExecute$1.apply(ShuffleExchangeExec.scala:124)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$doExecute$1.apply(ShuffleExchangeExec.scala:119)&lt;BR /&gt;at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)&lt;BR /&gt;... 58 more&lt;/PRE&gt;&lt;P&gt;It looks like that the important part is a failed assertion, but It does not gives me any useful info:&lt;/P&gt;&lt;PRE&gt;Caused by: java.lang.AssertionError: assertion failed&lt;BR /&gt;at scala.Predef$.assert(Predef.scala:156)&lt;BR /&gt;at org.apache.spark.sql.execution.exchange.ExchangeCoordinator.doEstimationIfNecessary(ExchangeCoordinator.scala:201)&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help to teach me how to debut it is greatly appreciated&lt;/P&gt;</description>
      <pubDate>Mon, 15 Aug 2022 20:34:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-debug-a-SQL-query-that-works-using-a-spark-Jupyter/m-p/350117#M235882</guid>
      <dc:creator>PauloNeves</dc:creator>
      <dc:date>2022-08-15T20:34:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to debug a SQL query that works using a spark Jupyter Notebook, but fails when executed from Livy?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-debug-a-SQL-query-that-works-using-a-spark-Jupyter/m-p/350324#M235954</link>
      <description>&lt;P&gt;Now I'm trying to check the session configuration setting the property:&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;&lt;SPAN&gt;'spark.logConf'&lt;/SPAN&gt;&lt;SPAN&gt; = &lt;/SPAN&gt;&lt;SPAN&gt;"true"&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I believe that setting it "true" would make the session properties be saved in spark log and that I'd be able to check them issuing the command&lt;/P&gt;&lt;PRE&gt;&lt;BR /&gt;yarn logs -applicationId application_1660776720083_9876 &amp;gt; yarn.log&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;but I can't find the session values in my yarn.log file. How would it be displayed? Am I doing something wrong? I'm using spark 2.3.4.&lt;/P&gt;</description>
      <pubDate>Thu, 18 Aug 2022 21:08:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-debug-a-SQL-query-that-works-using-a-spark-Jupyter/m-p/350324#M235954</guid>
      <dc:creator>PauloNeves</dc:creator>
      <dc:date>2022-08-18T21:08:15Z</dc:date>
    </item>
    <item>
      <title>Re: How to debug a SQL query that works using a spark Jupyter Notebook, but fails when executed from Livy?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-debug-a-SQL-query-that-works-using-a-spark-Jupyter/m-p/351103#M236153</link>
      <description>&lt;P&gt;I solved the problem turning off this option&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;&lt;SPAN&gt;("spark.sql.adaptive.enabled", "true")&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 29 Aug 2022 22:10:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-debug-a-SQL-query-that-works-using-a-spark-Jupyter/m-p/351103#M236153</guid>
      <dc:creator>PauloNeves</dc:creator>
      <dc:date>2022-08-29T22:10:48Z</dc:date>
    </item>
  </channel>
</rss>

