<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Issue running spark jobs with Airflow in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/377136#M243147</link>
    <description>&lt;P&gt;Hi again&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/78612"&gt;@RangaReddy&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;I'm sorry for the huge delay in reply, unfortunately this triggered a lengthy discussion between us and the AD team.&lt;/P&gt;&lt;P&gt;In the end we managed to get our hands on a keytab file, and we confirmed it works fine by manually submitting the below command:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;kinit -k -t /path/to/keytab/file.keytab username&lt;BR /&gt;&lt;BR /&gt;Unfortunately when we attempt to pass this with a bash operator from an Airflow DAG we get the same error:&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;SPAN&gt;py4j.protocol.Py4JJavaError&lt;/SPAN&gt;&lt;SPAN&gt;: An error occurred while calling&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;o32.csv&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;:&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;org.apache.hadoop.security.AccessControlException&lt;/SPAN&gt;&lt;SPAN&gt;:&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;org.apache.hadoop.security.AccessControlException&lt;/SPAN&gt;&lt;SPAN&gt;: SIMPLE authentication is not enabled. &amp;nbsp;Available:[TOKEN, KERBEROS]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Thank you,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Mario&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Mon, 02 Oct 2023 13:29:01 GMT</pubDate>
    <dc:creator>imule</dc:creator>
    <dc:date>2023-10-02T13:29:01Z</dc:date>
    <item>
      <title>Issue running spark jobs with Airflow</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/375784#M242654</link>
      <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So I've inherited a kerberized Cloudera cluster and I'm learning as I go. Right now I'm trying to get Airflow to work with our Spark jobs but without success. As I understand Airflow was installed by our OS team only after the cluster was configured by Cloudera. It runs on our edge node from where we run our jobs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Basically I'm using bash operators for my test DAG with the following tasks:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Task 1:&lt;/P&gt;&lt;P&gt;Kinit the user that is running the script:&lt;BR /&gt;"&lt;SPAN&gt;echo 'password' | kinit user@domain"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Task 2:&lt;/P&gt;&lt;P&gt;Download some files from some location.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Task 3:&lt;BR /&gt;&lt;SPAN&gt;spark-submit /path/to/script.py&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Task 1 and 2 work fine, but task 3 fails with the following:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;py4j.protocol.Py4JJavaError&lt;/SPAN&gt;&lt;SPAN&gt;: An error occurred while calling &lt;/SPAN&gt;&lt;SPAN&gt;o32.csv&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;org.apache.hadoop.security.AccessControlException&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;org.apache.hadoop.security.AccessControlException&lt;/SPAN&gt;&lt;SPAN&gt;: SIMPLE authentication is not enabled. &amp;nbsp;Available:[TOKEN, KERBEROS]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I am a bit confused by this as I am am authenticating the user as a first step. This exact workflow executes just fine when I run it manually in the CL.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Has anyone dealt with a similar issue? Any input would be appreciated as we need to transition to using Airflow.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Many thanks,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Mario&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 21 Apr 2026 06:49:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/375784#M242654</guid>
      <dc:creator>imule</dc:creator>
      <dc:date>2026-04-21T06:49:13Z</dc:date>
    </item>
    <item>
      <title>Re: Issue running spark jobs with Airflow</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/375815#M242670</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/100570"&gt;@imule&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In step3, could you please pass --keytab &amp;lt;key_tab_path&amp;gt; --principal &amp;lt;principal_name&amp;gt; to the spark-submit command.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; In CDP, Airflow integration is not yet we are supported.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2023 08:20:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/375815#M242670</guid>
      <dc:creator>RangaReddy</dc:creator>
      <dc:date>2023-08-31T08:20:43Z</dc:date>
    </item>
    <item>
      <title>Re: Issue running spark jobs with Airflow</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/375884#M242690</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/78612"&gt;@RangaReddy&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Is there a way to generate the file myself or do I need to contact our Active Directory administrators for that?&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2023 14:16:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/375884#M242690</guid>
      <dc:creator>imule</dc:creator>
      <dc:date>2023-08-31T14:16:55Z</dc:date>
    </item>
    <item>
      <title>Re: Issue running spark jobs with Airflow</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/375934#M242709</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/100570"&gt;@imule&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can follow the following steps to generate the keytab and if you don't have permission, please check with your admin team.&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.cloudera.com/data-hub/cloud/access-clusters/topics/dh-retrieving-keytabs.html" target="_blank"&gt;https://docs.cloudera.com/data-hub/cloud/access-clusters/topics/dh-retrieving-keytabs.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 04 Sep 2023 04:53:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/375934#M242709</guid>
      <dc:creator>RangaReddy</dc:creator>
      <dc:date>2023-09-04T04:53:49Z</dc:date>
    </item>
    <item>
      <title>Re: Issue running spark jobs with Airflow</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/376571#M242947</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/100570"&gt;@imule&lt;/a&gt;,&amp;nbsp;Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 20 Sep 2023 14:02:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/376571#M242947</guid>
      <dc:creator>VidyaSargur</dc:creator>
      <dc:date>2023-09-20T14:02:05Z</dc:date>
    </item>
    <item>
      <title>Re: Issue running spark jobs with Airflow</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/377136#M243147</link>
      <description>&lt;P&gt;Hi again&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/78612"&gt;@RangaReddy&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;I'm sorry for the huge delay in reply, unfortunately this triggered a lengthy discussion between us and the AD team.&lt;/P&gt;&lt;P&gt;In the end we managed to get our hands on a keytab file, and we confirmed it works fine by manually submitting the below command:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;kinit -k -t /path/to/keytab/file.keytab username&lt;BR /&gt;&lt;BR /&gt;Unfortunately when we attempt to pass this with a bash operator from an Airflow DAG we get the same error:&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;SPAN&gt;py4j.protocol.Py4JJavaError&lt;/SPAN&gt;&lt;SPAN&gt;: An error occurred while calling&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;o32.csv&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;:&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;org.apache.hadoop.security.AccessControlException&lt;/SPAN&gt;&lt;SPAN&gt;:&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;org.apache.hadoop.security.AccessControlException&lt;/SPAN&gt;&lt;SPAN&gt;: SIMPLE authentication is not enabled. &amp;nbsp;Available:[TOKEN, KERBEROS]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Thank you,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Mario&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 02 Oct 2023 13:29:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/377136#M243147</guid>
      <dc:creator>imule</dc:creator>
      <dc:date>2023-10-02T13:29:01Z</dc:date>
    </item>
    <item>
      <title>Re: Issue running spark jobs with Airflow</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/377159#M243160</link>
      <description>&lt;P&gt;There are two solutions you can try.&lt;/P&gt;&lt;P&gt;1. Create one more shell operator and perform kinit and after that submit your spark&lt;/P&gt;&lt;P&gt;2. Pass the keytab and principal to the spark-submit&lt;/P&gt;</description>
      <pubDate>Tue, 03 Oct 2023 05:58:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Issue-running-spark-jobs-with-Airflow/m-p/377159#M243160</guid>
      <dc:creator>RangaReddy</dc:creator>
      <dc:date>2023-10-03T05:58:09Z</dc:date>
    </item>
  </channel>
</rss>

