<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: pyspark / pyarrow problem in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/pyspark-pyarrow-problem/m-p/88853#M21544</link>
    <description>&lt;P&gt;From the details which you shared, we could see that pyspark is pointing to older version(libboost_system.so.1.65.1) of libboost than the one expected (libboost_system.so.1.66.0) {{ dzdo /opt/cloudera/parcels/Anaconda/bin/conda list |grep boost libboost 1.65.1 habcd387_4 }} It looks like that new version of PyArrow was not installed properly. So please try clean older packages and then install pyarrow again using below command {{ conda install -c conda-forge pyarrow }} Best Regards, Senthil Kumar&lt;/P&gt;</description>
    <pubDate>Tue, 09 Apr 2019 03:28:45 GMT</pubDate>
    <dc:creator>senthh</dc:creator>
    <dc:date>2019-04-09T03:28:45Z</dc:date>
    <item>
      <title>pyspark / pyarrow problem</title>
      <link>https://community.cloudera.com/t5/Support-Questions/pyspark-pyarrow-problem/m-p/88835#M21543</link>
      <description>&lt;P&gt;&lt;SPAN&gt;We're using cloudera with anaconda parcel on bda production cluster . &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I tried to execute pyspark code that imports pyarrow package , then i faced with error below . &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Traceback (most recent call last):&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp;File "/opt/cloudera/parcels/SPARK2-2.3.0.cloudera2-1.cdh5.13.3.p0.316101/lib/spark2/python/lib/pyspark.zip/pyspark/sql/utils.py", line 140, in require_minimum_pyarrow_version&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp;File "/opt/cloudera/parcels/Anaconda-3.6.5_2/lib/python3.6/site-packages/pyarrow/__init__.py", line 47, in &amp;lt;module&amp;gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;from pyarrow.lib import cpu_count, set_cpu_count&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;ImportError: libboost_system.so.1.66.0: cannot open shared object file: No such file or directory&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;During handling of the above exception, another exception occurred:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Traceback (most recent call last):&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp;File "lbf_livArea_scr_2.py", line 51, in &amp;lt;module&amp;gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;@pandas_udf(schema, PandasUDFType.GROUPED_MAP)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp;File "/opt/cloudera/parcels/SPARK2-2.3.0.cloudera2-1.cdh5.13.3.p0.316101/lib/spark2/python/lib/pyspark.zip/pyspark/sql/udf.py", line 45, in _create_udf&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp;File "/opt/cloudera/parcels/SPARK2-2.3.0.cloudera2-1.cdh5.13.3.p0.316101/lib/spark2/python/lib/pyspark.zip/pyspark/sql/utils.py", line 143, in require_minimum_pyarrow_version&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;ImportError: PyArrow &amp;gt;= 0.8.0 must be installed; however, it was not found.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Also output of conda list is below . &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;[ihsany@gbbdap02 ~]$ dzdo /opt/cloudera/parcels/Anaconda/bin/conda list |grep arrow&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;arrow-cpp &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.9.0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;py36h1ae9da6_7 &amp;nbsp; &amp;nbsp;&amp;lt;unknown&amp;gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;pyarrow &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.9.0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;py36_1 &amp;nbsp; &amp;nbsp;&amp;lt;unknown&amp;gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;[ihsany@gbbdap02 ~]$ dzdo /opt/cloudera/parcels/Anaconda/bin/conda list |grep boost&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;libboost &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1.65.1 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; habcd387_4 &amp;nbsp; &amp;nbsp;&amp;lt;unknown&amp;gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 14:17:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/pyspark-pyarrow-problem/m-p/88835#M21543</guid>
      <dc:creator>pra_big</dc:creator>
      <dc:date>2022-09-16T14:17:42Z</dc:date>
    </item>
    <item>
      <title>Re: pyspark / pyarrow problem</title>
      <link>https://community.cloudera.com/t5/Support-Questions/pyspark-pyarrow-problem/m-p/88853#M21544</link>
      <description>&lt;P&gt;From the details which you shared, we could see that pyspark is pointing to older version(libboost_system.so.1.65.1) of libboost than the one expected (libboost_system.so.1.66.0) {{ dzdo /opt/cloudera/parcels/Anaconda/bin/conda list |grep boost libboost 1.65.1 habcd387_4 }} It looks like that new version of PyArrow was not installed properly. So please try clean older packages and then install pyarrow again using below command {{ conda install -c conda-forge pyarrow }} Best Regards, Senthil Kumar&lt;/P&gt;</description>
      <pubDate>Tue, 09 Apr 2019 03:28:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/pyspark-pyarrow-problem/m-p/88853#M21544</guid>
      <dc:creator>senthh</dc:creator>
      <dc:date>2019-04-09T03:28:45Z</dc:date>
    </item>
    <item>
      <title>Re: pyspark / pyarrow problem</title>
      <link>https://community.cloudera.com/t5/Support-Questions/pyspark-pyarrow-problem/m-p/88866#M21545</link>
      <description>Thanks Gentleman, Let me re-install. I hope it works</description>
      <pubDate>Tue, 09 Apr 2019 11:20:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/pyspark-pyarrow-problem/m-p/88866#M21545</guid>
      <dc:creator>pra_big</dc:creator>
      <dc:date>2019-04-09T11:20:21Z</dc:date>
    </item>
  </channel>
</rss>

