<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Rowwise manipulation of a DataFrame in PySpark. in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Rowwise-manipulation-of-a-DataFrame-in-PySpark/m-p/226337#M188197</link>
    <description>&lt;P&gt;I have a PySpark dataframe with 87 columns. I want to pass each row of the dataframe to a function and get a list for each row so that I can create a column separately.&lt;/P&gt;&lt;P&gt;`&lt;/P&gt;&lt;H2&gt;PySpark code&lt;/H2&gt;&lt;H2&gt;UDF:&lt;/H2&gt;&lt;P&gt;&lt;CODE&gt;def make_range_vector(row,categories,ledger):    print(type(row),type(categories),type(ledger))&lt;BR /&gt;    category_vector=[]    for category in categories:      if(row[category]!=0):         category_percentage=func.round(row[category]*100/row[ledger])         category_vector.append(category_percentage)      else:          category_vector.append(0)    category_vector=sqlCtx.createDataFrame(category_vector,IntegerType())&lt;BR /&gt;    return category_vector &lt;/CODE&gt;&lt;/P&gt;&lt;H2&gt;Main function&lt;/H2&gt;&lt;P&gt;&lt;CODE&gt; pivot_card.withColumn('category_debit_vector',(make_range_vector(struct([pivot_card[x]  for x in pivot_card.columns] ),pivot_card.columns[3:],'debit'))) &lt;/CODE&gt;&lt;/P&gt;&lt;P&gt;I am beginner in PySpark, and I can't find answers to below questions.&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;P&gt;Print statement outputs &lt;CODE&gt;&amp;lt;class 'pyspark.sql.column.Column'&amp;gt; &amp;lt;class 'list'&amp;gt; &amp;lt;class #'str'&amp;gt;&lt;/CODE&gt;. Shouldn't it be StructType?&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Can I pass a Row object and do something similar, like we do in Pandas ?&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;</description>
    <pubDate>Mon, 12 Aug 2019 18:33:59 GMT</pubDate>
    <dc:creator>preethyvarma</dc:creator>
    <dc:date>2019-08-12T18:33:59Z</dc:date>
  </channel>
</rss>

