<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: hive doesnt display special charactere from writestream in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/hive-doesnt-display-special-charactere-from-writestream/m-p/293506#M216728</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;after some researches i have find a solution to this issues. The problem was from the Hive table definition for storing data.&lt;/P&gt;&lt;P&gt;I was defining some properties of my table like this :&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;hive.createTable("res_idos_0")
        .ifNotExists()
        .prop("serialization.encoding","UTF-8")
        .prop("escape.delim" , "\t")
        .column("t_date","TIMESTAMP")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But when we are in writeStream and we use special characters, the use of property escape.delim is note supported and we can't save characters correctly.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So, i have removed the property escape.delim in my hive table definition and i had also added this line&amp;nbsp;in my code for being certain that file save in HDFS have the right encoding.&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;System.setProperty("file.encoding", "UTF-8")&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 08 Apr 2020 13:13:48 GMT</pubDate>
    <dc:creator>Ellyly</dc:creator>
    <dc:date>2020-04-08T13:13:48Z</dc:date>
    <item>
      <title>hive doesnt display special charactere from writestream</title>
      <link>https://community.cloudera.com/t5/Support-Questions/hive-doesnt-display-special-charactere-from-writestream/m-p/293347#M216649</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm facing an issue with the display and storage of special charactere in hive.&lt;/P&gt;
&lt;P&gt;I'm using spark for doing a WriteStream like this&amp;nbsp;in Hive,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;// Write result in hive
    val query = trimmedDF.writeStream
      //.format("console")
      .format("com.hortonworks.spark.sql.hive.llap.streaming.HiveStreamingDataSource")
      .outputMode("append")
      .option("metastoreUri", metastoreUri)
      .option("database", "dwh_prod")
      .option("table", "res_idos_0")
      .option("checkpointLocation", "/tmp/idos_LVD_060420_0")
      .queryName("test_final")
      .option("truncate", "false")
      .option("encoding", "UTF-8")
      .start()

    query.awaitTermination()&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;but when I have a special charactere Hive doesn't display it. I have already fixe encoding UTF8 in the hive table :&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;select distinct(analyte) from res_idos_0;
+--------------------------------------------+
|                  analyte                   |
+--------------------------------------------+
| D02                                        |
| E                                          |
| E - Hauteur Int��rieure jupe - 6,75mm      |
| Hauteur totale                             |
| Long tube apparent (embout 408 assembl��)  |
| Side streaming - poids apr��s              |
| Tenue tube plongeur                        |
| 1 dose - poids avant                       |
| Diam��tre 1er joint de sertissage          |
| HDS - Saillie Point Mort Bas               |
| P - Epaisseur tourette P5 - 0,51mm         |
+--------------------------------------------+&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But if I display the data in console with writeStream the special chararacter are correctly display or if I use write fonction for write in hive like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;final_DF.write.format("com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector")
      .mode("overwrite")
      .option("table","dwh_prod.result_idos_lims3")
      .save()&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The charactere are correctly display in hive&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;+-------------------------------------------+
|                  analyte                  |
+-------------------------------------------+
| 1 dose                                    |
| 1 dose (moyenne) - Kinf                   |
| 1 dose (écart type)                       |
| 1 dose - poids avant                      |
| 1 dose individuelle (maxi)                |
| 1,00mm                                    |
| 1,3,5-trioxane                            |&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I use spark 2.3.2 an hive 3.1.0&lt;/P&gt;
&lt;P&gt;Those anyone face this issue or have clue or a solution for me.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in advance,&lt;/P&gt;
&lt;P&gt;Best Regards&lt;/P&gt;</description>
      <pubDate>Mon, 06 Apr 2020 12:53:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/hive-doesnt-display-special-charactere-from-writestream/m-p/293347#M216649</guid>
      <dc:creator>Ellyly</dc:creator>
      <dc:date>2020-04-06T12:53:10Z</dc:date>
    </item>
    <item>
      <title>Re: hive doesnt display special charactere from writestream</title>
      <link>https://community.cloudera.com/t5/Support-Questions/hive-doesnt-display-special-charactere-from-writestream/m-p/293506#M216728</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;after some researches i have find a solution to this issues. The problem was from the Hive table definition for storing data.&lt;/P&gt;&lt;P&gt;I was defining some properties of my table like this :&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;hive.createTable("res_idos_0")
        .ifNotExists()
        .prop("serialization.encoding","UTF-8")
        .prop("escape.delim" , "\t")
        .column("t_date","TIMESTAMP")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But when we are in writeStream and we use special characters, the use of property escape.delim is note supported and we can't save characters correctly.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So, i have removed the property escape.delim in my hive table definition and i had also added this line&amp;nbsp;in my code for being certain that file save in HDFS have the right encoding.&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;System.setProperty("file.encoding", "UTF-8")&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Apr 2020 13:13:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/hive-doesnt-display-special-charactere-from-writestream/m-p/293506#M216728</guid>
      <dc:creator>Ellyly</dc:creator>
      <dc:date>2020-04-08T13:13:48Z</dc:date>
    </item>
  </channel>
</rss>

