About Ellyly

Ellyly · ‎09-17-2021

Thanks @willx, this solved my problem, work perfectly !!!

Ellyly · ‎09-15-2021

Hello, For an application, I need to extract the maximum depth from an hdfs directory. I know how to do this action in shell: we can execute find /tmp -type d -printf '%d\n' | sort -rn | head -1 So I wanted to do the same with the find function of hdfs: hdfs dfs -find /tmp -type d but the -type argument does not exist on hdfs, here is the error: find: Unexpected argument: -type Does anyone have any solution or advice for this problem ? ps: my hadoop version Hadoop 2.6.0-cdh5.13. regards, thanks in advance

Ellyly · ‎04-08-2020

Hi, after some researches i have find a solution to this issues. The problem was from the Hive table definition for storing data. I was defining some properties of my table like this : hive.createTable("res_idos_0") .ifNotExists() .prop("serialization.encoding","UTF-8") .prop("escape.delim" , "\t") .column("t_date","TIMESTAMP") But when we are in writeStream and we use special characters, the use of property escape.delim is note supported and we can't save characters correctly. So, i have removed the property escape.delim in my hive table definition and i had also added this line in my code for being certain that file save in HDFS have the right encoding. System.setProperty("file.encoding", "UTF-8")

Ellyly · ‎04-06-2020

Hello, I'm facing an issue with the display and storage of special charactere in hive. I'm using spark for doing a WriteStream like this in Hive, // Write result in hive val query = trimmedDF.writeStream //.format("console") .format("com.hortonworks.spark.sql.hive.llap.streaming.HiveStreamingDataSource") .outputMode("append") .option("metastoreUri", metastoreUri) .option("database", "dwh_prod") .option("table", "res_idos_0") .option("checkpointLocation", "/tmp/idos_LVD_060420_0") .queryName("test_final") .option("truncate", "false") .option("encoding", "UTF-8") .start() query.awaitTermination() but when I have a special charactere Hive doesn't display it. I have already fixe encoding UTF8 in the hive table : select distinct(analyte) from res_idos_0; +--------------------------------------------+ | analyte | +--------------------------------------------+ | D02 | | E | | E - Hauteur Int��rieure jupe - 6,75mm | | Hauteur totale | | Long tube apparent (embout 408 assembl��) | | Side streaming - poids apr��s | | Tenue tube plongeur | | 1 dose - poids avant | | Diam��tre 1er joint de sertissage | | HDS - Saillie Point Mort Bas | | P - Epaisseur tourette P5 - 0,51mm | +--------------------------------------------+ But if I display the data in console with writeStream the special chararacter are correctly display or if I use write fonction for write in hive like this: final_DF.write.format("com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector") .mode("overwrite") .option("table","dwh_prod.result_idos_lims3") .save() The charactere are correctly display in hive +-------------------------------------------+ | analyte | +-------------------------------------------+ | 1 dose | | 1 dose (moyenne) - Kinf | | 1 dose (écart type) | | 1 dose - poids avant | | 1 dose individuelle (maxi) | | 1,00mm | | 1,3,5-trioxane | I use spark 2.3.2 an hive 3.1.0 Those anyone face this issue or have clue or a solution for me. Thanks in advance, Best Regards

Ellyly · ‎03-06-2020

Hello @pal_1990, I think your input is something like this: +----------------------------------------------------+ | semicolon.a | +----------------------------------------------------+ | 1;13004211,13004211_02_13004212,4000000003378605589,1105,2000 | +----------------------------------------------------+ 1 . You need to separate the one from other values, for this I use posexplode fonction: select pe.i,pe.x from semicolon lateral view posexplode(split(a,';')) pe as i,x; +-------+----------------------------------------------------+ | pe.i | pe.x | +-------+----------------------------------------------------+ | 0 | 1 | | 1 | 13004211,13004211_02_13004212,4000000003378605589,1105,2000 | +-------+----------------------------------------------------+ 2. You on only select where pe.i =1: select t.x from (select pe.i,pe.x from semicolon lateral view posexplode(split(a,';')) pe as i,x) t where t.i=1 ; +----------------------------------------------------+ | t.x | +----------------------------------------------------+ | 13004211,13004211_02_13004212,4000000003378605589,1105,2000 | +----------------------------------------------------+ 3. You split values in columns; select split(t.x,',')[0] as col1, split(t.x,',')[1] as col2, split(t.x,',')[2] as col3, split(t.x,',')[3] as col4, split(t.x,',')[4] as col5 from (select pe.i,pe.x from semicolon lateral view posexplode(split(a,';')) pe as i,x) t where t.i=1 ; +-----------+-----------------------+----------------------+-------+-------+ | col1 | col2 | col3 | col4 | col5 | +-----------+-----------------------+----------------------+-------+-------+ | 13004211 | 13004211_02_13004212 | 4000000003378605589 | 1105 | 2000 | +-----------+-----------------------+----------------------+-------+-------+ I hope it will help you. Best regards

Online	Offline
Last Visited	‎10-13-2021 08:48 AM

Member Since	‎12-02-2019 07:01 AM
Last Visited	‎10-13-2021 08:48 AM
Posts	19
Kudos received	4

Cloudera Community

Re: hive doesnt display special charactere from wr...

Re: Need help in Hive Query

Re: HDFS command find argument type

HDFS command find argument type

Re: hive doesnt display special charactere from wr...

hive doesnt display special charactere from writes...

Re: Need help in Hive Query