Member since
04-19-2020
2
Posts
0
Kudos Received
0
Solutions
08-19-2022
04:26 AM
Hello @ssubhas , the above worked however, when we try the same with LazySerde, it is able to escape the delimiter but loads few NULL values at the end. PFB snippet of statement I used: CREATE TABLE test1(5columns string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES( 'separatorChar'='|', 'escapeChar'='\\' ) STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'; NOTE: also tried field.delim=|, format.serialization=|. It works when serde properties are not mentioned and we use escape by Clause as you suggested, any way to make it work with LazySerde as well? (Data is Pipe delimited & may also have pipe in the data). Please suggest and help.
... View more
09-04-2020
04:20 AM
Hi...to know which which flowfile completed, you can use a putemail processor to get an email when a particular flowfiles is finished. You can make it dynamic using db.table.name attribute which is added by generatetablefetch...if you have a lot of flowfiles for a single table, you can merge the flowfiles using mergecontent on tablename to give you periodic or batch completion status. Another way could be to write success and failures to for e.g hive table and you can check the table for completions and failure. Hope this helps. If the comment helps you to find a solution or move forward, please accept it as a solution for other community members.
... View more