Created 12-09-2016 02:22 PM
I tried specifying what Hive must use as null values for the export but it keeps writing out the physical values in the CSV file as "NULL" or "\N' where I need to be empty values just separated by ,. E.g. file line is currently written as W,\N,\N,\N where I need format of W,,,
Created 12-09-2016 03:25 PM
You cannot do this directly from the export command and must do some separate processing.
I feel the best way to do this is to run this pig script on export result
raw = load 'data.csv' using PigStorage(','); nonull = foreach raw generate REPLACE($0, '\\\\N', ''), REPLACE($1, '\\\\N', ''), REPLACE($2, '\\\\N', ''), REPLACE($3, '\\\\N', ''); store nonull into 'nonull/data.csv' using PigStorage(',');
Keep in mind this will result in output in the m-r format in hdfs
data.csv data.csv/_SUCCESS data.csv/part-m-00000 data.csv/part-m-00001 ...
If you want to process this file in hadoop, just point to data.csv
If you want to pull this to edge node with command line use hdfs dfs -getmerge <localpath> nonull/data.csv
If you want to download it using Ambari Files View, just double click on nonull/data.csv the click Select All then Concatenate and it will download as a single file
Created 12-09-2016 03:25 PM
You cannot do this directly from the export command and must do some separate processing.
I feel the best way to do this is to run this pig script on export result
raw = load 'data.csv' using PigStorage(','); nonull = foreach raw generate REPLACE($0, '\\\\N', ''), REPLACE($1, '\\\\N', ''), REPLACE($2, '\\\\N', ''), REPLACE($3, '\\\\N', ''); store nonull into 'nonull/data.csv' using PigStorage(',');
Keep in mind this will result in output in the m-r format in hdfs
data.csv data.csv/_SUCCESS data.csv/part-m-00000 data.csv/part-m-00001 ...
If you want to process this file in hadoop, just point to data.csv
If you want to pull this to edge node with command line use hdfs dfs -getmerge <localpath> nonull/data.csv
If you want to download it using Ambari Files View, just double click on nonull/data.csv the click Select All then Concatenate and it will download as a single file