Created on 04-08-2020 08:51 AM - last edited on 04-08-2020 09:58 AM by cjervis
Hi, I'm attempting to replace values in an ASCII file that im importing from a website. I've added the header with replace text but, I'm having a hard time replacing the spaces with commas so, I can convert the file into csv and later to json. Here is what I'm dealing with.
Edit: I should add that sometimes there is one space other times there are two spaces. As of now I have two spaces in the search attribute and a comma in the replace attribute works for part of this problem.
,
UTF-8
1 MB
Regex Replace
Line-by-Line
All
Before:
Year,Day,Hour,Minute,ID_for_SW_Plasma_spacecraft,Percent_of_interpolation,Timeshift,RMS_Timeshift,RMS_Min_var,Time_btwn_observation_sec,Field_magnutude_average_nT,BY_nT(GSM),BZ_nT_(GSM),RMS_SD_B_scalar_nT,RMS_SD_field_vector_nT,Speed_km/s,Alfven_mach_number,Magnetosonic_Mach_number,BSN_location_Xgse_Re,2019 1 0 0 51 100 2788 164 0.12 999999 5.11 2.00 2.73 0.08 1.03 451.0 9.8 6.5 13.15
2019 1 0 1 51 100 2810 159 0.12 37 5.10 2.33 2.58 0.11 1.04 451.3 9.8 6.5 13.10
2019 1 0 2 51 80 2852 109 0.09 18 4.86 2.37 2.56 0.12 0.56 454.7 10.3 6.7 13.07
2019 1 0 3 51 67 2951 66 0.06 -39 4.78 2.21 2.55 0.03 0.33 452.3 11.0 6.8 13.00
2019 1 0 4 51 100 3025 7 0.00 -13 4.80 2.17 2.37 0.03 0.14 451.4 11.2 6.8 13.00
2019 1 0 5 99 80 2973 111 0.09 111 4.94 2.68 2.39 0.13 0.55 99999.9 999.9 99.9 13.19
2019 1 0 6 51 67 3074 20 0.02 -40 4.88 2.54 2.01 0.02 0.28 451.0 9.8 6.5 13.27
2019 1 0 7 51 50 3114 9 0.00 19 4.82 2.37 2.93 0.02 0.14 451.0 9.9 6.5 13.29
2019 1 0 8 99 999 999999 999999 99.99 999999 9999.99 9999.99 9999.99 9999.99 9999.99 99999.9 999.9 99.9 9999.99
2019 1 0 9 99 100 3036 0 0.00 999999 5.16 3.34 2.44 0.00 0.00 99999.9 999.9 99.9 13.24
2019 1 0 10 99 100 3036 0 0.00 60 5.16 3.34 2.43 0.00 0.00 99999.9 999.9 99.9 13.24
after:
Year,Day,Hour,Minute,ID_for_SW_Plasma_spacecraft,Percent_of_interpolation,Timeshift,RMS_Timeshift,RMS_Min_var,Time_btwn_observation_sec,Field_magnutude_average_nT,BY_nT(GSM),BZ_nT_(GSM),RMS_SD_B_scalar_nT,RMS_SD_field_vector_nT,Speed_km/s,Alfven_mach_number,Magnetosonic_Mach_number,BSN_location_Xgse_Re,2019, 1,0,0 51 100, 2788,,164,0.12 999999,,5.11,,2.00,,2.73,,0.08,,1.03, 451.0, 9.8,6.5, 13.15
2019, 1,0,1 51 100, 2810,,159,0.12,, 37,,5.10,,2.33,,2.58,,0.11,,1.04, 451.3, 9.8,6.5, 13.10
2019, 1,0,2 51,80, 2852,,109,0.09,, 18,,4.86,,2.37,,2.56,,0.12,,0.56, 454.7,10.3,6.7, 13.07
2019, 1,0,3 51,67, 2951,, 66,0.06,,-39,,4.78,,2.21,,2.55,,0.03,,0.33, 452.3,11.0,6.8, 13.00
2019, 1,0,4 51 100, 3025,,,7,0.00,,-13,,4.80,,2.17,,2.37,,0.03,,0.14, 451.4,11.2,6.8, 13.00
2019, 1,0,5 99,80, 2973,,111,0.09,,111,,4.94,,2.68,,2.39,,0.13,,0.55 99999.9 999.9 99.9, 13.19
2019, 1,0,6 51,67, 3074,, 20,0.02,,-40,,4.88,,2.54,,2.01,,0.02,,0.28, 451.0, 9.8,6.5, 13.27
2019, 1,0,7 51,50, 3114,,,9,0.00,, 19,,4.82,,2.37,,2.93,,0.02,,0.14, 451.0, 9.9,6.5, 13.29
2019, 1,0,8 99 999 999999 999999 99.99 999999 9999.99 9999.99 9999.99 9999.99 9999.99 99999.9 999.9 99.9 9999.99
2019, 1,0,9 99 100, 3036,,,0,0.00 999999,,5.16,,3.34,,2.44,,0.00,,0.00 99999.9 999.9 99.9, 13.24
2019, 1,0 10 99 100, 3036,,,0,0.00,, 60,,5.16,,3.34,,2.43,,0.00,,0.00 99999.9 999.9 99.9, 13.24
2019, 1,0 11 99 999 999999 999999 99.99 999999 9999.99 9999.99 9999.99 9999.99 9999.99 99999.9 999.9 99.9 9999.99
2019, 1,0 12 99 999 999999 999999 99.99 999999 9999.99 9999.99 9999.99 9999.99 9999.99 99999.9 999.9 99.9 9999.99
Created 04-08-2020 01:26 PM
@ForrestGump there must be some configuration difference then. I created a simple flow and was able to get below output using exact proc I screen shot above on your Before data:
Year,Day,Hour,Minute,ID_for_SW_Plasma_spacecraft,Percent_of_interpolation,Timeshift,RMS_Timeshift,RMS_Min_var,Time_btwn_observation_sec,Field_magnutude_average_nT,BY_nT(GSM),BZ_nT_(GSM),RMS_SD_B_scalar_nT,RMS_SD_field_vector_nT,Speed_km/s,Alfven_mach_number,Magnetosonic_Mach_number,BSN_location_Xgse_Re,2019,1,0,0,51,100,2788,164,0.12,999999,5.11,2.00,2.73,0.08,1.03,451.0,9.8,6.5,13.15
2019,1,0,1,51,100,2810,159,0.12,37,5.10,2.33,2.58,0.11,1.04,451.3,9.8,6.5,13.10
2019,1,0,2,51,80,2852,109,0.09,18,4.86,2.37,2.56,0.12,0.56,454.7,10.3,6.7,13.07
2019,1,0,3,51,67,2951,66,0.06,-39,4.78,2.21,2.55,0.03,0.33,452.3,11.0,6.8,13.00
2019,1,0,4,51,100,3025,7,0.00,-13,4.80,2.17,2.37,0.03,0.14,451.4,11.2,6.8,13.00
2019,1,0,5,99,80,2973,111,0.09,111,4.94,2.68,2.39,0.13,0.55,99999.9,999.9,99.9,13.19
2019,1,0,6,51,67,3074,20,0.02,-40,4.88,2.54,2.01,0.02,0.28,451.0,9.8,6.5,13.27
2019,1,0,7,51,50,3114,9,0.00,19,4.82,2.37,2.93,0.02,0.14,451.0,9.9,6.5,13.29
2019,1,0,8,99,999,999999,999999,99.99,999999,9999.99,9999.99,9999.99,9999.99,9999.99,99999.9,999.9,99.9,9999.99
2019,1,0,9,99,100,3036,0,0.00,999999,5.16,3.34,2.44,0.00,0.00,99999.9,999.9,99.9,13.24
2019,1,0,10,99,100,3036,0,0.00,60,5.16,3.34,2.43,0.00,0.00,99999.9,999.9,99.9,13.24
I have dropped the template for you here:
https://github.com/steven-dfheinz/NiFi-Templates/blob/master/Replace_Text_Demo.xml
Created on 04-08-2020 10:00 AM - edited 04-08-2020 10:01 AM
Created 04-08-2020 10:33 AM
Unfortunately this method does not work @stevenmatison
I using Replace Text but the only method that has worked so far is below. this has only lead to the following results:
Year,Day,Hour,Minute,ID_for_SW_Plasma_spacecraft,Percent_of_interpolation,Timeshift,RMS_Timeshift,RMS_Min_var,Time_btwn_observation_sec,Field_magnutude_average_nT,BY_nT(GSM),BZ_nT_(GSM),RMS_SD_B_scalar_nT,RMS_SD_field_vector_nT,Speed_km/s,Alfven_mach_number,Magnetosonic_Mach_number,BSN_location_Xgse_Re2019,,,1,,0,,0,51,100,,,2788,,,,164,,0.12,999999,,,,5.11,,,,2.00,,,,2.73,,,,0.08,,,,1.03,,,451.0,,,9.8,,6.5,,,13.15,2019,,,1,,0,,1,51,100,,,2810,,,,159,,0.12,,,,,37,,,,5.10,,,,2.33,,,,2.58,,,,0.11,,,,1.04,,,451.3,,,9.8,,6.5,,,13.10,2019,,,1,,0,,2,51,,80,,,2852,,,,109,,0.09,,,,,18,,,,4.86,,,,2.37,,,,2.56,,,,0.12,,,,0.56,,,454.7,,10.3,,6.7,,,13.07,2019,,,1,,0,,3,51,,67,,,2951,,,,,66,,0.06,,,,-39,,,,4.78,,,,2.21,,,,2.55,,,,0.03,,,,0.33,,,452.3,,11.0,,6.8,,,13.00,2019,,,1,,0,,4,51,100,,,3025,,,,,,7,,0.00,,,,-13,,,,4.80,,,,2.17,,,,2.37,,,,0.03,,,,0.14,,,451.4,,11.2,,6.8,,,13.00,2019,,,1,,0,,5,99,,80,,,2973,,,,111,,0.09,,,,111,,,,4.94,,,,2.68,,,,2.39,,,,0.13,,,,0.55,99999.9,999.9,99.9,,,13.19,2019,,,1,,0,,6,51,,67,,,3074,,,,,20,,0.02,,,,-40,,,,4.88,,,,2.54,,,,2.01,,,,0.02,,,,0.28,,,451.0,,,9.8,,6.5,,,13.27,2019,,,1,,0,,7
\s
,
UTF-8
1 MB
Regex Replace
Line-by-Line
All
Created on 04-08-2020 11:51 AM - edited 04-08-2020 11:51 AM
Notice my settings for All Properties and I use Regex.
If you don't want to use Regex, change Replacement Strategy.
I would recommend doing it as I have shown, with regex on complete flow file. Not line by line. Also, you can append and prepend more text and transformations in same replace too. For Example:
PREPEND
${'$1':replace('"',''):replace(',',' '):replace('\n',',\n')}
APPEND
Created 04-08-2020 12:35 PM
I've made these modification and all it does is route it to failed.
Created 04-08-2020 01:26 PM
@ForrestGump there must be some configuration difference then. I created a simple flow and was able to get below output using exact proc I screen shot above on your Before data:
Year,Day,Hour,Minute,ID_for_SW_Plasma_spacecraft,Percent_of_interpolation,Timeshift,RMS_Timeshift,RMS_Min_var,Time_btwn_observation_sec,Field_magnutude_average_nT,BY_nT(GSM),BZ_nT_(GSM),RMS_SD_B_scalar_nT,RMS_SD_field_vector_nT,Speed_km/s,Alfven_mach_number,Magnetosonic_Mach_number,BSN_location_Xgse_Re,2019,1,0,0,51,100,2788,164,0.12,999999,5.11,2.00,2.73,0.08,1.03,451.0,9.8,6.5,13.15
2019,1,0,1,51,100,2810,159,0.12,37,5.10,2.33,2.58,0.11,1.04,451.3,9.8,6.5,13.10
2019,1,0,2,51,80,2852,109,0.09,18,4.86,2.37,2.56,0.12,0.56,454.7,10.3,6.7,13.07
2019,1,0,3,51,67,2951,66,0.06,-39,4.78,2.21,2.55,0.03,0.33,452.3,11.0,6.8,13.00
2019,1,0,4,51,100,3025,7,0.00,-13,4.80,2.17,2.37,0.03,0.14,451.4,11.2,6.8,13.00
2019,1,0,5,99,80,2973,111,0.09,111,4.94,2.68,2.39,0.13,0.55,99999.9,999.9,99.9,13.19
2019,1,0,6,51,67,3074,20,0.02,-40,4.88,2.54,2.01,0.02,0.28,451.0,9.8,6.5,13.27
2019,1,0,7,51,50,3114,9,0.00,19,4.82,2.37,2.93,0.02,0.14,451.0,9.9,6.5,13.29
2019,1,0,8,99,999,999999,999999,99.99,999999,9999.99,9999.99,9999.99,9999.99,9999.99,99999.9,999.9,99.9,9999.99
2019,1,0,9,99,100,3036,0,0.00,999999,5.16,3.34,2.44,0.00,0.00,99999.9,999.9,99.9,13.24
2019,1,0,10,99,100,3036,0,0.00,60,5.16,3.34,2.43,0.00,0.00,99999.9,999.9,99.9,13.24
I have dropped the template for you here:
https://github.com/steven-dfheinz/NiFi-Templates/blob/master/Replace_Text_Demo.xml
Created 04-08-2020 01:55 PM
Thanks for the effort @stevenmatison I imagine it does have to do with configuration.