Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Regex doesn't work on ExtractText Processor?

avatar
Contributor

Hello,

I am trying to extract value from a key/value pair json record. I am trying to extract the value of "image" key.  I used an ExtractText processor but there was no match. Although, there was a match when i used an online regex extractor(https://onlinetexttools.com/extract-regex-matches-from-text).

 

Regex i used in the extracttext processor: (?<=\"image\"\s:\s")[A-Z-a-z-0-9\-\:\/\.\_]+

 

My json record:

{
  "id" : "03ee73b8-a553-4575-8dfa-d0da4e7939e9",
  "url" : "https://ll.thespacedevs.com/2.0.0/launch/03ee73b8-a553-4575-8dfa-d0da4e7939e9/",
  "launch_library_id" : null,
  "slug" : "falcon-9-block-5-galaxy-33-34",
  "name" : "Falcon 9 Block 5 | Galaxy 33 & 34",
  "status" : {
    "id" : 2,
    "name" : "TBD"
  },
  "net" : "2022-10-05T23:07:00Z",
  "window_end" : "2022-10-06T00:14:00Z",
  "window_start" : "2022-10-05T23:07:00Z",
  "inhold" : false,
  "tbdtime" : false,
  "tbddate" : false,
  "probability" : null,
  "holdreason" : "",
  "failreason" : "",
  "hashtag" : null,
  "launch_service_provider" : {
    "id" : 121,
    "url" : "https://ll.thespacedevs.com/2.0.0/agencies/121/",
    "name" : "SpaceX",
    "type" : "Commercial"
  },
  "rocket" : {
    "id" : 7549,
    "configuration" : {
      "id" : 164,
      "launch_library_id" : 188,
      "url" : "https://ll.thespacedevs.com/2.0.0/config/launcher/164/",
      "name" : "Falcon 9",
      "family" : "Falcon",
      "full_name" : "Falcon 9 Block 5",
      "variant" : "Block 5"
    }
  },
  "mission" : {
    "id" : 5976,
    "launch_library_id" : null,
    "name" : "Galaxy 33 & 34",
    "description" : "Galaxy 33, 34 are two geostationary communications satellites manufactured by Northrop Grumman and operated by Intelsat.",
    "launch_designator" : null,
    "type" : "Communications",
    "orbit" : {
      "id" : 2,
      "name" : "Geostationary Transfer Orbit",
      "abbrev" : "GTO"
    }
  },
  "pad" : {
    "id" : 80,
    "url" : "https://ll.thespacedevs.com/2.0.0/pad/80/",
    "agency_id" : 121,
    "name" : "Space Launch Complex 40",
    "info_url" : null,
    "wiki_url" : "https://en.wikipedia.org/wiki/Cape_Canaveral_Air_Force_Station_Space_Launch_Complex_40",
    "map_url" : "http://maps.google.com/maps?q=28.56194122,-80.57735736",
    "latitude" : "28.56194122",
    "longitude" : "-80.57735736",
    "location" : {
      "id" : 12,
      "url" : "https://ll.thespacedevs.com/2.0.0/location/12/",
      "name" : "Cape Canaveral, FL, USA",
      "country_code" : "USA",
      "map_image" : "https://spacelaunchnow-prod-east.nyc3.digitaloceanspaces.com/media/launch_images/location_12_20200803142519.jpg",
      "total_launch_count" : 858,
      "total_landing_count" : 24
    },
    "map_image" : "https://spacelaunchnow-prod-east.nyc3.digitaloceanspaces.com/media/launch_images/pad_80_20200803143323.jpg",
    "total_launch_count" : 154
  },
  "webcast_live" : false,
  "image" : "https://spacelaunchnow-prod-east.nyc3.digitaloceanspaces.com/media/launcher_images/falcon_9_block__image_20210506060831.jpg",
  "infographic" : null,
  "program" : [ ]
}

 

Expected output:

 https://spacelaunchnow-prod-east.nyc3.digitaloceanspaces.com/media/launcher_images/falcon_9_block__i...

 

 

Thanks for your help.

2 ACCEPTED SOLUTIONS

avatar

Hi,

 

You dont have to use the ExtractText processor for this. Use the EvaluateJsonPath processor with the following configuration:

 

SAMSAL_0-1664202157672.png

 

If you find this helpful please accept solution.

Thanks

View solution in original post

avatar
Contributor

Hi @rafy

I tried the same regex with the same sample in 1.13.2 and 1.16.3 and both resulted in image url string.

There can be a case with nifi JSON beautificator, the initial JSON lacks spaces and line breaks.

Expression that works with https://ll.thespacedevs.com/2.0.0/launch/ is:
(?<=\"image\":\")[A-Z-a-z-0-9\-\:\/\.\_]+

and as @SAMSAL said, EvaluateJsonPath is the right tool for this job.

View solution in original post

3 REPLIES 3

avatar

Hi,

 

You dont have to use the ExtractText processor for this. Use the EvaluateJsonPath processor with the following configuration:

 

SAMSAL_0-1664202157672.png

 

If you find this helpful please accept solution.

Thanks

avatar
Contributor

Hi @rafy

I tried the same regex with the same sample in 1.13.2 and 1.16.3 and both resulted in image url string.

There can be a case with nifi JSON beautificator, the initial JSON lacks spaces and line breaks.

Expression that works with https://ll.thespacedevs.com/2.0.0/launch/ is:
(?<=\"image\":\")[A-Z-a-z-0-9\-\:\/\.\_]+

and as @SAMSAL said, EvaluateJsonPath is the right tool for this job.

avatar
Contributor

Thank you all.

I eventually evaluated the json path to extract the url. My mind was astray as i was using complex solution to a simple problem.