Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

Regex doesn't work on ExtractText Processor?

avatar
Rising Star

Hello,

I am trying to extract value from a key/value pair json record. I am trying to extract the value of "image" key.  I used an ExtractText processor but there was no match. Although, there was a match when i used an online regex extractor(https://onlinetexttools.com/extract-regex-matches-from-text).

 

Regex i used in the extracttext processor: (?<=\"image\"\s:\s")[A-Z-a-z-0-9\-\:\/\.\_]+

 

My json record:

{
  "id" : "03ee73b8-a553-4575-8dfa-d0da4e7939e9",
  "url" : "https://ll.thespacedevs.com/2.0.0/launch/03ee73b8-a553-4575-8dfa-d0da4e7939e9/",
  "launch_library_id" : null,
  "slug" : "falcon-9-block-5-galaxy-33-34",
  "name" : "Falcon 9 Block 5 | Galaxy 33 & 34",
  "status" : {
    "id" : 2,
    "name" : "TBD"
  },
  "net" : "2022-10-05T23:07:00Z",
  "window_end" : "2022-10-06T00:14:00Z",
  "window_start" : "2022-10-05T23:07:00Z",
  "inhold" : false,
  "tbdtime" : false,
  "tbddate" : false,
  "probability" : null,
  "holdreason" : "",
  "failreason" : "",
  "hashtag" : null,
  "launch_service_provider" : {
    "id" : 121,
    "url" : "https://ll.thespacedevs.com/2.0.0/agencies/121/",
    "name" : "SpaceX",
    "type" : "Commercial"
  },
  "rocket" : {
    "id" : 7549,
    "configuration" : {
      "id" : 164,
      "launch_library_id" : 188,
      "url" : "https://ll.thespacedevs.com/2.0.0/config/launcher/164/",
      "name" : "Falcon 9",
      "family" : "Falcon",
      "full_name" : "Falcon 9 Block 5",
      "variant" : "Block 5"
    }
  },
  "mission" : {
    "id" : 5976,
    "launch_library_id" : null,
    "name" : "Galaxy 33 & 34",
    "description" : "Galaxy 33, 34 are two geostationary communications satellites manufactured by Northrop Grumman and operated by Intelsat.",
    "launch_designator" : null,
    "type" : "Communications",
    "orbit" : {
      "id" : 2,
      "name" : "Geostationary Transfer Orbit",
      "abbrev" : "GTO"
    }
  },
  "pad" : {
    "id" : 80,
    "url" : "https://ll.thespacedevs.com/2.0.0/pad/80/",
    "agency_id" : 121,
    "name" : "Space Launch Complex 40",
    "info_url" : null,
    "wiki_url" : "https://en.wikipedia.org/wiki/Cape_Canaveral_Air_Force_Station_Space_Launch_Complex_40",
    "map_url" : "http://maps.google.com/maps?q=28.56194122,-80.57735736",
    "latitude" : "28.56194122",
    "longitude" : "-80.57735736",
    "location" : {
      "id" : 12,
      "url" : "https://ll.thespacedevs.com/2.0.0/location/12/",
      "name" : "Cape Canaveral, FL, USA",
      "country_code" : "USA",
      "map_image" : "https://spacelaunchnow-prod-east.nyc3.digitaloceanspaces.com/media/launch_images/location_12_20200803142519.jpg",
      "total_launch_count" : 858,
      "total_landing_count" : 24
    },
    "map_image" : "https://spacelaunchnow-prod-east.nyc3.digitaloceanspaces.com/media/launch_images/pad_80_20200803143323.jpg",
    "total_launch_count" : 154
  },
  "webcast_live" : false,
  "image" : "https://spacelaunchnow-prod-east.nyc3.digitaloceanspaces.com/media/launcher_images/falcon_9_block__image_20210506060831.jpg",
  "infographic" : null,
  "program" : [ ]
}

 

Expected output:

 https://spacelaunchnow-prod-east.nyc3.digitaloceanspaces.com/media/launcher_images/falcon_9_block__i...

 

 

Thanks for your help.

2 ACCEPTED SOLUTIONS

avatar
Master Collaborator

Hi,

 

You dont have to use the ExtractText processor for this. Use the EvaluateJsonPath processor with the following configuration:

 

SAMSAL_0-1664202157672.png

 

If you find this helpful please accept solution.

Thanks

View solution in original post

avatar
Explorer

Hi @rafy

I tried the same regex with the same sample in 1.13.2 and 1.16.3 and both resulted in image url string.

There can be a case with nifi JSON beautificator, the initial JSON lacks spaces and line breaks.

Expression that works with https://ll.thespacedevs.com/2.0.0/launch/ is:
(?<=\"image\":\")[A-Z-a-z-0-9\-\:\/\.\_]+

and as @SAMSAL said, EvaluateJsonPath is the right tool for this job.

View solution in original post

3 REPLIES 3

avatar
Master Collaborator

Hi,

 

You dont have to use the ExtractText processor for this. Use the EvaluateJsonPath processor with the following configuration:

 

SAMSAL_0-1664202157672.png

 

If you find this helpful please accept solution.

Thanks

avatar
Explorer

Hi @rafy

I tried the same regex with the same sample in 1.13.2 and 1.16.3 and both resulted in image url string.

There can be a case with nifi JSON beautificator, the initial JSON lacks spaces and line breaks.

Expression that works with https://ll.thespacedevs.com/2.0.0/launch/ is:
(?<=\"image\":\")[A-Z-a-z-0-9\-\:\/\.\_]+

and as @SAMSAL said, EvaluateJsonPath is the right tool for this job.

avatar
Rising Star

Thank you all.

I eventually evaluated the json path to extract the url. My mind was astray as i was using complex solution to a simple problem. 

Labels