Member since
11-05-2019
25
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
260 | 01-15-2020 01:36 PM |
10-14-2020
05:34 AM
I ended up executing unzip from execute process and then a ListFiles to get the new files created by unzip command.
... View more
07-26-2020
02:20 PM
Hi, sorry for the bump, any opinion related to this topic? Thanks!
... View more
07-10-2020
03:22 PM
Hi! I've using NiFi for around a year, every day I like it more, very flexible, processors for every need, have some nice reusable templates, etc. So, here goes the question, why there is so more hype around Airflow? I didn't have a deep dive on the platform, watched some tutorials for ETL tasks, and still don't feel it as friendly as NiFi for the same tasks. I started to think, why do everyone defaults to Airfow and don't even try NiFi? should I be testing Airflow to do same tasks? Both seem to have a nice user base (it's a strange metric, but looking on linkedin for both platfoms Airflow gives me double the results 4k vs 2k). Meybe the NiFi support forum on cloudera is somehow biased, but I love to read some opinions. Thanks!
... View more
Labels:
07-05-2020
03:16 PM
Sorry for the bump, but I am very courious of how everyone handles this kind of *possible* security issues. Thanks!
... View more
06-25-2020
06:39 PM
I have my credential stored in AWS Secret Manager, use a PutLamda to retrieve it and send it to a jsonextract and then invokehttp processor where I use the credentials in the body of a post message to get a bearer token. Between those processors the secret travels in plaintext. Is there a way to hide the credentials? The idea is to rotate the credentials on aws and the only way to view those is using aws console, doing this we can log the access to the secret, when and who. Thank you!
... View more
Labels:
06-04-2020
01:21 PM
Hi, I get a rar file from a SFTP server, have it in a flowfile I'm trying to run unrar on the flowfile with an executeStreamCommand withtout persisting it with no luck, tried setting working directory to /tmp too, unrar is in path an executing: Is there any way to do this or I have to dump the files to disk, decompress them there with this processor and get the files again? Thanks!
... View more
Labels:
06-04-2020
05:24 AM
Great! thanks!
... View more
06-03-2020
01:54 PM
Maybe it's not implemented, I'm logged in on Nifi, commit a processor, go into nifi-registry and it shows the commit was made by anonymous (I have no authentication on nifi-registry), but I have OpenID in NiFi, should it use this user to log the commit? Thank you!
... View more
Labels:
06-03-2020
01:51 PM
Hi! I have a DBCPConnectionPool on NiFi 1.11.3 with this configuration: Normally it works ok, but sometimes the query get stuck on a PutSQL, no errors for minutes and then a timeout (very basic queries) I was trying some configurations on the pool but it seems I didn't quite understand the best way to troubleshoot this, I was thinking in each processor add a retry+counter+error. But don't know where to set the time out for queries from start of it and no per connection. Maybe a way is to not keep connections open until the query starts, how should I set the pool to generate the connection on request from a processor and terminate it before execution. I think it's a Athena problem with pools because I have no problem with another databases like postgres and mysql. Any ideas?? Thank you!
... View more
Labels:
05-24-2020
05:50 AM
Hi, it seems I forgot to set that varialble for the clean test I made to make the post, this is current setting: and this is the error I get with the context added: Thank you!
... View more
05-21-2020
01:27 PM
Hi, sorry for the bump, I don't know what other things to try, any lead will help. Thank you!
... View more
05-15-2020
11:36 AM
Hi i'm having a problem with self S2S reports, it's a NIFI 1.11.3 standalone (no cluster) linux deployment, this is the related configuration: # Site to Site properties
nifi.remote.input.host=localhost
nifi.remote.input.secure=true
nifi.remote.input.socket.port=10443
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec
nifi.remote.contents.cache.expiration=30 secs # web properties #
nifi.web.war.directory=./lib
nifi.web.http.host=
nifi.web.http.port=
nifi.web.http.network.interface.default=
nifi.web.https.host=my.domain.com
nifi.web.https.port=8443
nifi.web.https.network.interface.default=
nifi.web.jetty.working.directory=./work/jetty
nifi.web.jetty.threads=200
nifi.web.max.header.size=16 KB
nifi.web.proxy.context.path=
nifi.web.proxy.host= nifi.security.keystore=./conf/keystore.jks
nifi.security.keystoreType=jks
nifi.security.keystorePasswd=xxxxxxxxxxxx
nifi.security.keyPasswd=xxxxxxxxxxxxxxxx
nifi.security.truststore=./conf/truststore.jks
nifi.security.truststoreType=jks
nifi.security.truststorePasswd=xxxxxxxxxxxxxxxxx
nifi.security.user.authorizer=managed-authorizer
nifi.security.user.login.identity.provider=
nifi.security.ocsp.responder.url=
nifi.security.ocsp.responder.certificate= And the processor configuration, StandardRestrictedSSLContextService (using the self signed keystore from nifi that allows it to work securely) And the SiteToSiteBulletinReportingTask: with that configuration I receive this errors: SiteToSiteBulletinReportingTask[id=017111a7-83c2-1c18-25d3-ad4d5f780eb1] Error running task SiteToSiteBulletinReportingTask[id=017111a7-83c2-1c18-25d3-ad4d5f780eb1] due to org.apache.nifi.processor.exception.ProcessException: Failed to send Bulletins to destination due to IOException:null SiteToSiteBulletinReportingTask[id=017111a7-83c2-1c18-25d3-ad4d5f780eb1] org.apache.nifi.remote.client.PeerSelector@1e7445c6 Unable to refresh Remote Group's peers due to null If i change http to https I receive: SiteToSiteBulletinReportingTask[id=017111a7-83c2-1c18-25d3-ad4d5f780eb1] Error running task SiteToSiteBulletinReportingTask[id=017111a7-83c2-1c18-25d3-ad4d5f780eb1] due to org.apache.nifi.processor.exception.ProcessException: Failed to send Bulletins to destination due to IOException:sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target To test if Nifi could see itself I tried this, it's about the self signed certificate, so it seems ok: curl https://host123.internal:8443/nifi
curl: (60) SSL certificate problem: self signed certificate in certificate chain
More details here: https://curl.haxx.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above. Ignoring the certificate I reach Jetty: curl --insecure -I https://host123.internal:8443/nifi
HTTP/1.1 302 Found
Date: Fri, 15 May 2020 18:21:49 GMT
Location: https://host123.internal:8443/nifi/
Content-Length: 0
Server: Jetty(9.4.19.v20190610) To secure the connection on installation (http to https) I used the security toolkit and copied the files to their locations under the nifi/conf folder as seen on the configuration file at top of the post: ./bin/tls-toolkit.sh standalone -n 'localhost(1)' -C 'CN=my_user,OU=NIFI' -O -o ../security_output Any idea of what can I be doing wrong with the S2S configuration? Thanks!
... View more
Labels:
03-24-2020
03:43 PM
Ok, this is all me for not understanding permissions correctly, I thought that if a permission wasn't configured it heredates the the permissions of NiFi. So: I'm Admin, I generated a group I should have access. You made me check again and I thank you for that!
... View more
03-23-2020
11:38 AM
Sorry for the bump, have more time than usual to solve this kind of problems, any idea? It's becoming hard to debug flows without this. Thank you!
... View more
03-17-2020
02:01 PM
Hi, currently I have 2 process groups on NiFi 1.11.3, the native one is showing data provenance on every processor as it should.
The imported template (from 1.9.2) inside another group shows a blank provenance window.
Any idea of what can be causing this?
Thank you!
... View more
03-13-2020
11:38 AM
Hi, I just deployed a new NiFi instance, I want to have a good management of credentials, there are some that I can store in the Parameter Context, but there are others that have to be posted using an invokeHTTP (like oauth secrets), not a sensitive data field, and don't want to sit there in plaintext.
How do you store this tipe of credentials? I thought about AWS secrets manager, but don't know if I can access them from NiFi, I can have a file on S3 too, but again it's stored in plain text, not decoded on usage.
I think I need something like Parameter Context that let me get the unencrypted value when I need to use it outside sensitive fields.
Any idea? how do you manage this? unencrypt a file on the fly?
Thank you!
... View more
- Tags:
- aws
- Credentials
- NiFi
Labels:
03-13-2020
11:01 AM
Hi, sorry for bumping this, there is a typo in the subject but I think i'ts understandable, any idea of how to do at least the point 1: 1. Call to request token api with already known secret (that I don't want to store not encrypted). Thank you!
... View more
03-01-2020
02:14 PM
Hi, is there a way to login to NiFi registry via openid (ej google)? I'm thinking of how I can secure the instance without a domain like in NiFi Thanks!
... View more
Labels:
03-01-2020
02:05 PM
I am working in an AWS environment with NifI on the core of my ETL process, this is a new installation so I have no monitoring setup yet. Because this is a fresh installation I want to hear the practices you use to monitor NiFi, I used to put some Slack messages for FAIL and SUCCESS but there are very long flows and I don't seem viable putting a slack call on every possible point of failure, those can be a LOT of processors. How do you do? insert to mysql to monitor start/end and alarm if something doesn't end? LOT of slack/mail processors? Dashboards (how do you feed them)? any suggestions? Thank you!
... View more
Labels:
01-18-2020
11:51 AM
Hi, I have a flowfile with this format:
<td>Item1</td> <td class="dest">50.3421</td> <td class="dest">20.5547</td>
I need to write a query with the parameters, so I need to extract the numbers, with this for example:
(\d{2}.\d{4})
And put the 2 resultos in:
Insert into table(a,b) VALUES($1,$2)
How can I do this?
Thanks!
I tryed with replaceText but it's not what im looking for
... View more
Labels:
01-15-2020
01:36 PM
1 Kudo
Solved, PutDatabaseRecord gives you mapping error if the table doesn't exists or if you don't have permissions, hours fighting against the columns names and it was a database problem.
... View more
01-15-2020
01:21 PM
Hi, im trying to upload a CSV to a RedShift database using GetFile->PutDatabaseRecord i'm getting this error: PutDatabaseRecord: CSVReader (my file got headers, this way and not hardcoding the schema makes everything more easily scalable): Null string: NULL CSV: A,B,C,D,E,F,G,H,I 2020-01-15,AX,COD245,NO,PASS,R,,, 2020-01-15,AX,COD235,YES,PASS,R,,, Table with same fields names. Any idea what i'm doing wrong? Is there any way to look for the query it's making, that will make this more easy to debug. Thank you!
... View more
Labels:
12-06-2019
01:20 PM
Hi, i'm having a problem with this, I decompress a gz, it gives me 10k+ json files with 1 element each.
I make a transformation with JOLT and SHIFT only to reorder the fields.
Now I have this problem, when trying to InferAvroSchema, it founds spaces on some keys, the keys can change, so I don't want to force the schema.
How can I replace all the keys with spaces (replace with "_") and keeping the current tree of the JSON? don't want to hardcode anything if possible, because some of the keys are actions, and when a new element is created on the app wich generates the json, a new child key is added.
My last try was this:
https://community.cloudera.com/t5/Support-Questions/How-to-do-JOLT-replace-on-all-JSON-keys-in-Nifi/td-p/178416
But it doesn't modify the child elements, only root ones, and if I tell on wich element is what I want to replace it "shift" it to root and all the other elements dissapear.
The JSON is something like:
{ "server_received_time": "2019-08-01", "app": 8, "event_id": int, "event_properties":{"event buy":"one", "event try":"yes"} "event_state":{"event profile":"absent", "event_fail":"yes", "event text":"will try again"} }
The json has a lot more keys, a space will not appear in the root elements but it can in the child elements, so I only need to replace those keeping the others intact.
The transformations I try make me lost the schema format or name all the child elements (that I want to keep this way because I dont know if new ones will appear.
The idea is to remove this spaces on keys so I can InferAvroSchema without loosing elements and continue to process and store them
Any idea?
Thank you!
... View more
Labels:
12-02-2019
05:42 AM
Anyone solved this? having same problem when trying to unpack a zip: unsupported feature data descriptor used in entry /folder/file#1.gz
... View more
11-05-2019
01:23 PM
Hi, I have an API call to a service, that service have different type of processes, the API cannot filter them.
So when I query the API it returns JSON with 30 common keys, and another 15 that varies for each case, the question is, is there a way to automatically adapt the schema so I can use MergeRecords? The result goes to S3.
There are a lot of cases, and it seems they are adding more once in a while, is there a "generic" way to tell MergeRecords to "if you see any change, make another group", I don't have a key with value to determine the group. Any idea?
Thanks!
... View more
Labels: