Member since
06-26-2015
515
Posts
140
Kudos Received
114
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2573 | 09-20-2022 03:33 PM | |
| 6967 | 09-19-2022 04:47 PM | |
| 3679 | 09-11-2022 05:01 PM | |
| 4284 | 09-06-2022 02:23 PM | |
| 6790 | 09-06-2022 04:30 AM |
02-15-2022
04:02 PM
@wichovalde , The error below is a server-side error that should (hopefully) be logged in the Atlas server log. Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, URL: https://<atlas-server>:31443/api/atlas/v2/entity/uniqueAttribute/type/hive_db?attr%3AqualifiedName=inventarios%40cm&ignoreRelationships=true&minExtInfo=true&user.name=superuser, status: 500, message: Server Error Try to find the entries in the Atlas server log that match this call and it could tell you a bit more about the problem. If the Atlas log doesn't seem to have the corresponding information, try setting its log threshold to DEBUG in Cloudera Manager and restarting the service and repeating the test. André
... View more
02-14-2022
02:45 AM
I think about Groovy script but did not find how to loop each flowfile or how to get the count of the files
... View more
02-13-2022
02:29 PM
1 Kudo
Hi @cdh
Just to add on to the answer given above by @araujo , I wanted to address the second part of your question:
If not can i install CDP without license or trail versions. if kindly provide links to download and installation document as i never installed CDP.
No, you cannot install a non-trial version of CDP without a valid Cloudera subscription. Cloudera does have programs to support those doing a legitimate evaluation/PoC of Cloudera's data platform software for lengths of time beyond that allowed by trial versions, however. Your best approach, if you're interested in that, would be to contact the Cloudera Sales Team to find out more about your company's options.
If you're honestly looking to evaluate a data platform, you can currently do so without an existing valid Cloudera subscription by downloading and installing the Trial Version of CDP Private Cloud Base Edition of Cloudera Data Platform. A link to the documentation describing in detail how to install this version of CDP Private Cloud Base can be found on that page, labeled CDP Private Cloud Base Trial Installation.
Cheers!
... View more
02-12-2022
08:42 PM
1 Kudo
Some time ago I faced an interesting problem with a cluster failing to start after I replaced an MIT KDC with a FreeIPA KDC.
For the replacement, I installed the ipa-client package on the cluster nodes, and then, changed the KDC configuration in Cloudera Manager (CM) (changed realm and KDC details, imported Kerberos user, re-generated credentials, etc..)
The cluster refused to start. Besides that, CM's KDC Login monitor kept complaining about the KDC not being healthy. I could manually kinit successfully, though, and there seemed to be no KDC problems at first glance.
After enabling debug at different places I saw that there were socket timeouts when processes tried to connect to the KDC and that those processes were actually trying to connect to the KDC over UDP, rather than TCP. The UDP requests explained the problem, since UDP traffic was blocked between the cluster and the KDC.
What's strange, though, is that the krb5.conf created by the ipa-client install had the following configuration:
udp_preference_limit = 0
According to the MIT documentation, this should force all the communication to be over TCP, instead of UDP. From the MIT website:
"When sending a message to the KDC, the library will try using TCP before UDP if the size of the message is above udp_preference_limit. If the message is smaller than udp_preference_limit, then UDP will be tried before TCP. Regardless of the size, both protocols will be tried if the first attempt fails."
Even though the "library" above doesn't refer to the Java library, Java does recognize the udp_preference_limit parameter from the krb5.conf, as explained here.
So, I'd expect that, with that setting, TCP would be tried first for all requests, but it was not. And after 3 UDP attempts, the connection would actually fail altogether without trying to connect over TCP.
I found it interesting, though, that the ipa-client installation set that value to 0. At Cloudera, we have always recommended to customers to set it to 1 instead. So I went ahead and changed the entry in the krb5.conf to:
udp_preference_limit = 1
And amazingly everything worked after that!! The debug logs didn't show traces of UDP requests any longer, the cluster came up correctly and the CM alerts went away.
Interesting how something really small can badly break things leaving very little vestiges of what's going on...
The JDK behavior is coming from this.
So, in short, to be on the safe side always set udp_preference_limit to 1 and never to 0.
... View more
02-12-2022
05:21 PM
From the extension of your key file (key.ppk), my guess is that you're using PuTTY to connect to the VMs. Is that correct? PuTTY uses a different key format than OpenSSH clients. If the above is correct, try converting your key.ppk to OpenSSH format using PuTTYgen (see link below) and try again using the converted file. https://www.thegeekdiary.com/how-to-convert-puttys-private-key-ppk-to-ssh-key/ Cheers, André
... View more
02-11-2022
01:51 PM
1 Kudo
Great to hear! I try my best to understand Jolt because sometimes it can be quite useful, but I think it has a very convoluted syntax and sometimes it's really hard to use. But practice helps. The first asterisk matches against the field names of an object. The second asterisk depends: if the value of the attribute is a scalar, it will match against the value; if it's a nest object, it will match against the name of the nested object. The trick is that when it matches the value of the object it does not match nulls 😉 Cheers, André
... View more
02-10-2022
04:32 PM
2 Kudos
Which repository are you referring to? An internal NiFi repository or the location your flow is writing data to? You can use the EncryptContent processor to encrypt the whole content of the flowfile, but there isn't an easy way to a single field of a record. To do this you will have to use something like the ScriptedTransformRecord and provide a script that encrypts parts of your data. Here's an example of using ScriptedTransformRecord with a Groovy script to encrypt the field "name": import javax.crypto.Cipher import javax.crypto.SecretKey import javax.crypto.SecretKeyFactory import javax.crypto.spec.IvParameterSpec import javax.crypto.spec.PBEKeySpec import javax.crypto.spec.SecretKeySpec import java.security.Key import java.security.spec.KeySpec String encryptionKey = "#{encryption.key}" Key aesKey = new SecretKeySpec(encryptionKey.getBytes("UTF-8"), "AES") Cipher cipher = Cipher.getInstance("AES/ECB/PKCS5Padding") cipher.init(Cipher.ENCRYPT_MODE, aesKey) record.setValue("name", cipher.doFinal(record.getValue("name").getBytes("UTF-8")).encodeBase64()) record To decrypt it you could use: import javax.crypto.Cipher import javax.crypto.SecretKey import javax.crypto.SecretKeyFactory import javax.crypto.spec.IvParameterSpec import javax.crypto.spec.PBEKeySpec import javax.crypto.spec.SecretKeySpec import java.security.Key import java.security.spec.KeySpec import java.util.Base64 String encryptionKey = "#{encryption.key}" Key aesKey = new SecretKeySpec(encryptionKey.getBytes("UTF-8"), "AES") Cipher cipher = Cipher.getInstance("AES/ECB/PKCS5Padding") cipher.init(Cipher.DECRYPT_MODE, aesKey) record.setValue("name", cipher.doFinal(Base64.getDecoder().decode(record.getValue("name")))) record The encrypt key is specified through a NiFi parameter called encryption.key. Cheers, André
... View more
02-10-2022
07:06 AM
I used MergeContent and i solved my problem. Thanks !
... View more
02-09-2022
08:11 PM
1 Kudo
You can use a QueryRecord processor before the PutDatabaseRecord. You can add a relation to the QueryRecord processor with the following associated query: select "field one" as field_one, "field two" as field_two, "field three" as field_three from flowfile In the query above you can reference one field names using double-quotes if they have spaces. You can specify an alias for that column, which is the field name that will be used in the output. Cheers, Andre
... View more