Member since
07-21-2020
29
Posts
5
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1216 | 03-21-2022 05:11 AM | |
4400 | 04-22-2021 12:21 PM | |
1893 | 02-08-2021 11:44 PM |
10-04-2022
12:47 PM
Hello, on our system CDP 7.1.7, we have use konfiguration parameter kafka.properties_role_safety_valve add set attribute sasl.kerberos.principal.to.local.rules to map ActiveDirectory principals to entities created in ranger. In our system, the AD user have a prefix e.g. xjohndoe@SAMPLE.COM maps to a ranger entity "johndoe" During a spark-submit (over yarn), we also need to pass a principal, however as there is no such mapping, we obtain an error saying the unix user "xjohndoe" does not exist. This is true indeed, we eed to map it to "johndoe". Ist there any possibility to map principals during spark-spark-submit possibly similarly to sasl.kerberos.principal.to.local.rules in kafka or any other possibility? Best regards Jaro
... View more
Labels:
05-06-2022
06:25 AM
Hello, is it possible to alter compression.type on a kafka topic while having applications on the topic running? What should be expected hard drive saving factor for text records like json (2KB size) and compression.type=gzip. In CM, I see a global setting. Does it apply to KafkaMirrorMaker only? Are producer and consumer applications somehow affected, when the global parameter changes. Best regards Jaro
... View more
Labels:
- Labels:
-
Apache Kafka
03-21-2022
05:11 AM
I have eventually solved the issue. The Writter property "Schema Access Strategy" was misconfigured. It must be set to "Use 'Schema Text' Property" to really apply the given schema .
... View more
03-21-2022
04:09 AM
Hello, I have a problem with converting json to csv.
I use ConvertRecord processor with attached JsonTreeReader and CSVRecordSetWriter. The schema for both might look like this. Attributes in the json are optional.
{
"type": "record",
"name": "some_event",
"fields": [
{ "name": "date_time", "type": ["string","null"] },
{ "name": "id", "type": ["string","null"] },
{ "name": "is_success", "type": ["string","null"] },
{ "name": "username", "type": ["string","null"] },
{ "name": "username_type", "type": ["string","null"] }
]
}
for the input json like
{
"date_time": "2017-01-29T13:42:03.965Z",
"id": "20049015990001584400001"
"is_success": "TRUE",
}
where username and username_type is not set, the output does not correspond to my expectation
I get 2022-01-29T13:42:03.965Z|20049015990001584400001|true instead of 2022-01-29T13:42:03.965Z|20049015990001584400001|true|| which should be the right format. Note the separator for null values at the end of the second record.
Do you have some hints, how to solve it? It should be sort of standard task.
The CSVRecordSetWriter is configured like this:
We use
Cloudera Flow Management (CFM) 2.0.4.3 1.11.4.2.0.4.3-1 built 01/06/2021 23:14:00 CET
Best regards Jaro
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
03-03-2022
10:54 AM
Hi,
I see the in Cloudera Manager in Kafka Status the diagram for "Total Bytes Received Across Kafka Brokers". What does it mean exactly? Is the pure input without replication, or is replication counted on it and the pure input would be only e.g. 1/3 for replication factor 3
Best regards
Jaro
... View more
Labels:
- Labels:
-
Apache Kafka
-
Cloudera Manager
02-10-2022
05:26 AM
Hello, a question regarding application design in NiFi. We have a REST resource which provides data changes since a unix timestamp, which is passed inside a json structure to be sent as http POST. The major question is how and where to keep the state of the last unix timestamp and how this state is accessible to operation. We should be able to reset the timestamp to any value to repeat calls. Also we should be able set increment, which es added to the timestmap in the next call. Similary, we have a related issue ina use case, which access a database. We generate a query in GenerateTableFetch and set the property "Maximum-value Columns". Here the state is maintened by the processor, but it pretty much unclear how to set the state to specific stating value. To clear the state is obviously not appropriate. Best regards Jaro
... View more
Labels:
- Labels:
-
Apache NiFi
11-22-2021
03:56 AM
Hello, after some operation in NiFi GUI (parameter value change in a parameter context), the GUI got sort of stuck. A controller service show status disabled in GUI, however, I cannot enable or delete a controller service, as the NiFi claims it is running on a node. Do you have some Idea how to get rid of this situation? Best regards Jaro
... View more
Labels:
- Labels:
-
Apache NiFi
10-05-2021
10:07 AM
Hi, This sounds promissing. Actually, I went accross principal mapping rules recently, but was quite unclear about the implications. The fact is, the "some_user" is no posix, or LDAP user or what so ever. It exists only as certificate. Also the user identifier in Ranger is like "OU=Dempt,O=Company,...". this is, how colleagues of mine have set up the policies. Does your assumption mean, that every single client certificate should be backed by a posix user? And what, if the user is an external party accessing the broker remotelly? Best regards Jaro
... View more
10-04-2021
02:11 AM
Hello, we observe lot of annoying messages in kafka logs like this. 2021-09-01 17:09:47,435 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: unable to return groups for user OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user PartialGroupNameException The user name 'OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user' is not found. id: OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user: no such user id: OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user: no such user at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.resolvePartialGroupNames(ShellBasedUnixGroupsMapping.java:291) at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:215) at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroupsSet(ShellBasedUnixGroupsMapping.java:123) at org.apache.hadoop.security.Groups$GroupCacheLoader.fetchGroupSet(Groups.java:413) at org.apache.hadoop.security.Groups$GroupCacheLoader.load(Groups.java:351) at org.apache.hadoop.security.Groups$GroupCacheLoader.load(Groups.java:300) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3529) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2155) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2045) at com.google.common.cache.LocalCache.get(LocalCache.java:3953) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3976) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4960) at org.apache.hadoop.security.Groups.getGroupInternal(Groups.java:258) at org.apache.hadoop.security.Groups.getGroupsSet(Groups.java:230) at org.apache.hadoop.security.UserGroupInformation.getGroupsSet(UserGroupInformation.java:1760) at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1726) at org.apache.ranger.audit.provider.MiscUtil.getGroupsForRequestUser(MiscUtil.java:587) at org.apache.ranger.authorization.kafka.authorizer.RangerKafkaAuthorizer.authorize(RangerKafkaAuthorizer.java:155) at org.apache.ranger.authorization.kafka.authorizer.RangerKafkaAuthorizer.authorize(RangerKafkaAuthorizer.java:135) at kafka.security.authorizer.AuthorizerWrapper$$anonfun$authorize$1.apply(AuthorizerWrapper.scala:52) at kafka.security.authorizer.AuthorizerWrapper$$anonfun$authorize$1.apply(AuthorizerWrapper.scala:50) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at kafka.security.authorizer.AuthorizerWrapper.authorize(AuthorizerWrapper.scala:50) at kafka.server.KafkaApis.filterAuthorized(KafkaApis.scala:2775) at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:639) at kafka.server.KafkaApis.handle(KafkaApis.scala:128) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:75) at java.lang.Thread.run(Thread.java:748) OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user (values changed) is a self-signed certificate used as a client certificate for TLS based kafka connection. The certificate exists, the name matches, and it was created accordingly in Ranger. The connection works actually. We still don't undertstand where these warnings come from. Cloudera Enterprise 7.2.4 Best Regards Jaro
... View more
Labels:
- Labels:
-
Apache Kafka