About Jarinek

Jarinek · ‎10-06-2022

Great, thanks

Jarinek · ‎10-04-2022

Hello, on our system CDP 7.1.7, we have use konfiguration parameter kafka.properties_role_safety_valve add set attribute sasl.kerberos.principal.to.local.rules to map ActiveDirectory principals to entities created in ranger. In our system, the AD user have a prefix e.g. xjohndoe@SAMPLE.COM maps to a ranger entity "johndoe" During a spark-submit (over yarn), we also need to pass a principal, however as there is no such mapping, we obtain an error saying the unix user "xjohndoe" does not exist. This is true indeed, we eed to map it to "johndoe". Ist there any possibility to map principals during spark-spark-submit possibly similarly to sasl.kerberos.principal.to.local.rules in kafka or any other possibility? Best regards Jaro

Jarinek · ‎05-06-2022

Hello, is it possible to alter compression.type on a kafka topic while having applications on the topic running? What should be expected hard drive saving factor for text records like json (2KB size) and compression.type=gzip. In CM, I see a global setting. Does it apply to KafkaMirrorMaker only? Are producer and consumer applications somehow affected, when the global parameter changes. Best regards Jaro

Jarinek · ‎03-21-2022

I have eventually solved the issue. The Writter property "Schema Access Strategy" was misconfigured. It must be set to "Use 'Schema Text' Property" to really apply the given schema .

Jarinek · ‎03-21-2022

Hello, I have a problem with converting json to csv. I use ConvertRecord processor with attached JsonTreeReader and CSVRecordSetWriter. The schema for both might look like this. Attributes in the json are optional. { "type": "record", "name": "some_event", "fields": [ { "name": "date_time", "type": ["string","null"] }, { "name": "id", "type": ["string","null"] }, { "name": "is_success", "type": ["string","null"] }, { "name": "username", "type": ["string","null"] }, { "name": "username_type", "type": ["string","null"] } ] } for the input json like { "date_time": "2017-01-29T13:42:03.965Z", "id": "20049015990001584400001" "is_success": "TRUE", } where username and username_type is not set, the output does not correspond to my expectation I get 2022-01-29T13:42:03.965Z|20049015990001584400001|true instead of 2022-01-29T13:42:03.965Z|20049015990001584400001|true|| which should be the right format. Note the separator for null values at the end of the second record. Do you have some hints, how to solve it? It should be sort of standard task. The CSVRecordSetWriter is configured like this: We use Cloudera Flow Management (CFM) 2.0.4.3 1.11.4.2.0.4.3-1 built 01/06/2021 23:14:00 CET Best regards Jaro

Jarinek · ‎03-03-2022

Hi, I see the in Cloudera Manager in Kafka Status the diagram for "Total Bytes Received Across Kafka Brokers". What does it mean exactly? Is the pure input without replication, or is replication counted on it and the pure input would be only e.g. 1/3 for replication factor 3 Best regards Jaro

Jarinek · ‎02-10-2022

Hello, a question regarding application design in NiFi. We have a REST resource which provides data changes since a unix timestamp, which is passed inside a json structure to be sent as http POST. The major question is how and where to keep the state of the last unix timestamp and how this state is accessible to operation. We should be able to reset the timestamp to any value to repeat calls. Also we should be able set increment, which es added to the timestmap in the next call. Similary, we have a related issue ina use case, which access a database. We generate a query in GenerateTableFetch and set the property "Maximum-value Columns". Here the state is maintened by the processor, but it pretty much unclear how to set the state to specific stating value. To clear the state is obviously not appropriate. Best regards Jaro

Jarinek · ‎11-22-2021

Hello, after some operation in NiFi GUI (parameter value change in a parameter context), the GUI got sort of stuck. A controller service show status disabled in GUI, however, I cannot enable or delete a controller service, as the NiFi claims it is running on a node. Do you have some Idea how to get rid of this situation? Best regards Jaro

Jarinek · ‎10-05-2021

Hi, This sounds promissing. Actually, I went accross principal mapping rules recently, but was quite unclear about the implications. The fact is, the "some_user" is no posix, or LDAP user or what so ever. It exists only as certificate. Also the user identifier in Ranger is like "OU=Dempt,O=Company,...". this is, how colleagues of mine have set up the policies. Does your assumption mean, that every single client certificate should be backed by a posix user? And what, if the user is an external party accessing the broker remotelly? Best regards Jaro

Jarinek · ‎10-04-2021

Hello, we observe lot of annoying messages in kafka logs like this. 2021-09-01 17:09:47,435 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: unable to return groups for user OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user PartialGroupNameException The user name 'OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user' is not found. id: OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user: no such user id: OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user: no such user at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.resolvePartialGroupNames(ShellBasedUnixGroupsMapping.java:291) at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:215) at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroupsSet(ShellBasedUnixGroupsMapping.java:123) at org.apache.hadoop.security.Groups$GroupCacheLoader.fetchGroupSet(Groups.java:413) at org.apache.hadoop.security.Groups$GroupCacheLoader.load(Groups.java:351) at org.apache.hadoop.security.Groups$GroupCacheLoader.load(Groups.java:300) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3529) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2155) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2045) at com.google.common.cache.LocalCache.get(LocalCache.java:3953) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3976) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4960) at org.apache.hadoop.security.Groups.getGroupInternal(Groups.java:258) at org.apache.hadoop.security.Groups.getGroupsSet(Groups.java:230) at org.apache.hadoop.security.UserGroupInformation.getGroupsSet(UserGroupInformation.java:1760) at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1726) at org.apache.ranger.audit.provider.MiscUtil.getGroupsForRequestUser(MiscUtil.java:587) at org.apache.ranger.authorization.kafka.authorizer.RangerKafkaAuthorizer.authorize(RangerKafkaAuthorizer.java:155) at org.apache.ranger.authorization.kafka.authorizer.RangerKafkaAuthorizer.authorize(RangerKafkaAuthorizer.java:135) at kafka.security.authorizer.AuthorizerWrapper$$anonfun$authorize$1.apply(AuthorizerWrapper.scala:52) at kafka.security.authorizer.AuthorizerWrapper$$anonfun$authorize$1.apply(AuthorizerWrapper.scala:50) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at kafka.security.authorizer.AuthorizerWrapper.authorize(AuthorizerWrapper.scala:50) at kafka.server.KafkaApis.filterAuthorized(KafkaApis.scala:2775) at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:639) at kafka.server.KafkaApis.handle(KafkaApis.scala:128) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:75) at java.lang.Thread.run(Thread.java:748) OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user (values changed) is a self-signed certificate used as a client certificate for TLS based kafka connection. The certificate exists, the name matches, and it was created accordingly in Ranger. The connection works actually. We still don't undertstand where these warnings come from. Cloudera Enterprise 7.2.4 Best Regards Jaro

Online	Offline
Last Visited	‎03-25-2023 06:34 AM

Member Since	‎07-21-2020 03:23 AM
Last Visited	‎03-25-2023 06:34 AM
Posts	29
Kudos received	5

Cloudera Community

Re: NiFi convert json to csv with optional attrib...

Re: NiFi ListenHTTP Failed to bind to 0.0.0.0

Re: NiFi CLI Tookit : set-param

Re: Spark-submit - mapping of principal

Spark-submit - mapping of principal

questions about kafka and compression.type

Re: NiFi convert json to csv with optional attrib...

NiFi convert json to csv with optional attributes

Cloudera Manager: Total Bytes Received Across Kafk...

NiFi application design : maintaining state

NiFi controller service in an undefined status

Re: kafka warning "no such user"

kafka warning "no such user"