Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar

Each rule in the auth-to-local rules as the following formats:

RULE:[n:string](regexp)s/pattern/replacement/
RULE:[n:string](regexp)s/pattern/replacement/g
RULE:[n:string](regexp)s/pattern/replacement//L
RULE:[n:string](regexp)s/pattern/replacement/g/L

[n:string]

Indicates a matching rule where n declares the number of expected components in the principal. Components are separated by a /, where a user account has one component (ambari-qa) and a service account has two components (nn/fqdn). The string value declares how to reformat the value to be used in the rest of the expression. The placeholders are as follows:

$0 - realm

$1 - 1st component

$2 - 2nd component

Typically we ignore the 2nd component since it is the service’s hostname and thus the format is generally set to $1@$0 (but can be any pattern) as in:

[1:$1@$0]

Matches on ambari-qa@EXAMPLE.COM

Translates to ambari-qa@EXAMPLE.COM

[2:$1@$0]

Matches on nn/c6501.ambari.apache.org@EXAMPLE.COM

Translates to nn@EXAMPLE.COM

(regexp)

Indicates a matching rule on the value generated by the [n:string] clause. If this regular expression (regexp) matches, then the replacement expression is invoked.

For Example:

(.*@EXAMPLE.COM)

Matches on

  • ambari-qa@EXAMPLE.COM
  • nn/c6501.ambari.apache.org@EXAMPLE.COM
  • any_name@EXAMPLE.COM

Does not match on

  • ambari-qa@NOT.EXAMPLE.COM
  • nn/c6501.ambari.apache.org@OTHER.REAM
(ambari-.+@EXAMPLE.COM)

Matches on

  • ambari-qa@EXAMPLE.COM
  • ambari-user@EXAMPLE.COM

Does not match on

  • ambari-@EXAMPLE.COM
  • any_user@EXAMPLE.COM
  • ambari-qa@NOT.EXAMPLE.COM

s/pattern/replacement/

s/pattern/replacement/g

The replacement expression to use to generate a value that is to be used as the local user account. This expression is similar to (if not the same as) a sed replacement expression and is executed over the value generated by [n:string]. The pattern part of this expression is a regular expression used to find the portion of the string to replace. The replacement part of this expression is the value to use for replacing the matched section. If g is specified after the last /, the replacements will occur for every match in the value, else only the first match is processed.

For Example:

s/@.*//

Removes all characters in the source string including and after the @.

  • If the source string is ambari-qa@EXAMPLE.COM, the result is ambari-qa
  • If the source string is any_user@EXAMPLE.COM, the result is any_user

s/@.*/user/

Replaces all characters in the source string including and after the @ with "user"

  • If the source string is ambari-qa@EXAMPLE.COM, the result is ambari-qa@user
  • If the source string is any_user@EXAMPLE.COM, the result is any_user@user
s/abc/123/

Replaces the first substring of "abc" the source string with "123"

  • If the source string is ambari-qa@EXAMPLE.COM, the result is ambari-qa@EXAMPLE.COM
  • If the source string is abc_user_abc@EXAMPLE.COM, the result is 123_user_abc@EXAMPLE.COM

s/abc/123/g

Replaces all substrings of "abc" the source string with "123"

  • If the source string is ambari-qa@EXAMPLE.COM, the result is ambari-qa@EXAMPLE.COM
  • If the source string is abc_user_abc@EXAMPLE.COM, the result is 123_user_123@EXAMPLE.COM

The pattern part of the expression may include capturing groups that can be reused in the replacement part of the expression. Capturing groups are declared parentheses and the data capture can be used by referencing it by number (in order of placement in the pattern). The placeholder for captured data is specified using a dollar sign and the reference number. For example $1.

s/(\d+)@.*/ID.$1/

Captures all sequences of numbers and appends "ID." to it

  • If the source string is 1234567890@EXAMPLE.COM, the result is ID.1234567890
s/(\d+)([a-zA-Z]+)@.*/$2$1/

Captures all sequences of numbers and then all sequences of letters and places the letters before the numbers

  • If the source string is 123abc@EXAMPLE.COM, the result is abc123

/L

By default, translations based on rules are done maintaining the case of the input principal. For example, given the rule

RULE:[1:$1@$0](.*@EXAMPLE.COM)s/@.*//

  • If the source string is ambari-qa@EXAMPLE.COM, the result is ambari-qa
  • If the source string is AMBARI-QA@EXAMPLE.COM, the result is AMBARI-QA
  • If the source string is Ambari-QA@EXAMPLE.COM, the result is Ambari-QA

However this may not be desired given how different operating system handle usernames, where as some are case-sensitive and some are case-insensitive. For example, Linux is case-sensitive and Windows is case-insensitive.

To help with this issue, it is possible to force the translated result to be all lower case. This is done by adding a "/L" to the end of the rule. However, it must be noted that this does not effect how pattern matches on input and therefore that will still be case-sensitive.

RULE:[1:$1@$0](ambari-qa-.*@EXAMPLE.COM)s/.*/AMBARI-QA//L
RULE:[1:$1@$0](AMBARI-QA-.*@EXAMPLE.COM)s/.*/AMBARI-QA-UPPER//L
RULE:[1:$1@$0](.*@EXAMPLE.COM)s/@.*///L
  • If the source string is ambari-qa-cl1@EXAMPLE.COM, the result is ambari-qa
  • If the source string is AMBARI-QA-cl1@EXAMPLE.COM, the result is ambari-qa-upper
  • If the source string is joe_user@EXAMPLE.COM, the result is joe_user
  • If the source string is JOE_USER@EXAMPLE.COM, the result is joe_user

Examples

RULE:[1:$1@$0](.*@HDP01.LOCAL)s/@.*//
  • jqpublic@HDP01.LOCAL → jqpublic
  • jdoe@HDP01.LOCAL → jdoe
  • nn/c6501.ambari.apache.org@HDP01.LOCAL → [not processed]
  • dn/c6501.ambari.apache.org@HDP01.LOCAL → [not processed]
RULE:[1:$1@$0](.*@HDP01.LOCAL)s/.*/ambari-qa/
  • jqpublic@HDP01.LOCAL → ambari-qa
  • jdoe@HDP01.LOCAL → ambari-qa
  • nn/c6501.ambari.apache.org@HDP01.LOCAL → [not processed]
  • dn/c6501.ambari.apache.org@HDP01.LOCAL → [not processed]
RULE:[2:$1@$0](.*@HDP01.LOCAL)s/@.*//
  • jqpublic@HDP01.LOCAL → [not processed]
  • jdoe@HDP01.LOCAL → [not processed]
  • nn/c6501.ambari.apache.org@HDP01.LOCAL → nn
  • dn/c6501.ambari.apache.org@HDP01.LOCAL → dn
RULE:[2:$1@$0](.*@HDP01.LOCAL)s/.*/hdfs/
  • jqpublic@HDP01.LOCAL → [not processed]
  • jdoe@HDP01.LOCAL → [not processed]
  • nn/c6501.ambari.apache.org@HDP01.LOCAL → hdfs
  • dn/c6501.ambari.apache.org@HDP01.LOCAL → hdfs

Rule Processing

When processing auth-to-local rules, each rule in the ruleset is processed in order. When a match is made, the processing routine effectively exits and returns the translation that was generated.

For example, if the rule set was:

RULE:[1:$1@$0](hdfs@EXAMPLE.COM)s/.*/hdfs/
RULE:[1:$1@$0](.*@EXAMPLE.COM)s/.*/not_hdfs/
  • hdfs@EXAMPLE.COM would yield "hdfs"
  • user@EXAMPLE.COM would yield "not_hdfs"

However, if the ruleset was:

RULE:[1:$1@$0](.*@EXAMPLE.COM)s/.*/not_hdfs/
RULE:[1:$1@$0](hdfs@EXAMPLE.COM)s/.*/hdfs
  • hdfs@EXAMPLE.COM would yield "not_hdfs"
  • user@EXAMPLE.COM would yield "not_hdfs"

Testing Rulesets

Since auth-to-local rulesets can be rather difficult to read and determine correctness, a handy tool can be used to test it out. However this tool reads the ruleset from the hadoop.security.auth_to_local property in the core-site.xml file (typically found at /etc/hadoop/conf/core-site.xml) and may not be able to import rules from a different source.

To use the tool, one of two commands can be executed on the command line:

Newer versions of hadoop should use:

hadoop kerbname

Older versions of hadoop should use:

hadoop org.apache.hadoop.security.HadoopKerberosName

For example:

hadoop org.apache.hadoop.security.HadoopKerberosName joe_user@EXAMPLE.COM
Name: joe_user@EXAMPLE.COM to joe_user
hadoop org.apache.hadoop.security.HadoopKerberosName ambari-qa-c1@EXAMPLE.COM
Name: ambari-qa-c1@EXAMPLE.COM to ambari-qa
67,052 Views
Comments
avatar
Master Guru

Saved that as PDF. I always wanted to look up how they work but followed through. Thanks a lot.

avatar

If any rule matches, does it stop processing other rules?

Stop or not wouldn't be issue in most of the case but in case someone put completely wrong rules...

avatar

@Hajime, that would be worth adding to this document. Thanks for asking.

The rules are processed in order from top to bottom. When a match is found, processing stops.

Therefore with

RULE:[1:$1@$0](hdfs@EXAMPLE.COM)s/.*/hdfs/
RULE:[1:$1@$0](.*@EXAMPLE.COM)s/.*/not_hdfs/
  • hdfs@EXAMPLE.COM would yield "hdfs"
  • user@EXAMPLE.COM would yield "not_hdfs"

And with

RULE:[1:$1@$0](.*@EXAMPLE.COM)s/.*/not_hdfs/
RULE:[1:$1@$0](hdfs@EXAMPLE.COM)s/.*/hdfs
  • hdfs@EXAMPLE.COM would yield "not_hdfs"
  • user@EXAMPLE.COM would yield "not_hdfs"
avatar
Guru

Great article @Robert Levas !

For the completeness purpose, I'd like to see this information added to the above article:

While mapping the Kerberos principals, if the Kerberos principal names are in the UPPER case or CaMeL case, they won't be recognized on the Linux machine (as Linux users are always in lower case). So you need to add the extra switch "/L" in the RULE definition to force the conversion to lower case.

For example, here are some auth_to_local rule examples with lower case switch added:

"RULE:[1:$1]/L"
"RULE:[2:$1]/L"
"RULE:[2:$1;$2](^.*;admin$)s/;admin$///L"
"RULE:[2:$1;$2](^.*;guest$)s/;guest$//g/L"

And based on these rules, here are the expected output for the following inputs:

"JOE@FOO.COM" to "joe"
"Joe/root@FOO.COM" to "joe"
"Joe/admin@FOO.COM" to "joe"
"Joe/guestguest@FOO.COM" to "joe"

Hope this helps.

avatar

@Robert Levas, thanks for the great article! May I also suggest adding information about the "hadoop kerbname" or "hadoop org.apache.hadoop.security.HadoopKerberosName" shell command? This is a helpful debugging tool that prints the current prinicipal's short name after Hadoop applies the currently configured auth_to_local rules. If you'd like, feel free to copy-paste my text from this answer: https://community.hortonworks.com/questions/38573/pig-view-hdfs-test-failing-service-hdfs-check-fail... .

avatar

@Chris Nauroth

The hadoop kerbname tool is awesome! I just had a opportunity to use it to test out a complex rule, but I needed to use the _old_ syntax to get it to work (thanks for referencing your comment that explains the usage):

hadoop org.apache.hadoop.security.HadoopKerberosName 123456789@EXAMPLE.COM

I meant to add this to the doc, I just haven't had the time. I will get around to it though. But thanks for proving the tool.

avatar

Very helpful, thanks!

avatar
New Contributor

I have a use case where given a rule like:

RULE:[1:$1@$0](.*@FOO.COM)s/@.*// 

want *all* of the below username's to match

user1@FOO.COM 
user2@foo.com
user3@FOO.com

I tried modifying the above rule to use the RegEx modifier that makes the (regexp) case insensitive:

RULE:[1:$1@$0]((?i)(.*@FOO.COM))s/@.*// 

However on testing with this rule I get an error :

$ hadoop org.apache.hadoop.security.HadoopKerberosName  user1@FOO.COM
Exception in thread "main" java.util.regex.PatternSyntaxException: Unknown inline modifier near index 3
(?i
   ^
        at java.util.regex.Pattern.error(Pattern.java:1957)
        at java.util.regex.Pattern.group0(Pattern.java:2896)
        at java.util.regex.Pattern.sequence(Pattern.java:2053)
        at java.util.regex.Pattern.expr(Pattern.java:1998)
        at java.util.regex.Pattern.compile(Pattern.java:1698)
        at java.util.regex.Pattern.<init>(Pattern.java:1351)
        at java.util.regex.Pattern.compile(Pattern.java:1028)
        at org.apache.hadoop.security.authentication.util.KerberosName$Rule.<init>(KerberosName.java:193)
        at org.apache.hadoop.security.authentication.util.KerberosName.parseRules(KerberosName.java:342)
        at org.apache.hadoop.security.authentication.util.KerberosName.setRules(KerberosName.java:398)
        at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:75)
        at org.apache.hadoop.security.HadoopKerberosName.main(HadoopKerberosName.java:79)

Any idea what I'm doing wrong ?

avatar

@Anant Aneja

You probably should have posed this as a question in the form, rather than a comment to this article. It may have gotten answered quicker.

The rule you are using will not perform the translation you want. The regular expression syntax to match using case-insensitivity is not supported as you have specified it and the translation will not generate local names with all lower-case characters. The rule you want is more like

RULE:[1:$1@$0](.*@FOO.COM)s////L

With this rule, the Hadoop UGI class will translate user@FOO.COM to user@foo.com

[root@c7401 ~]# hadoop org.apache.hadoop.security.HadoopKerberosName joe_user@FOO.COM
18/08/27 15:57:07 INFO util.KerberosName: Non-simple name joe_user@FOO.COM after auth_to_local rule RULE:[1:$1@$0](.*@FOO.COM)s////L
Name: joe_user@FOO.COM to joe_user@foo.com

As for the other principal names, they will technically be invalid since the realm name needs to always be in all upper-case characters.

  • user1@FOO.COM - legal
  • user2@foo.com - illegal
  • user3@Foo.com - illegal
avatar
New Contributor

Hi team,
I want to configuration

"yarn-user/hdp01-node.lab.contoso.com@LAB.CONTOSO.COM" to "yarn-user"
"yarn-user/hdp02-node.lab.contoso.com@LAB.CONTOSO.COM" to "yarn-user"
"yarn-user/hdp03-node.lab.contoso.com@LAB.CONTOSO.COM" to "yarn-user"

Please given a rule advidor.
Thanks