Reply
Explorer
Posts: 7
Registered: ‎11-30-2015

Regarding TSB 2019-366 - HDFS FACL permissions out of sync with Hive Catalog (HCATALOG)

[ Edited ]

We have experiencing the issue with HDFS FACL permissions being out of sync with Hive Catalog (HCATALOG) / Sentry after we upgraded to 5.16.1.

 

The issue is described in Technical Sevice Bulletin 2019-366: Clusters running CDH 5.16.1, 6.1.0, or 6.1.1 can lose some HDFS file permissions any time the Name Node is restarted. 

https://my.cloudera.com/knowledge/TSB-2019-366?id=87515

 

We discovered that reapplying the table grants will fix the issue until the name node is restarted - at which point the issue will reappear.

 

Fortunately the Sentry database holds the current grants and these data are not affected by the issue, so the data there can be used to re-generate the grant commands.

 

The process can be automated and re-run in case your name node gets restarted.

 

On the host running the Sentry MySQL database, create a login path (file with MySQL credentials):

 

# First create a login file for MySQL:
mysql_config_editor set --login-path=oneoff --host=$(hostname) --user=sentry --password
# input password for the {{sentry}} database in the environment

# Test the login path:
mysql --login-path=oneoff --host=$(hostname) sentry -e "select now()"

Create a Kerberos credentials for accessing Hive :

 

kinit <Hive admin user>

Create the script with the Hive grant commands and run the script on Hive:

SQL="SELECT CONCAT('GRANT ', upper(p.ACTION) ,' ON DATABASE \`', DB_NAME, '\` TO ROLE \`', ROLE_NAME, '\`', IF (WITH_GRANT_OPTION='Y',' WITH GRANT OPTION;',';')) 
FROM SENTRY_DB_PRIVILEGE p
INNER JOIN SENTRY_ROLE_DB_PRIVILEGE_MAP pm ON pm.DB_PRIVILEGE_ID = p.DB_PRIVILEGE_ID
INNER JOIN SENTRY_ROLE r ON r.ROLE_ID = pm.ROLE_ID
WHERE p.ACTION IN ('insert','select');
"
LOG_FILE_NAME="mysql-stderr.log"
HQL_FILE="grants.hql"
mysql --login-path=oneoff --skip-column-names --batch --silent --no-beep --host=$(hostname) -Dsentry \
-P3306 -e "${SQL}" >"${HQL_FILE}" 2> "${LOG_FILE_NAME}"
# cat "$LOG_FILE_NAME"
cat "$HQL_FILE"

# Change the host name here if HiveServer2 is not running on the local host (or if hostname does not resolve to the correct value)
HIVE_SVR="$(hostname):10000"
# HIVE_SVR="my-hiveserver.domain.com:10000"

# Change REALM.COM below to your Kerberos realm
beeline -u "jdbc:hive2://${HIVE_SVR}/default;principal=hive/_HOST@REALM.COM" \
--fastConnect=true --silent=false --outputformat=table --showHeader=false --showWarnings=false \
-f "${HQL_FILE}"

# Clean up
rm -f "${HQL_FILE}"
# Clean up - remove the login path:
mysql_config_editor -v remove --login-path=oneoff

Hope this may be usefull for someone somewhere.

 

/henrik ring

 

Cloudera Employee
Posts: 3
Registered: ‎06-26-2017

Re: Regarding TSB 2019-366 - HDFS FACL permissions out of sync with Hive Catalog (HCATALOG)

[ Edited ]

As you mentioned a restart of NN causes the problem to re-occur. That is a big limitation of this workaround since running several grants on every restart may not be feasible. Therefore we do not recommend it.
The reason why your suggestion works is because duplicate grants are not written to the sentry database however a permission update is sent to NN. Which is why you won't see a change on the sentry database with duplicate grants.
If possible, please wait until the TSB is addressed and fixed in CDH 6.1.2 and 5.16.2 releases, without which only the latest applied privilege gets converted to ACLs (check column DB_PRIVILEGE_ID for latest values).
A better workaround is to actually first REVOKE select/insert privileges, and then re-GRANT them. This will force the select/insert privilege to be the most recent applied privilege. You will also not lose ACLs on NN restart.
Again this won't address the bug in the TSB where multiple privileges get applied as ACLs

Explorer
Posts: 7
Registered: ‎11-30-2015

Re: Regarding TSB 2019-366 - HDFS FACL permissions out of sync with Hive Catalog (HCATALOG)

Hi @arjunM  ,

 

Thank you for taking the time to look into this.

 

Am I understanding you correctly, that if I do a REVOKE before the GRANTs it will fix the problem permanently ie. I can restart my NN without the problem re-occuring. 

 

If that is the case, a simple change to the above script can fix the issue permanently on any system ?

 

SQL="SELECT CONCAT('REVOKE ', upper(p.ACTION) ,' ON DATABASE \`', DB_NAME, '\` FROM ROLE \`', ROLE_NAME, '\`', ';') 
FROM SENTRY_DB_PRIVILEGE p
INNER JOIN SENTRY_ROLE_DB_PRIVILEGE_MAP pm ON pm.DB_PRIVILEGE_ID = p.DB_PRIVILEGE_ID
INNER JOIN SENTRY_ROLE r ON r.ROLE_ID = pm.ROLE_ID
WHERE p.ACTION IN ('insert','select')
UNION ALL
SELECT CONCAT('GRANT ', upper(p.ACTION) ,' ON DATABASE \`', DB_NAME, '\` TO ROLE \`', ROLE_NAME, '\`', IF (WITH_GRANT_OPTION='Y',' WITH GRANT OPTION;',';')) 
FROM SENTRY_DB_PRIVILEGE p
INNER JOIN SENTRY_ROLE_DB_PRIVILEGE_MAP pm ON pm.DB_PRIVILEGE_ID = p.DB_PRIVILEGE_ID
INNER JOIN SENTRY_ROLE r ON r.ROLE_ID = pm.ROLE_ID
WHERE p.ACTION IN ('insert','select');
"
LOG_FILE_NAME="mysql-stderr.log"
HQL_FILE="grants.hql"
mysql --login-path=oneoff --skip-column-names --batch --silent --no-beep --host=$(hostname) -Dsentry \
-P3306 -e "${SQL}" >"${HQL_FILE}" 2> "${LOG_FILE_NAME}"
# cat "$LOG_FILE_NAME"
cat "$HQL_FILE"

Best regards,

Henrik Ring

Cloudera Employee
Posts: 3
Registered: ‎06-26-2017

Re: Regarding TSB 2019-366 - HDFS FACL permissions out of sync with Hive Catalog (HCATALOG)

@henrikringcsc  That is correct. After you revoke and re-grant, NN restart won't result in loss in ACLs.  You can verify the privilege ordering is different. Look at the SENTRY_ROLE_DB_PRIVILEGE_MAP table. 

I would also recommend you take a backup of the tables prior to doing this to avoid any regression.

Highlighted
Explorer
Posts: 7
Registered: ‎11-30-2015

Re: Regarding TSB 2019-366 - HDFS FACL permissions out of sync with Hive Catalog (HCATALOG)

Hi @arjunM ,

 

FYI:

We tested your suggestion, REVOKING grants before GRANTING them again.

After a restart of the NN the problem unfortunately re-appeared.

So it seems is it not a permanent solution after all.

 

Best regards,

Henrik Ring

 

Cloudera Employee
Posts: 3
Registered: ‎06-26-2017

Re: Regarding TSB 2019-366 - HDFS FACL permissions out of sync with Hive Catalog (HCATALOG)

@henrikringcsc  Did the DB_PRIVILEGE_ID ordering change?

Explorer
Posts: 7
Registered: ‎11-30-2015

Re: Regarding TSB 2019-366 - HDFS FACL permissions out of sync with Hive Catalog (HCATALOG)

Hi @arjunM ,

 

I do not know if the ordring changed - maybe you can tell me. Here is an example:

 

SELECT p.DB_PRIVILEGE_ID,p.PRIVILEGE_SCOPE,p.ACTION,r.ROLE_NAME FROM SENTRY_DB_PRIVILEGE p
INNER JOIN SENTRY_ROLE_DB_PRIVILEGE_MAP pm ON pm.DB_PRIVILEGE_ID = p.DB_PRIVILEGE_ID
INNER JOIN SENTRY_ROLE r ON r.ROLE_ID = pm.ROLE_ID
WHERE r.ROLE_NAME='car_raw_d'
ORDER BY p.DB_PRIVILEGE_ID;

+-----------------+-----------------+--------+-----------+
| DB_PRIVILEGE_ID | PRIVILEGE_SCOPE | ACTION | ROLE_NAME |
+-----------------+-----------------+--------+-----------+
|             156 | URI             | *      | car_raw_d |
|             157 | URI             | *      | car_raw_d |
|             382 | DATABASE        | insert | car_raw_d |
|             384 | DATABASE        | select | car_raw_d |
+-----------------+-----------------+--------+-----------+

Here is an example from another environment where we did not apply the REVOKE/GRANT script:

 

+-----------------+-----------------+--------+-----------+
| DB_PRIVILEGE_ID | PRIVILEGE_SCOPE | ACTION | ROLE_NAME |
+-----------------+-----------------+--------+-----------+
|               5 | DATABASE        | select | car_raw_d |
|               6 | DATABASE        | insert | car_raw_d |
|             165 | URI             | *      | car_raw_d |
|             166 | URI             | *      | car_raw_d |
+-----------------+-----------------+--------+-----------+

So yes the order is different; but is it the corrent kind of different? I cannot say.

 

Both environments are initially setup with the same scripts and are both on 5.16.1.

 

/henrik ring