1) Build the Apache Sentry distribution
First we will build and install the Apache Sentry distribution. Download Apache Sentry (1.8.0 was used for the purposes of this tutorial). Verify that the signature is valid and that the message digests match. Now extract and build the source and copy the distribution to a location where you wish to install it:
- tar zxvf apache-sentry-1.8.0-src.tar.gz
- cd apache-sentry-1.8.0-src
- mvn clean install -DskipTests
- cp -r sentry-dist/target/apache-sentry-1.8.0-bin ${sentry.home}
2) Install and configure Apache Hive
Please follow the first tutorial to install and configure Apache Hadoop if you have not already done so. Apache Sentry 1.8.0 does not support Apache Hive 2.1.x, so we will need to download and extract Apache Hive 2.0.1. Set the "HADOOP_HOME" environment variable to point to the Apache Hadoop installation directory above. Then follow the steps as outlined in the first tutorial to create the table in Hive and make sure that a query is successful.
3) Integrate Apache Sentry with Apache Hive
Now we will integrate Apache Sentry with Apache Hive. We need to add three new configuration files to the "conf" directory of Apache Hive.
3.a) Configure Apache Hive to use authorization
Create a file called 'conf/hiveserver2-site.xml' with the content:
Here we are enabling authorization and adding the Sentry authorization plugin.
3.b) Add Sentry plugin configuration
Create a new file in the "conf" directory of Apache Hive called "sentry-site.xml" with the following content:
This is the configuration file for the Sentry plugin for Hive. It essentially says that the authorization privileges are stored in a local file, and that the groups for authenticated users should be retrieved from this file. As we are not using Kerberos, the "testing.mode" configuration parameter must be set to "true".
3.c) Add the authorization privileges for our test-case
Next, we need to specify the authorization privileges. Create a new file in the config directory called "sentry.ini" with the following content:
Here we are granting the user "alice" a role which allows her to perform a "select" on the table "words".
3.d) Add Sentry libraries to Hive
Finally, we need to add the Sentry libraries to Hive. Copy the following files from ${sentry.home}/lib to ${hive.home}/lib:
- sentry-binding-hive-common-1.8.0.jar
- sentry-core-model-db-1.8.0.jar
- sentry*provider*.jar
- sentry-core-common-1.8.0.jar
- shiro-core-1.2.3.jar
- sentry-policy*.jar
- sentry-service-*.jar
4) Test authorization with Apache Hive
Now we can test authorization after restarting Apache Hive. The user 'alice' can query the table according to our policy:
- bin/beeline -u jdbc:hive2://localhost:10000 -n alice
- select * from words where word == 'Dare'; (works)
- bin/beeline -u jdbc:hive2://localhost:10000 -n bob
- select * from words where word == 'Dare'; (fails)
No comments:
Post a Comment