Monday, January 29, 2018

Securing Apache Sqoop - part II

This is the second in a series of posts on how to secure Apache Sqoop. The first post looked at how to set up Apache Sqoop to perform a simple use-case of transferring a file from HDFS to Apache Kafka. In this post we will look at securing Apache Sqoop with Apache Ranger, such that only authorized users can interact with it. We will then show how to use the Apache Ranger Admin UI to create authorization policies for Apache Sqoop.

1) Install the Apache Ranger Sqoop plugin

If you have not done so already, please follow the steps in the earlier tutorial to set up Apache Sqoop. First we will install the Apache Ranger Sqoop plugin. Download Apache Ranger and verify that the signature is valid and that the message digests match. Due to some bugs that were fixed for the installation process, I am using version 1.0.0-SNAPSHOT in this post. Now extract and build the source, and copy the resulting plugin to a location where you will configure and install it:
  • mvn clean package assembly:assembly -DskipTests
  • tar zxvf target/ranger-1.0.0-SNAPSHOT-sqoop-plugin.tar.gz
  • mv ranger-1.0.0-SNAPSHOT-sqoop-plugin ${ranger.sqoop.home}
Now go to ${ranger.sqoop.home} and edit "". You need to specify the following properties:
  • POLICY_MGR_URL: Set this to "http://localhost:6080"
  • REPOSITORY_NAME: Set this to "SqoopTest".
  • COMPONENT_INSTALL_DIR_NAME: The location of your Apache Sqoop installation
Save "" and install the plugin as root via "sudo -E ./". Make sure that the user you are running Sqoop as has permission to access '/etc/ranger/SqoopTest', which is where the Ranger plugin for Sqoop will download authorization policies created in the Ranger Admin UI.

In the Apache Sqoop directory, copy 'conf/ranger-sqoop-security.xml' to the root directory (or else add the 'conf' directory to the Sqoop classpath). Now restart Apache Sqoop and try to see the Connectors that were installed:
  • bin/sqoop2-server start
  • bin/sqoop2-shell
  • show connector
You should see an empty list here as you are not authorized to see the connectors. Note that "show job" should still work OK, as you have permission to view jobs that you created.

2) Create authorization policies in the Apache Ranger Admin console

Next we will use the Apache Ranger admin console to create authorization policies for Sqoop. Follow the steps in this tutorial (except use at least Ranger 1.0.0) to install the Apache Ranger admin service. Start the Apache Ranger admin service with "sudo ranger-admin start" and open a browser and go to "http://localhost:6080/" and log on with "admin/admin". Add a new Sqoop service with the following configuration values:
  • Service Name: SqoopTest
  • Username: admin
  • Sqoop URL: http://localhost:12000
Note that "Test Connection" is not going to work here, as the "admin" user is not authorized at this stage to read from the Sqoop 2 server. However, once the service is created and the policies synced to the Ranger plugin in Sqoop (roughly every 30 seconds by default), it should work correctly.

Once the "SqoopTest" service is created, we will create some authorization policies for the user who is using the Sqoop Shell.
Click on "Settings" and "Users/Groups" and add a new user corresponding to the user for whom you wish to create authorization policies. When this is done then click on the "SqoopTest" service and edit the existing policies, adding this user (for example):

Wait 30 seconds for the policies to sync to the Ranger plugin that is co-located with the Sqoop service. Now re-start the Shell and "show connector" should list the full range of Sqoop Connectors, as authorization has succeeded. Similar policies could be created to allow only certain users to run jobs created by other users.

No comments:

Post a Comment