Monday, June 29, 2020

Configuring Kerberos for Kafka in Talend Open Studio for ESB

A few years back I wrote a blog post about how to create a job in Talend Open Studio for Big Data to read data from an Apache Kafka topic using kerberos. This job made use of the "tKafkaConnection" and "tKafkaInput" components. In Talend Open Studio for ESB, there is a component based on Apache Camel called "cKafka" that can also be used for the same purpose, but configuring it with kerberos is slightly different. In this post, we will show how to use the cKafka component in Talend Open Studio for ESB to read from a Kafka topic using kerberos.

1) Kafka setup

Follow a previous tutorial to setup an Apache Kerby based KDC testcase and to configure Apache Kafka to require kerberos for authentication. Kafka 2.5.0 was used for the purpose of this tutorial. Create a "test" topic and write some data to it, and verify with the command-line consumer that the data can be read correctly.

2) Download Talend Open Studio for ESB and create a route

Now we will download Talend Open Studio for ESB (7.3.1 was used for the purposes of this tutorial). Unzip the file when it is downloaded and then start the Studio using one of the platform-specific scripts. It will prompt you to download some additional dependencies and to accept the licenses. Right click on "Routes" and select "Create Route", entering a name for the route.

In the search bar under "Palette" on the right hand side enter "kafka" and hit enter. Drag the "cKafka" component that should appear into the route designer. Next find the "cLog" component under "Miscellaneous" and drag this to the right of the "cKafka" component. Right click the "cKafka" component and select "Row / Route" and connect the resulting arrow with the "cLog" component.

3) Configure the components

Now let's configure the individual components. Double-click on the "cKafka" component and enter "test" for the topic. Next, select "Advanced Settings" and scroll down to the kerberos configuration. For "Kerberos Service Name" enter "kafka". Then for "Security Protocol" select "SASL over Plaintext":

Next click on the "Run" tab and go to "Advanced Settings". Under "JVM Settings" select the checkbox for "Use specific JVM arguments", and add new arguments as follows:
For the first argument, you need to enter the path of the "client.jaas" file as described in the tutorial to set up the Kafka test-case. For the second argument, you need to specify the path of the "krb5.conf" file supplied in the target directory of the Apache Kerby test-case:

Now we are ready to run the job. Click on the "Run" tab and then hit the "Run" button. Send some data via the producer to the "test" topic and you should see the data appear in the Run Window in the Studio.