I have been hearing about Apache Kafka for a while now, many of our customers ask us about running the project inside containers. Lately I was lucky to have had the opportunity to work with a customer to try and get it deployed and processing messages on OpenShift. We are running a multi-node OpenShift cluster on Azure and it took a little while to get the network configured properly (someone said it was not possible, that made us work harder to figure it all out). I figured I would blog about the configuration and pass on the knowledge to the Arctiq community.
Kafka is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.
This diagram sums up what we have configured, we are running 3 Kafka Brokers inside OpenShift as well as 3 Zookeeper PODs that provide the configuration information. The Kafka Producers and Consumers are running outside of OpenShift and need to establish TCP connections on port 9092 to the Brokers. This created the main networking challenge, considering these are not http/https based connection we can’s used the OCP haproxy routers. We are going to have to deploy and configure sn ingress policies on OpenShift. (More on that soon)
Below you get a feel for our deployed architecture, the Kafka Brokers accept connection inbound but also need to communicate outbound to Consumers. We can leverage the SDN for communication inside the OpenShift project but need L3 connectivity for the rest.
So we have 3 Kafka Broker PODs and 3 Zookeeper PODs running inside an OpenShift project / namespace. We deployed these utilizing Stateful Sets and made some persistent storage claims as well. Stateful Sets are still in tech preview in OCP v3.7 but seem to work fine – Read about them here
Kafka has a bunch of different API’s available, Here is a list for reference: Producer, Consumer, Streams, Connect, AdminClient and Legacy APIs.
We have deployed some new services with our PODs, 1 for Zookeeper (not externally facing) and 3 for Kafka Brokers (we will be doing some Azure network plumbing to allow external access to these services)
If we run: oc get svc we can see our new services and also the external ip’s we will be mapping to each broker / service.
Here is also an example of the .yaml we used to create kafka1-ingress.
app: dev01-broker is how we define what PODs to direct the connect to
apiVersion: v1 kind: Service metadata: name: kafka1-ingress spec: externalIPs: - 10.73.115.121 ports: - port: 9093 protocol: TCP selector: app: dev01-broker
You can also see the image steams we used to deploy the Kafka containers:
So we need to route non http based traffic to these services > PODs, we are going to deploy an ipfailover router to handle this traffic. In this use case we deployed this router in the default project namespace. We used the following commands to deploy the router POD.
oc project default oc create serviceaccount ipfailover oadm policy add-scc-to-user privileged system:serviceaccount:default:ipfailover oc adm ipfailover ipf-ha-router1 --replicas=1 --selector="ipfailover=ipf-ha-router1" --virtual-ips=10.73.115.120-123 --watch-port=0 --service-account=ipfailover1 --create
You can see we are using the following IP’s 10.73.115.120-123 for this configuration, these relate back to the services we deployed for the Kafka PODs. We are also specifying a label in-order to ensure this router is deployed on a specific infrastructure node. This will be the node we configure the additional IPs inside Azure on. Kafka PODs can run anywhere, we run them on the application nodes.
Now we can ssh to our infrastructure node and see if we can see our new ip addresses.
ip addr | grep inet
In-order to reserve the IPs in our Azure virtual network and configure the route table we need to define the addresses on our Azure infrastructure node network interface, we also need to ensure we select enable if forwarding.
You can see the ipfailover POD is running in the default project.
Below are the PODs environment variables and virtual ip’s that were configured
You can jump on a different node that the node running the ipfailover router and make sure you can ping the new ip address – 10.73.115.120-123
ping 10.73.115.121 PING 10.73.115.121 (10.73.115.121) 56(84) bytes of data. 64 bytes from 10.73.115.121: icmp_seq=1 ttl=64 time=1.69 ms 64 bytes from 10.73.115.121: icmp_seq=2 ttl=64 time=0.427 ms
One thing you may have figured out with this configuration is that if the single infrastructure nodes running the ipfailover router fails then access to the Kafka Brokers won’t be available. We are working on some new strategies, for now we will just run multi-able ipfailover routers and split the IP’s across nodes for now.
Hope this help you understand Apache Kafka a bit and saves you some time when deploying non http based services on Azure. “Ingress”
Interested in learning more or discussing OpenShift on Azure and Apache Kafka? We would love to hear from you.