Streamer Deployment Task
This tutorial is the third step in our data pipeline deployment. Here, you will deploy the inverter-streamer, a custom application that acts as a data producer. It generates simulated solar inverter data and sends it to your Kafka cluster.
This task is designed for students learning how to deploy custom applications on Kubernetes, especially those that need to connect to other services and pull images from private container registries.
In this tutorial, you will be tasked to:
Deploy an application from a private GitHub Container Registry (
ghcr.io).Use an
imagePullSecretto authorize access to the private registry.Configure the application using environment variables to specify the Kafka connection details, topic name, and other operational parameters.
Verify that the application is running and successfully producing data by inspecting its logs.
Before you start
Make sure that:
The Zookeeper and Kafka deployments from the previous tasks are running successfully.
You are SSHed into the
k3smainnode and have switched to the shared user account.A pre-configured secret named
ghcr-pull-secretexists in the cluster to allow access to the private container registry. You do not need to create this secret yourself.
Part 1: Create the Streamer Deployment Manifest
Your goal is to create a Kubernetes deployment file named inverter-streamer.yaml that runs the data streamer application.
Deployment Requirements
The Streamer instance must be configured with the following specifications:
Deployment Name: The deployment must be named
inverter-streamer.Node Affinity: The pod must be scheduled to run on the node named
k3smain.Container Image: Use the
ghcr.io/decsresearch/c2sr-bootcamp-streamer:latestimage.Image Pull Secret: Since the image is in a private registry, the deployment must reference the
ghcr-pull-secretto be able to pull the image.Container Command: The container must be explicitly told what command to run:
["python", "main.py"].Environment Configuration: The following environment variables must be set:
KAFKA_HOST: The name of the Kafka service (kafka-service).KAFKA_PORT: The port for the Kafka service (9092).PRODUCTION_INTERVAL: How often to send data, in seconds (1).PRODUCE_TO: The name of the Kafka topic to send data to (nano).nano01throughnano06: Set the value for each of these to494654.
Skeleton File for Deployment
Create a file named inverter-streamer.yaml and fill it out according to the requirements above.
Part 2: Deployment and Verification
Once you have created the inverter-streamer.yaml file, apply it to the cluster and check its logs. The logs are very important here, as they will provide visual confirmation that data is being generated and sent to Kafka.
Apply the manifest:
kubectl apply -f inverter-streamer.yamlVerify the deployment:
kubectl get pods -l app=inverter-streamerCheck for data production: Inspect the logs to see the streaming data.
POD_NAME=$(kubectl get pods -l app=inverter-streamer -o jsonpath='{.items[0].metadata.name}')kubectl logs -f $POD_NAME(Use-fto follow the log stream in real-time.)
Once you can see data being successfully produced in the logs, you can move on to deploying the inverter, which will consume this data.