Building a Kubernetes Operator with Pepr
15 minute read
Introduction
This tutorial guides you through building a Kubernetes Operator using Pepr. You’ll create a WebApp Operator that manages custom WebApp resources in your Kubernetes cluster.
If you get stuck at any point, you can reference the complete example code in the Pepr Excellent Examples repository.
What You’ll Build
The WebApp Operator will:
- Deploy a custom
WebApp
resource definition (CRD) - Watch for WebApp instances and reconcile them with the actual cluster state
- For each WebApp instance, manage:
- A
Deployment
with configurable replicas - A
Service
to expose the application - A
ConfigMap
containing configurable HTML with language and theme options
- A
All resources will include ownerReferences
, triggering cascading deletion when a WebApp is removed. The operator will also automatically restore any managed resources that are deleted externally.
Prerequisites
- A Kubernetes cluster (local or remote)
- Access to the
curl
command - Basic understanding of Kubernetes concepts
- Familiarity with TypeScript
- Node.js ≥ 18.0
Tutorial Steps
- Create a new Pepr Module
- Define the WebApp CRD
- Create Helper Functions
- Implement the Reconciler
- Build and Deploy Your Operator
- Test Your Operator
Create a new Pepr Module
[🟢⚪⚪⚪⚪⚪] Step 1 of 6
First, create a new Pepr module for your operator:
npx pepr init \
--name operator \
--uuid my-operator-uuid \
--description "Kubernetes Controller for WebApp Resources" \
--errorBehavior reject \
--confirm &&
cd operator # set working directory as the new pepr module
To track your progress in this tutorial, let’s treat it as a git
repository:
git init && git add --all && git commit -m "npx pepr init"
Create CRD
[🟢🟢⚪⚪⚪⚪] Step 2 of 6
The WebApp Custom Resource Definition (CRD) specifies the structure and validation for your custom resource. Create the necessary directory structure:
mkdir -p capabilities/crd/generated capabilities/crd/source
Generate a class based on the WebApp CRD using kubernetes-fluent-client.
This allows us to react to the CRD fields in a type-safe manner.
Create a CRD named crd.yaml
for the WebApp that includes:
- Theme selection (dark/light)
- Language selection (en/es)
- Configurable replica count
- Status tracking
curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/crd/source/crd.yaml \
-o capabilities/crd/source/crd.yaml
Examine the contents of capabilities/crd/source/crd.yaml
.
Note that the status should be listed under subresources
to make it writable.
We provide descriptions for each property to clarify their purpose.
Enums are used to restrict the values that can be assigned to a property.
Create an interface for the CRD spec with the following command:
curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/crd/generated/webapp-v1alpha1.ts \
-o capabilities/crd/generated/webapp-v1alpha1.ts
Examine the contents of capabilities/crd/generated/webapp-v1alpha1.ts
.
Create a TypeScript file that contains the WebApp CRD named webapp.crd.ts
.
This will enable the controller to automatically create the CRD on startup.
Use the command:
curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/crd/source/webapp.crd.ts \
-o capabilities/crd/source/webapp.crd.ts
Take a moment to commit your changes for CRD creation:
git add capabilities/crd/ && git commit -m "Create WebApp CRD"
The webapp.crd.ts
file defines the structure of our CRD, including the validation rules and schema that Kubernetes will use when WebApp resources are created or modified.
Create a file that will automatically register the CRD on startup named capabilities/crd/register.ts
:
curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/crd/register.ts \
-o capabilities/crd/register.ts
The register.ts
file contains logic to ensure our CRD is created in the cluster when the operator starts up, avoiding the need for manual CRD installation.
Create a file to validate that WebApp instances are in valid namespaces and have a maximum of 7
replicas.
Create a validator.ts
file with the command:
curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/crd/validator.ts \
-o capabilities/crd/validator.ts
The validator.ts
file implements validation logic for our WebApp resources, ensuring they meet our requirements before they’re accepted by the cluster.
In this section, we’ve generated the CRD class for WebApp, created a function to automatically register the CRD, and added a validator to ensure WebApp instances are in valid namespaces and don’t exceed 7 replicas.
Commit your changes for CRD registration & validation:
git add capabilities/crd/ && git commit -m "Create CRD handling logic"
Create Helpers
[🟢🟢🟢⚪⚪⚪] Step 3 of 6
Now, let’s create helper functions that will generate the Kubernetes resources managed by our operator. These helpers will simplify the creation of Deployments, Services, and ConfigMaps for each WebApp instance.
Create a controller
folder in the capabilities
folder and create a generators.ts
file. This file will contain functions that generate Kubernetes objects for the operator to deploy (with the ownerReferences automatically included). Since these resources are owned by the WebApp resource, they will be deleted when the WebApp resource is deleted.
mkdir -p capabilities/controller
Create generators.ts
with the following command:
curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/controller/generators.ts \
-o capabilities/controller/generators.ts
The generators.ts
file contains functions to create all the necessary Kubernetes resources for our WebApp, including properly configured Deployments, Services, and ConfigMaps with appropriate labels and selectors.
Our goal is to simplify WebApp deployment. Instead of requiring users to manage multiple Kubernetes objects, track versions, and handle revisions manually, they can focus solely on the WebApp
instance. The controller reconciles WebApp instances against the actual cluster state to achieve the desired configuration.
The controller deploys a ConfigMap
based on the language and theme specified in the WebApp resource and sets the number of replicas according to the WebApp specification.
Commit your changes for deployment, service, and configmap generation:
git add capabilities/controller/ && git commit -m "Add generators for WebApp deployments, services, and configmaps"
Create Reconciler
[🟢🟢🟢🟢⚪⚪] Step 4 of 6
Now, create the function that reacts to changes in WebApp instances. This function will be called and placed into a queue, guaranteeing ordered and synchronous processing of events, even when the system is under heavy load.
In the base of the capabilities
folder, create a reconciler.ts
file and add the following:
curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/reconciler.ts \
-o capabilities/reconciler.ts
The reconciler.ts
file contains the core logic of our operator, handling the creation, updating, and deletion of WebApp resources and ensuring the cluster state matches the desired state.
Finally, create the index.ts
file in the capabilities
folder and add the following:
curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/index.ts \
-o capabilities/index.ts
The index.ts
file contains the WebAppController capability and the functions that are used to watch for changes to the WebApp resource and corresponding Kubernetes resources.
- When a WebApp is created or updated, validate it, store the name of the instance and enqueue it for processing.
- If an “owned” resource (ConfigMap, Service, or Deployment) is deleted, redeploy it.
- Always redeploy the WebApp CRD if it was deleted as the controller depends on it
In this section we created a reconciler.ts
file that contains the function responsible for reconciling the state of WebApp instances with the cluster and updating their status.
The index.ts
file contains the WebAppController capability and functions that watch for changes to WebApp resources and their corresponding Kubernetes objects.
The Reconcile
action processes callbacks in a queue, guaranteeing ordered and synchronous processing of events.
Commit your changes with:
git add capabilities/ && git commit -m "Create reconciler for webapps"
Add capability to Pepr Module
Ensure that the PeprModule in pepr.ts
uses WebAppController
. The implementation should look something like this:
new PeprModule(cfg, [WebAppController]);
Using sed
, replace the contents of the file to use our new WebAppController
:
# Use the WebAppController Module
sed -i '' -e '/new PeprModule(cfg, \[/,/\]);/c\
new PeprModule(cfg, [WebAppController]);' ./pepr.ts
# Update Imports
sed -i '' 's|import { HelloPepr } from "./capabilities/hello-pepr";|import { WebAppController } from "./capabilities";|' ./pepr.ts
Commit your changes now that the WebAppController is part of the Pepr Module:
git add pepr.ts && git commit -m "Register WebAppController with pepr module"
Build and Deploy Your Operator
[🟢🟢🟢🟢🟢⚪] Step 5 of 6
Preparing Your Environment
Create an ephemeral cluster with k3d
.
k3d cluster delete pepr-dev &&
k3d cluster create pepr-dev --k3s-arg '--debug@server:0' --wait &&
kubectl rollout status deployment -n kube-system
What is an ephemeral cluster?
An ephemeral cluster is a temporary Kubernetes cluster that exists only for testing purposes. Tools like Kind (Kubernetes in Docker) and k3d let you quickly create and destroy clusters without affecting your production environments.
Update and Prepare Pepr
Make sure Pepr is updated to the latest version:
npx pepr update --skip-template-update
⚠️ Important Note: Be cautious when updating Pepr in an existing project as it could potentially override custom configurations. The
--skip-template-update
flag helps prevent this.
Building the Operator
Build the Pepr module by running:
npx pepr format &&
npx pepr build
Commit your changes after the build completes:
git add capabilities/ package*.json && git commit -m "Build pepr module"
The build process explained
The pepr build
command performs three critical steps:
- Compile TypeScript: Converts your TypeScript code to JavaScript using the settings in tsconfig.json
- Bundle the Operator: Packages everything into a deployable format using esbuild
- Generate Kubernetes Manifests: Creates all necessary YAML files in the
dist
directory, including:- Custom Resource Definitions (CRDs)
- The controller deployment
- ServiceAccounts and RBAC permissions
- Any other resources needed for your operator
This process creates a self-contained deployment unit that includes everything needed to run your operator in a Kubernetes cluster.
┌─────────────────────┐ ┌──────────────────┐ ┌───────────────────┐
│ │ │ │ │ │
│ Your Pepr Code │────────▶ │ pepr build │────────▶ │ dist/ │
│ (TypeScript) │ │ (Build Process) │ │(Deployment Files) │
│ │ │ │ │ │
└─────────────────────┘ └──────────────────┘ └───────────────────┘
│
│
▼
┌───────────────────┐
│ │
│ Kubernetes │
│ Cluster │
│ │
└───────────────────┘
Deploy to Kubernetes
To deploy your operator to a Kubernetes cluster:
kubectl apply -f dist/pepr-module-my-operator-uuid.yaml &&
kubectl wait --for=condition=Ready pods -l app -n pepr-system --timeout=120s
What’s happening during deployment?
The first command applies all the Kubernetes resources defined in the YAML file, including:
- The WebApp CRD (Custom Resource Definition)
- A Deployment that runs your operator code
- The necessary RBAC permissions for your operator to function
The second command waits for the operator pod to be ready before proceeding. This ensures your operator is running before you attempt to create WebApp resources.
Troubleshooting Deployment Issues
Troubleshooting operator startup problems
If your operator doesn’t start properly, check these common issues:
Check pod logs:
kubectl logs -n pepr-system -l app --tail=100
Verify permissions:
kubectl describe deployment -n pepr-system
Look for permission-related errors in the events section.
Verify the deployment was successful by checking if the CRD has been properly registered:
kubectl get crd | grep webapp
You should see webapps.pepr.io
in the output, which confirms your Custom Resource Definition was created successfully.
Understanding the WebApp Resource
You can use kubectl explain
to see the structure of your custom resource.
It may take a moment for the cluster to recognize this resource before the following command will work:
kubectl explain wa.spec
Expected Output:
GROUP: pepr.io
KIND: WebApp
VERSION: v1alpha1
FIELD: spec <Object>
DESCRIPTION:
<empty>
FIELDS:
language <string> -required-
Language defines the language of the web application, either English (en) or
Spanish (es).
replicas <integer> -required-
Replicas is the number of desired replicas.
theme <string> -required-
Theme defines the theme of the web application, either dark or light.
💡 Note:
wa
is the short form ofwebapp
that kubectl recognizes. This resource structure directly matches the TypeScript interface we defined earlier in our code.
Test Your Operator
[🟢🟢🟢🟢🟢🟢] Step 6 of 6
Understanding reconciliation
Reconciliation is the core concept behind Kubernetes operators. It’s the process of:
- Observing the current state of resources in the cluster
- Comparing it to the desired state (defined in your custom resource)
- Taking actions to align the actual state with the desired state
This continuous loop ensures your application maintains its expected configuration even when disruptions occur.
┌───────────────────┐
│ │
│ Custom Resource │◄────────────┐
│ (WebApp) │ │
│ │ │
└───────┬───────────┘ │
│ │
│ Observe │
▼ │
┌───────────────────┐ ┌────────────────┐
│ │ │ │
│ Pepr Operator │───►│ Reconcile │
│ Controller │ │ (Take Action) │
│ │ │ │
└───────┬───────────┘ └────────────────┘
│ ▲
│ Create/Update │
▼ │
┌───────────────────┐ │
│ Owned Resources │─────────────┘
│ • ConfigMap │ If deleted or
│ • Service │ changed, trigger
│ • Deployment │ reconciliation
└───────────────────┘
Creating a WebApp Instance
Let’s create an instance of our custom WebApp
resource in English with a light theme and 1 replicas:
curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/webapp-light-en.yaml \
-o webapp-light-en.yaml
Examine the contents of webapp-light-en.yaml
. It defines a WebApp with English language, light theme, and 1 replica.
Next, apply it to the cluster:
kubectl create namespace webapps &&
kubectl apply -f webapp-light-en.yaml
How resource creation works
- Kubernetes API server receives the WebApp resource
- Our operator’s controller (in
index.ts
) detects the new resource via its reconcile function - The controller validates the WebApp using our validator
- The reconcile function creates three “owned” resources:
- A ConfigMap with HTML content based on the theme and language
- A Service to expose the web application
- A Deployment to run the web server pods with the specified number of replicas
- The status is updated to track progress
All this logic is in the code we wrote earlier in the tutorial.
Verifying Resource Creation
Now, verify that the WebApp and its owned resources were created properly:
kubectl get cm,svc,deploy,webapp -n webapps
Expected Output:
NAME DATA AGE
configmap/kube-root-ca.crt 1 6s
configmap/web-content-webapp-light-en 1 5s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/webapp-light-en ClusterIP 10.43.85.1 <none> 80/TCP 5s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/webapp-light-en 1/1 1 1 5s
💡 Tip: Our operator created three resources based on our single WebApp definition - this is the power of operators in action!
Checking WebApp Status
The status field is how our operator communicates the current state of the WebApp:
kubectl get wa webapp-light-en -n webapps -ojsonpath="{.status}" | jq
Expected Output:
{
"observedGeneration": 1,
"phase": "Ready"
}
Understanding status fields
- observedGeneration: A counter that increments each time the resource spec is changed
- phase: The current lifecycle state of the WebApp (“Pending” during creation, “Ready” when all components are operational)
This status information comes from our reconciler code, which updates these fields during each reconciliation cycle.
You can also see events related to your WebApp that provide a timeline of actions taken by the operator:
kubectl describe wa webapp-light-en -n webapps
Expected Output:
Name: webapp-light-en
Namespace: webapps
API Version: pepr.io/v1alpha1
Kind: WebApp
Metadata: ...
Spec:
Language: en
Replicas: 1
Theme: light
Status:
Observed Generation: 1
Phase: Ready
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal InstanceCreatedOrUpdated 36s webapp-light-en Pending
Normal InstanceCreatedOrUpdated 36s webapp-light-en Ready
Viewing Your WebApp
To access your WebApp in a browser, use port-forwarding to connect to the service.
The following command runs the portforward in the background.
Be sure to make note of the Port-forward PID
for later when we tear down the test environment.
kubectl port-forward svc/webapp-light-en -n webapps 3000:80 &
PID=$!
echo "Port-forward PID: $PID"
About port-forwarding
Port-forwarding creates a secure tunnel from your local machine to a pod or service in your Kubernetes cluster. In this case, we’re forwarding your local port 3000 to port 80 of the WebApp service, allowing you to access the application at http://localhost:3000 in your browser.
Now open http://localhost:3000 in your browser or run curl http://localhost:3000
to see the response in a terminal.
The browser should display a light theme web application:
Testing the Reconciliation Loop
A key feature of operators is their ability to automatically repair resources when they’re deleted or changed. Let’s test this by deleting the ConfigMap:
kubectl delete cm -n webapps --all &&
sleep 10 &&
kubectl get cm -n webapps
Expected output:
configmap "kube-root-ca.crt" deleted
configmap "web-content-webapp-light-en" deleted
NAME DATA AGE
kube-root-ca.crt 1 0s
web-content-webapp-light-en 1 0s
Now that we’ve successfully deployed a WebApp, commit your changes:
git add webapp-light-en.yaml && git commit -m "Add WebApp resource for light mode in english"
Behind the scenes of reconciliation
- When you deleted the ConfigMap, Kubernetes sent a DELETE event
- Our operator (in
index.ts
) was watching for these events via theonDeleteOwnedResource
handler - This triggered the reconciliation loop, which detected that the ConfigMap was missing
- The reconciler recreated the ConfigMap based on the WebApp definition
- This all happened automatically without manual intervention - the core benefit of using an operator!
🛠️ Try it yourself: Try deleting the Service or Deployment. What happens? The operator should recreate those too!
Updating the WebApp
Now let’s test changing the WebApp’s specification. Copy down the next WebApp resource with the following command:
curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/webapp-dark-es.yaml \
-o webapp-dark-es.yaml
Compare the contents of webapp-light-en.yaml
and webapp-dark-es.yaml
with the command:
diff --side-by-side \
webapp-light-en.yaml \
webapp-dark-es.yaml
kubectl apply -f webapp-dark-es.yaml
💡 Note: We’ve changed the theme from light to dark and the language from English (en) to Spanish (es).
Your port-forward should still be active, so you can refresh your browser to see the changes. If your porf-forward is no longer active for some reason, create a new one:
# Only needed if previous port-forward closed
kubectl port-forward svc/webapp-light-en -n webapps 3000:80 &
PID=$!
echo "Port-forward PID: $PID"
Now open http://localhost:3000 in your browser or run curl http://localhost:3000
to see the response in a terminal.
The browser should display a dark theme web application:
Now that we’ve successfully updated a WebApp, commit your changes:
git add webapp-dark-es.yaml && git commit -m "Update WebApp resource for dark mode in spanish"
How updating works
- When you apply the changed WebApp, Kubernetes sends an UPDATE event
- Our operator’s controller (in
index.ts
) detects this via theonUpdate
handler - The updated spec is validated and then queued for reconciliation
- The reconciler compares the current resources with what’s needed for the new spec
- It updates the ConfigMap with the new theme and language content
- The Deployment automatically detects the ConfigMap change and restarts the pod with the new content
Cleanup
When you’re done testing, you can delete your WebApp and verify that all owned resources are removed:
kill $PID && # Close port-forward
kubectl delete wa -n webapps --all &&
sleep 5 &&
kubectl get cm,deploy,svc -n webapps
You can also delete the entire test cluster when you’re finished:
k3d cluster delete pepr-dev
Congratulations!
You’ve successfully built a Kubernetes operator using Pepr. Through this tutorial, you:
- Created a custom resource definition (CRD) for WebApps
- Implemented a controller with reconciliation logic
- Added validation for your custom resources
- Deployed your operator to a Kubernetes cluster
- Verified that your operator correctly manages the lifecycle of WebApp resources
This pattern is powerful for creating self-managing applications in Kubernetes. Your operator now handles the complex task of maintaining your application’s state according to your specifications, reducing the need for manual intervention.
What You’ve Learned
By completing this tutorial, you’ve gained experience with several important concepts:
- Custom Resource Definitions (CRDs): You defined a structured, validated schema for your WebApp resources
- Reconciliation: You implemented the core operator pattern that maintains desired state
- Owner References: You used Kubernetes ownership to manage resource lifecycles
- Status Reporting: Your operator provides feedback about resource state through status fields
- Watch Patterns: Your operator reacts to changes in both custom and standard Kubernetes resources
These concepts form the foundation of the operator pattern and can be applied to manage any application or service on Kubernetes.
Next Steps
Now that you understand the basics of building an operator with Pepr, you might want to:
- Add more sophisticated validation logic
- Implement status conditions that provide detailed health information
- Add support for upgrading between versions of your application
- Explore more complex reconciliation patterns for multi-component applications
- Add metrics and monitoring to your operator
For more information, check out: