This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Pepr

1: User Guide

1.1: Pepr CLI
1.2: Pepr SDK
1.3: Pepr Modules
1.4: Actions

1.4.1: Mutate
1.4.2: Validate
1.4.3: Reconcile
1.4.4: Watch
1.4.5: Finalize
1.4.6: Using Alias Child Logger in Actions

1.5: Pepr Capabilities
1.6: Pepr Store
1.7: Custom Resources
1.8: OnSchedule
1.9: RBAC Modes
1.10: Metrics Endpoints
1.11: WASM Support
1.12: Customization
1.13: Pepr Filters

2: Pepr Tutorials

2.1: Tutorial - Create a Pepr Module
2.2: Tutorial - Create a Pepr Dashboard
2.3: Building a Kubernetes Operator with Pepr

3: Module Examples

4: Pepr Best Practices

5: Frequently Asked Questions

6: Roadmap for Pepr

7: Community and Support

7.1: Pepr Media

8: Contributor Guide

9: Contributor Covenant Code of Conduct

10: Security Policy

11: Support

What happened to Pepr’s stars?

In February 2025, an accidental change to the repository’s visibility reset the star count. The visibility issue was quickly resolved, but the stars were unfortunately lost.

Pepr had over 200 stars, demonstrating its recognition and value within the Kubernetes community. We’re working to rebuild that recognition.

If you’ve previously starred Pepr, or if you find it a useful project, we would greatly appreciate it if you could re-star the repository. We really appreciate your support! :star:

Type safe Kubernetes middleware for humans

Pepr is on a mission to save Kubernetes from the tyranny of YAML, intimidating glue code, bash scripts, and other makeshift solutions. As a Kubernetes controller, Pepr empowers you to define Kubernetes transformations using TypeScript, without software development expertise thanks to plain-english configurations. Pepr transforms a patchwork of forks, scripts, overlays, and other chaos into a cohesive, well-structured, and maintainable system. With Pepr, you can seamlessly transition IT ops organizational knowledge into code, simplifying documentation, testing, validation, and coordination of changes for a more predictable outcome.

Features

Zero-config K8s Mutating and Validating Webhooks plus Controller generation
Automatic leader-elected K8s resource watching
Lightweight async key-value store backed by K8s for stateful operations with the Pepr Store
Human-readable fluent API for generating Pepr Capabilities
A fluent API for creating/modifying/watching and server-side applying K8s resources via Kubernetes Fluent Client
Generate new K8s resources based off of cluster resource changes
Perform other exec/API calls based off of cluster resources changes or any other arbitrary schedule
Out of the box airgap support with Zarf
Entire NPM ecosystem available for advanced operations
Realtime K8s debugging system for testing/reacting to cluster changes
Controller network isolation and tamper-resistant module execution
Least-privilege RBAC generation
AMD64 and ARM64 support

Example Pepr Action

This quick sample shows how to react to a ConfigMap being created or updated in the cluster. It adds a label and annotation to the ConfigMap and adds some data to the ConfigMap. It also creates a Validating Webhook to make sure the “pepr” label still exists. Finally, after the ConfigMap is created, it logs a message to the Pepr controller and creates or updates a separate ConfigMap with the kubernetes-fluent-client using server-side apply. For more details see actions section.

When(a.ConfigMap)
  .IsCreatedOrUpdated()
  .InNamespace("pepr-demo")
  .WithLabel("unicorn", "rainbow")
  // Create a Mutate Action for the ConfigMap
  .Mutate(request => {
    // Add a label and annotation to the ConfigMap
    request.SetLabel("pepr", "was-here").SetAnnotation("pepr.dev", "annotations-work-too");

    // Add some data to the ConfigMap
    request.Raw.data["doug-says"] = "Pepr is awesome!";

    // Log a message to the Pepr controller logs
    Log.info("A 🦄 ConfigMap was created or updated:");
  })
  // Create a Validate Action for the ConfigMap
  .Validate(request => {
    // Validate the ConfigMap has a specific label
    if (request.HasLabel("pepr")) {
      return request.Approve();
    }

    // Reject the ConfigMap if it doesn't have the label
    return request.Deny("ConfigMap must have a unicorn label");
  })
  // Watch behaves like controller-runtime's Manager.Watch()
  .Watch(async (cm, phase) => {
    Log.info(cm, `ConfigMap was ${phase}.`);

    // Apply a ConfigMap using K8s server-side apply (will create or update)
    await K8s(kind.ConfigMap).Apply({
      metadata: {
        name: "pepr-ssa-demo",
        namespace: "pepr-demo-2",
      },
      data: {
        uid: cm.metadata.uid,
      },
    });
  });

Prerequisites

Node.js v18.0.0+ (even-numbered releases only)
- To ensure compatability and optimal performance, it is recommended to use even-numbered releases of Node.js as they are stable releases and receive long-term support for three years. Odd-numbered releases are experimental and may not be supported by certain libraries utilized in Pepr.
npm v10.1.0+
Recommended (optional) tools:
- Visual Studio Code for inline debugging and Pepr Capabilities creation.
- A Kubernetes cluster for npx pepr dev. Pepr modules include npm run k3d-setup if you want to test locally with K3d and Docker.

Wow, too many words! tl;dr

# Create a new Pepr Module
npx pepr init

# If you already have a K3d cluster you want to use, skip this step
npm run k3d-setup

# Start playing with Pepr now!
# If using Kind, or another local k8s distro instead,
# run `npx pepr dev --host <your_hostname>`
npx pepr dev
kubectl apply -f capabilities/hello-pepr.samples.yaml

# Be amazed and ⭐️ this repo!

[!TIP] Don’t use IP as your --host, it’s not supported. Make sure to check your local k8s distro documentation how to reach your localhost, which is where pepr dev is serving the code from.

«video class=“td-content” controls src=“https://user-images.githubusercontent.com/882485/230895880-c5623077-f811-4870-bb9f-9bb8e5edc118.mp4">>

Concepts

Module

A module is the top-level collection of capabilities. It is a single, complete TypeScript project that includes an entry point to load all the configuration and capabilities, along with their actions. During the Pepr build process, each module produces a unique Kubernetes MutatingWebhookConfiguration and ValidatingWebhookConfiguration, along with a secret containing the transpiled and compressed TypeScript code. The webhooks and secret are deployed into the Kubernetes cluster with their own isolated controller.

See Module for more details.

Capability

A capability is set of related actions that work together to achieve a specific transformation or operation on Kubernetes resources. Capabilities are user-defined and can include one or more actions. They are defined within a Pepr module and can be used in both MutatingWebhookConfigurations and ValidatingWebhookConfigurations. A Capability can have a specific scope, such as mutating or validating, and can be reused in multiple Pepr modules.

See Capabilities for more details.

Action

Action is a discrete set of behaviors defined in a single function that acts on a given Kubernetes GroupVersionKind (GVK) passed in from Kubernetes. Actions are the atomic operations that are performed on Kubernetes resources by Pepr.

For example, an action could be responsible for adding a specific label to a Kubernetes resource, or for modifying a specific field in a resource’s metadata. Actions can be grouped together within a Capability to provide a more comprehensive set of operations that can be performed on Kubernetes resources.

There are both Mutate() and Validate() Actions that can be used to modify or validate Kubernetes resources within the admission controller lifecycle. There is also a Watch() Action that can be used to watch for changes to Kubernetes resources that already exist.

See actions for more details.

Logical Pepr Flow

Arch Diagram Source Diagram

TypeScript

TypeScript is a strongly typed, object-oriented programming language built on top of JavaScript. It provides optional static typing and a rich type system, allowing developers to write more robust code. TypeScript is transpiled to JavaScript, enabling it to run in any environment that supports JavaScript. Pepr allows you to use JavaScript or TypeScript to write capabilities, but TypeScript is recommended for its type safety and rich type system. You can learn more about TypeScript here.

Community

To join our channel go to Kubernetes Slack and join the #pepr channel.

Made with contrib.rocks.

1 - User Guide

In this section you can find detailed information about Pepr and how to use it.

Sections

You can find the following information in this section:

Pepr CLI

Pepr Modules

Pepr Actions

Pepr Capabilities

Pepr Store

Custom Resources

OnSchedule

RBAC

Metrics

Webassembly

1.1 - Pepr CLI

`npx pepr init`

Initialize a new Pepr Module.

Options:

--skip-post-init - Skip npm install, git init and VSCode launch.
--confirm - Skip verification prompt when creating a new module.
--description <string> - Explain the purpose of the new module.
--name <string> - Set the name of the new module.
--skip-post-init - Skip npm install, git init, and VSCode launch.
--errorBehavior <audit|ignore|reject> - Set an errorBehavior.
--uuid [string] - Unique identifier for your module with a max length of 36 characters.
--crd - Scaffold and generate Kubernetes CRDs from structured TypeScript definitions.

`npx pepr update`

Update the current Pepr Module to the latest SDK version. This command is not recommended for production use, instead, we recommend Renovate or Dependabot for automated updates.

Options:

--skip-template-update - Skip updating the template files

`npx pepr dev`

Connect a local cluster to a local version of the Pepr Controller to do real-time debugging of your module. Note the npx pepr dev assumes a K3d cluster is running by default. If you are working with Kind or another docker-based K8s distro, you will need to pass the --host host.docker.internal option to npx pepr dev. If working with a remote cluster you will have to give Pepr a host path to your machine that is reachable from the K8s cluster.

NOTE: This command, by necessity, installs resources into the cluster you run it against. Generally, these resources are removed once the pepr dev session ends but there are two notable exceptions:

the pepr-system namespace, and
the PeprStore CRD.

These can’t be auto-removed because they’re global in scope & doing so would risk wrecking any other Pepr deployments that are already running in-cluster. If (for some strange reason) you’re not pepr dev-ing against an ephemeral dev cluster and need to keep the cluster clean, you’ll have to remove these hold-overs yourself (or not)!

Options:

-h, --host [host] - Host to listen on (default: “host.k3d.internal”)
--confirm - Skip confirmation prompt

`npx pepr deploy`

Deploy the current module into a Kubernetes cluster, useful for CI systems. Not recommended for production use.

Options:

-i, --image [image] - Override the image tag
--confirm - Skip confirmation prompt
--pullSecret <name> - Deploy imagePullSecret for Controller private registry
--docker-server <server> - Docker server address
--docker-username <username> - Docker registry username
--docker-email <email> - Email for Docker registry
--docker-password <password> - Password for Docker registry
--force - Force deploy the module, override manager field

`npx pepr monitor`

Monitor Validations for a given Pepr Module or all Pepr Modules.

Usage:

npx pepr monitor [options] [module-uuid]

Options:

-h, --help - Display help for command

`npx pepr uuid`

Module UUID(s) currently deployed in the cluster with their descriptions.

Options:

[uuid] - Specific module UUID

`npx pepr build`

Create a zarf.yaml and K8s manifest for the current module. This includes everything needed to deploy Pepr and the current module into production environments.

Options:

--custom-name [name] - Specify a custom name for zarf component and service monitors in helm charts.
-e, --entry-point [file] - Specify the entry point file to build with. (default: “pepr.ts”)
-n, --no-embed - Disables embedding of deployment files into output module. Useful when creating library modules intended solely for reuse/distribution via NPM
-r, --registry-info [<registry>/<username>] - Provide the image registry and username for building and pushing a custom WASM container. Requires authentication. Builds and pushes ‘registry/username/custom-pepr-controller:’.
-o, --output-dir [output directory] - Define where to place build output
--timeout [timeout] - How long the API server should wait for a webhook to respond before treating the call as a failure
--rbac-mode [admin|scoped] - Rbac Mode: admin, scoped (default: admin) (choices: “admin”, “scoped”, default: “admin”)
-i, --custom-image [custom-image] - Specify a custom image (including version) for Admission and Watch Deployments. Example: ‘docker.io/username/custom-pepr-controller:v1.0.0’
--registry [GitHub, Iron Bank] - Container registry: Choose container registry for deployment manifests.
-v, --version <version>. Example: '0.27.3' - DEPRECATED: The version of the Pepr image to use in the deployment manifests.
--withPullSecret <imagePullSecret> - Image Pull Secret: Use image pull secret for controller Deployment.
-z, --zarf [manifest|chart] - The Zarf package type to generate: manifest or chart (default: manifest).

`npx pepr kfc`

Execute a kubernetes-fluent-client command. This command is a wrapper around kubernetes-fluent-client.

Usage:

npx pepr kfc [options] [command]

If you are unsure of what commands are available, you can run npx pepr kfc to see the available commands.

For example, to generate usable types from a Kubernetes CRD, you can run npx pepr kfc crd [source] [directory]. This will generate the types for the [source] CRD and output the generated types to the [directory].

You can learn more about the kubernetes-fluent-client here.

`npx pepr crd`

Scaffold and generate Kubernetes CRDs from structured TypeScript definitions.

Options:

-h, –help display help for command

Commands:

create [options] Create a new CRD TypeScript definition generate [options] Generate CRD manifests from TypeScript definitions help [command] display help for command

`npx pepr crd create`

Create a new CRD TypeScript definition.

Options:

--group <group> - API group (e.g. cache)
--version <version> - API version (e.g. v1alpha1)
--kind <kind> - Kind name (e.g. Memcached)
--domain <domain> - Optional domain (e.g. pepr.dev) (default: “pepr.dev”)
--scope <Namespaced | Cluster> - Whether the resulting custom resource is cluster- or namespace-scoped (default: “Namespaced”)
--plural <plural> - Plural name (e.g. memcacheds) (default: “”)
--shortName <shortName> - Short name (e.g. mc) (default: “”)
-h, --help - display help for command

`npx pepr crd generate`

Generate CRD manifests from TypeScript definitions

Options:

--output <output> - Output directory for generated CRDs (default: “./crds”)
-h, --help - display help for command

1.2 - Pepr SDK

To use, import the sdk from the pepr package:

import { sdk } from "pepr";

`containers`

Returns list of all containers in a pod. Accepts the following parameters:

@param peprValidationRequest The request/pod to get the containers from
@param containerType The type of container to get

Usage:

Get all containers

const { containers } = sdk;
let result = containers(peprValidationRequest)

Get only the standard containers

const { containers } = sdk;
let result = containers(peprValidationRequest, "containers")

Get only the init containers

const { containers } = sdk;
let result = containers(peprValidationRequest, "initContainers")

Get only the ephemeral containers

const { containers } = sdk;
let result = containers(peprValidationRequest, "ephemeralContainers")

`getOwnerRefFrom`

Returns the owner reference for a Kubernetes resource as an array. Accepts the following parameters:

@param kubernetesResource: GenericKind The Kubernetes resource to get the owner reference for
@param blockOwnerDeletion: boolean If true, AND if the owner has the “foregroundDeletion” finalizer, then the owner cannot be deleted from the key-value store until this reference is removed.
@param controller: boolean If true, this reference points to the managing controller.

Usage:

const { getOwnerRefFrom } = sdk;
const ownerRef = getOwnerRefFrom(kubernetesResource);

`writeEvent`

Write a K8s event for a CRD. Accepts the following parameters:

@param kubernetesResource: GenericKind The Kubernetes resource to write the event for
@param event The event to write, should contain a human-readable message for the event
@param options Configuration options for the event.
- eventType: string – The type of event to write, for example “Warning”
- eventReason: string – The reason for the event, for example “ReconciliationFailed”
- reportingComponent: string – The component that is reporting the event, for example “uds.dev/operator”
- reportingInstance: string – The instance of the component that is reporting the event, for example process.env.HOSTNAME

Usage:

const { writeEvent } = sdk;
const event = { message: "Resource was created." };
writeEvent(kubernetesResource, event, {
  eventType: "Info",
  eventReason: "ReconciliationSuccess",
  reportingComponent: "uds.dev/operator",
  reportingInstance: process.env.HOSTNAME,
});

`sanitizeResourceName`

Returns a sanitized resource name to make the given name a valid Kubernetes resource name. Accepts the following parameter:

@param resourceName The name of the resource to sanitize

Usage:

const { sanitizeResourceName } = sdk;
const sanitizedResourceName = sanitizeResourceName(resourceName)

1.3 - Pepr Modules

What is a Pepr Module?

A Pepr Module is a collection of capabilities, config and scaffolding in a Pepr Project. To create a module use the npx pepr init command.

1.4 - Actions

Overview

An action is a discrete set of behaviors defined in a single function that acts on a given Kubernetes GroupVersionKind (GVK) passed in during the admission controller lifecycle. Actions are the atomic operations that are performed on Kubernetes resources by Pepr.

Actions are Mutate(), Validate(), Watch(), Reconcile(), and Finalize(). Both Mutate and Validate actions run during the admission controller lifecycle, while Watch and Reconcile actions run in a separate controller that tracks changes to resources, including existing resource; the Finalize action spans both the admission & afterward.

Let’s look at some example actions that are included in the HelloPepr capability that is created for you when you npx pepr init:

In this first example, Pepr is adding a label and annotation to a ConfigMap with the name example-1 when it is created. Comments are added to each line to explain in more detail what is happening.

// When(a.<Kind>) filters which GroupVersionKind (GVK) this action should act on.
When(a.ConfigMap)
  // This limits the action to only act on new resources.
  .IsCreated()
  // This limits the action to only act on resources with the name "example-1".
  .WithName("example-1")
  // Mutate() is where we define the actual behavior of this action.
  .Mutate(request => {
    // The request object is a wrapper around the K8s resource that Pepr is acting on.
    request
      // Here we are adding a label to the ConfigMap.
      .SetLabel("pepr", "was-here")
      // And here we are adding an annotation.
      .SetAnnotation("pepr.dev", "annotations-work-too");

    // Note that we are not returning anything here. This is because Pepr is tracking the changes in each action automatically.
  });

In this example, a Validate action rejects any ConfigMap in the pepr-demo namespace that has no data.

When(a.ConfigMap)
  .IsCreated()
  .InNamespace("pepr-demo")
  // Validate() is where we define the actual behavior of this action.
  .Validate(request => {
    // If data exists, approve the request.
    if (request.Raw.data) {
      return request.Approve();
    }

    // Otherwise, reject the request with a message and optional code.
    return request.Deny("ConfigMap must have data");
  });

In this example, a Watch action on the name and phase of any ConfigMap.Watch actions run in a separate controller that tracks changes to resources, including existing resources so that you can react to changes in real-time. It is important to note that Watch actions are not run during the admission controller lifecycle, so they cannot be used to modify or validate resources. They also may run multiple times for the same resource, so it is important to make sure that your Watch actions are idempotent. In a future release, Pepr will provide a better way to control when a Watch action is run to avoid this issue.

When(a.ConfigMap)
  // Watch() is where we define the actual behavior of this action.
  .Watch((cm, phase) => {
    Log.info(cm, `ConfigMap ${cm.metadata.name} was ${phase}`);
  });

There are many more examples in the HelloPepr capability that you can use as a reference when creating your own actions. Note that each time you run npx pepr update, Pepr will automatically update the HelloPepr capability with the latest examples and best practices for you to reference and test directly in your Pepr Module.

In some scenarios involving Kubernetes Resource Controllers or Operator patterns, opting for a Reconcile action could be more fitting. Comparable to the Watch functionality, Reconcile is responsible for monitoring the name and phase of any Kubernetes Object. It operates within the Watch controller dedicated to observing modifications to resources, including those already existing, enabling responses to alterations as they occur. Unlike Watch, however, Reconcile employs a Queue to sequentially handle events once they are returned by the Kubernetes API. This allows the operator to handle bursts of events without overwhelming the system or the Kubernetes API. It provides a mechanism to back off when the system is under heavy load, enhancing overall stability and maintaining the state consistency of Kubernetes resources, as the order of operations can impact the final state of a resource.

When(WebApp)
  .IsCreatedOrUpdated()
  .Validate(validator)
  .Reconcile(async instance => {

    const { namespace, name, generation } = instance.metadata;

    if (!instance.metadata?.namespace) {
      Log.error(instance, `Invalid WebApp definition`);
      return;
    }

    const isPending = instance.status?.phase === Phase.Pending;
    const isCurrentGeneration = generation === instance.status?.observedGeneration;

    if (isPending || isCurrentGeneration) {
      Log.debug(instance, `Skipping pending or completed instance`);
      return;
    }

    Log.debug(instance, `Processing instance ${namespace}/${name}`);


    try {
      // Set Status to pending
      await updateStatus(instance, { phase: Phase.Pending });

      // Deploy Deployment, ConfigMap, Service, ServiceAccount, and RBAC based on instance
      await Deploy(instance);

      // Set Status to ready
      await updateStatus(instance, {
        phase: Phase.Ready,
        observedGeneration: instance.metadata.generation,
      });
    } catch (e) {
      Log.error(e, `Error configuring for ${namespace}/${name}`);

      // Set Status to failed
      void updateStatus(instance, {
        phase: Phase.Failed,
        observedGeneration: instance.metadata.generation,
      });
    }
  });

1.4.1 - Mutate

Mutating admission webhooks are invoked first and can modify objects sent to the API server to enforce custom defaults. After an object is sent to Pepr’s Mutating Admission Webhook, Pepr will annotate the object to indicate the status.

After a successful mutation of an object in a module with UUID static-test, and capability name hello-pepr, expect to see this annotation: static-test.pepr.dev/hello-pepr: succeeded.

Helpers

`SetLabel`

SetLabel is used to set a lable on a Kubernetes object as part of a Pepr Mutate action.

For example, to add a label when a ConfigMap is created:

When(a.ConfigMap)
  .IsCreated()
  .Mutate(request => {
    request
      // Here we are adding a label to the ConfigMap.
      .SetLabel("pepr", "was-here")

    // Note that we are not returning anything here. This is because Pepr is tracking the changes in each action automatically.
  });

`RemoveLabel`

RemoveLabel is used to remove a label on a Kubernetes object as part of a Pepr Mutate action.

For example, to remove a label when a ConfigMap is updated:

When(a.ConfigMap)
  .IsCreated()
  .Mutate(request => {
    request
      // Here we are removing a label from the ConfigMap.
      .RemoveLabel("remove-me")

    // Note that we are not returning anything here. This is because Pepr is tracking the changes in each action automatically.
  });

`SetAnnotation`

SetAnnotation is used to set an annotation on a Kubernetes object as part of a Pepr Mutate action.

For example, to add an annotation when a ConfigMap is created:

When(a.ConfigMap)
  .IsCreated()
  .Mutate(request => {
    request
      // Here we are adding an annotation to the ConfigMap.
      .SetAnnotation("pepr.dev", "annotations-work-too");

    // Note that we are not returning anything here. This is because Pepr is tracking the changes in each action automatically.
  });

`RemoveAnnotation`

RemoveAnnotation is used to remove an annotation on a Kubernetes object as part of a Pepr Mutate action.

For example, to remove an annotation when a ConfigMap is updated:

When(a.ConfigMap)
  .IsUpdated()
  .Mutate(request => {
    request
      // Here we are removing an annotation from the ConfigMap.
      .RemoveAnnotation("remove-me");

    // Note that we are not returning anything here. This is because Pepr is tracking the changes in each action automatically.
  });

1.4.2 - Validate

After the Mutation phase comes the Validation phase where the validating admission webhooks are invoked and can reject requests to enforce custom policies.

Validate does not annotate the objects that are allowed into the cluster, but the validation webhook can be audited with npx pepr monitor. Read the monitoring docs for more information.

Basic Validation

Validation actions can either approve or deny requests:

When(a.ConfigMap)
  .IsCreated()
  .Validate(request => {
    if (request.HasAnnotation("evil")) {
      return request.Deny("No evil CM annotations allowed.", 400);
    }

    return request.Approve();
  });

Validation with Warnings

Pepr supports including warning messages in both approval and denial responses. Warnings provide a way to communicate important information to users without necessarily blocking their requests.

Approving with Warnings

When(a.ConfigMap)
  .IsCreatedOrUpdated()
  .InNamespace("pepr-demo")
  .Validate(request => {
    const warnings = [];

    // Check for deprecated fields
    if (request.Raw.data && request.Raw.data["deprecated-field"]) {
      warnings.push("Warning: The 'deprecated-field' is being used and will be removed in future versions");
    }

    // Check for missing app label
    if (!request.HasLabel("app")) {
      warnings.push("Warning: Best practice is to include an 'app' label for resource identification");
    }

    // Return approval with warnings if any were generated
    return request.Approve(warnings.length > 0 ? warnings : undefined);
  });

Denying with Warnings

When(a.ConfigMap)
  .IsCreatedOrUpdated()
  .InNamespace("pepr-demo")
  .Validate(request => {
    // Check for dangerous settings
    if (request.Raw.data && request.Raw.data["dangerous-setting"] === "true") {
      const warnings = [
        "Warning: The 'dangerous-setting' field is set to 'true'",
        "Consider using a safer configuration option"
      ];
      
      return request.Deny(
        "ConfigMap contains dangerous settings that are not allowed",
        422,
        warnings
      );
    }
    
    return request.Approve();
  });

Warnings will be included in the Kubernetes API response and can be displayed to users by kubectl and other Kubernetes clients, providing helpful feedback while still enforcing policies.

1.4.3 - Reconcile

Reconcile functions the same as Watch but is tailored for building Kubernetes Controllers and Operators because it processes callback operations in a Queue, guaranteeing ordered and synchronous processing of events, even when the system may be under heavy load.

Ordering can be configured to operate in one of two ways: as a single queue that maintains ordering of operations across all resources of a kind (default) or with separate processing queues per resource instance.

See Configuring Reconcile for more on configuring how Reconcile behaves.

When(WebApp)
  .IsCreatedOrUpdated()
  .Validate(validator) // Validate the shape of the resource
  .Reconcile(async instance => {
    try {
      Store.setItem(instance.metadata.name, JSON.stringify(instance));
      await reconciler(instance); // Reconcile the resource - Deploy Kubernetes manifests, etc.
    } catch (error) {
      Log.info(`Error reconciling instance of WebApp`);
    }
  });

export async function validator(req: PeprValidateRequest<WebApp>) {
  const ns = req.Raw.metadata?.namespace ?? "_unknown_";

  if (req.Raw.spec.replicas > 7) {
    return req.Deny("max replicas is 7 to prevent noisy neighbors");
  }
  if (invalidNamespaces.includes(ns)) {
    if (req.Raw.metadata?.namespace === "default") {
      return req.Deny("default namespace is not allowed");
    }
    return req.Deny("invalid namespace");
  }

  return req.Approve();
}

1.4.4 - Watch

Kubernetes supports efficient change notifications on resources via watches. Pepr uses the Watch action for monitoring resources that previously existed in the cluster and for performing long-running asynchronous events upon receiving change notifications on resources, as watches are not limited by timeouts.

When(a.Namespace)
  .IsCreated()
  .WithName("pepr-demo-2")
  .Watch(async ns => {
    Log.info("Namespace pepr-demo-2 was created.");

    try {
      // Apply the ConfigMap using K8s server-side apply
      await K8s(kind.ConfigMap).Apply({
        metadata: {
          name: "pepr-ssa-demo",
          namespace: "pepr-demo-2",
        },
        data: {
          "ns-uid": ns.metadata.uid,
        },
      });
    } catch (error) {
      // You can use the Log object to log messages to the Pepr controller pod
      Log.error(error, "Failed to apply ConfigMap using server-side apply.");
    }

    // You can share data between actions using the Store, including between different types of actions
    Store.setItem("watch-data", "This data was stored by a Watch Action.");
  });

1.4.5 - Finalize

A specialized combination of Pepr’s Mutate & Watch functionalities that allow a module author to run logic while Kubernetes is Finalizing a resource (i.e. cleaning up related resources after a deleteion request has been accepted). Finalize() can only be accessed after a Watch() or Reconcile().

This method will:

Inject a finalizer into the metadata.finalizers field of the requested resource during the mutation phase of the admission.
Watch appropriate resource lifecycle events & invoke the given callback.
Remove the injected finalizer from the metadata.finalizers field of the requested resource.

When(a.ConfigMap)
  .IsCreated()
  .InNamespace("hello-pepr-finalize-create")
  .WithName("cm-reconcile-create")
  .Reconcile(function reconcileCreate(cm) {
    Log.info(cm, "external api call (create): reconcile/callback")
  })
  .Finalize(function finalizeCreate(cm) {
    Log.info(cm, "external api call (create): reconcile/finalize")
  });

1.4.6 - Using Alias Child Logger in Actions

You can use the Alias function to include a user-defined alias in the logs for Mutate, Validate, and Watch actions. This can make for easier debugging since your user-defined alias will be included in the action’s logs. This is especially useful when you have multiple actions of the same type in a single module.

The below example uses Mutate, Validate, and Watch actions with the Alias function:

When(a.Pod)
  .IsCreatedOrUpdated()
  .Alias("mutate")
  .Mutate((po, logger) => {
    logger.info(`alias: mutate ${po.Raw.metadata.name}`);
  });
When(a.Pod)
  .IsCreatedOrUpdated()
  .Alias("validate")
  .Validate((po, logger) => {
    logger.info(`alias: validate ${po.Raw.metadata.name}`);
    return po.Approve();
  });
When(a.Pod)
  .IsCreatedOrUpdated()
  .Alias("watch")
  .Watch((po, _, logger) => {
    logger.info(`alias: watch ${po.metadata.name}`);
  });
When(a.Pod)
  .IsCreatedOrUpdated()
  .Alias("reconcile")
  .Reconcile((po, _, logger) => {
    logger.info(`alias: reconcile ${po.metadata.name}`);
  });

This will result in log entries when creating a Pod with the that include the alias:

Logs for Mutate When Pod red is Created:

{"level":30,"time":1726632368808,"pid":16,"hostname":"pepr-static-test-6786948977-6hbnt","uid":"b2221631-e87c-41a2-94c8-cdaef15e7b5f","namespace":"pepr-demo","name":"/red","gvk":{"group":"","version":"v1","kind":"Pod"},"operation":"CREATE","admissionKind":"Mutate","msg":"Incoming request"}
{"level":30,"time":1726632368808,"pid":16,"hostname":"pepr-static-test-6786948977-6hbnt","uid":"b2221631-e87c-41a2-94c8-cdaef15e7b5f","namespace":"pepr-demo","name":"/red","msg":"Processing request"}
{"level":30,"time":1726632368808,"pid":16,"hostname":"pepr-static-test-6786948977-6hbnt","msg":"Executing mutation action with alias: mutate"}
{"level":30,"time":1726632368808,"pid":16,"hostname":"pepr-static-test-6786948977-6hbnt","alias":"mutate","msg":"alias: mutate red"}
{"level":30,"time":1726632368808,"pid":16,"hostname":"pepr-static-test-6786948977-6hbnt","uid":"b2221631-e87c-41a2-94c8-cdaef15e7b5f","namespace":"pepr-demo","name":"hello-pepr","msg":"Mutation action succeeded (mutateCallback)"}
{"level":30,"time":1726632368808,"pid":16,"hostname":"pepr-static-test-6786948977-6hbnt","uid":"b2221631-e87c-41a2-94c8-cdaef15e7b5f","namespace":"pepr-demo","name":"/red","res":{"uid":"b2221631-e87c-41a2-94c8-cdaef15e7b5f","allowed":true,"patchType":"JSONPatch","patch":"W3sib3AiOiJhZGQiLCJwYXRoIjoiL21ldGFkYXRhL2Fubm90YXRpb25zL3N0YXRpYy10ZXN0LnBlcHIuZGV2fjFoZWxsby1wZXByIiwidmFsdWUiOiJzdWNjZWVkZWQifV0="},"msg":"Check response"}
{"level":30,"time":1726632368809,"pid":16,"hostname":"pepr-static-test-6786948977-6hbnt","uid":"b2221631-e87c-41a2-94c8-cdaef15e7b5f","method":"POST","url":"/mutate/c1a7fb6e3f2ab9dc08909d2de4166987520f317d53b759ab882dfd0b1c198479?timeout=10s","status":200,"duration":"1 ms"}

Logs for Validate When Pod red is Created:

{"level":30,"time":1726631437605,"pid":16,"hostname":"pepr-static-test-6786948977-j7f9h","uid":"731eff93-d457-4ffc-a98c-0bcbe4c1727a","namespace":"pepr-demo","name":"/red","gvk":{"group":"","version":"v1","kind":"Pod"},"operation":"CREATE","admissionKind":"Validate","msg":"Incoming request"}
{"level":30,"time":1726631437606,"pid":16,"hostname":"pepr-static-test-6786948977-j7f9h","uid":"731eff93-d457-4ffc-a98c-0bcbe4c1727a","namespace":"pepr-demo","name":"/red","msg":"Processing validation request"}
{"level":30,"time":1726631437606,"pid":16,"hostname":"pepr-static-test-6786948977-j7f9h","uid":"731eff93-d457-4ffc-a98c-0bcbe4c1727a","namespace":"pepr-demo","name":"hello-pepr","msg":"Processing validation action (validateCallback)"}
{"level":30,"time":1726631437606,"pid":16,"hostname":"pepr-static-test-6786948977-j7f9h","msg":"Executing validate action with alias: validate"}
{"level":30,"time":1726631437606,"pid":16,"hostname":"pepr-static-test-6786948977-j7f9h","alias":"validate","msg":"alias: validate red"}
{"level":30,"time":1726631437606,"pid":16,"hostname":"pepr-static-test-6786948977-j7f9h","uid":"731eff93-d457-4ffc-a98c-0bcbe4c1727a","namespace":"pepr-demo","name":"hello-pepr","msg":"Validation action complete (validateCallback): allowed"}
{"level":30,"time":1726631437606,"pid":16,"hostname":"pepr-static-test-6786948977-j7f9h","uid":"731eff93-d457-4ffc-a98c-0bcbe4c1727a","namespace":"pepr-demo","name":"/red","res":{"uid":"731eff93-d457-4ffc-a98c-0bcbe4c1727a","allowed":true},"msg":"Check response"}
{"level":30,"time":1726631437606,"pid":16,"hostname":"pepr-static-test-6786948977-j7f9h","uid":"731eff93-d457-4ffc-a98c-0bcbe4c1727a","method":"POST","url":"/validate/c1a7fb6e3f2ab9dc08909d2de4166987520f317d53b759ab882dfd0b1c198479?timeout=10s","status":200,"duration":"5 ms"}

Logs for Watch and Reconcile When Pod red is Created:

{"level":30,"time":1726798504518,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","msg":"Executing reconcile action with alias: reconcile"}
{"level":30,"time":1726798504518,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","alias":"reconcile","msg":"alias: reconcile red"}
{"level":30,"time":1726798504518,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","msg":"Executing watch action with alias: watch"}
{"level":30,"time":1726798504518,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","alias":"watch","msg":"alias: watch red"}
{"level":30,"time":1726798504521,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","msg":"Executing reconcile action with alias: reconcile"}
{"level":30,"time":1726798504521,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","alias":"reconcile","msg":"alias: reconcile red"}
{"level":30,"time":1726798504521,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","msg":"Executing watch action with alias: watch"}
{"level":30,"time":1726798504521,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","alias":"watch","msg":"alias: watch red"}
{"level":30,"time":1726798504528,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","msg":"Executing reconcile action with alias: reconcile"}
{"level":30,"time":1726798504528,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","alias":"reconcile","msg":"alias: reconcile red"}
{"level":30,"time":1726798504528,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","msg":"Executing watch action with alias: watch"}
{"level":30,"time":1726798504528,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","alias":"watch","msg":"alias: watch red"}
{"level":30,"time":1726798510464,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","msg":"Executing watch action with alias: watch"}
{"level":30,"time":1726798510464,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","alias":"watch","msg":"alias: watch red"}
{"level":30,"time":1726798510466,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","msg":"Executing reconcile action with alias: reconcile"}
{"level":30,"time":1726798510466,"pid":16,"hostname":"pepr-static-test-watcher-6dc69654c9-5ql6b","alias":"reconcile","msg":"alias: reconcile red"}

Note: The Alias function is optional and can be used to provide additional context in the logs. You must pass the logger object as shown above to the action to use the Alias function.

1.5 - Pepr Capabilities

When you npx pepr init, a capabilities directory is created for you. This directory is where you will define your capabilities. You can create as many capabilities as you need, and each capability can contain one or more actions. Pepr also automatically creates a HelloPepr capability with a number of example actions to help you get started.

Creating a Capability

Defining a new capability can be done via a VSCode Snippet generated during npx pepr init.

Create a new file in the capabilities directory with the name of your capability. For example, capabilities/my-capability.ts.
Open the new file in VSCode and type create in the file. A suggestion should prompt you to generate the content from there.

[)

If you prefer not to use VSCode, you can also modify or copy the HelloPepr capability to meet your needs instead.

Reusable Capabilities

Pepr has an NPM org managed by Defense Unicorns, @pepr, where capabilities are published for reuse in other Pepr Modules. You can find a list of published capabilities here.

You also can publish your own Pepr capabilities to NPM and import them. A couple of things you’ll want to be aware of when publishing your own capabilities:

Reuseable capability versions should use the format 0.x.x or 0.12.x as examples to determine compatibility with other reusable capabilities. Before 1.x.x, we recommend binding to 0.x.x if you can for maximum compatibility.
pepr.ts will still be used for local development, but you’ll also need to publish an index.ts that exports your capabilities. When you build & publish the capability to NPM, you can use npx pepr build -e index.ts to generate the code needed for reuse by other Pepr modules.

1.6 - Pepr Store

A Lightweight Key-Value Store for Pepr Modules

The nature of admission controllers and general watch operations (the Mutate, Validate and Watch actions in Pepr) make some types of complex and long-running operations difficult. There are also times when you need to share data between different actions. While you could manually create your own K8s resources and manage their cleanup, this can be very hard to track and keep performant at scale.

The Pepr Store solves this by exposing a simple, Web Storage API-compatible mechanism for use within capabilities. Additionally, as Pepr runs multiple replicas of the admission controller along with a watch controller, the Pepr Store provides a unique way to share data between these different instances automatically.

Each Pepr Capability has a Store instance that can be used to get, set and delete data as well as subscribe to any changes to the Store. Behind the scenes, all capability store instances in a single Pepr Module are stored within a single CRD in the cluster. This CRD is automatically created when the Pepr Module is deployed. Care is taken to make the read and write operations as efficient as possible by using K8s watches, batch processing and patch operations for writes.

Key Features

Asynchronous Key-Value Store: Provides an asynchronous interface for storing small amounts of data, making it ideal for sharing information between various actions and capabilities.
Web Storage API Compatibility: The store’s API is aligned with the standard Web Storage API, simplifying the learning curve.
Real-time Updates: The .subscribe() and onReady() methods enable real-time updates, allowing you to react to changes in the data store instantaneously.
Automatic CRD Management: Each Pepr Module has its data stored within a single Custom Resource Definition (CRD) that is automatically created upon deployment.
Efficient Operations: Pepr Store uses Kubernetes watches, batch processing, and patch operations to make read and write operations as efficient as possible.

Quick Start

// Example usage for Pepr Store
Store.setItem("example-1", "was-here");
Store.setItem("example-1-data", JSON.stringify(request.Raw.data));
Store.onReady(data => {
  Log.info(data, "Pepr Store Ready");
});
const unsubscribe = Store.subscribe(data => {
  Log.info(data, "Pepr Store Updated");
  unsubscribe();
});

API Reference

Methods

getItem(key: string): Retrieves a value by its key. Returns null if the key doesn’t exist.
setItem(key: string, value: string): Sets a value for a given key. Creates a new key-value pair if the key doesn’t exist.
setItemAndWait(key: string, value: string): Sets a value for a given key. Creates a new key-value pair if the key doesn’t exist. Resolves a promise when the new key and value show up in the store. Note - Async operations in Mutate and Validate are susceptible to timeouts.
removeItem(key: string): Deletes a key-value pair by its key.
removeItemAndWait(key: string): Deletes a key-value pair by its key and resolves a promise when the key and value do not show up in the store. Note - Async operations in Mutate and Validate are susceptible to timeouts.
clear(): Clears all key-value pairs from the store.
subscribe(listener: DataReceiver): Subscribes to store updates.
onReady(callback: DataReceiver): Executes a callback when the store is ready.

1.7 - Custom Resources

Importing Custom Resources

The Kubernetes Fluent Client supports the creation of TypeScript typings directly from Kubernetes Custom Resource Definitions (CRDs). The files it generates can be directly incorporated into Pepr capabilities and provide a way to work with strongly-typed CRDs.

For example (below), Istio CRDs can be imported and used as though they were intrinsic Kubernetes resources.

Generating TypeScript Types from CRDs

Using the kubernetes-fluent-client to produce a new type looks like this:

npx kubernetes-fluent-client crd [source] [directory]

The crd command expects a [source], which can be a URL or local file containing the CustomResourceDefinition(s), and a [directory] where the generated code will live.

The following example creates types for the Istio CRDs:

user@workstation$  npx kubernetes-fluent-client crd https://raw.githubusercontent.com/istio/istio/master/manifests/charts/base/crds/crd-all.gen.yaml crds

Attempting to load https://raw.githubusercontent.com/istio/istio/master/manifests/charts/base/crds/crd-all.gen.yaml as a URL

- Generating extensions.istio.io/v1alpha1 types for WasmPlugin
- Generating networking.istio.io/v1alpha3 types for DestinationRule
- Generating networking.istio.io/v1beta1 types for DestinationRule
- Generating networking.istio.io/v1alpha3 types for EnvoyFilter
- Generating networking.istio.io/v1alpha3 types for Gateway
- Generating networking.istio.io/v1beta1 types for Gateway
- Generating networking.istio.io/v1beta1 types for ProxyConfig
- Generating networking.istio.io/v1alpha3 types for ServiceEntry
- Generating networking.istio.io/v1beta1 types for ServiceEntry
- Generating networking.istio.io/v1alpha3 types for Sidecar
- Generating networking.istio.io/v1beta1 types for Sidecar
- Generating networking.istio.io/v1alpha3 types for VirtualService
- Generating networking.istio.io/v1beta1 types for VirtualService
- Generating networking.istio.io/v1alpha3 types for WorkloadEntry
- Generating networking.istio.io/v1beta1 types for WorkloadEntry
- Generating networking.istio.io/v1alpha3 types for WorkloadGroup
- Generating networking.istio.io/v1beta1 types for WorkloadGroup
- Generating security.istio.io/v1 types for AuthorizationPolicy
- Generating security.istio.io/v1beta1 types for AuthorizationPolicy
- Generating security.istio.io/v1beta1 types for PeerAuthentication
- Generating security.istio.io/v1 types for RequestAuthentication
- Generating security.istio.io/v1beta1 types for RequestAuthentication
- Generating telemetry.istio.io/v1alpha1 types for Telemetry

✅ Generated 23 files in the istio directory

Observe that the kubernetes-fluent-client has produced the TypeScript types within the crds directory. These types can now be utilized in the Pepr module.

user@workstation$  cat crds/proxyconfig-v1beta1.ts
// This file is auto-generated by kubernetes-fluent-client, do not edit manually

import { GenericKind, RegisterKind } from "kubernetes-fluent-client";

export class ProxyConfig extends GenericKind {
    /**
     * Provides configuration for individual workloads. See more details at:
     * https://istio.io/docs/reference/config/networking/proxy-config.html
     */
    spec?:   Spec;
    status?: { [key: string]: any };
}

/**
 * Provides configuration for individual workloads. See more details at:
 * https://istio.io/docs/reference/config/networking/proxy-config.html
 */
export interface Spec {
    /**
     * The number of worker threads to run.
     */
    concurrency?: number;
    /**
     * Additional environment variables for the proxy.
     */
    environmentVariables?: { [key: string]: string };
    /**
     * Specifies the details of the proxy image.
     */
    image?: Image;
    /**
     * Optional.
     */
    selector?: Selector;
}

/**
 * Specifies the details of the proxy image.
 */
export interface Image {
    /**
     * The image type of the image.
     */
    imageType?: string;
}

/**
 * Optional.
 */
export interface Selector {
    /**
     * One or more labels that indicate a specific set of pods/VMs on which a policy should be
     * applied.
     */
    matchLabels?: { [key: string]: string };
}

RegisterKind(ProxyConfig, {
  group: "networking.istio.io",
  version: "v1beta1",
  kind: "ProxyConfig",
});

Using new types

The generated types can be imported into Pepr directly, there is no additional logic needed to make them to work.

import { Capability, K8s, Log, a, kind } from "pepr";

import { Gateway } from "../crds/gateway-v1beta1";
import {
  PurpleDestination,
  VirtualService,
} from "../crds/virtualservice-v1beta1";

export const IstioVirtualService = new Capability({
  name: "istio-virtual-service",
  description: "Generate Istio VirtualService resources",
});

// Use the 'When' function to create a new action
const { When, Store } = IstioVirtualService;

// Define the configuration keys
enum config {
  Gateway = "uds/istio-gateway",
  Host = "uds/istio-host",
  Port = "uds/istio-port",
  Domain = "uds/istio-domain",
}

// Define the valid gateway names
const validGateway = ["admin", "tenant", "passthrough"];

// Watch Gateways to get the HTTPS domain for each gateway
When(Gateway)
  .IsCreatedOrUpdated()
  .WithLabel(config.Domain)
  .Watch(vs => {
    // Store the domain for the gateway
    Store.setItem(vs.metadata.name, vs.metadata.labels[config.Domain]);
  });

1.8 - OnSchedule

The OnSchedule feature allows you to schedule and automate the execution of specific code at predefined intervals or schedules. This feature is designed to simplify recurring tasks and can serve as an alternative to traditional CronJobs. This code is designed to be run at the top level on a Capability, not within a function like When.

Best Practices

OnSchedule is designed for targeting intervals equal to or larger than 30 seconds due to the storage mechanism used to archive schedule info.

Usage

Create a recurring task execution by calling the OnSchedule function with the following parameters:

name - The unique name of the schedule.

every - An integer that represents the frequency of the schedule in number of units.

unit - A string specifying the time unit for the schedule (e.g., seconds, minute, minutes, hour, hours).

startTime - (Optional) A UTC timestamp indicating when the schedule should start. All date times must be provided in GMT. If not specified the schedule will start when the schedule store reports ready.

run - A function that contains the code you want to execute on the defined schedule.

completions - (Optional) An integer indicating the maximum number of times the schedule should run to completion. If not specified the schedule will run indefinitely.

Examples

Update a ConfigMap every 30 seconds:

OnSchedule({
    name: "hello-interval",
    every: 30,
    unit: "seconds",
    run: async () => {
      Log.info("Wait 30 seconds and create/update a ConfigMap");

      try {
        await K8s(kind.ConfigMap).Apply({
          metadata: {
            name: "last-updated",
            namespace: "default",
          },
          data: {
            count: `${new Date()}`,
          },
        });

      } catch (error) {
        Log.error(error, "Failed to apply ConfigMap using server-side apply.");
      }
    },
  });

Refresh an AWSToken every 24 hours, with a delayed start of 30 seconds, running a total of 3 times:


OnSchedule({
  name: "refresh-aws-token",
  every: 24,
  unit: "hours",
  startTime: new Date(new Date().getTime() + 1000 * 30),
  run: async () => {
    await RefreshAWSToken();
  },
  completions: 3,
});

Advantages

Simplifies scheduling recurring tasks without the need for complex CronJob configurations.
Provides flexibility to define schedules in a human-readable format.
Allows you to execute code with precision at specified intervals.
Supports limiting the number of schedule completions for finite tasks.

1.9 - RBAC Modes

During the build phase of Pepr (npx pepr build --rbac-mode [admin|scoped]), you have the option to specify the desired RBAC mode through specific flags. This allows fine-tuning the level of access granted based on requirements and preferences.

Modes

admin

npx pepr build --rbac-mode admin

Description: The service account is given cluster-admin permissions, granting it full, unrestricted access across the entire cluster. This can be useful for administrative tasks where broad permissions are necessary. However, use this mode with caution, as it can pose security risks if misused. This is the default mode.

scoped

npx pepr build --rbac-mode scoped

Description: The service account is provided just enough permissions to perform its required tasks, and no more. This mode is recommended for most use cases as it limits potential attack vectors and aligns with best practices in security. The admission controller’s primary mutating or validating action doesn’t require a ClusterRole (as the request is not persisted or executed while passing through admission control), if you have a use case where the admission controller’s logic involves reading other Kubernetes resources or taking additional actions beyond just validating, mutating, or watching the incoming request, appropriate RBAC settings should be reflected in the ClusterRole. See how in Updating the ClusterRole.

Debugging RBAC Issues

If encountering unexpected behaviors in Pepr while running in scoped mode, check to see if they are related to RBAC.

Check Deployment logs for RBAC errors:

kubectl logs -n pepr-system  -l app | jq

# example output
{
  "level": 50,
  "time": 1697983053758,
  "pid": 16,
  "hostname": "pepr-static-test-watcher-745d65857d-pndg7",
  "data": {
    "kind": "Status",
    "apiVersion": "v1",
    "metadata": {},
    "status": "Failure",
    "message": "configmaps \"pepr-ssa-demo\" is forbidden: User \"system:serviceaccount:pepr-system:pepr-static-test\" cannot patch resource \"configmaps\" in API group \"\" in the namespace \"pepr-demo-2\"",
    "reason": "Forbidden",
    "details": {
      "name": "pepr-ssa-demo",
      "kind": "configmaps"
    },
    "code": 403
  },
  "ok": false,
  "status": 403,
  "statusText": "Forbidden",
  "msg": "Dooes the ServiceAccount permissions to CREATE and PATCH this ConfigMap?"
}

Verify ServiceAccount Permissions with kubectl auth can-i

SA=$(kubectl get deploy -n pepr-system -o=jsonpath='{range .items[0]}{.spec.template.spec.serviceAccountName}{"\n"}{end}')

# Can i create configmaps as the service account in pepr-demo-2?
kubectl auth can-i create cm --as=system:serviceaccount:pepr-system:$SA -n pepr-demo-2

# example output: no

Describe the ClusterRole

SA=$(kubectl get deploy -n pepr-system -o=jsonpath='{range .items[0]}{.spec.template.spec.serviceAccountName}{"\n"}{end}')

kubectl describe clusterrole $SA

# example output:
Name:         pepr-static-test
Labels:       <none>
Annotations:  <none>
PolicyRule:
  Resources            Non-Resource URLs  Resource Names  Verbs
  ---------            -----------------  --------------  -----
  peprstores.pepr.dev  []                 []              [create delete get list patch update watch]
  configmaps           []                 []              [watch]
  namespaces           []                 []              [watch]

Updating the ClusterRole

As discussed in the Modes section, the admission controller’s primary mutating or validating action doesn’t require a ClusterRole (as the request is not persisted or executed while passing through admission control), if you have a use case where the admission controller’s logic involves reading other Kubernetes resources or taking additional actions beyond just validating, mutating, or watching the incoming request, appropriate RBAC settings should be reflected in the ClusterRole.

Step 1: Figure out the desired permissions. (kubectl create clusterrole --help is a good place to start figuring out the syntax)

 kubectl create clusterrole configMapApplier --verb=create,patch --resource=configmap --dry-run=client -oyaml

 # example output
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  creationTimestamp: null
  name: configMapApplier
rules:
- apiGroups:
  - ""
  resources:
  - configmaps
  verbs:
  - create
  - patch

Step 2: Update the ClusterRole in the dist folder.

...
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pepr-static-test
rules:
  - apiGroups:
      - pepr.dev
    resources:
      - peprstores
    verbs:
      - create
      - get
      - patch
      - watch
  - apiGroups:
      - ''
    resources:
      - namespaces
    verbs:
      - watch
  - apiGroups:
      - ''
    resources:
      - configmaps
    verbs:
      - watch
      - create # New
      - patch  # New
...

Step 3: Apply the updated configuration

1.10 - Metrics Endpoints

The /metrics endpoint provides metrics for the application that are collected via the MetricsCollector class. It uses the prom-client library and performance hooks from Node.js to gather and expose the metrics data in a format that can be scraped by Prometheus.

Metrics Exposed

The MetricsCollector exposes the following metrics:

pepr_errors: A counter that increments when an error event occurs in the application.
pepr_alerts: A counter that increments when an alert event is triggered in the application.
pepr_mutate: A summary that provides the observed durations of mutation events in the application.
pepr_mutate_timeouts : A counter that increments when a webhook timeout occurs during mutation.
pepr_validate: A summary that provides the observed durations of validation events in the application.
pepr_validate_timeouts : A counter that increments when a webhook timeout occurs during validation.
pepr_cache_miss: A gauge that provides the number of cache misses per window.
pepr_resync_failure_count: A gauge that provides the number of unsuccessful attempts at receiving an event within the last seen event limit before re-establishing a new connection.

Environment Variables

| PEPR_MAX_CACHE_MISS_WINDOWS | Maximum number windows to emit pepr_cache_miss metrics for | default: Undefined |

API Details

Method: GET

URL: /metrics

Response Type: text/plain

Status Codes:

200 OK: On success, returns the current metrics from the application.

Response Body: The response body is a plain text representation of the metrics data, according to the Prometheus exposition formats. It includes the metrics mentioned above.

Examples

Request

GET /metrics

Response

  `# HELP pepr_errors Mutation/Validate errors encountered
  # TYPE pepr_errors counter
  pepr_errors 5

  # HELP pepr_alerts Mutation/Validate bad api token received
  # TYPE pepr_alerts counter
  pepr_alerts 10

  # HELP pepr_mutate Mutation operation summary
  # TYPE pepr_mutate summary
  pepr_mutate{quantile="0.01"} 100.60707900021225
  pepr_mutate{quantile="0.05"} 100.60707900021225
  pepr_mutate{quantile="0.5"} 100.60707900021225
  pepr_mutate{quantile="0.9"} 100.60707900021225
  pepr_mutate{quantile="0.95"} 100.60707900021225
  pepr_mutate{quantile="0.99"} 100.60707900021225
  pepr_mutate{quantile="0.999"} 100.60707900021225
  pepr_mutate_sum 100.60707900021225
  pepr_mutate_count 1

  # HELP pepr_validate Validation operation summary
  # TYPE pepr_validate summary
  pepr_validate{quantile="0.01"} 201.19413900002837
  pepr_validate{quantile="0.05"} 201.19413900002837
  pepr_validate{quantile="0.5"} 201.2137690000236
  pepr_validate{quantile="0.9"} 201.23339900001884
  pepr_validate{quantile="0.95"} 201.23339900001884
  pepr_validate{quantile="0.99"} 201.23339900001884
  pepr_validate{quantile="0.999"} 201.23339900001884
  pepr_validate_sum 402.4275380000472
  pepr_validate_count 2

  # HELP pepr_cache_miss Number of cache misses per window
  # TYPE pepr_cache_miss gauge
  pepr_cache_miss{window="2024-07-25T11:54:33.897Z"} 18
  pepr_cache_miss{window="2024-07-25T12:24:34.592Z"} 0
  pepr_cache_miss{window="2024-07-25T13:14:33.450Z"} 22
  pepr_cache_miss{window="2024-07-25T13:44:34.234Z"} 19
  pepr_cache_miss{window="2024-07-25T14:14:34.961Z"} 0

  # HELP pepr_resync_failure_count Number of retries per count
  # TYPE pepr_resync_failure_count gauge
  pepr_resync_failure_count{count="0"} 5
  pepr_resync_failure_count{count="1"} 4

Prometheus Operator

If using the Prometheus Operator, the following ServiceMonitor example manifests can be used to scrape the /metrics endpoint for the admission and watcher controllers.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: admission
spec:
  selector:
    matchLabels:
      pepr.dev/controller: admission
  namespaceSelector:
    matchNames:
    - pepr-system
  endpoints:
  - targetPort: 3000
    scheme: https
    tlsConfig:
      insecureSkipVerify: true
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: watcher
spec:
  selector:
    matchLabels:
      pepr.dev/controller: watcher
  namespaceSelector:
    matchNames:
    - pepr-system
  endpoints:
  - targetPort: 3000
    scheme: https
    tlsConfig:
      insecureSkipVerify: true

1.11 - WASM Support

Pepr fully supports WebAssembly. Depending on the language used to generate the WASM, certain files can be too large to fit into a Secret or ConfigMap. Due to this limitation, users have the ability to incorporate *.wasm and any other essential files during the build phase, which are then embedded into the Pepr Controller container. This is achieved through adding an array of files to the includedFiles section under pepr in the package.json.

NOTE - In order to instantiate the WebAsembly module in TypeScript, you need the WebAssembly type. This is accomplished through add the “DOM” to the lib array in the compilerOptions section of the tsconfig.json. Ex: "lib": ["ES2022", "DOM"]. Be aware that adding the DOM will add a lot of extra types to your project and your developer experience will be impacted in terms of the intellisense.

High-Level Overview

WASM support is achieved through adding files as layers atop the Pepr controller image, these files are then able to be read by the individual capabilities. The key components of WASM support are:

Add files to the base of the Pepr module.
Reference the files in the includedFiles section of the pepr block of the package.json
Run npx pepr build with the -r option specifying registry info. Ex: npx pepr build -r docker.io/cmwylie19
Pepr builds and pushes a custom image that is used in the Deployment.

Using WASM Support

Creating a WASM Module in Go

Create a simple Go function that you want to call from your Pepr module

package main

import (
 "fmt"
 "syscall/js"
)

func concats(this js.Value, args []js.Value) interface{} {
    fmt.Println("PeprWASM!")
 stringOne := args[0].String()
 stringTwo := args[1].String()
 return fmt.Sprintf("%s%s", stringOne, stringTwo)
}

func main() {
 done := make(chan struct{}, 0)
 js.Global().Set("concats", js.FuncOf(concats))
 <-done
}

Compile it to a wasm target and move it to your Pepr module

GOOS=js GOARCH=wasm go build -o main.wasm
cp main.wasm $YOUR_PEPR_MODULE/

Copy the wasm_exec.js from GOROOT to your Pepr Module

cp "$(go env GOROOT)/misc/wasm/wasm_exec.js" $YOUR_PEPR_MODULE/

Update the polyfill to add globalThis.crypto in the wasm_exec.js since we are not running in the browser. This is needed directly under: (() => {

// Initialize the polyfill
if (typeof globalThis.crypto === 'undefined') {
    globalThis.crypto = {
        getRandomValues: (array) => {
        for (let i = 0; i < array.length; i++) {
            array[i] = Math.floor(Math.random() * 256);
        }
        },
    };
}

Configure Pepr to use WASM

After adding the files to the root of the Pepr module, reference those files in the package.json:

{
  "name": "pepr-test-module",
  "version": "0.0.1",
  "description": "A test module for Pepr",
  "keywords": [
    "pepr",
    "k8s",
    "policy-engine",
    "pepr-module",
    "security"
  ],
  "engines": {
    "node": ">=18.0.0"
  },
  "pepr": {
    "name": "pepr-test-module",
    "uuid": "static-test",
    "onError": "ignore",
    "alwaysIgnore": {
      "namespaces": [],
      "labels": []
    },
    "includedFiles":[
      "main.wasm",
      "wasm_exec.js"
    ]
  },
  ...
}

Update the tsconfig.json to add “DOM” to the compilerOptions lib:

{
  "compilerOptions": {
    "allowSyntheticDefaultImports": true,
    "declaration": true,
    "declarationMap": true,
    "emitDeclarationOnly": true,
    "esModuleInterop": true,
    "lib": [
      "ES2022",
      "DOM" // <- Add this
    ],
    "module": "CommonJS",
    "moduleResolution": "node",
    "outDir": "dist",
    "resolveJsonModule": true,
    "rootDir": ".",
    "strict": false,
    "target": "ES2022",
    "useUnknownInCatchVariables": false
  },
  "include": [
    "**/*.ts"
  ]
}

Call WASM functions from TypeScript

Import the wasm_exec.js in the pepr.ts

import "./wasm_exec.js";

Create a helper function to load the wasm file in a capability and call it during an event of your choice

async function callWASM(a,b) {
  const go = new globalThis.Go();

  const wasmData = readFileSync("main.wasm");
  var concated: string;

  await WebAssembly.instantiate(wasmData, go.importObject).then(wasmModule => {
    go.run(wasmModule.instance);

    concated = global.concats(a,b);
  });
  return concated;
}

When(a.Pod)
.IsCreated()
.Mutate(async pod => {
  try {
    let label_value = await callWASM("loves","wasm")
    pod.SetLabel("pepr",label_value)
  }
  catch(err) {
    Log.error(err);
  }
});

Run Pepr Build

Build your Pepr module with the registry specified.

npx pepr build -r docker.io/defenseunicorns

1.12 - Customization

This document outlines how to customize the build output through Helm overrides and package.json configurations.

Redact Store Values from Logs

By default, the store values are displayed in logs, to redact them you can set the PEPR_STORE_REDACT_VALUES environment variable to true in the package.json file or directly on the Watcher or Admission Deployment. The default value is undefined.

{
  "env": {
    "PEPR_STORE_REDACT_VALUES": "true"
  }
}

Display Node Warnings

You can display warnings in the logs by setting the PEPR_NODE_WARNINGS environment variable to true in the package.json file or directly on the Watcher or Admission Deployment. The default value is undefined.

{
  "env": {
    "PEPR_NODE_WARNINGS": "true"
  }
}

Customizing Log Format

The log format can be customized by setting the PINO_TIME_STAMP environment variable in the package.json file or directly on the Watcher or Admission Deployment. The default value is a partial JSON timestamp string representation of the time. If set to iso, the timestamp is displayed in an ISO format.

Caution: attempting to format time in-process will significantly impact logging performance.

{
  "env": {
    "PINO_TIME_STAMP": "iso"
  }
}

With ISO:

{"level":30,"time":"2024-05-14T14:26:03.788Z","pid":16,"hostname":"pepr-static-test-7f4d54b6cc-9lxm6","method":"GET","url":"/healthz","status":200,"duration":"1 ms"}

Default (without):

{"level":30,"time":"1715696764106","pid":16,"hostname":"pepr-static-test-watcher-559d94447f-xkq2h","method":"GET","url":"/healthz","status":200,"duration":"1 ms"}

Customizing Watch Configuration

The Watch configuration is a part of the Pepr module that allows you to watch for specific resources in the Kubernetes cluster. The Watch configuration can be customized by specific enviroment variables of the Watcher Deployment and can be set in the field in the package.json or in the helm values.yaml file.

Field	Description	Example Values
`PEPR_RESYNC_FAILURE_MAX`	The maximum number of times to fail on a resync interval before re-establishing the watch URL and doing a relist.	default: `"5"`
`PEPR_RETRY_DELAY_SECONDS`	The delay between retries in seconds.	default: `"10"`
`PEPR_LAST_SEEN_LIMIT_SECONDS`	Max seconds to go without receiving a watch event before re-establishing the watch	default: `"300"` (5 mins)
`PEPR_RELIST_INTERVAL_SECONDS`	Amount of seconds to wait before a relist of the watched resources	default: `"600"` (10 mins)

Configuring Reconcile

The Reconcile Action allows you to maintain ordering of resource updates processed by a Pepr controller. The Reconcile configuration can be customized via enviroment variable on the Watcher Deployment, which can be set in the package.json or in the helm values.yaml file.

Field	Description	Example Values
`PEPR_RECONCILE_STRATEGY`	How Pepr should order resource updates being Reconcile()’d.	default: `"kind"`

Available Options
`kind`	separate queues of events for Reconcile()’d resources of a kind
`kindNs`	separate queues of events for Reconcile()’d resources of a kind, within a namespace
`kindNsName`	separate queues of events for Reconcile()’d resources of a kind, within a namespace, per name
`global`	a single queue of events for all Reconcile()’d resources

Customizing with Helm

Below are the available Helm override configurations after you have built your Pepr module that you can put in the values.yaml.

Helm Overrides Table

Parameter	Description	Example Values
`additionalIgnoredNamespaces`	Namespaces to ignore in addition to alwaysIgnore.namespaces from Pepr config in `package.json`.	`- pepr-playground`
`secrets.apiToken`	Kube API-Server Token.	`Buffer.from(apiToken).toString("base64")`
`hash`	Unique hash for deployment. Do not change.	`<your_hash>`
`namespace.annotations`	Namespace annotations	`{}`
`namespace.labels`	Namespace labels	`{"pepr.dev": ""}`
`uuid`	Unique identifier for the module	`hub-operator`
`admission.*`	Admission controller configurations	Various, see subparameters below
`watcher.*`	Watcher configurations	Various, see subparameters below

Admission and Watcher Subparameters

Subparameter	Description
`failurePolicy`	Webhook failure policy [Ignore, Fail]
`webhookTimeout`	Timeout seconds for webhooks [1 - 30]
`env`	Container environment variables
`image`	Container image
`annotations`	Deployment annotations
`labels`	Deployment labels
`securityContext`	Pod security context
`readinessProbe`	Pod readiness probe definition
`livenessProbe`	Pod liveness probe definition
`resources`	Resource limits
`containerSecurityContext`	Container’s security context
`nodeSelector`	Node selection constraints
`tolerations`	Tolerations to taints
`affinity`	Node scheduling options
`terminationGracePeriodSeconds`	Optional duration in seconds the pod needs to terminate gracefully

Note: Replace * within admission.* or watcher.* to apply settings specific to the desired subparameter (e.g. admission.failurePolicy).

Customizing with package.json

Below are the available configurations through package.json.

package.json Configurations Table

Field	Description	Example Values
`uuid`	Unique identifier for the module	`hub-operator`
`onError`	Behavior of the webhook failure policy	`audit`, `ignore`, `reject`
`webhookTimeout`	Webhook timeout in seconds	`1` - `30`
`customLabels`	Custom labels for namespaces	`{namespace: {}}`
`alwaysIgnore`	Conditions to always ignore	`{namespaces: []}`
`includedFiles`	For working with WebAssembly	[“main.wasm”, “wasm_exec.js”]
`env`	Environment variables for the container	`{LOG_LEVEL: "warn"}`
`rbac`	Custom RBAC rules (requires building with `rbacMode: scoped`)	`{"rbac": [{"apiGroups": ["<apiGroups>"], "resources": ["<resources>"], "verbs": ["<verbs>"]}]}`
`rbacMode`	Configures module to build binding RBAC with principal of least privilege	`scoped`, `admin`

uuid: An identifier for the module in the pepr-system namespace. If not provided, a UUID will be generated. It can be any kubernetes acceptable name that is under 36 characters.

These tables provide a comprehensive overview of the fields available for customization within the Helm overrides and the package.json file. Modify these according to your deployment requirements.

Example Custom RBAC Rules

The following example demonstrates how to add custom RBAC rules to the Pepr module.

{
  "pepr": {
    "rbac": [
      {
        "apiGroups": ["pepr.dev"],
        "resources": ["customresources"],
        "verbs": ["get", "list"]
      },
      {
        "apiGroups": ["apps"],
        "resources": ["deployments"],
        "verbs": ["create", "delete"]
      }
    ]
  }
}

1.13 - Pepr Filters

Filters are functions that take a AdmissionReview or Watch event and return a boolean. They are used to filter out resources that do not meet certain criteria. Filters are used in the package to filter out resources that are not relevant to the user-defined admission or watch process.

When(a.ConfigMap)
  // This limits the action to only act on new resources.
  .IsCreated()
  // Namespace filter
  .InNamespace("webapp")
  // Name filter
  .WithName("example-1")
  // Label filter
  .WithLabel("app", "webapp")
  .WithLabel("env", "prod")
  .Mutate(request => {
    request
      .SetLabel("pepr", "was-here")
      .SetAnnotation("pepr.dev", "annotations-work-too");
  });

`Filters`

.WithName("name"): Filters resources by name.
.WithNameRegex(/^pepr/): Filters resources by name using a regex.
.InNamespace("namespace"): Filters resources by namespace.
.InNamespaceRegex(/(.*)-system/): Filters resources by namespace using a regex.
.WithLabel("key", "value"): Filters resources by label. (Can be multiple)
.WithDeletionTimestamp(): Filters resources that have a deletion timestamp.

Notes:

WithDeletionTimestamp() is does not work on Delete through the Mutate or Validate methods because the Kubernetes Admission Process does not fire the DELETE event with a deletion timestamp on the resource.
WithDeletionTimestamp() will match on an Update event during Admission (Mutate or Validate) when pending-deletion permitted changes (like removing a finalizer) occur.

2 - Pepr Tutorials

In this section, we provide tutorials for using Pepr. These tutorials are:

2.1 - Tutorial - Create a Pepr Module

Introduction

This tutorial will walk you through the process of creating a Pepr module.

Each Pepr Module is it’s own Typescript project, produced by npx pepr init. Typically a module is maintained by a unique group or system. For example, a module for internal Zarf mutations would be different from a module for Big Bang. An important idea with modules is that they are wholly independent of one another. This means that 2 different modules can be on completely different versions of Pepr and any other dependencies; their only interaction is through the standard K8s interfaces like any other webhook or controller.

Prerequisites

Steps

Create the module:
Use npx pepr init to generate a new module.

Quickly validate system setup:

Every new module includes a sample Pepr Capability called HelloPepr. By default, this capability is deployed and monitoring the pepr-demo namespace. There is a sample yaml also included you can use to see Pepr in your cluster. Here’s the quick steps to do that after npx pepr init:

# cd to the newly-created Pepr module folder
cd my-module-name

# If you don't already have a local K8s cluster, you can set one up with k3d
npm run k3d-setup

# Launch pepr dev mode
# If using another local K8s distro instead of k3d, use `npx pepr dev --host host.docker.internal`
npx pepr dev

# From another terminal, apply the sample yaml
kubectl apply -f capabilities/hello-pepr.samples.yaml

# Verify the configmaps were transformed using kubectl, k9s or another tool

Create your custom Pepr Capabilities
Now that you have confirmed Pepr is working, you can now create new capabilities. You’ll also want to disable the HelloPepr capability in your module (pepr.ts) before pushing to production. You can disable by commenting out or deleting the HelloPepr variable below:
```
new PeprModule(cfg, [
  // Remove or comment the line below to disable the HelloPepr capability
  HelloPepr,

  // Your additional capabilities go here
]);
```
Note: if you also delete the capabilities/hello-pepr.ts file, it will be added again on the next npx pepr update so you have the latest examples usages from the Pepr SDK. Therefore, it is sufficient to remove the entry from your pepr.ts module config.
Build and deploy the Pepr Module
Most of the time, you’ll likely be iterating on a module with npx pepr dev for real-time feedback and validation Once you are ready to move beyond the local dev environment, Pepr provides deployment and build tools you can use.
npx pepr deploy - you can use this command to build your module and deploy it into any K8s cluster your current kubecontext has access to. This setup is ideal for CI systems during testing, but is not recommended for production use. See npx pepr deploy for more info.

Additional Information

By default, when you run npx pepr init, the module is not configured with any additional options. Currently, there are 3 options you can configure:

deferStart - if set to true, the module will not start automatically. You will need to call start() manually. This is useful if you want to do some additional setup before the module controller starts. You can also use this to change the default port that the controller listens on.
beforeHook - an optional callback that will be called before every request is processed. This is useful if you want to do some additional logging or validation before the request is processed.
afterHook - an optional callback that will be called after every request is processed. This is useful if you want to do some additional logging or validation after the request is processed.

You can configure each of these by modifying the pepr.ts file in your module. Here’s an example of how you would configure each of these options:

const module = new PeprModule(
  cfg,
  [
    // Your capabilities go here
  ],
  {
    deferStart: true,

    beforeHook: req => {
      // Any actions you want to perform before the request is processed, including modifying the request.
    },

    afterHook: res => {
      // Any actions you want to perform after the request is processed, including modifying the response.
    },
  }
);

// Do any additional setup before starting the controller
module.start();

Summary

Checkout some examples of Pepr modules in the excellent examples repo. If you have questions after that, please reach out to us on Slack or GitHub Issues

2.2 - Tutorial - Create a Pepr Dashboard

Introduction

This tutorial will walk you through the process of creating a dashboard to display your Pepr metrics. This dashboard will present data such as the number of validation requests processed, the number of mutation requests that were allowed, the number of errors that were processed, the number of alerts that were processed, the status of the Pepr pods, and the scrape duration of the Pepr pods. This dashboard will be created using Grafana. The dashboard will display data from Prometheus, which is a monitoring system that Pepr uses to collect metrics.

This tutorial is not intended for production, but instead is intended to show how to quickly scrape Pepr metrics. The Kube Prometheus Stack provides a starting point for a more production suitable way of deploying Prometheus in prod.

An example of what the dashboard will look like is shown below:

Pepr Dashboard

Note: The dashboard shown above is an example of what the dashboard will look like. The dashboard will be populated with data from your Pepr instance.

Steps

Step 1. Get Cluster Running With Your Pepr Module Deployed

You can learn more about how to create a Pepr module and deploy it in the Create a Pepr Module tutorial. The short version is:

#Create your cluster
k3d cluster create

#Create your module
npx pepr init

#Change directory to your module that was created using `npx pepr init`
npx pepr  dev
kubectl apply -f capabilities/hello-pepr.samples.yaml

#Deploy your module to the cluster
npx pepr deploy

Step 2: Create and Apply Our Pepr Dashboard to the Cluster

Create a new file called grafana-dashboard.yaml and add the following content:

apiVersion: v1
kind: ConfigMap
metadata:
  name: pepr-dashboard
  namespace: default
data:
  pepr-dashboard.json: |
    {
    "__inputs": [
      {
        "name": "DS_PROMETHEUS",
        "label": "Prometheus",
        "description": "",
        "type": "datasource",
        "pluginId": "prometheus",
        "pluginName": "Prometheus"
      }
    ],
    "__elements": {},
    "__requires": [
      {
        "type": "grafana",
        "id": "grafana",
        "name": "Grafana",
        "version": "9.1.6"
      },
      {
        "type": "datasource",
        "id": "prometheus",
        "name": "Prometheus",
        "version": "1.0.0"
      },
      {
        "type": "panel",
        "id": "stat",
        "name": "Stat",
        "version": ""
      },
      {
        "type": "panel",
        "id": "timeseries",
        "name": "Time series",
        "version": ""
      }
    ],
    "annotations": {
      "list": [
        {
          "builtIn": 1,
          "datasource": {
            "type": "grafana",
            "uid": "-- Grafana --"
          },
          "enable": true,
          "hide": true,
          "iconColor": "rgba(0, 211, 255, 1)",
          "name": "Annotations & Alerts",
          "target": {
            "limit": 100,
            "matchAny": false,
            "tags": [],
            "type": "dashboard"
          },
          "type": "dashboard"
        }
      ]
    },
    "editable": true,
    "fiscalYearStartMonth": 0,
    "graphTooltip": 0,
    "id": null,
    "links": [],
    "liveNow": false,
    "panels": [
      {
        "collapsed": false,
        "gridPos": {
          "h": 1,
          "w": 24,
          "x": 0,
          "y": 0
        },
        "id": 18,
        "panels": [],
        "title": "Pepr Status",
        "type": "row"
      },
      {
        "datasource": {
          "type": "prometheus",
          "uid": "prometheus"
        },
        "description": "Pepr pod status by pod",
        "fieldConfig": {
          "defaults": {
            "color": {
              "mode": "thresholds"
            },
            "mappings": [
              {
                "options": {
                  "0": {
                    "color": "red",
                    "index": 1,
                    "text": "Down"
                  },
                  "1": {
                    "color": "green",
                    "index": 0,
                    "text": "Up"
                  }
                },
                "type": "value"
              },
              {
                "options": {
                  "match": "empty",
                  "result": {
                    "color": "blue",
                    "index": 2,
                    "text": "?"
                  }
                },
                "type": "special"
              }
            ],
            "min": 0,
            "thresholds": {
              "mode": "absolute",
              "steps": [
                {
                  "color": "green",
                  "value": null
                },
                {
                  "color": "red",
                  "value": 80
                }
              ]
            }
          },
          "overrides": []
        },
        "gridPos": {
          "h": 8,
          "w": 12,
          "x": 0,
          "y": 1
        },
        "id": 14,
        "options": {
          "colorMode": "value",
          "graphMode": "none",
          "justifyMode": "auto",
          "orientation": "auto",
          "reduceOptions": {
            "calcs": [
              "last"
            ],
            "fields": "",
            "values": false
          },
          "text": {
            "titleSize": 16,
            "valueSize": 70
          },
          "textMode": "auto"
        },
        "pluginVersion": "9.1.6",
        "targets": [
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "up{container=\"server\"}",
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "A"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "up{container=\"watcher\"}",
            "hide": false,
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "B"
          }
        ],
        "title": "Pepr Status",
        "type": "stat"
      },
      {
        "datasource": {
          "type": "prometheus",
          "uid": "prometheus"
        },
        "fieldConfig": {
          "defaults": {
            "color": {
              "mode": "palette-classic"
            },
            "custom": {
              "axisCenteredZero": false,
              "axisColorMode": "text",
              "axisLabel": "",
              "axisPlacement": "auto",
              "barAlignment": 0,
              "drawStyle": "line",
              "fillOpacity": 0,
              "gradientMode": "none",
              "hideFrom": {
                "legend": false,
                "tooltip": false,
                "viz": false
              },
              "lineInterpolation": "linear",
              "lineWidth": 1,
              "pointSize": 5,
              "scaleDistribution": {
                "type": "linear"
              },
              "showPoints": "auto",
              "spanNulls": false,
              "stacking": {
                "group": "A",
                "mode": "none"
              },
              "thresholdsStyle": {
                "mode": "off"
              }
            },
            "mappings": [],
            "thresholds": {
              "mode": "absolute",
              "steps": [
                {
                  "color": "green",
                  "value": null
                },
                {
                  "color": "red",
                  "value": 80
                }
              ]
            }
          },
          "overrides": []
        },
        "gridPos": {
          "h": 8,
          "w": 12,
          "x": 12,
          "y": 1
        },
        "id": 12,
        "options": {
          "legend": {
            "calcs": [],
            "displayMode": "list",
            "placement": "bottom",
            "showLegend": true
          },
          "tooltip": {
            "mode": "single",
            "sort": "none"
          }
        },
        "targets": [
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "scrape_duration_seconds{container=\"server\"}",
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "A"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "scrape_duration_seconds{container=\"watcher\"}",
            "hide": false,
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "B"
          }
        ],
        "title": "Scrape Duration Seconds",
        "type": "timeseries"
      },
      {
        "collapsed": false,
        "gridPos": {
          "h": 1,
          "w": 24,
          "x": 0,
          "y": 9
        },
        "id": 6,
        "panels": [],
        "title": "Error, Alert, Validate and Mutate Counts",
        "type": "row"
      },
      {
        "datasource": {
          "type": "prometheus",
          "uid": "prometheus"
        },
        "fieldConfig": {
          "defaults": {
            "color": {
              "fixedColor": "dark-red",
              "mode": "fixed"
            },
            "mappings": [],
            "min": 0,
            "thresholds": {
              "mode": "absolute",
              "steps": [
                {
                  "color": "green",
                  "value": null
                },
                {
                  "color": "red",
                  "value": 80
                }
              ]
            }
          },
          "overrides": []
        },
        "gridPos": {
          "h": 8,
          "w": 6,
          "x": 0,
          "y": 10
        },
        "id": 16,
        "options": {
          "colorMode": "value",
          "graphMode": "area",
          "justifyMode": "auto",
          "orientation": "auto",
          "reduceOptions": {
            "calcs": [
              "last"
            ],
            "fields": "",
            "values": false
          },
          "text": {
            "titleSize": 16,
            "valueSize": 70
          },
          "textMode": "auto"
        },
        "pluginVersion": "9.1.6",
        "targets": [
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "count by(instance) (rate(pepr_errors{container=\"server\"}[$__rate_interval]))",
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "A"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "count by(instance) (rate(pepr_errors{container=\"watcher\"}[$__rate_interval]))",
            "hide": false,
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "B"
          }
        ],
        "title": "Pepr: Error Count",
        "type": "stat"
      },
      {
        "datasource": {
          "type": "prometheus",
          "uid": "prometheus"
        },
        "description": "Count of Pepr Alerts by pod",
        "fieldConfig": {
          "defaults": {
            "color": {
              "fixedColor": "dark-yellow",
              "mode": "fixed"
            },
            "mappings": [],
            "min": 0,
            "thresholds": {
              "mode": "absolute",
              "steps": [
                {
                  "color": "green",
                  "value": null
                },
                {
                  "color": "red",
                  "value": 80
                }
              ]
            }
          },
          "overrides": []
        },
        "gridPos": {
          "h": 8,
          "w": 6,
          "x": 6,
          "y": 10
        },
        "id": 10,
        "options": {
          "colorMode": "value",
          "graphMode": "area",
          "justifyMode": "auto",
          "orientation": "auto",
          "reduceOptions": {
            "calcs": [
              "last"
            ],
            "fields": "",
            "values": false
          },
          "text": {
            "titleSize": 16,
            "valueSize": 70
          },
          "textMode": "auto"
        },
        "pluginVersion": "9.1.6",
        "targets": [
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "pepr_alerts{container=\"server\"}",
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "A"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "pepr_alerts{container=\"watcher\"}",
            "hide": false,
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "B"
          }
        ],
        "title": "Pepr: Alert Count",
        "type": "stat"
      },
      {
        "datasource": {
          "type": "prometheus",
          "uid": "prometheus"
        },
        "description": "Count of Pepr Validate actions by pod",
        "fieldConfig": {
          "defaults": {
            "color": {
              "fixedColor": "dark-purple",
              "mode": "fixed"
            },
            "mappings": [],
            "min": 0,
            "thresholds": {
              "mode": "absolute",
              "steps": [
                {
                  "color": "green",
                  "value": null
                },
                {
                  "color": "red",
                  "value": 80
                }
              ]
            }
          },
          "overrides": []
        },
        "gridPos": {
          "h": 8,
          "w": 6,
          "x": 12,
          "y": 10
        },
        "id": 4,
        "options": {
          "colorMode": "value",
          "graphMode": "area",
          "justifyMode": "auto",
          "orientation": "auto",
          "reduceOptions": {
            "calcs": [
              "last"
            ],
            "fields": "",
            "values": false
          },
          "text": {
            "titleSize": 16,
            "valueSize": 66
          },
          "textMode": "auto"
        },
        "pluginVersion": "9.1.6",
        "targets": [
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "exemplar": false,
            "expr": "pepr_validate_count{container=\"server\"}",
            "instant": false,
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "A"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "pepr_validate_sum{container=\"watcher\"}",
            "hide": false,
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "B"
          }
        ],
        "title": "Pepr: Validate Count",
        "type": "stat"
      },
      {
        "datasource": {
          "type": "prometheus",
          "uid": "prometheus"
        },
        "description": "Count of Pepr mutate actions applied by pod.",
        "fieldConfig": {
          "defaults": {
            "color": {
              "fixedColor": "dark-blue",
              "mode": "fixed"
            },
            "mappings": [],
            "min": 0,
            "thresholds": {
              "mode": "absolute",
              "steps": [
                {
                  "color": "green",
                  "value": null
                },
                {
                  "color": "red",
                  "value": 80
                }
              ]
            }
          },
          "overrides": []
        },
        "gridPos": {
          "h": 8,
          "w": 6,
          "x": 18,
          "y": 10
        },
        "id": 2,
        "options": {
          "colorMode": "value",
          "graphMode": "area",
          "justifyMode": "auto",
          "orientation": "auto",
          "reduceOptions": {
            "calcs": [
              "last"
            ],
            "fields": "",
            "values": false
          },
          "text": {
            "titleSize": 16,
            "valueSize": 70
          },
          "textMode": "value_and_name"
        },
        "pluginVersion": "9.1.6",
        "targets": [
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "pepr_mutate_count{container=\"server\"}",
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "A"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "prometheus"
            },
            "editorMode": "builder",
            "expr": "rate(pepr_mutate_count{container=\"watcher\"}[24h])",
            "hide": false,
            "legendFormat": "{{instance}}",
            "range": true,
            "refId": "B"
          }
        ],
        "title": "Pepr: Mutate Count",
        "type": "stat"
      }
    ],
    "schemaVersion": 37,
    "style": "dark",
    "tags": [],
    "templating": {
      "list": []
    },
    "time": {
      "from": "now-24h",
      "to": "now"
    },
    "timepicker": {},
    "timezone": "",
    "title": "Pepr Dashboard",
    "uid": "j7BjgMpIk",
    "version": 17,
    "weekStart": ""
    }

Now, apply the grafana-dashboard.yaml file to the cluster:

kubectl apply -f grafana-dashboard.yaml

Step 3: Install Prometheus and Grafana using the kube-prometheus-stack Helm Chart

First, create a values.yaml file to add our endpoints to Prometheus and allow us to see our dashboard.

prometheus:
  enabled: true
  additionalServiceMonitors:
    - name: admission
      selector:
        matchLabels:
          pepr.dev/controller: admission
      namespaceSelector:
        matchNames:
          - pepr-system
      endpoints:
        - targetPort: 3000
          scheme: https
          tlsConfig:
            insecureSkipVerify: true
    - name: watcher
      selector:
        matchLabels:
          pepr.dev/controller: watcher
      namespaceSelector:
        matchNames:
          - pepr-system
      endpoints:
        - targetPort: 3000
          scheme: https
          tlsConfig:
            insecureSkipVerify: true
  additionalClusterRoleBindings:
    - name: scrape-binding
      roleRef:
        apiGroup: rbac.authorization.k8s.io
        kind: ClusterRole
        name: scrape-resources
      subjects:
      - kind: ServiceAccount
        name: prometheus-operator
        namespace: default
grafana:
  enabled: true
  adminUser: admin
  adminPassword: secret
  defaultDashboardsTimezone: browser
  extraVolumeMounts:
    - mountPath: /var/lib/grafana/dashboards
      name: pepr-dashboard
  extraVolumes:
    - name: pepr-dashboard
      configMap:
        name: pepr-dashboard
  dashboardProviders:
    dashboardproviders.yaml:
      apiVersion: 1
      providers:
      - name: 'default'
        isDefault: true
        orgId: 1
        folder: ''
        type: file
        disableDeletion: false
        editable: true
        options:
          path: /var/lib/grafana/dashboards/default
  dashboardsConfigMaps:
    default: pepr-dashboard

Now, install the kube-prometheus-stack Helm Chart using the values.yaml file we created.

helm install -f values.yaml monitoring prometheus-community/kube-prometheus-stack

Step 4: Check on Services

kubectl get svc

You should see something similar to the following services:

$ kubectl get svc
NAME                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
kubernetes                                ClusterIP   10.43.0.1       <none>        443/TCP                      99m
monitoring-kube-prometheus-prometheus     ClusterIP   10.43.116.49    <none>        9090/TCP,8080/TCP            81s
monitoring-kube-state-metrics             ClusterIP   10.43.232.84    <none>        8080/TCP                     81s
monitoring-grafana                        ClusterIP   10.43.82.67     <none>        80/TCP                       81s
monitoring-kube-prometheus-operator       ClusterIP   10.43.197.97    <none>        443/TCP                      81s
monitoring-kube-prometheus-alertmanager   ClusterIP   10.43.40.24     <none>        9093/TCP,8080/TCP            81s
monitoring-prometheus-node-exporter       ClusterIP   10.43.152.179   <none>        9100/TCP                     81s
alertmanager-operated                     ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   81s
prometheus-operated                       ClusterIP   None            <none>        9090/TCP                     81s

Step 5: Port Forward Prometheus and Grafana Services

kubectl port-forward service/prometheus-operated 9090

kubectl port-forward service/monitoring-grafana 3000:80

Step 6: View Prometheus Metrics Targets

You should be able to see the Pepr targets in the Prometheus UI by visiting the following URL:

http://localhost:9090/targets

The targets should look something like this:

Admission Endpoints Watcher Endpoint

Step 7: Test the Prometheus Connection in Grafana

You should be able to test the Prometheus connection in the Grafana UI by visiting the following URL:

http://localhost:3000/connections/datasources

The login information for Grafana was set in the values.yaml file:

username: admin password: secret

By clicking on the Prometheus data source, you should be able to test the connection to Prometheus by clicking the “Test” button at the bottom of the screen.

NOTE: The Prometheus server URL should be something like:

http://monitoring-kube-prometh-prometheus.default:9090/

You should now be able to select the Pepr Dashboard from the Grafana UI in the “Dashboards” section.

Note: The dashboard may take a few minutes to populate with data.

Summary

This tutorial demonstrated how to use Prometheus and Grafana to display metrics from your Pepr instance. If you have questions about Pepr metrics or dashboards, please reach out to us on Slack or GitHub Issues

2.3 - Building a Kubernetes Operator with Pepr

Introduction

This tutorial guides you through building a Kubernetes Operator using Pepr. You’ll create a WebApp Operator that manages custom WebApp resources in your Kubernetes cluster.

If you get stuck at any point, you can reference the complete example code in the Pepr Excellent Examples repository.

What You’ll Build

The WebApp Operator will:

Deploy a custom WebApp resource definition (CRD)
Watch for WebApp instances and reconcile them with the actual cluster state
For each WebApp instance, manage:
- A Deployment with configurable replicas
- A Service to expose the application
- A ConfigMap containing configurable HTML with language and theme options

All resources will include ownerReferences, triggering cascading deletion when a WebApp is removed. The operator will also automatically restore any managed resources that are deleted externally.

Prerequisites

A Kubernetes cluster (local or remote)
Access to the curl command
Basic understanding of Kubernetes concepts
Familiarity with TypeScript
Node.js ≥ 18.0

Create a new Pepr Module

[🟢⚪⚪⚪⚪⚪] Step 1 of 6

First, create a new Pepr module for your operator:

npx pepr init \
  --name operator \
  --uuid my-operator-uuid \
  --description "Kubernetes Controller for WebApp Resources" \
  --errorBehavior reject \
  --confirm &&
cd operator # set working directory as the new pepr module

To track your progress in this tutorial, let’s treat it as a git repository:

git init && git add --all && git commit -m "npx pepr init"

Create CRD

[🟢🟢⚪⚪⚪⚪] Step 2 of 6

The WebApp Custom Resource Definition (CRD) specifies the structure and validation for your custom resource. Create the necessary directory structure:

mkdir -p capabilities/crd/generated capabilities/crd/source

Generate a class based on the WebApp CRD using kubernetes-fluent-client. This allows us to react to the CRD fields in a type-safe manner. Create a CRD named crd.yaml for the WebApp that includes:

Theme selection (dark/light)
Language selection (en/es)
Configurable replica count
Status tracking

curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/crd/source/crd.yaml \
  -o capabilities/crd/source/crd.yaml

Examine the contents of capabilities/crd/source/crd.yaml. Note that the status should be listed under subresources to make it writable. We provide descriptions for each property to clarify their purpose. Enums are used to restrict the values that can be assigned to a property.

Create an interface for the CRD spec with the following command:

curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/crd/generated/webapp-v1alpha1.ts \
  -o capabilities/crd/generated/webapp-v1alpha1.ts

Examine the contents of capabilities/crd/generated/webapp-v1alpha1.ts.

Create a TypeScript file that contains the WebApp CRD named webapp.crd.ts. This will enable the controller to automatically create the CRD on startup. Use the command:

curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/crd/source/webapp.crd.ts \
  -o capabilities/crd/source/webapp.crd.ts

Take a moment to commit your changes for CRD creation:

git add capabilities/crd/ && git commit -m "Create WebApp CRD"

The webapp.crd.ts file defines the structure of our CRD, including the validation rules and schema that Kubernetes will use when WebApp resources are created or modified.

Create a file that will automatically register the CRD on startup named capabilities/crd/register.ts:

curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/crd/register.ts \
  -o capabilities/crd/register.ts

The register.ts file contains logic to ensure our CRD is created in the cluster when the operator starts up, avoiding the need for manual CRD installation.

Create a file to validate that WebApp instances are in valid namespaces and have a maximum of 7 replicas. Create a validator.ts file with the command:

curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/crd/validator.ts \
  -o capabilities/crd/validator.ts

The validator.ts file implements validation logic for our WebApp resources, ensuring they meet our requirements before they’re accepted by the cluster.

In this section, we’ve generated the CRD class for WebApp, created a function to automatically register the CRD, and added a validator to ensure WebApp instances are in valid namespaces and don’t exceed 7 replicas.

Commit your changes for CRD registration & validation:

git add capabilities/crd/ && git commit -m "Create CRD handling logic"

Create Helpers

[🟢🟢🟢⚪⚪⚪] Step 3 of 6

Now, let’s create helper functions that will generate the Kubernetes resources managed by our operator. These helpers will simplify the creation of Deployments, Services, and ConfigMaps for each WebApp instance.

Create a controller folder in the capabilities folder and create a generators.ts file. This file will contain functions that generate Kubernetes objects for the operator to deploy (with the ownerReferences automatically included). Since these resources are owned by the WebApp resource, they will be deleted when the WebApp resource is deleted.

mkdir -p capabilities/controller

Create generators.ts with the following command:

curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/controller/generators.ts \
  -o capabilities/controller/generators.ts

The generators.ts file contains functions to create all the necessary Kubernetes resources for our WebApp, including properly configured Deployments, Services, and ConfigMaps with appropriate labels and selectors.

Our goal is to simplify WebApp deployment. Instead of requiring users to manage multiple Kubernetes objects, track versions, and handle revisions manually, they can focus solely on the WebApp instance. The controller reconciles WebApp instances against the actual cluster state to achieve the desired configuration.

The controller deploys a ConfigMap based on the language and theme specified in the WebApp resource and sets the number of replicas according to the WebApp specification.

Commit your changes for deployment, service, and configmap generation:

git add capabilities/controller/ && git commit -m "Add generators for WebApp deployments, services, and configmaps"

Create Reconciler

[🟢🟢🟢🟢⚪⚪] Step 4 of 6

Now, create the function that reacts to changes in WebApp instances. This function will be called and placed into a queue, guaranteeing ordered and synchronous processing of events, even when the system is under heavy load.

In the base of the capabilities folder, create a reconciler.ts file and add the following:

curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/reconciler.ts \
  -o capabilities/reconciler.ts

The reconciler.ts file contains the core logic of our operator, handling the creation, updating, and deletion of WebApp resources and ensuring the cluster state matches the desired state.

Finally, create the index.ts file in the capabilities folder and add the following:

curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/capabilities/index.ts \
  -o capabilities/index.ts

The index.ts file contains the WebAppController capability and the functions that are used to watch for changes to the WebApp resource and corresponding Kubernetes resources.

When a WebApp is created or updated, validate it, store the name of the instance and enqueue it for processing.
If an “owned” resource (ConfigMap, Service, or Deployment) is deleted, redeploy it.
Always redeploy the WebApp CRD if it was deleted as the controller depends on it

In this section we created a reconciler.ts file that contains the function responsible for reconciling the state of WebApp instances with the cluster and updating their status. The index.ts file contains the WebAppController capability and functions that watch for changes to WebApp resources and their corresponding Kubernetes objects. The Reconcile action processes callbacks in a queue, guaranteeing ordered and synchronous processing of events.

Commit your changes with:

git add capabilities/ && git commit -m "Create reconciler for webapps"

Add capability to Pepr Module

Ensure that the PeprModule in pepr.ts uses WebAppController. The implementation should look something like this:

new PeprModule(cfg, [WebAppController]);

Using sed, replace the contents of the file to use our new WebAppController:

# Use the WebAppController Module
sed -i '' -e '/new PeprModule(cfg, \[/,/\]);/c\
new PeprModule(cfg, [WebAppController]);' ./pepr.ts

# Update Imports
sed -i '' 's|import { HelloPepr } from "./capabilities/hello-pepr";|import { WebAppController } from "./capabilities";|' ./pepr.ts

Commit your changes now that the WebAppController is part of the Pepr Module:

git add pepr.ts && git commit -m "Register WebAppController with pepr module"

Build and Deploy Your Operator

[🟢🟢🟢🟢🟢⚪] Step 5 of 6

Preparing Your Environment

Create an ephemeral cluster with k3d.

k3d cluster delete pepr-dev &&
k3d cluster create pepr-dev --k3s-arg '--debug@server:0' --wait &&
kubectl rollout status deployment -n kube-system

What is an ephemeral cluster?
An ephemeral cluster is a temporary Kubernetes cluster that exists only for testing purposes. Tools like Kind (Kubernetes in Docker) and k3d let you quickly create and destroy clusters without affecting your production environments.

Update and Prepare Pepr

Make sure Pepr is updated to the latest version:

npx pepr update --skip-template-update

⚠️ Important Note: Be cautious when updating Pepr in an existing project as it could potentially override custom configurations. The --skip-template-update flag helps prevent this.

Building the Operator

Build the Pepr module by running:

npx pepr format && 
npx pepr build

Commit your changes after the build completes:

git add capabilities/ package*.json && git commit -m "Build pepr module"

The build process explained

The pepr build command performs three critical steps:

Compile TypeScript: Converts your TypeScript code to JavaScript using the settings in tsconfig.json
Bundle the Operator: Packages everything into a deployable format using esbuild
Generate Kubernetes Manifests: Creates all necessary YAML files in the dist directory, including:
- Custom Resource Definitions (CRDs)
- The controller deployment
- ServiceAccounts and RBAC permissions
- Any other resources needed for your operator

This process creates a self-contained deployment unit that includes everything needed to run your operator in a Kubernetes cluster.

┌─────────────────────┐          ┌──────────────────┐          ┌───────────────────┐
│                     │          │                  │          │                   │
│  Your Pepr Code     │────────▶ │  pepr build      │────────▶ │ dist/             │
│  (TypeScript)       │          │  (Build Process) │          │(Deployment Files) │
│                     │          │                  │          │                   │
└─────────────────────┘          └──────────────────┘          └───────────────────┘
                                                                        │
                                                                        │
                                                                        ▼
                                                               ┌───────────────────┐
                                                               │                   │
                                                               │  Kubernetes       │
                                                               │  Cluster          │
                                                               │                   │
                                                               └───────────────────┘

Deploy to Kubernetes

To deploy your operator to a Kubernetes cluster:

kubectl apply -f dist/pepr-module-my-operator-uuid.yaml &&
kubectl wait --for=condition=Ready pods -l app -n pepr-system --timeout=120s

What’s happening during deployment?

The first command applies all the Kubernetes resources defined in the YAML file, including:
- The WebApp CRD (Custom Resource Definition)
- A Deployment that runs your operator code
- The necessary RBAC permissions for your operator to function
The second command waits for the operator pod to be ready before proceeding. This ensures your operator is running before you attempt to create WebApp resources.

Troubleshooting Deployment Issues

Troubleshooting operator startup problems

If your operator doesn’t start properly, check these common issues:

Check pod logs:

kubectl logs -n pepr-system -l app --tail=100

Verify permissions:
```
kubectl describe deployment -n pepr-system
```
Look for permission-related errors in the events section.

Verify the deployment was successful by checking if the CRD has been properly registered:

kubectl get crd | grep webapp

You should see webapps.pepr.io in the output, which confirms your Custom Resource Definition was created successfully.

Understanding the WebApp Resource

You can use kubectl explain to see the structure of your custom resource. It may take a moment for the cluster to recognize this resource before the following command will work:

kubectl explain wa.spec

Expected Output:

GROUP:      pepr.io
KIND:       WebApp
VERSION:    v1alpha1

FIELD: spec <Object>

DESCRIPTION:
    <empty>
FIELDS:
  language      <string> -required-
    Language defines the language of the web application, either English (en) or
    Spanish (es).

  replicas      <integer> -required-
    Replicas is the number of desired replicas.

  theme <string> -required-
    Theme defines the theme of the web application, either dark or light.

💡 Note: wa is the short form of webapp that kubectl recognizes. This resource structure directly matches the TypeScript interface we defined earlier in our code.

Test Your Operator

[🟢🟢🟢🟢🟢🟢] Step 6 of 6

Understanding reconciliation

Reconciliation is the core concept behind Kubernetes operators. It’s the process of:

Observing the current state of resources in the cluster
Comparing it to the desired state (defined in your custom resource)
Taking actions to align the actual state with the desired state

This continuous loop ensures your application maintains its expected configuration even when disruptions occur.

┌───────────────────┐
│                   │
│  Custom Resource  │◄────────────┐
│  (WebApp)         │             │
│                   │             │
└───────┬───────────┘             │
        │                         │
        │ Observe                 │
        ▼                         │
┌───────────────────┐    ┌────────────────┐
│                   │    │                │
│  Pepr Operator    │───►│  Reconcile     │
│  Controller       │    │  (Take Action) │
│                   │    │                │
└───────┬───────────┘    └────────────────┘
        │                         ▲
        │ Create/Update           │
        ▼                         │
┌───────────────────┐             │
│  Owned Resources  │─────────────┘
│  • ConfigMap      │ If deleted or
│  • Service        │ changed, trigger
│  • Deployment     │ reconciliation
└───────────────────┘

Creating a WebApp Instance

Let’s create an instance of our custom WebApp resource in English with a light theme and 1 replicas:

curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/webapp-light-en.yaml \
  -o webapp-light-en.yaml

Examine the contents of webapp-light-en.yaml. It defines a WebApp with English language, light theme, and 1 replica.

Next, apply it to the cluster:

kubectl create namespace webapps &&
kubectl apply -f webapp-light-en.yaml

How resource creation works

Kubernetes API server receives the WebApp resource
Our operator’s controller (in index.ts) detects the new resource via its reconcile function
The controller validates the WebApp using our validator
The reconcile function creates three “owned” resources:
- A ConfigMap with HTML content based on the theme and language
- A Service to expose the web application
- A Deployment to run the web server pods with the specified number of replicas
The status is updated to track progress

All this logic is in the code we wrote earlier in the tutorial.

Verifying Resource Creation

Now, verify that the WebApp and its owned resources were created properly:

kubectl get cm,svc,deploy,webapp -n webapps

Expected Output:

NAME                                    DATA   AGE
configmap/kube-root-ca.crt              1      6s
configmap/web-content-webapp-light-en   1      5s

NAME                      TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/webapp-light-en   ClusterIP   10.43.85.1   <none>        80/TCP    5s

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/webapp-light-en   1/1     1            1           5s

💡 Tip: Our operator created three resources based on our single WebApp definition - this is the power of operators in action!

Checking WebApp Status

The status field is how our operator communicates the current state of the WebApp:

kubectl get wa webapp-light-en -n webapps -ojsonpath="{.status}" | jq

Expected Output:

{
  "observedGeneration": 1,
  "phase": "Ready"
}

Understanding status fields

observedGeneration: A counter that increments each time the resource spec is changed
phase: The current lifecycle state of the WebApp (“Pending” during creation, “Ready” when all components are operational)

This status information comes from our reconciler code, which updates these fields during each reconciliation cycle.

You can also see events related to your WebApp that provide a timeline of actions taken by the operator:

kubectl describe wa webapp-light-en -n webapps

Expected Output:

Name:         webapp-light-en
Namespace:    webapps
API Version:  pepr.io/v1alpha1
Kind:         WebApp
Metadata: ...
Spec:
  Language:  en
  Replicas:  1
  Theme:     light
Status:
  Observed Generation:  1
  Phase:                Ready
Events:
  Type    Reason                    Age   From             Message
  ----    ------                    ----  ----             -------
  Normal  InstanceCreatedOrUpdated  36s   webapp-light-en  Pending
  Normal  InstanceCreatedOrUpdated  36s   webapp-light-en  Ready

Viewing Your WebApp

To access your WebApp in a browser, use port-forwarding to connect to the service. The following command runs the portforward in the background. Be sure to make note of the Port-forward PID for later when we tear down the test environment.

kubectl port-forward svc/webapp-light-en -n webapps 3000:80 &
PID=$!
echo "Port-forward PID: $PID"

About port-forwarding

Port-forwarding creates a secure tunnel from your local machine to a pod or service in your Kubernetes cluster. In this case, we’re forwarding your local port 3000 to port 80 of the WebApp service, allowing you to access the application at http://localhost:3000 in your browser.

Now open http://localhost:3000 in your browser or run curl http://localhost:3000 to see the response in a terminal. The browser should display a light theme web application:

WebApp Light Theme

Testing the Reconciliation Loop

A key feature of operators is their ability to automatically repair resources when they’re deleted or changed. Let’s test this by deleting the ConfigMap:

kubectl delete cm -n webapps --all &&
sleep 10 && 
kubectl get cm -n webapps

Expected output:

configmap "kube-root-ca.crt" deleted
configmap "web-content-webapp-light-en" deleted
NAME                          DATA   AGE
kube-root-ca.crt              1      0s
web-content-webapp-light-en   1      0s

Now that we’ve successfully deployed a WebApp, commit your changes:

git add webapp-light-en.yaml && git commit -m "Add WebApp resource for light mode in english"

Behind the scenes of reconciliation

When you deleted the ConfigMap, Kubernetes sent a DELETE event
Our operator (in index.ts) was watching for these events via the onDeleteOwnedResource handler
This triggered the reconciliation loop, which detected that the ConfigMap was missing
The reconciler recreated the ConfigMap based on the WebApp definition
This all happened automatically without manual intervention - the core benefit of using an operator!

🛠️ Try it yourself: Try deleting the Service or Deployment. What happens? The operator should recreate those too!

Updating the WebApp

Now let’s test changing the WebApp’s specification. Copy down the next WebApp resource with the following command:

curl -s https://raw.githubusercontent.com/defenseunicorns/pepr-excellent-examples/main/pepr-operator/webapp-dark-es.yaml \
  -o webapp-dark-es.yaml

Compare the contents of webapp-light-en.yaml and webapp-dark-es.yaml with the command:

diff --side-by-side \
  webapp-light-en.yaml \
  webapp-dark-es.yaml

kubectl apply -f webapp-dark-es.yaml

💡 Note: We’ve changed the theme from light to dark and the language from English (en) to Spanish (es).

Your port-forward should still be active, so you can refresh your browser to see the changes. If your porf-forward is no longer active for some reason, create a new one:

# Only needed if previous port-forward closed
kubectl port-forward svc/webapp-light-en -n webapps 3000:80 &
PID=$!
echo "Port-forward PID: $PID"

Now open http://localhost:3000 in your browser or run curl http://localhost:3000 to see the response in a terminal. The browser should display a dark theme web application:

WebApp Dark Theme

Now that we’ve successfully updated a WebApp, commit your changes:

git add webapp-dark-es.yaml && git commit -m "Update WebApp resource for dark mode in spanish"

How updating works

When you apply the changed WebApp, Kubernetes sends an UPDATE event
Our operator’s controller (in index.ts) detects this via the onUpdate handler
The updated spec is validated and then queued for reconciliation
The reconciler compares the current resources with what’s needed for the new spec
It updates the ConfigMap with the new theme and language content
The Deployment automatically detects the ConfigMap change and restarts the pod with the new content

Cleanup

When you’re done testing, you can delete your WebApp and verify that all owned resources are removed:

kill $PID && # Close port-forward
kubectl delete wa -n webapps --all && 
sleep 5 &&
kubectl get cm,deploy,svc -n webapps

You can also delete the entire test cluster when you’re finished:

k3d cluster delete pepr-dev

Congratulations!

You’ve successfully built a Kubernetes operator using Pepr. Through this tutorial, you:

Created a custom resource definition (CRD) for WebApps
Implemented a controller with reconciliation logic
Added validation for your custom resources
Deployed your operator to a Kubernetes cluster
Verified that your operator correctly manages the lifecycle of WebApp resources

This pattern is powerful for creating self-managing applications in Kubernetes. Your operator now handles the complex task of maintaining your application’s state according to your specifications, reducing the need for manual intervention.

What You’ve Learned

By completing this tutorial, you’ve gained experience with several important concepts:

Custom Resource Definitions (CRDs): You defined a structured, validated schema for your WebApp resources
Reconciliation: You implemented the core operator pattern that maintains desired state
Owner References: You used Kubernetes ownership to manage resource lifecycles
Status Reporting: Your operator provides feedback about resource state through status fields
Watch Patterns: Your operator reacts to changes in both custom and standard Kubernetes resources

These concepts form the foundation of the operator pattern and can be applied to manage any application or service on Kubernetes.

Next Steps

Now that you understand the basics of building an operator with Pepr, you might want to:

Add more sophisticated validation logic
Implement status conditions that provide detailed health information
Add support for upgrading between versions of your application
Explore more complex reconciliation patterns for multi-component applications
Add metrics and monitoring to your operator

For more information, check out:

3 - Module Examples

We maintain a repo of Pepr examples at Pepr Excellent Examples. Each example is a complete module that can be deployed to a Pepr instance.

4 - Pepr Best Practices

Pepr Best Practices

Mutating Webhook Errors

When developing mutating admission policies, it is essential to include a validation step immediately after applying mutations. This ensures that the changes made by the mutating admission policy were applied correctly and do not introduce unintended inconsistencies or invalid configurations into your Kubernetes cluster.

Why Validate After Mutating?

Detect Misconfigurations Early: Mutating admission policies modify incoming resource configurations dynamically. Without validation, you risk introducing invalid configurations into your cluster if the mutation logic contains bugs, unintended side effects, or runs too long and causes a Webhook Timeout.
Maintain Cluster Integrity: By validating the mutated resource, you ensure it adheres to expected formats, standards, and constraints, maintaining the health and stability of your cluster.
Catch Logic Errors in Mutations: A mutation may not always produce the intended output due to edge cases, unexpected inputs, or incorrect assumptions in the mutation logic.

Validation helps catch such issues early and becomes particularly important if your Module is configured to use a Webhook failurePolicy of ignore. In that case, admission requests failures won’t prevent further processing and/or acceptance of mutate-failed requests and could result in undesirable resources getting into your cluster!

Comply with Kubernetes Best Practices: Kubernetes resources must meet specific structural and functional requirements. Validating ensures compliance, preventing the risk of deployment failures or runtime errors.

How to implement a Validate-After-Mutate Pattern

Apply the desired transformations to the resource in the Mutate block.
Validate the mutated resource in the Validate block to ensure it adheres to the expected structure.
If the validation fails, reject the resource with a descriptive message explaining the issue.

When(a.Pod)
  .IsCreated()
  .InNamespace("my-app")
  .WithName("database")
  .Mutate(po => po.SetLabel("pepr", "true"))
  .Validate(po => {
    if (po.Raw.metadata?.labels["pepr"] !== "true") {
      return po.Approve();
    }
    return po.Deny("Needs pepr label set to true")
  });

Core Development

When developing new features in Pepr Core, it is recommended to use npx pepr deploy -i pepr:dev, which will deploy Pepr’s Kubernetes manifests to the cluster with the development image. This will allow you to test your changes without having to build a new image and push it to a registry.

The workflow for developing features in Pepr is:

Run npm test which will create a k3d cluster and build a development image called pepr:dev
Deploy development image into the cluster with npx pepr deploy -i pepr:dev

Debugging

Pepr is composed of Modules, Capabilities, and Actions:

Actions are the blocks of code containing filters, Mutate, Validate, Watch, Reconcile, and OnSchedule.
Capabilities such as hello-pepr.ts.
Modules are the result of npx pepr init. You can have as many Capabilities as you would like in a Module.

Pepr is a webhook-based system, meaning it is event-driven. When a resource is created, updated, or deleted, Pepr is called to perform the actions you have defined in your Capabilities. It’s common for multiple webhooks to exist in a cluster, not just Pepr. When there are multiple webhooks, the order in which they are called is not guaranteed. The only guarantee is that all of the MutatingWebhooks will be called before all of the ValidatingWebhooks. After the admission webhooks are called, the Watch and Reconcile are called. The Reconcile and Watch create a watch on the resources specified in the When block and are watched for changes after admission. The difference between reconcile and watch is that Reconcile processes events in a queue to guarantee that the events are processed in order where as watch does not.

Considering that many webhooks may be modifying the same resource, it is best practice to validate the resource after mutations are made to ensure that the resource is in a valid state if it has been changed since the last mutation.

When(a.Pod)
  .IsCreated()
  .InNamespace("my-app")
  .WithName("database")
  .Mutate(pod => {
    pod.metadata.labels["pepr"] = "true";
    return pod;
  })
  // another mutating webhook could removed labels
  .Validate(pod => {
    if (pod.metadata.labels["pepr"] !== "true") {
      return pod.Approve("Label 'pepr' must be 'true'");
    }
    return pod.Deny("Needs pepr label set to true")
  });

If you think your Webhook is not being called for a given resource, check the *WebhookConfiguration.

Debugging During Module Development

Pepr supports breakpoints in the VSCode editor. To use breakpoints, run npx pepr dev in the root of a Pepr module using a JavaScript Debug Terminal. This command starts the Pepr development server running at localhost:3000 with the *WebhookConfiguration configured to send AdmissionRequest objects to the local address.

This allows you to set breakpoints in Mutate(), Validate(), Reconcile(), Watch() or OnSchedule() and step through module code.

Note that you will need a cluster running:

k3d cluster create pepr-dev --k3s-arg '--debug@server:0' --wait

When(a.Pod)
  .IsCreated()
  .InNamespace("my-app")
  .WithName("database")
  .Mutate(pod => {
    // Set a breakpoint here
    pod.metadata.labels["pepr"] = "true";
    return pod;
  })
  .Validate(pod => {
    // Set a breakpoint here
    if (pod.metadata.labels["pepr"] !== "true") {
      return ["Label 'pepr' must be 'true'"];
    }
  });

Logging

Pepr can deploy two types of controllers: Admission and Watch. The controllers deployed are dictated by the Actions called for by a given set of Capabilities (Pepr only deploys what is necessary). Within those controllers, the default log level is info but that can be changed to debug by setting the LOG_LEVEL environment variable to debug.

To pull logs for all controller pods:

kubectl logs -l app -n pepr-system

Admission Controller

If the focus of the debug is on a Mutate() or Validate(), the relevenat logs will be from pods with label pepr.dev/controller: admission.

kubectl logs -l pepr.dev/controller=admission -n pepr-system

More refined admission logs – which can be optionally filtered by the module UUID – can be obtained with npx pepr monitor

npx pepr monitor

Watch Controller

If the focus of the debug is a Watch(), Reconcile(), or OnSchedule(), look for logs from pods containing label pepr.dev/controller: watcher.

kubectl logs -l pepr.dev/controller=watcher -n pepr-system

Internal Error Occurred

Error from server (InternalError): Internal error occurred: failed calling webhook "<pepr_module>pepr.dev": failed to call webhook: Post ...

When an internal error occurs, check the deployed *WebhookConfiguration resources’ timeout and failurePolicy configurations. If the failurePolicy is set to Fail, and a request cannot be processed within the timeout, that request will be rejected. If the failurePolicy is set to Ignore, given the same timeout conditions, the request will (perhaps surprisingly) be allowed to continue.

If you have a validating webhook, the recommended is to set the failurePolicy to Fail to ensure that the request is rejected if the webhook fails.

    failurePolicy: Fail
    matchPolicy: Equivalent
    timeoutSeconds: 3

The failurePolicy and timeouts can be set in the Module’s package.json file, under the pepr configuration key. If changed, the settings will be reflected in the *WebhookConfiguration after the next build:

  "pepr": {
    "uuid": "static-test",
    "onError": "ignore", 
    "webhookTimeout": 10,
  }

Pepr Store Custom Resource

If you need to read all store keys, or you think the PeprStore is malfunctioning, you can check the PeprStore CR:

kubectl get peprstore  -n pepr-system -o yaml

You should run in npx pepr dev mode to debug the issue, see the Debugging During Module Development section for more information.

Deployment

Production environment deployments should be declarative in order to avoid mistakes. The Pepr modules should be generated with npx pepr build and moved into the appropriate location.

Development environment deployments can use npx pepr deploy to deploy Pepr’s Kubernetes manifests into the cluster or npx pepr dev to active debug the Pepr module with breakpoints in the code editor.

Keep Modules Small

Modules are minified and built JavaScript files that are stored in a Kubernetes Secret in the cluster. The Secret is mounted in the Pepr Pod and is processed by Pepr Core. Due to the nature of the module being packaged in a Secret, it is recommended to keep the modules as small as possible to avoid hitting the 1MB limit of secrets.

Recommendations for keeping modules small are:

Don’t repeat yourself
Only import the part of the library modules that you need

It is suggested to lint and format your modules using npx pepr format.

Monitoring

Pepr can monitor Mutations and Validations from Admission Controller the through the npx pepr monitor [module-uuid] command. This command will display neatly formatted log showing approved and rejected Validations as well as the Mutations. If [module-uuid] is not supplied, then it uses all Pepr admission controller logs as the data source. If you are unsure of what modules are currently deployed, issue npx pepr uuid to display the modules and their descriptions.

✅  MUTATE     pepr-demo/pepr-demo (50c5d836-335e-4aa5-8b56-adecb72d4b17)

✅  VALIDATE   pepr-demo/example-2 (01c1d044-3a33-4160-beb9-01349e5d7fea)

❌  VALIDATE   pepr-demo/example-evil-cm (8ee44ca8-845c-4845-aa05-642a696b51ce)
[ 'No evil CM annotations allowed.' ]

Multiple Modules or Multiple Capabilities

Each module has it’s own Mutating, Validating webhook configurations, Admission and Watch Controllers and Stores. This allows for each module to be deployed independently of each other. However, creating multiple modules creates overhead on the kube-apiserver, and the cluster.

Due to the overhead costs, it is recommended to deploy multiple capabilities that share the same resources (when possible). This will simplify analysis of which capabilities are responsible for changes on resources.

However, there are some cases where multiple modules makes sense. For instance different teams owning separate modules, or one module for Validations and another for Mutations. If you have a use-case where you need to deploy multiple modules it is recommended to separate concerns by operating in different namespaces.

OnSchedule

OnSchedule is supported by a PeprStore to safeguard against schedule loss following a pod restart. It is utilized at the top level, distinct from being within a Validate, Mutate, Reconcile or Watch. Recommended intervals are 30 seconds or longer, and jobs are advised to be idempotent, meaning that if the code is applied or executed multiple times, the outcome should be the same as if it had been executed only once. A major use-case for OnSchedule is day 2 operations.

Security

To enhance the security of your Pepr Controller, we recommend following these best practices:

Regularly update Pepr to the latest stable release.
Secure Pepr through RBAC using scoped mode taking into account access to the Kubernetes API server needed in the callbacks.
Practice the principle of least privilege when assigning roles and permissions and avoid giving the service account more permissions than necessary.
Use NetworkPolicy to restrict traffic from Pepr Controllers to the minimum required.
Limit calls from Pepr to the Kubernetes API server to the minimum required.
Set webhook failure policies to Fail to ensure that the request is rejected if the webhook fails. More Below..

When using Pepr as a Validating Webhook, it is recommended to set the Webhook’s failurePolicy to Fail. This can be done in your Pepr module in thevalues.yaml file of the helm chart by setting admission.failurePolicy to Fail or in the package.json under pepr by setting the onError flag to reject, then running npx pepr build again.

By following these best practices, you can help protect your Pepr Controller from potential security threats.

Reconcile

Fills a similar niche to .Watch() – and runs in the Watch Controller – but it employs a Queue to force sequential processing of resource states once they are returned by the Kubernetes API. This allows things like operators to handle bursts of events without overwhelming the system or the Kubernetes API. It provides a mechanism to back off when the system is under heavy load, enhancing overall stability and maintaining the state consistency of Kubernetes resources, as the order of operations can impact the final state of a resource. For example, creating and then deleting a resource should be processed in that exact order to avoid state inconsistencies.

When(WebApp)
  .IsCreatedOrUpdated()
  .Validate(validator)
  .Reconcile(async instance => {
     // Do work here

Pepr Store

The store is backed by ETCD in a PeprStore resource, and updates happen at 5-second intervals when an array of patches is sent to the Kubernetes API Server. The store is intentionally not designed to be transactional; instead, it is built to be eventually consistent, meaning that the last operation within the interval will be persisted, potentially overwriting other operations. Changes to the data are made without a guarantee that they will occur simultaneously, so caution is needed in managing errors and ensuring consistency.

Watch

Pepr streamlines the process of receiving timely change notifications on resources by employing the Watch mechanism. It is advisable to opt for Watch over Mutate or Validate when dealing with more extended operations, as Watch does not face any timeout limitations. Additionally, Watch proves particularly advantageous for monitoring previously existing resources within a cluster. One compelling scenario for leveraging Watch is when there is a need to chain API calls together, allowing Watch operations to be sequentially executed following Mutate and Validate actions.

When(a.Pod)
  .IsCreated()
  .InNamespace("my-app")
  .WithName("database")
  .Mutate(pod => // .... )
  .Validate(pod => // .... )
  .Watch(async (pod, phase) => {
    Log.info(pod, `Pod was ${phase}.`);

    // do consecutive api calls

TOP

5 - Frequently Asked Questions

How do I remove the punycode warning?

By default, warnings are removed. You can allow warnings by setting the PEPR_NODE_WARNINGS environment variable.

PEPR_NODE_WARNINGS="true"

If you allow warnings, you can disable the specific punycode warning by:

export NODE_OPTIONS="--disable-warning=DEP0040"

npx --node-options="--disable-warning=DEP0040" pepr [command]

How does Pepr compare to Operator SDK?

Pepr and Operator SDK are both frameworks used for building Kubernetes operators and admission controllers. While they share a common goal of simplifying the creation of Kubernetes operators and enhancing Kubernetes functionality, they have different approaches and features.

Similarities:

Scaffolding: Automatically generate boilerplate code for new operators and Kubernetes manifests for building controllers.
Helper Functions: Provide utility functions to interact with Kubernetes resources and manage the lifecycle of Kubernetes resources.
Admission Webhooks and Kubernetes Controllers: Both support building admission and Kubernetes controllers by reacting to changes in the cluster in an automated way.

Differences:

Main Goals: Operator SDK is mainly focused on building operators and later included support for Webhooks. In contrast, Pepr started out as a framework for building Webhooks and later added support for building operators via Kubernetes-Fluent-Client through Watch and Reconcile.
Language Support: Operator SDK supports Go, Ansible, and Helm, while Pepr is written in TypeScript and designed with an English style fluent API for simplicity.
Lifecycle Management: Operator SDK provides tools for managing the lifecycle of operators through OLM (Operator Lifecycle Manager), while Pepr relies on Helm for upgrades.
Complexity: Operator SDK uses native Kubernetes Go libraries for deep integration with Kubernetes resources, while Pepr exposes a high-level abstraction allowing users to focus on business logic.
Easy Setup: While both make it easy to initialize a new project, Pepr comes with an out-of-the-box hello-pepr.ts example which demonstrates how to use Pepr effectively.

How does Pepr compare to Kyverno?

Although Pepr and Kyverno have similarities, Pepr is very different than Kyverno. They have very different mission statements. Pepr focuses on making operators as easy as possible. Kyverno focuses on reporting, not building operators.

Similarities:

Both have Mutating Webhooks that can dynamically change resources before admission
Both have Validating Webhooks to configure what can/cannot go through admission
Both provide a way to react to changes to pre-existing cluster resources (i.e., resources that have already gone through admission)

Differences:

Pepr is more like a “framework” than a tool. In Pepr you create a Pepr Module. In the Pepr module you define capabilities that enforce / apply desired cluster state.
Pepr is written in TypeScript. Kyverno is written in Go.
Pepr provides the flexibility of a full-fledged, strongly typed programming language to decide what decisions to make based on events happening in the cluster. With Kyverno, you are limited to the constraints of YAML.
Pepr can be used to reconcile events in order, similar to Kube-Builder or Operator SDK.
Pepr can apply a CustomResourceDefinition and control cluster state based on that custom resource.

Both Pepr and Kyverno are great tools. Which one to use for your project depends on your use case.

How do I add custom labels to Pepr’s Kubernetes manifests?

During the build process, custom labels can be added the pepr-system namespace based on the package.json. Checkout the Customizing with package.json.

The following example shows how to add custom namespace labels.

  "pepr": {
    "name": "new-release",
    ...
    "customLabels": {
      "namespace": {
        "istio-injection": "enabled",
        "app.kubernetes.io/name": "new-release"
      }
    },
    ...
  }

The resulting namespace will be generated after npx pepr build.

apiVersion: v1
kind: Namespace
metadata:
  name: pepr-system
  labels:
    istio-injection: enabled
    app.kubernetes.io/name: new-release

My Pepr version is not the latest

If you notice your Pepr version does not correspond to the latest release in GitHub after doing npx pepr -V, clearing the NPX cache can often resolve the issue.

Run the cache clearing command

npx clear-npx-cache

If you want to ensure the cache has been cleared, you can check the cache directory. The location of this directory varies based on your operating system and configuration. However, you can generally find it in your system’s home directory under .npm.

Note - If you are inside of the Pepr Core repo (https://github.com/defenseunicorns/pepr), then it is normal for npx pepr -V to return 0.0.0-development.

I’ve found a bug, what should I do?

Please report it by opening an issue in the Pepr GitHub repository. Please include as much information as possible in your bug report, including:

The version of Pepr you are using
The version of Kubernetes you are using

I’ve found a security issue, what should I do?

Security issues should be reported privately, via email. You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message.

I have a feature request, what should I do?

Please let us know by opening an issue in the Pepr GitHub repository.

How do I get help with Pepr?

If you have a question about Pepr, please open an issue in the Pepr GitHub repository or contact us through the Pepr channel on the Kubernetes Slack.

6 - Roadmap for Pepr

2025 Roadmap

Phase 1: Code Quality - Experimentation

Q1:
- Turn on eslint enforcement and configure settings and see no warnings:
  - Eliminate circular dependencies, complexity, return statements, etc.
- Metric and Performance Baselining:
  - Establish a baseline for performance and resource utilization metrics. Use this data to make informed decisions about the direction of the project in terms of Deno2
- Nightly Release:
  - Establish a nightly release process. This will help us to catch bugs early and ensure that the project is always in a releasable state.

Phase 2: Test Reliability & CRD generation

Q2:
- Professional Dashboard displaying metrics and performance tests originating from CI
- Improve integration testing with Pepr, UDS Core, and Kubernetes Fluent Client
  - This focused on a more reliable release process to reduce the frequency of regressions.
- **Migrate old journey/ style tests to Pepr-Excellent-Examples or newer CLI integration tests. See pepr/#1597.
- Define Governance with Pepr
  - Add support for compliance/governance validation within Pepr. See pepr/#1823.
- As an alpha feature, support the generation of CRDs within Pepr. See pepr/#1062.

Phase 3: TBD

Q3:
- Support Governance with Pepr
  - Improve initial support added in Q2.
- Transactional PeprStore Implementation:
  - Begin integrating transactional functionality into PeprStore. The implementation will emphasize robust testing and clear documentation to support fast and reliable data operations in a transactional manner.

Phase 4: TBD

Q4:
- Prepare for Project Donation
  - Consider Pepr project donation after resolving the open topics of Governance and performance at scale
- Experimentation with Deno2:
  - Experiment with Deno2 through Dash Days and see if it can be used in the project. Look into the performance improvements and new features that Deno2 brings to the table.

Deferred or Unprioritized Work

This section contains work we’ve considered, but have not slotted in to the roadmap or have had to deprioritize.

Pepr v1.0.0
- Determine if Pepr is stable enough to be at v1.0.0 and release it!
OTEL Preparation:
- Come up with a plan to implement Open Telemetry. Specifically distributed tracing, metrics, logs and events. Use this data to make debugging easier from a UDS Core prespective. There will be documentation work on how to use an OTEL collector with a Pepr Module.
Deno2 Implementation:
- If determined to be advisable, move forward with migrating the project to Deno2 (starting with the kubernetes-fluent-client..?). This phase will focus on adapting the codebase, conducting extensive testing, and creating comprehensive documentation to ensure a seamless transition.
Determine if a Transactional PeprStore makes sense:
- Sus out details involved with having a transactional Pepr Store. What are the implications of this? What are the benefits? What are the drawbacks? What are the use-cases? What are the technologies that can be used to implement this?

2024 Roadmap

Phase 1: Preparation - Testing and Docs

Q1:
- Establish Medium for Communication with Community:
  - Establish communication channel for community members and contributors. Easy/discoverable “how to contribute” guide.
- Site/Documentation:
  - Improve information architecture, nail killer use-cases, and make it obvious how to get started and find your way around.
- Automated Testing:
  - Focus on stories that increase confidence in protection of features and functionality. Simplify hello-pepr and bring e2e test against external repo with examples. Make sure that contributions are well-tested.

Phase 2: Community Building, Competitive Analysis, Instrumentation and Feature Development

Q2:
- Community Engagement:
  - Begin engaging with potential contributors and users through social media, Kubernetes/Cloud Native Computing Foundation (CNCF) meetups, and other channels. Monitor and participate in our Slack channel.
- Feature Development:
  - Based on company feedback, continuously improve and add features. Add feature parity with other tools in the Kubernetes ecosystem where it makes sense. Chip away at the backlog.
- Documentation Improvements:
  - Continue to improve documentation and add more examples like Doom, find scaling limitations
- Competitive Analysis:
  - Understand the competitive landscape and how/where Pepr can/does differentiate itself. Have it in the docs.
- Instrumentation:
  - Outfit Pepr with the necessary instrumentation to collect metrics and logs. Use this data to make informed about the watch direction.

Phase 3: Informer Iterations, Watch Analysis, Metric Collection, Development/Stabilization/Improvement of New Features

Q3:
- Informer Iterations:
  - Tune informer based on feedback from UDS Core and delivery so events will be reconciled through eventual consistency.
  - Evaluate other underlying technologies for informer.
- Feature Development:
  - .WithNameRegex()/.InNamespaceRegex() for Kubernetes Controller development against resources that could match a variety of names or namespaces.
  - .WithDeletionTimestamp() for Kubernetes Controller Development against resources that are pending deletion
  - Create a sharded queue that enables the Module Author to define queueing strategies based on kind, kind/namespace, kind/namespace/name, global.
- Community Building:
  - Grow the contributor base, establish a governance model, and encourage community-led initiatives. Look to drive conversation in our Slack Channel.
  - Based on community feedback, continuously improve and add features. Rigoursly test and document and review code.
- Project Advocation:
  - Publicly advocate for the project and encourage adoption.
- Stability:
  - Ensure that the project is stable and reliable. Make sure that the project is well-tested and documented.
  - Identify new areas of project improvement and work on them.

Phase 4: Feature Development, Stablization, Code and Testing Quality Improvements

Q4:
- Features:
  - Improve DevEx overrides in Pepr section of package.json for customized builds of Modules
  - .Finalize() for Kubernetes Controller Development to controlling downstream resources through finalizers
  - Scaffolding to validate images from a registry through cosign/sigstore
  - Replace node-fetch with Undici in the KFC project for performance improvements
- Removal of Circular Dependencies: Identify and remove circular dependencies in the codebase.
- Strong Typings:
  - Identify where we can make Pepr/KFC stronger by adding typings.
- Work to reduce code complexity
  - Monitor code complexity through eslint, work to drive down complexity
- Robust E2E Tests in KFC:
  - Create a strong e2e suite in KFC, ensure tests are robust and cover all the features of KFC.
- Documentation:
  - Ensure that the documentation is up-to-date and accurate. Add more examples and use-cases.
  - Onboarding and contribution guides should be clear and easy to follow.
- Load/Stress Testing:
  - Load test Pepr/KFC to identify bottlenecks and areas of improvement.
  - Ensure that Pepr/KFC can handle a large number of resources and events over a sustained period of time (nightly).

7 - Community and Support

Introduction

Pepr is a community-driven project. We welcome contributions of all kinds, from bug reports to feature requests to code changes. We also welcome contributions of documentation, tutorials, and examples.

Contributing

You can find all the details on contributing to Pepr at:

Contributing to Pepr

Reporting Bugs

Information on reporting bugs can be found at:

Reporting Bugs

Reporting Security Issues

Information on reporting security issues can be found at:

Reporting Security Issues

7.1 - Pepr Media

2025

Presentations

Building Smarter Kubernetes Workflows: Pepr for the modern SRE

2024

Session Recordings

Workshops

Blogs

From YAML Chaos to Kubernetes Zen

2023

Session Recordings

8 - Contributor Guide

Thank you for your interest in contributing to Pepr! We welcome all contributions and are grateful for your help. This guide outlines how to get started with contributing to this project.

Contributor Guide

Code of Conduct

Please follow our Code of Conduct to maintain a respectful and collaborative environment.

Getting Started

Repository: https://github.com/defenseunicorns/pepr/
npm package: https://www.npmjs.com/package/pepr
Required Node version: >=18.0.0

Setup

Fork the repository.
Clone your fork locally: git clone https://github.com/your-username/pepr.git.
Install dependencies: npm ci.
Create a new branch for your feature or fix: git checkout -b my-feature-branch.

Kubernetes Fluent Client Contributions

Kubernetes Fluent Client is a library used by Pepr that provides a fluent interface for Kubernetes API clients. In Pepr, we use the kubernetes-fluent-client package to interact with Kubernetes resources. Due to the nature of this library, it is important to ensure that any changes made to the Kubernetes Fluent Client are thoroughly tested and validated in Pepr before being merged into the codebase. In particular, we need to ensure that the changes do not break any existing functionality or introduce new bugs, especially in the context of the Pepr Watcher.

If you are making changes to the Kubernetes Fluent Client, please ensure that you run the Soak Test in GitHub Actions to validate your changes.

To run the Soak Test, you can do the following:

Go to the GitHub Actions tab
Select the workflow named “Soak Test”
Click on the “Run workflow” button to get the options to run the workflow
Select the Kubernetes Fluent Client branch you want to test (KFC dev branch) and click “Run workflow”

Submitting a Pull Request

Create an Issue: For significant changes, please create an issue first, describing the problem or feature proposal. Trivial fixes do not require an issue.
Commit Your Changes: Make your changes and commit them. All commits must be signed.
Run Tests: Ensure that your changes pass all tests by running npm test.
Push Your Branch: Push your branch to your fork on GitHub.
Create a Pull Request: Open a pull request against the main branch of the Pepr repository. Please make sure that your PR passes all CI checks.

PR Requirements

PRs must be against the main branch.
PRs must pass CI checks.
All commits must be signed.
PRs should have a related issue, except for trivial fixes.

We take PR reviews seriously and strive to provide a great contributor experience with timely feedback. To help maintain this, we ask external contributors to limit themselves to no more than two open PRs at a time. Having too many open PRs can slow down the review process and impact the quality of feedback

Coding Guidelines

Please follow the coding conventions and style used in the project. Use ESLint and Prettier for linting and formatting:

Check formatting: npm run format:check
Fix formatting: npm run format:fix
If regex is used, provide a link to regex101.com with an explanation of the regex pattern.

Git Hooks

This project uses husky to manage git hooks for pre-commit and pre-push actions.
pre-commit will automatically run linters so that you don’t need to remember to run npm run format:* commands
pre-push will warn you if you’ve changed lots of lines on a branch and encourage you to optionally present the changes as several smaller PRs to facilitate easier PR reviews.
- The pre-push hook is an opinionated way of working, and is therefore optional.
- You can opt-in to using the pre-push hook by setting PEPR_HOOK_OPT_IN=1 as an environment variable.

Running Tests

Run Tests Locally

[!WARNING] Be cautious when creating test cases in journey/!**
Test cases that capture end-to-end/journey behavior are usually stored in pepr-excellent-examples or run as a Github workflow (.github/workflows).
Journey tests established in journey/ are from an earlier time in project history.

Run all tests: npm test

Test a Local Development Version

Run npm test and wait for completion.
Change to the test module directory: cd pepr-test-module.
You can now run any of the npx pepr commands.

Running Development Version Locally

Run npm run build to build the package.
For running modified pepr, you have two options:

Using npx ts-node ./src/cli.ts init to run the modified code directly, without installing it locally. You’ll need to also run npx link <your_dev_pepr_location> inside your pepr module, to link to the development version of pepr.
Install the pre-build package with npm install pepr-0.0.0-development.tgz. You’ll need to re-run the installation after every build, though.

Run npx pepr dev inside your module’s directory to run the modified version of pepr.

[!TIP] Make sure to re-run npm run build after you modify any of the pepr source files.

Contact

For any questions or concerns, please open an issue on GitHub or contact the maintainers.

9 - Contributor Covenant Code of Conduct

Our Pledge

We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.

Our Standards

Examples of behavior that contributes to a positive environment for our community include:

Demonstrating empathy and kindness toward other people
Being respectful of differing opinions, viewpoints, and experiences
Giving and gracefully accepting constructive feedback
Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
Focusing on what is best not just for us as individuals, but for the overall community

Examples of unacceptable behavior include:

The use of sexualized language or imagery, and sexual attention or advances of any kind
Trolling, insulting or derogatory comments, and personal or political attacks
Public or private harassment
Publishing others’ private information, such as a physical or email address, without their explicit permission
Other conduct which could reasonably be considered inappropriate in a professional setting

Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.

Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.

Scope

This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official email address, posting via an official social media account, or acting as an appointed representative at an online or offline event.

Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at pepr-dev-private@googlegroups.com. All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the reporter of any incident.

Enforcement Guidelines

Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:

1. Correction

Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.

Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.

2. Warning

Community Impact: A violation through a single incident or series of actions.

Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.

3. Temporary Ban

Community Impact: A serious violation of community standards, including sustained inappropriate behavior.

Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.

4. Permanent Ban

Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.

Consequence: A permanent ban from any sort of public interaction within the community.

Attribution

This Code of Conduct is adapted from the Contributor Covenant, version 2.1, available at https://www.contributor-covenant.org/version/2/1/code_of_conduct.html.

Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.

For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.

10 - Security Policy

Reporting a Vulnerability

If you discover a security vulnerability in Pepr, please report it to us by sending an email to pepr@defenseunicorns.com or directly through the GitHub UI.

Please include the following details in your report:

A clear description of the vulnerability
Steps to reproduce the vulnerability
Any additional information that may be helpful in understanding and fixing the issue

We appreciate your help in making Pepr more secure and will acknowledge your contribution in the remediation PR.

Contact

If you have any questions or concerns regarding the security of Pepr, please contact us at pepr@defenseunicorns.com.

11 - Support

Reporting Bugs

If you find a bug in Pepr, please report it by opening an issue in the Pepr GitHub repository. Please include as much information as possible in your bug report, including:

The version of Pepr you are using
The version of Kubernetes you are using

Contact

You can contact the Pepr team in the following ways:

Pepr

What happened to Pepr’s stars?

Type safe Kubernetes middleware for humans

Features

Example Pepr Action

Prerequisites

Wow, too many words! tl;dr

Concepts

Module

Capability

Action

Logical Pepr Flow

TypeScript

Community

1 - User Guide

Sections

1.1 - Pepr CLI

npx pepr init

npx pepr update

npx pepr dev

npx pepr deploy

npx pepr monitor

npx pepr uuid

npx pepr build

npx pepr kfc

npx pepr crd

npx pepr crd create

npx pepr crd generate

1.2 - Pepr SDK

containers

getOwnerRefFrom

writeEvent

sanitizeResourceName

See Also

1.3 - Pepr Modules

What is a Pepr Module?

1.4 - Actions

Overview

1.4.1 - Mutate

Helpers

SetLabel

RemoveLabel

SetAnnotation

RemoveAnnotation

See Also

1.4.2 - Validate

Basic Validation

Validation with Warnings

Approving with Warnings

Denying with Warnings

1.4.3 - Reconcile

1.4.4 - Watch

1.4.5 - Finalize

1.4.6 - Using Alias Child Logger in Actions

See Also

1.5 - Pepr Capabilities

Creating a Capability

Reusable Capabilities

1.6 - Pepr Store

A Lightweight Key-Value Store for Pepr Modules

Key Features

Quick Start

API Reference

Methods

1.7 - Custom Resources

Importing Custom Resources

Generating TypeScript Types from CRDs

Using new types

1.8 - OnSchedule

Best Practices

Usage

Examples

Advantages

1.9 - RBAC Modes

Modes

admin

scoped

Debugging RBAC Issues

Updating the ClusterRole

1.10 - Metrics Endpoints

`npx pepr init`

`npx pepr update`

`npx pepr dev`

`npx pepr deploy`

`npx pepr monitor`

`npx pepr uuid`

`npx pepr build`

`npx pepr kfc`

`npx pepr crd`

`npx pepr crd create`

`npx pepr crd generate`

`containers`

`getOwnerRefFrom`

`writeEvent`

`sanitizeResourceName`

`SetLabel`

`RemoveLabel`

`SetAnnotation`

`RemoveAnnotation`

`Filters`