Flux Status and ntfy Notifications: The MCP Server Learns to Read GitOps
Why
The MCP server could already tell me about nodes, pods, workloads, certs, alerts, logs, and metrics. But it couldn't answer "is Flux actually reconciling everything?" or "did ntfy fire any notifications recently?" — which are arguably the two most important questions when you're about to pile new changes onto the cluster.
Flux is the entire deployment mechanism. If a Kustomization is stuck or a HelmRelease is failing, nothing I push is going to land. And ntfy is where all the cluster alerts end up. Not being able to query either of these through the MCP server felt like a gap worth closing.
Flux Status Tools
The CRD Zoo
Flux has a lot of CRDs spread across four API groups:
| API Group | Resources |
|---|---|
kustomize.toolkit.fluxcd.io/v1 | Kustomization |
helm.toolkit.fluxcd.io/v2 | HelmRelease |
source.toolkit.fluxcd.io/v1 | GitRepository, HelmRepository, OCIRepository |
image.toolkit.fluxcd.io/v1beta2 | ImageRepository, ImagePolicy, ImageUpdateAutomation |
None of these are in k8s-openapi, so they all go through kube-rs's dynamic API — same approach as the cert-manager tool from the previous round.
I split the Flux tools into three, because dumping all of this into one response would be overwhelming:
-
get_flux_status— The reconciliation tools: Kustomizations and HelmReleases. These are the "is my stuff deploying?" tools. Returns ready state, last applied revision, source reference, and path. -
get_flux_sources— Where Flux pulls from: GitRepository, HelmRepository, OCIRepository. Returns sync state, URL, and latest artifact revision. -
get_flux_images— The image automation pipeline: ImageRepository (what tags exist?), ImagePolicy (which tag should we use?), ImageUpdateAutomation (did it push a commit?). Returns scan times, tag counts, latest image selections, and last push commits.
Shared Status Extraction
Every Flux resource follows the same status convention — a conditions array with a Ready condition that has status, message, and lastTransitionTime. I pulled this into a shared helper:
#![allow(unused)] fn main() { fn flux_status_summary(data: &serde_json::Value) -> serde_json::Value { let ready = data .pointer("/status/conditions") .and_then(|c| c.as_array()) .and_then(|conditions| { conditions.iter().find(|c| c.get("type").and_then(|t| t.as_str()) == Some("Ready")) }); // ... extract status, message, lastTransitionTime } }
The get_flux_status tool also calculates a not_ready_count across both Kustomizations and HelmReleases, so I can quickly see if anything's broken without reading through every resource.
Concurrent API Calls
Each tool queries multiple CRD types, so I used tokio::join! to fire them in parallel:
#![allow(unused)] fn main() { let lp = ListParams::default(); let (ks_result, hr_result) = tokio::join!( ks_api.list(&lp), hr_api.list(&lp), ); }
One fun detail: you can't write ks_api.list(&ListParams::default()) inside tokio::join! — the temporary ListParams gets dropped before the future resolves. The borrow checker catches it, thankfully. Binding to let lp extends the lifetime.
ntfy Notifications
The Poll API
ntfy has a clever HTTP API for retrieving historical messages. Instead of the streaming SSE endpoint, you add ?poll=1 to get all matching messages as newline-delimited JSON and close the connection:
GET http://ntfy.ntfy.svc:80/cluster-alerts/json?poll=1&since=24h
Each line is a separate JSON object. Most are "event": "message" but there are also keepalive events, so the tool filters to only actual messages.
The get_notifications tool accepts a topic name and an optional since parameter (defaults to 24h). Returns title, message body, priority, tags, and Unix timestamp for each notification.
RBAC
The Flux tools needed new RBAC permissions since we're querying CRDs that weren't in the original ClusterRole. Added read access to all four Flux API groups:
- apiGroups: ["kustomize.toolkit.fluxcd.io"]
resources: [kustomizations]
verbs: ["get", "list", "watch"]
- apiGroups: ["helm.toolkit.fluxcd.io"]
resources: [helmreleases]
verbs: ["get", "list", "watch"]
- apiGroups: ["source.toolkit.fluxcd.io"]
resources: [gitrepositories, helmrepositories, ocirepositories]
verbs: ["get", "list", "watch"]
- apiGroups: ["image.toolkit.fluxcd.io"]
resources: [imagerepositories, imagepolicies, imageupdateautomations]
verbs: ["get", "list", "watch"]
Pushed RBAC to gitops before the MCP code — same ordering trick as last time. Flux reconciles the RBAC in ~5 minutes, the MCP build takes ~9 minutes through the CI pipeline, so the permissions are guaranteed to exist before the new pod starts trying to use them.
The Full Tool Set
The server now exposes 14 tools:
| Tool | Source | What it does |
|---|---|---|
get_version | Static | Server version and build SHA |
get_node_status | K8s API | Node readiness, CPU, memory, OS |
get_pods | K8s API | Pod phase, restarts, node placement |
get_namespaces | K8s API | Namespace listing |
get_events | K8s API | Recent cluster events |
get_workloads | K8s API | Deployment/StatefulSet/DaemonSet replicas |
get_certificates | K8s API | cert-manager certificate status and expiry |
get_alerts | Alertmanager | Active alerts with severity |
query_logs | Loki | LogQL log search |
query_metrics | Prometheus | PromQL metric queries |
get_flux_status | K8s API | Kustomization and HelmRelease reconciliation |
get_flux_sources | K8s API | GitRepository/HelmRepository/OCIRepository sync |
get_flux_images | K8s API | Image automation pipeline status |
get_notifications | ntfy | Recent notifications from any topic |
That's a pretty complete read-only view of the cluster. I can ask about infrastructure state, deployment pipeline status, observability data, and notification history — all through natural language via Claude Code. The only thing missing at this point is write operations, but for those we can just shell out to kubectl.