📚 Auto-publish: Add/update 1 blog posts
All checks were successful
Hugo Publish CI / build-and-deploy (push) Successful in 1m29s
All checks were successful
Hugo Publish CI / build-and-deploy (push) Successful in 1m29s
Generated on: Wed Feb 4 06:18:45 UTC 2026 Source: md-personal repository
This commit is contained in:
88
content/posts/blog-draft.md
Normal file
88
content/posts/blog-draft.md
Normal file
@@ -0,0 +1,88 @@
|
|||||||
|
---
|
||||||
|
title: "Deployment Lessons and My Take on Self-Hosting OpenClaw"
|
||||||
|
date: 2026-02-03
|
||||||
|
draft: false
|
||||||
|
---
|
||||||
|
|
||||||
|
|
||||||
|
Deploying autonomous agents like OpenClaw on a self-hosted Kubernetes cluster offers significantly more control and integration potential than cloud-hosted alternatives. However, moving from a standard SaaS model to running your own intelligence infrastructure introduces several deployment challenges.
|
||||||
|
|
||||||
|
Here are the practical lessons learned, organized by the layers of the agentic stack: Environment, Runtime, and Capabilities.
|
||||||
|
|
||||||
|
## Layer 1: The Environment – Breaking the Sandbox
|
||||||
|
|
||||||
|
To move beyond being a chatbot, an agent needs to be able to affect its world. Deep integration starts with networking.
|
||||||
|
|
||||||
|
Code execution agents often need to spin up temporary servers—for previews, React apps, or documentation sites. In a standard Kubernetes Pod, these dynamic ports (like 3000, 8080, etc.) are isolated inside the container network namespace.
|
||||||
|
|
||||||
|
To securely expose these arbitrary ports, I deployed a lightweight **Nginx sidecar** alongside the main OpenClaw agent. This avoids the complexity and latency of dynamically updating Ingress resources.
|
||||||
|
|
||||||
|
The Nginx configuration handling the routing logic:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
server {
|
||||||
|
listen 80;
|
||||||
|
server_name ~^(?<port>\d+)\.agent\.mydomain\.com$;
|
||||||
|
|
||||||
|
location / {
|
||||||
|
proxy_pass http://localhost:$port;
|
||||||
|
proxy_set_header Host $host;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This configuration uses a regex-based server block to capture the port from the subdomain (e.g., `3000.agent.mydomain.com`) and proxies traffic to that port on `localhost`. Since containers in the same Pod share a network namespace, `localhost` connectivity is seamless.
|
||||||
|
|
||||||
|
For this to work effectively, the agent must be aware of its environment. I updated OpenClaw's system prompts to understand this pattern: *"If you start a server on port X, the external URL is https://X.agent.mydomain.com"*. This allows the agent to provide valid, clickable links for its generated applications.
|
||||||
|
|
||||||
|
## Layer 2: The Runtime – Agility and Persistence
|
||||||
|
|
||||||
|
Once the agent allows for external connectivity, the next challenge is agility. Self-hosting often requires customizations that haven't yet been merged upstream.
|
||||||
|
|
||||||
|
Self-hosting often requires customizations that haven't yet been merged upstream. For example, I needed a custom OAuth flow for Google's internal APIs.
|
||||||
|
|
||||||
|
Instead of maintaining a forked Docker image, I used a Kubernetes `ConfigMap` to inject the necessary TypeScript plugin at runtime. The file is mounted directly into the container at `/app/extensions/google-antigravity-auth/index.ts`.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
kind: ConfigMap
|
||||||
|
metadata:
|
||||||
|
name: openclaw-patch-antigravity
|
||||||
|
data:
|
||||||
|
index.ts: |
|
||||||
|
import { createHash, randomBytes } from "node:crypto";
|
||||||
|
// ... custom OAuth implementation ...
|
||||||
|
export default antigravityPlugin;
|
||||||
|
```
|
||||||
|
|
||||||
|
This approach allows for rapid iteration on patches without rebuilding container images for every change.
|
||||||
|
|
||||||
|
However, two operational realities became clear during this process:
|
||||||
|
1. **Debugging is Standard**: When the agent fails (e.g., your custom patch throws an error), it behaves like any other application. Standard debugging tools like `kubectl logs` and `strace` remain the most effective way to diagnose issues.
|
||||||
|
2. **Persistent Storage Matches Tooling**: Just as code needs injection, tools need persistence. I had to explicitly mount a volume for Homebrew (`.linuxbrew`) so that tools installed by me or the agent didn't vanish on pod restart. Agents need long-term memory on their filesystem as much as in their context window.
|
||||||
|
|
||||||
|
## Layer 3: The Capabilities – Skills over Abstractions
|
||||||
|
|
||||||
|
With the infrastructure (Layer 1) and runtime (Layer 2) established, we move to the application logic: how the agent actually *does* work.
|
||||||
|
|
||||||
|
While the industry chases complex abstractions like the Model Context Protocol (MCP), I found that simple, text-based "Skills" offer a superior workflow. I recently created a Gitea skill simply by exposing the `tea` CLI documentation to the agent.
|
||||||
|
|
||||||
|
This approach aligns with the UNIX philosophy: small, simple tools that do one thing well. MCP servers often clutter the context window and impose significant development overhead. A well-structured "Skill"—essentially a localized knowledge base for a CLI—is cleaner and faster to implement. I predict that these lightweight Skills will eventually replace heavy MCP integrations for the majority of use cases.
|
||||||
|
|
||||||
|
There is one current limitation: Gemini models lack specific post-training for these custom skills. The agent doesn't always intuitively know when to reach for a specific tool. Also, remember that granting the agent access to CLI tools like `kubectl` or `tea` (Gitea CLI) enables it to perform operations directly, transforming it from a text generator to a system operator. My agent can now open Pull Requests on my self-hosted Gitea instance, effectively becoming a contributor to its own config repo.
|
||||||
|
|
||||||
|
## The Payoff: Why This Complexity Matters
|
||||||
|
|
||||||
|
Why go through this trouble of sidecars, config patches, and custom skills?
|
||||||
|
|
||||||
|
My previous AI workflows relied on standard chatbots via interfaces like Open-WebUI. The friction in that model is the "all-or-nothing" generation. LLMs are stochastic; regenerating an entire file to change three lines is inefficient and risky.
|
||||||
|
|
||||||
|
OpenClaw's or agentic tools (such as Cursor or Antigravity) killer feature is **partial editing**. The ability to iteratively improve a stable codebase or document without regenerating the entire file is the missing link for AI-assisted development. We need to treat code as a living document, not a chat response.
|
||||||
|
|
||||||
|
When combined with tools like Obsidian that I already use as my second brain for persistent knowledge management, this model provides both the long-term memory and the granular control necessary for complex projects.
|
||||||
|
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
1. **OpenClaw Documentation**: [https://docs.openclaw.org](https://docs.openclaw.org)
|
||||||
|
2. **Kubernetes Flux CD**: [https://fluxcd.io/](https://fluxcd.io/)
|
||||||
|
3. **Nginx Regex Server Names**: [https://nginx.org/en/docs/http/server_names.html](https://nginx.org/en/docs/http/server_names.html)
|
||||||
Reference in New Issue
Block a user