diff --git a/content/posts/blog-draft.md b/content/posts/blog-draft.md new file mode 100644 index 0000000..3a1f2e0 --- /dev/null +++ b/content/posts/blog-draft.md @@ -0,0 +1,88 @@ +--- +title: "Deployment Lessons and My Take on Self-Hosting OpenClaw" +date: 2026-02-03 +draft: false +--- + + +Deploying autonomous agents like OpenClaw on a self-hosted Kubernetes cluster offers significantly more control and integration potential than cloud-hosted alternatives. However, moving from a standard SaaS model to running your own intelligence infrastructure introduces several deployment challenges. + +Here are the practical lessons learned, organized by the layers of the agentic stack: Environment, Runtime, and Capabilities. + +## Layer 1: The Environment – Breaking the Sandbox + +To move beyond being a chatbot, an agent needs to be able to affect its world. Deep integration starts with networking. + +Code execution agents often need to spin up temporary servers—for previews, React apps, or documentation sites. In a standard Kubernetes Pod, these dynamic ports (like 3000, 8080, etc.) are isolated inside the container network namespace. + +To securely expose these arbitrary ports, I deployed a lightweight **Nginx sidecar** alongside the main OpenClaw agent. This avoids the complexity and latency of dynamically updating Ingress resources. + +The Nginx configuration handling the routing logic: + +```yaml +server { + listen 80; + server_name ~^(?\d+)\.agent\.mydomain\.com$; + + location / { + proxy_pass http://localhost:$port; + proxy_set_header Host $host; + } +} +``` + +This configuration uses a regex-based server block to capture the port from the subdomain (e.g., `3000.agent.mydomain.com`) and proxies traffic to that port on `localhost`. Since containers in the same Pod share a network namespace, `localhost` connectivity is seamless. + +For this to work effectively, the agent must be aware of its environment. I updated OpenClaw's system prompts to understand this pattern: *"If you start a server on port X, the external URL is https://X.agent.mydomain.com"*. This allows the agent to provide valid, clickable links for its generated applications. + +## Layer 2: The Runtime – Agility and Persistence + +Once the agent allows for external connectivity, the next challenge is agility. Self-hosting often requires customizations that haven't yet been merged upstream. + +Self-hosting often requires customizations that haven't yet been merged upstream. For example, I needed a custom OAuth flow for Google's internal APIs. + +Instead of maintaining a forked Docker image, I used a Kubernetes `ConfigMap` to inject the necessary TypeScript plugin at runtime. The file is mounted directly into the container at `/app/extensions/google-antigravity-auth/index.ts`. + +```yaml +kind: ConfigMap +metadata: + name: openclaw-patch-antigravity +data: + index.ts: | + import { createHash, randomBytes } from "node:crypto"; + // ... custom OAuth implementation ... + export default antigravityPlugin; +``` + +This approach allows for rapid iteration on patches without rebuilding container images for every change. + +However, two operational realities became clear during this process: +1. **Debugging is Standard**: When the agent fails (e.g., your custom patch throws an error), it behaves like any other application. Standard debugging tools like `kubectl logs` and `strace` remain the most effective way to diagnose issues. +2. **Persistent Storage Matches Tooling**: Just as code needs injection, tools need persistence. I had to explicitly mount a volume for Homebrew (`.linuxbrew`) so that tools installed by me or the agent didn't vanish on pod restart. Agents need long-term memory on their filesystem as much as in their context window. + +## Layer 3: The Capabilities – Skills over Abstractions + +With the infrastructure (Layer 1) and runtime (Layer 2) established, we move to the application logic: how the agent actually *does* work. + +While the industry chases complex abstractions like the Model Context Protocol (MCP), I found that simple, text-based "Skills" offer a superior workflow. I recently created a Gitea skill simply by exposing the `tea` CLI documentation to the agent. + +This approach aligns with the UNIX philosophy: small, simple tools that do one thing well. MCP servers often clutter the context window and impose significant development overhead. A well-structured "Skill"—essentially a localized knowledge base for a CLI—is cleaner and faster to implement. I predict that these lightweight Skills will eventually replace heavy MCP integrations for the majority of use cases. + +There is one current limitation: Gemini models lack specific post-training for these custom skills. The agent doesn't always intuitively know when to reach for a specific tool. Also, remember that granting the agent access to CLI tools like `kubectl` or `tea` (Gitea CLI) enables it to perform operations directly, transforming it from a text generator to a system operator. My agent can now open Pull Requests on my self-hosted Gitea instance, effectively becoming a contributor to its own config repo. + +## The Payoff: Why This Complexity Matters + +Why go through this trouble of sidecars, config patches, and custom skills? + +My previous AI workflows relied on standard chatbots via interfaces like Open-WebUI. The friction in that model is the "all-or-nothing" generation. LLMs are stochastic; regenerating an entire file to change three lines is inefficient and risky. + +OpenClaw's or agentic tools (such as Cursor or Antigravity) killer feature is **partial editing**. The ability to iteratively improve a stable codebase or document without regenerating the entire file is the missing link for AI-assisted development. We need to treat code as a living document, not a chat response. + +When combined with tools like Obsidian that I already use as my second brain for persistent knowledge management, this model provides both the long-term memory and the granular control necessary for complex projects. + + +## References + +1. **OpenClaw Documentation**: [https://docs.openclaw.org](https://docs.openclaw.org) +2. **Kubernetes Flux CD**: [https://fluxcd.io/](https://fluxcd.io/) +3. **Nginx Regex Server Names**: [https://nginx.org/en/docs/http/server_names.html](https://nginx.org/en/docs/http/server_names.html)