deploy: 3b1396d814

2026-01-08 06:03:04 +00:00
parent 598c74df0a
commit 9c66ed1b1b
35 changed files with 95 additions and 54 deletions
--- a/posts/index.xml
+++ b/posts/index.xml
@@ -1,6 +1,7 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Posts on Eric X. Liu's Personal Page</title><link>https://ericxliu.me/posts/</link><description>Recent content in Posts on Eric X. Liu's Personal Page</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sat, 03 Jan 2026 06:23:35 +0000</lastBuildDate><atom:link href="https://ericxliu.me/posts/index.xml" rel="self" type="application/rss+xml"/><item><title>Why Your "Resilient" Homelab is Slower Than a Raspberry Pi</title><link>https://ericxliu.me/posts/debugging-authentik-performance/</link><pubDate>Fri, 02 Jan 2026 00:00:00 +0000</pubDate><guid>https://ericxliu.me/posts/debugging-authentik-performance/</guid><description>&lt;p&gt;In the world of self-hosting, there are many metrics for success: 99.9% uptime, sub-second latency, or a perfect GitOps pipeline. But for those of us running &amp;ldquo;production&amp;rdquo; at home, there is only one metric that truly matters: &lt;strong&gt;The Wife Acceptance Factor (WAF)&lt;/strong&gt;.&lt;/p&gt;
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Posts on Eric X. Liu's Personal Page</title><link>https://ericxliu.me/posts/</link><description>Recent content in Posts on Eric X. Liu's Personal Page</description><generator>Hugo</generator><language>en</language><lastBuildDate>Thu, 08 Jan 2026 06:02:38 +0000</lastBuildDate><atom:link href="https://ericxliu.me/posts/index.xml" rel="self" type="application/rss+xml"/><item><title>Why I Downgraded Magisk to Root My Pixel 2 XL</title><link>https://ericxliu.me/posts/rooting-pixel-2-xl-for-reverse-engineering/</link><pubDate>Wed, 07 Jan 2026 00:00:00 +0000</pubDate><guid>https://ericxliu.me/posts/rooting-pixel-2-xl-for-reverse-engineering/</guid><description>&lt;p&gt;For the past few weeks, I&amp;rsquo;ve been stuck in a stalemate with my EcoFlow Bluetooth Protocol Reverse Engineering Project. I have the hci snoop logs, I have the decompiled APK, and I have a strong suspicion about where the authentication logic is hiding. But suspicion isn&amp;rsquo;t proof.&lt;/p&gt;
+&lt;p&gt;Static analysis has its limits. I found the &amp;ldquo;smoking gun&amp;rdquo; function—a native method responsible for encrypting the login payload—but understanding &lt;em&gt;how&lt;/em&gt; it constructs that payload within a strict 13-byte limit purely from assembly (ARM64) was proving to be a headache.&lt;/p&gt;</description></item><item><title>Why Your "Resilient" Homelab is Slower Than a Raspberry Pi</title><link>https://ericxliu.me/posts/debugging-authentik-performance/</link><pubDate>Fri, 02 Jan 2026 00:00:00 +0000</pubDate><guid>https://ericxliu.me/posts/debugging-authentik-performance/</guid><description>&lt;p&gt;In the world of self-hosting, there are many metrics for success: 99.9% uptime, sub-second latency, or a perfect GitOps pipeline. But for those of us running &amp;ldquo;production&amp;rdquo; at home, there is only one metric that truly matters: &lt;strong&gt;The Wife Acceptance Factor (WAF)&lt;/strong&gt;.&lt;/p&gt;
 &lt;p&gt;My detailed Grafana dashboards said everything was fine. But my wife said the SSO login was &amp;ldquo;slow sometimes.&amp;rdquo; She was right. Debugging it took me down a rabbit hole of connection pooling, misplaced assumptions, and the harsh reality of running databases on distributed storage.&lt;/p&gt;</description></item><item><title>How I Got Open WebUI Talking to OpenAI Web Search</title><link>https://ericxliu.me/posts/open-webui-openai-websearch/</link><pubDate>Mon, 29 Dec 2025 00:00:00 +0000</pubDate><guid>https://ericxliu.me/posts/open-webui-openai-websearch/</guid><description>&lt;p&gt;OpenAI promised native web search in GPT‑5, but LiteLLM proxy deployments (and by extension Open WebUI) still choke on it—issue &lt;a href="https://github.com/BerriAI/litellm/issues/13042" class="external-link" target="_blank" rel="noopener"&gt;#13042&lt;/a&gt; tracks the fallout. I needed grounded answers inside Open WebUI anyway, so I built a workaround: route GPT‑5 traffic through the Responses API and mask every &lt;code&gt;web_search_call&lt;/code&gt; before the UI ever sees it.&lt;/p&gt;
-&lt;p&gt;This post documents the final setup, the hotfix script that keeps LiteLLM honest, and the tests that prove Open WebUI now streams cited answers without trying to execute the tool itself.&lt;/p&gt;</description></item><item><title>From Gemini-3-Flash to T5-Gemma-2 A Journey in Distilling a Family Finance LLM</title><link>https://ericxliu.me/posts/technical-deep-dive-llm-categorization/</link><pubDate>Sat, 27 Dec 2025 00:00:00 +0000</pubDate><guid>https://ericxliu.me/posts/technical-deep-dive-llm-categorization/</guid><description>&lt;p&gt;Running a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and &amp;ldquo;wait, was this dinner or &lt;em&gt;vacation&lt;/em&gt; dinner?&amp;rdquo; questions.&lt;/p&gt;
+&lt;p&gt;This post documents the final setup, the hotfix script that keeps LiteLLM honest, and the tests that prove Open WebUI now streams cited answers without trying to execute the tool itself.&lt;/p&gt;</description></item><item><title>From Gemini-3-Flash to T5-Gemma-2: A Journey in Distilling a Family Finance LLM</title><link>https://ericxliu.me/posts/technical-deep-dive-llm-categorization/</link><pubDate>Sat, 27 Dec 2025 00:00:00 +0000</pubDate><guid>https://ericxliu.me/posts/technical-deep-dive-llm-categorization/</guid><description>&lt;p&gt;Running a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and &amp;ldquo;wait, was this dinner or &lt;em&gt;vacation&lt;/em&gt; dinner?&amp;rdquo; questions.&lt;/p&gt;
 &lt;p&gt;For years, I relied on a rule-based system to categorize our credit card transactions. It worked&amp;hellip; mostly. But maintaining &lt;code&gt;if &amp;quot;UBER&amp;quot; in description and amount &amp;gt; 50&lt;/code&gt; style rules is a never-ending battle against the entropy of merchant names and changing habits.&lt;/p&gt;</description></item><item><title>The Convergence of Fast Weights, Linear Attention, and State Space Models</title><link>https://ericxliu.me/posts/the-convergence-of-fast-weights-linear-attention-and-state-space-models/</link><pubDate>Fri, 19 Dec 2025 00:00:00 +0000</pubDate><guid>https://ericxliu.me/posts/the-convergence-of-fast-weights-linear-attention-and-state-space-models/</guid><description>&lt;p&gt;Modern Large Language Models (LLMs) are dominated by the Transformer architecture. However, as context windows grow, the computational cost of the Transformer’s attention mechanism has become a primary bottleneck. Recent discussions in the AI community—most notably by Geoffrey Hinton—have highlighted a theoretical link between biological memory mechanisms (&amp;ldquo;Fast Weights&amp;rdquo;) and efficient engineering solutions like Linear Transformers and State Space Models (SSMs).&lt;/p&gt;
 &lt;p&gt;This article explores the mathematical equivalence between Hinton’s concept of Fast Weights as Associative Memory and the recurrence mechanisms found in models such as Mamba and RWKV.&lt;/p&gt;</description></item><item><title>vAttention</title><link>https://ericxliu.me/posts/vattention/</link><pubDate>Mon, 08 Dec 2025 00:00:00 +0000</pubDate><guid>https://ericxliu.me/posts/vattention/</guid><description>&lt;p&gt;Large Language Model (LLM) inference is memory-bound, primarily due to the Key-Value (KV) cache—a store of intermediate state that grows linearly with sequence length. Efficient management of this memory is critical for throughput. While &lt;strong&gt;PagedAttention&lt;/strong&gt; (popularized by vLLM) became the industry standard by solving memory fragmentation via software, recent research suggests that leveraging the GPU’s native hardware Memory Management Unit (MMU) offers a more performant and portable solution.&lt;/p&gt;
 &lt;h4 id="the-status-quo-pagedattention-and-software-tables"&gt;