deploy: 3b1396d814
This commit is contained in:
@@ -1,9 +1,9 @@
|
||||
<!doctype html><html lang=en><head><title>From Gemini-3-Flash to T5-Gemma-2 A Journey in Distilling a Family Finance LLM · Eric X. Liu's Personal Page</title><meta charset=utf-8><meta name=viewport content="width=device-width,initial-scale=1"><meta name=color-scheme content="light dark"><meta http-equiv=Content-Security-Policy content="upgrade-insecure-requests; block-all-mixed-content; default-src 'self'; child-src 'self'; font-src 'self' https://fonts.gstatic.com https://cdn.jsdelivr.net/; form-action 'self'; frame-src 'self' https://www.youtube.com https://disqus.com; img-src 'self' https://referrer.disqus.com https://c.disquscdn.com https://*.disqus.com; object-src 'none'; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com/ https://cdn.jsdelivr.net/; script-src 'self' 'unsafe-inline' https://www.google-analytics.com https://cdn.jsdelivr.net/ https://pagead2.googlesyndication.com https://static.cloudflareinsights.com https://unpkg.com https://ericxliu-me.disqus.com https://disqus.com https://*.disqus.com https://*.disquscdn.com https://unpkg.com; connect-src 'self' https://www.google-analytics.com https://pagead2.googlesyndication.com https://cloudflareinsights.com ws://localhost:1313 ws://localhost:* wss://localhost:* https://links.services.disqus.com https://*.disqus.com;"><meta name=author content="Eric X. Liu"><meta name=description content='Running a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and “wait, was this dinner or vacation dinner?” questions.
|
||||
For years, I relied on a rule-based system to categorize our credit card transactions. It worked… mostly. But maintaining if "UBER" in description and amount > 50 style rules is a never-ending battle against the entropy of merchant names and changing habits.'><meta name=keywords content="software engineer,performance engineering,Google engineer,tech blog,software development,performance optimization,Eric Liu,engineering blog,mountain biking,Jeep enthusiast,overlanding,camping,outdoor adventures"><meta name=twitter:card content="summary"><meta name=twitter:title content="From Gemini-3-Flash to T5-Gemma-2 A Journey in Distilling a Family Finance LLM"><meta name=twitter:description content='Running a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and “wait, was this dinner or vacation dinner?” questions.
|
||||
For years, I relied on a rule-based system to categorize our credit card transactions. It worked… mostly. But maintaining if "UBER" in description and amount > 50 style rules is a never-ending battle against the entropy of merchant names and changing habits.'><meta property="og:url" content="https://ericxliu.me/posts/technical-deep-dive-llm-categorization/"><meta property="og:site_name" content="Eric X. Liu's Personal Page"><meta property="og:title" content="From Gemini-3-Flash to T5-Gemma-2 A Journey in Distilling a Family Finance LLM"><meta property="og:description" content='Running a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and “wait, was this dinner or vacation dinner?” questions.
|
||||
For years, I relied on a rule-based system to categorize our credit card transactions. It worked… mostly. But maintaining if "UBER" in description and amount > 50 style rules is a never-ending battle against the entropy of merchant names and changing habits.'><meta property="og:locale" content="en"><meta property="og:type" content="article"><meta property="article:section" content="posts"><meta property="article:published_time" content="2025-12-27T00:00:00+00:00"><meta property="article:modified_time" content="2025-12-27T22:05:12+00:00"><link rel=preload href=/fonts/fa-solid-900.woff2 as=font type=font/woff2 crossorigin><link rel=preload href=/fonts/fa-brands-400.woff2 as=font type=font/woff2 crossorigin><link rel=canonical href=https://ericxliu.me/posts/technical-deep-dive-llm-categorization/><link rel=preload href=/fonts/fa-brands-400.woff2 as=font type=font/woff2 crossorigin><link rel=preload href=/fonts/fa-regular-400.woff2 as=font type=font/woff2 crossorigin><link rel=preload href=/fonts/fa-solid-900.woff2 as=font type=font/woff2 crossorigin><link rel=stylesheet href=/css/coder.min.4b392a85107b91dbdabc528edf014a6ab1a30cd44cafcd5325c8efe796794fca.css integrity="sha256-SzkqhRB7kdvavFKO3wFKarGjDNRMr81TJcjv55Z5T8o=" crossorigin=anonymous media=screen><link rel=stylesheet href=/css/coder-dark.min.a00e6364bacbc8266ad1cc81230774a1397198f8cfb7bcba29b7d6fcb54ce57f.css integrity="sha256-oA5jZLrLyCZq0cyBIwd0oTlxmPjPt7y6KbfW/LVM5X8=" crossorigin=anonymous media=screen><link rel=icon type=image/svg+xml href=/images/favicon.svg sizes=any><link rel=icon type=image/png href=/images/favicon-32x32.png sizes=32x32><link rel=icon type=image/png href=/images/favicon-16x16.png sizes=16x16><link rel=apple-touch-icon href=/images/apple-touch-icon.png><link rel=apple-touch-icon sizes=180x180 href=/images/apple-touch-icon.png><link rel=manifest href=/site.webmanifest><link rel=mask-icon href=/images/safari-pinned-tab.svg color=#5bbad5><script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-3972604619956476" crossorigin=anonymous></script><script type=application/ld+json>{"@context":"http://schema.org","@type":"Person","name":"Eric X. Liu","url":"https:\/\/ericxliu.me\/","description":"Software \u0026 Performance Engineer at Google","sameAs":["https:\/\/www.linkedin.com\/in\/eric-x-liu-46648b93\/","https:\/\/git.ericxliu.me\/eric"]}</script><script type=application/ld+json>{"@context":"http://schema.org","@type":"BlogPosting","headline":"From Gemini-3-Flash to T5-Gemma-2 A Journey in Distilling a Family Finance LLM","genre":"Blog","wordcount":"1355","url":"https:\/\/ericxliu.me\/posts\/technical-deep-dive-llm-categorization\/","datePublished":"2025-12-27T00:00:00\u002b00:00","dateModified":"2025-12-27T22:05:12\u002b00:00","description":"\u003cp\u003eRunning a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and \u0026ldquo;wait, was this dinner or \u003cem\u003evacation\u003c\/em\u003e dinner?\u0026rdquo; questions.\u003c\/p\u003e\n\u003cp\u003eFor years, I relied on a rule-based system to categorize our credit card transactions. It worked\u0026hellip; mostly. But maintaining \u003ccode\u003eif \u0026quot;UBER\u0026quot; in description and amount \u0026gt; 50\u003c\/code\u003e style rules is a never-ending battle against the entropy of merchant names and changing habits.\u003c\/p\u003e","author":{"@type":"Person","name":"Eric X. Liu"}}</script></head><body class="preload-transitions colorscheme-auto"><div class=float-container><a id=dark-mode-toggle class=colorscheme-toggle><i class="fa-solid fa-adjust fa-fw" aria-hidden=true></i></a></div><main class=wrapper><nav class=navigation><section class=container><a class=navigation-title href=https://ericxliu.me/>Eric X. Liu's Personal Page
|
||||
<!doctype html><html lang=en><head><title>From Gemini-3-Flash to T5-Gemma-2: A Journey in Distilling a Family Finance LLM · Eric X. Liu's Personal Page</title><meta charset=utf-8><meta name=viewport content="width=device-width,initial-scale=1"><meta name=color-scheme content="light dark"><meta http-equiv=Content-Security-Policy content="upgrade-insecure-requests; block-all-mixed-content; default-src 'self'; child-src 'self'; font-src 'self' https://fonts.gstatic.com https://cdn.jsdelivr.net/; form-action 'self'; frame-src 'self' https://www.youtube.com https://disqus.com; img-src 'self' https://referrer.disqus.com https://c.disquscdn.com https://*.disqus.com; object-src 'none'; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com/ https://cdn.jsdelivr.net/; script-src 'self' 'unsafe-inline' https://www.google-analytics.com https://cdn.jsdelivr.net/ https://pagead2.googlesyndication.com https://static.cloudflareinsights.com https://unpkg.com https://ericxliu-me.disqus.com https://disqus.com https://*.disqus.com https://*.disquscdn.com https://unpkg.com; connect-src 'self' https://www.google-analytics.com https://pagead2.googlesyndication.com https://cloudflareinsights.com ws://localhost:1313 ws://localhost:* wss://localhost:* https://links.services.disqus.com https://*.disqus.com;"><meta name=author content="Eric X. Liu"><meta name=description content='Running a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and “wait, was this dinner or vacation dinner?” questions.
|
||||
For years, I relied on a rule-based system to categorize our credit card transactions. It worked… mostly. But maintaining if "UBER" in description and amount > 50 style rules is a never-ending battle against the entropy of merchant names and changing habits.'><meta name=keywords content="software engineer,performance engineering,Google engineer,tech blog,software development,performance optimization,Eric Liu,engineering blog,mountain biking,Jeep enthusiast,overlanding,camping,outdoor adventures"><meta name=twitter:card content="summary"><meta name=twitter:title content="From Gemini-3-Flash to T5-Gemma-2: A Journey in Distilling a Family Finance LLM"><meta name=twitter:description content='Running a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and “wait, was this dinner or vacation dinner?” questions.
|
||||
For years, I relied on a rule-based system to categorize our credit card transactions. It worked… mostly. But maintaining if "UBER" in description and amount > 50 style rules is a never-ending battle against the entropy of merchant names and changing habits.'><meta property="og:url" content="https://ericxliu.me/posts/technical-deep-dive-llm-categorization/"><meta property="og:site_name" content="Eric X. Liu's Personal Page"><meta property="og:title" content="From Gemini-3-Flash to T5-Gemma-2: A Journey in Distilling a Family Finance LLM"><meta property="og:description" content='Running a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and “wait, was this dinner or vacation dinner?” questions.
|
||||
For years, I relied on a rule-based system to categorize our credit card transactions. It worked… mostly. But maintaining if "UBER" in description and amount > 50 style rules is a never-ending battle against the entropy of merchant names and changing habits.'><meta property="og:locale" content="en"><meta property="og:type" content="article"><meta property="article:section" content="posts"><meta property="article:published_time" content="2025-12-27T00:00:00+00:00"><meta property="article:modified_time" content="2026-01-03T06:57:12+00:00"><link rel=preload href=/fonts/fa-solid-900.woff2 as=font type=font/woff2 crossorigin><link rel=preload href=/fonts/fa-brands-400.woff2 as=font type=font/woff2 crossorigin><link rel=canonical href=https://ericxliu.me/posts/technical-deep-dive-llm-categorization/><link rel=preload href=/fonts/fa-brands-400.woff2 as=font type=font/woff2 crossorigin><link rel=preload href=/fonts/fa-regular-400.woff2 as=font type=font/woff2 crossorigin><link rel=preload href=/fonts/fa-solid-900.woff2 as=font type=font/woff2 crossorigin><link rel=stylesheet href=/css/coder.min.4b392a85107b91dbdabc528edf014a6ab1a30cd44cafcd5325c8efe796794fca.css integrity="sha256-SzkqhRB7kdvavFKO3wFKarGjDNRMr81TJcjv55Z5T8o=" crossorigin=anonymous media=screen><link rel=stylesheet href=/css/coder-dark.min.a00e6364bacbc8266ad1cc81230774a1397198f8cfb7bcba29b7d6fcb54ce57f.css integrity="sha256-oA5jZLrLyCZq0cyBIwd0oTlxmPjPt7y6KbfW/LVM5X8=" crossorigin=anonymous media=screen><link rel=icon type=image/svg+xml href=/images/favicon.svg sizes=any><link rel=icon type=image/png href=/images/favicon-32x32.png sizes=32x32><link rel=icon type=image/png href=/images/favicon-16x16.png sizes=16x16><link rel=apple-touch-icon href=/images/apple-touch-icon.png><link rel=apple-touch-icon sizes=180x180 href=/images/apple-touch-icon.png><link rel=manifest href=/site.webmanifest><link rel=mask-icon href=/images/safari-pinned-tab.svg color=#5bbad5><script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-3972604619956476" crossorigin=anonymous></script><script type=application/ld+json>{"@context":"http://schema.org","@type":"Person","name":"Eric X. Liu","url":"https:\/\/ericxliu.me\/","description":"Software \u0026 Performance Engineer at Google","sameAs":["https:\/\/www.linkedin.com\/in\/eric-x-liu-46648b93\/","https:\/\/git.ericxliu.me\/eric"]}</script><script type=application/ld+json>{"@context":"http://schema.org","@type":"BlogPosting","headline":"From Gemini-3-Flash to T5-Gemma-2: A Journey in Distilling a Family Finance LLM","genre":"Blog","wordcount":"1355","url":"https:\/\/ericxliu.me\/posts\/technical-deep-dive-llm-categorization\/","datePublished":"2025-12-27T00:00:00\u002b00:00","dateModified":"2026-01-03T06:57:12\u002b00:00","description":"\u003cp\u003eRunning a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and \u0026ldquo;wait, was this dinner or \u003cem\u003evacation\u003c\/em\u003e dinner?\u0026rdquo; questions.\u003c\/p\u003e\n\u003cp\u003eFor years, I relied on a rule-based system to categorize our credit card transactions. It worked\u0026hellip; mostly. But maintaining \u003ccode\u003eif \u0026quot;UBER\u0026quot; in description and amount \u0026gt; 50\u003c\/code\u003e style rules is a never-ending battle against the entropy of merchant names and changing habits.\u003c\/p\u003e","author":{"@type":"Person","name":"Eric X. Liu"}}</script></head><body class="preload-transitions colorscheme-auto"><div class=float-container><a id=dark-mode-toggle class=colorscheme-toggle><i class="fa-solid fa-adjust fa-fw" aria-hidden=true></i></a></div><main class=wrapper><nav class=navigation><section class=container><a class=navigation-title href=https://ericxliu.me/>Eric X. Liu's Personal Page
|
||||
</a><input type=checkbox id=menu-toggle>
|
||||
<label class="menu-button float-right" for=menu-toggle><i class="fa-solid fa-bars fa-fw" aria-hidden=true></i></label><ul class=navigation-list><li class=navigation-item><a class=navigation-link href=/posts/>Posts</a></li><li class=navigation-item><a class=navigation-link href=https://chat.ericxliu.me>Chat</a></li><li class=navigation-item><a class=navigation-link href=https://git.ericxliu.me/user/oauth2/Authenitk>Git</a></li><li class=navigation-item><a class=navigation-link href=https://coder.ericxliu.me/api/v2/users/oidc/callback>Coder</a></li><li class=navigation-item><a class=navigation-link href=/about/>About</a></li><li class=navigation-item><a class=navigation-link href=/>|</a></li><li class=navigation-item><a class=navigation-link href=https://sso.ericxliu.me>Sign in</a></li></ul></section></nav><div class=content><section class="container post"><article><header><div class=post-title><h1 class=title><a class=title-link href=https://ericxliu.me/posts/technical-deep-dive-llm-categorization/>From Gemini-3-Flash to T5-Gemma-2 A Journey in Distilling a Family Finance LLM</a></h1></div><div class=post-meta><div class=date><span class=posted-on><i class="fa-solid fa-calendar" aria-hidden=true></i>
|
||||
<label class="menu-button float-right" for=menu-toggle><i class="fa-solid fa-bars fa-fw" aria-hidden=true></i></label><ul class=navigation-list><li class=navigation-item><a class=navigation-link href=/posts/>Posts</a></li><li class=navigation-item><a class=navigation-link href=https://chat.ericxliu.me>Chat</a></li><li class=navigation-item><a class=navigation-link href=https://git.ericxliu.me/user/oauth2/Authenitk>Git</a></li><li class=navigation-item><a class=navigation-link href=https://coder.ericxliu.me/api/v2/users/oidc/callback>Coder</a></li><li class=navigation-item><a class=navigation-link href=/about/>About</a></li><li class=navigation-item><a class=navigation-link href=/>|</a></li><li class=navigation-item><a class=navigation-link href=https://sso.ericxliu.me>Sign in</a></li></ul></section></nav><div class=content><section class="container post"><article><header><div class=post-title><h1 class=title><a class=title-link href=https://ericxliu.me/posts/technical-deep-dive-llm-categorization/>From Gemini-3-Flash to T5-Gemma-2: A Journey in Distilling a Family Finance LLM</a></h1></div><div class=post-meta><div class=date><span class=posted-on><i class="fa-solid fa-calendar" aria-hidden=true></i>
|
||||
<time datetime=2025-12-27T00:00:00Z>December 27, 2025
|
||||
</time></span><span class=reading-time><i class="fa-solid fa-clock" aria-hidden=true></i>
|
||||
7-minute read</span></div></div></header><div class=post-content><p>Running a family finance system is surprisingly complex. What starts as a simple spreadsheet often evolves into a web of rules, exceptions, and “wait, was this dinner or <em>vacation</em> dinner?” questions.</p><p>For years, I relied on a rule-based system to categorize our credit card transactions. It worked… mostly. But maintaining <code>if "UBER" in description and amount > 50</code> style rules is a never-ending battle against the entropy of merchant names and changing habits.</p><p>Recently, I decided to modernize this stack using Large Language Models (LLMs). This post details the technical journey from using an off-the-shelf commercial model to distilling that knowledge into a small, efficient local model (<code>google/t5gemma-2-270m</code>) that runs on my own hardware while maintaining high accuracy.</p><h2 id=phase-1-the-proof-of-concept-with-commercial-llms>Phase 1: The Proof of Concept with Commercial LLMs
|
||||
@@ -73,4 +73,4 @@ It turned out to be a syntax error in my arguments passed to the <code>Trainer</
|
||||
2016 -
|
||||
2026
|
||||
Eric X. Liu
|
||||
<a href="https://git.ericxliu.me/eric/ericxliu-me/commit/89dc118">[89dc118]</a></section></footer></main><script src=/js/coder.min.6ae284be93d2d19dad1f02b0039508d9aab3180a12a06dcc71b0b0ef7825a317.js integrity="sha256-auKEvpPS0Z2tHwKwA5UI2aqzGAoSoG3McbCw73gloxc="></script><script defer src=https://static.cloudflareinsights.com/beacon.min.js data-cf-beacon='{"token": "987638e636ce4dbb932d038af74c17d1"}'></script></body></html>
|
||||
<a href="https://git.ericxliu.me/eric/ericxliu-me/commit/3b1396d">[3b1396d]</a></section></footer></main><script src=/js/coder.min.6ae284be93d2d19dad1f02b0039508d9aab3180a12a06dcc71b0b0ef7825a317.js integrity="sha256-auKEvpPS0Z2tHwKwA5UI2aqzGAoSoG3McbCw73gloxc="></script><script defer src=https://static.cloudflareinsights.com/beacon.min.js data-cf-beacon='{"token": "987638e636ce4dbb932d038af74c17d1"}'></script></body></html>
|
||||
Reference in New Issue
Block a user