deploy: 69c2890aad
This commit is contained in:
20
index.xml
20
index.xml
@@ -1,12 +1,4 @@
|
||||
<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Eric X. Liu's Personal Page</title><link>/</link><description>Recent content on Eric X. Liu's Personal Page</description><generator>Hugo</generator><language>en</language><lastBuildDate>Wed, 20 Aug 2025 06:04:36 +0000</lastBuildDate><atom:link href="/index.xml" rel="self" type="application/rss+xml"/><item><title>Quantization in LLMs</title><link>/posts/quantization-in-llms/</link><pubDate>Tue, 19 Aug 2025 00:00:00 +0000</pubDate><guid>/posts/quantization-in-llms/</guid><description><p>The burgeoning scale of Large Language Models (LLMs) has necessitated a paradigm shift in their deployment, moving beyond full-precision floating-point arithmetic towards lower-precision representations. Quantization, the process of mapping a wide range of continuous values to a smaller, discrete set, has emerged as a critical technique to reduce model size, accelerate inference, and lower energy consumption. This article provides a technical overview of quantization theories, their application in modern LLMs, and highlights the ongoing innovations in this domain.</p></description></item><item><title>Transformer's Core Mechanics</title><link>/posts/transformer-s-core-mechanics/</link><pubDate>Tue, 19 Aug 2025 00:00:00 +0000</pubDate><guid>/posts/transformer-s-core-mechanics/</guid><description><p>The Transformer architecture is the bedrock of modern Large Language Models (LLMs). While its high-level success is widely known, a deeper understanding requires dissecting its core components. This article provides a detailed, technical breakdown of the fundamental concepts within a Transformer block, from the notion of &ldquo;channels&rdquo; to the intricate workings of the attention mechanism and its relationship with other advanced architectures like Mixture of Experts.</p>
|
||||
<h3 id="1-the-channel-a-foundational-view-of-d_model">
|
||||
1. The &ldquo;Channel&rdquo;: A Foundational View of <code>d_model</code>
|
||||
<a class="heading-link" href="#1-the-channel-a-foundational-view-of-d_model">
|
||||
<i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"></i>
|
||||
<span class="sr-only">Link to heading</span>
|
||||
</a>
|
||||
</h3>
|
||||
<p>In deep learning, a &ldquo;channel&rdquo; can be thought of as a feature dimension. While this term is common in Convolutional Neural Networks for images (e.g., Red, Green, Blue channels), in LLMs, the analogous concept is the model&rsquo;s primary embedding dimension, commonly referred to as <code>d_model</code>.</p></description></item><item><title>Breville Barista Pro Maintenance</title><link>/posts/breville-barista-pro-maintenance/</link><pubDate>Sat, 16 Aug 2025 00:00:00 +0000</pubDate><guid>/posts/breville-barista-pro-maintenance/</guid><description><p>Proper maintenance is critical for the longevity and performance of a Breville Barista Pro espresso machine. Consistent cleaning not only ensures the machine functions correctly but also directly impacts the quality of the espresso produced. This guide provides a detailed, technical breakdown of the essential maintenance routines, from automated cycles to daily upkeep.</p>
|
||||
<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Eric X. Liu's Personal Page</title><link>/</link><description>Recent content on Eric X. Liu's Personal Page</description><generator>Hugo</generator><language>en</language><lastBuildDate>Wed, 20 Aug 2025 06:28:39 +0000</lastBuildDate><atom:link href="/index.xml" rel="self" type="application/rss+xml"/><item><title>Quantization in LLMs</title><link>/posts/quantization-in-llms/</link><pubDate>Tue, 19 Aug 2025 00:00:00 +0000</pubDate><guid>/posts/quantization-in-llms/</guid><description><p>The burgeoning scale of Large Language Models (LLMs) has necessitated a paradigm shift in their deployment, moving beyond full-precision floating-point arithmetic towards lower-precision representations. Quantization, the process of mapping a wide range of continuous values to a smaller, discrete set, has emerged as a critical technique to reduce model size, accelerate inference, and lower energy consumption. This article provides a technical overview of quantization theories, their application in modern LLMs, and highlights the ongoing innovations in this domain.</p></description></item><item><title>Breville Barista Pro Maintenance</title><link>/posts/breville-barista-pro-maintenance/</link><pubDate>Sat, 16 Aug 2025 00:00:00 +0000</pubDate><guid>/posts/breville-barista-pro-maintenance/</guid><description><p>Proper maintenance is critical for the longevity and performance of a Breville Barista Pro espresso machine. Consistent cleaning not only ensures the machine functions correctly but also directly impacts the quality of the espresso produced. This guide provides a detailed, technical breakdown of the essential maintenance routines, from automated cycles to daily upkeep.</p>
|
||||
<h4 id="understanding-the-two-primary-maintenance-cycles">
|
||||
<strong>Understanding the Two Primary Maintenance Cycles</strong>
|
||||
<a class="heading-link" href="#understanding-the-two-primary-maintenance-cycles">
|
||||
@@ -33,6 +25,14 @@
|
||||
<p><strong>The Problem:</strong>
|
||||
Many routing mechanisms, especially &ldquo;Top-K routing,&rdquo; involve a discrete, hard selection process. A common function is <code>KeepTopK(v, k)</code>, which selects the top <code>k</code> scoring elements from a vector <code>v</code> and sets others to $-\infty$ or $0$.</p></description></item><item><title>An Architectural Deep Dive of T5</title><link>/posts/t5-the-transformer-that-zigged-when-others-zagged-an-architectural-deep-dive/</link><pubDate>Sun, 01 Jun 2025 00:00:00 +0000</pubDate><guid>/posts/t5-the-transformer-that-zigged-when-others-zagged-an-architectural-deep-dive/</guid><description><p>In the rapidly evolving landscape of Large Language Models, a few key architectures define the dominant paradigms. Today, the &ldquo;decoder-only&rdquo; model, popularized by the GPT series and its successors like LLaMA and Mistral, reigns supreme. These models are scaled to incredible sizes and excel at in-context learning.</p>
|
||||
<p>But to truly understand the field, we must look at the pivotal models that explored different paths. Google&rsquo;s T5, or <strong>Text-to-Text Transfer Transformer</strong>, stands out as one of the most influential. It didn&rsquo;t just introduce a new model; it proposed a new philosophy. This article dives deep into the architecture of T5, how it fundamentally differs from modern LLMs, and the lasting legacy of its unique design choices.</p></description></item><item><title>Mastering Your Breville Barista Pro: The Ultimate Guide to Dialing In Espresso</title><link>/posts/espresso-theory-application-a-guide-for-the-breville-barista-pro/</link><pubDate>Thu, 01 May 2025 00:00:00 +0000</pubDate><guid>/posts/espresso-theory-application-a-guide-for-the-breville-barista-pro/</guid><description><p>Are you ready to transform your home espresso game from good to genuinely great? The Breville Barista Pro is a fantastic machine, but unlocking its full potential requires understanding a few key principles. This guide will walk you through the systematic process of dialing in your espresso, ensuring every shot is delicious and repeatable.</p>
|
||||
<p>Our overarching philosophy is simple: <strong>isolate and change only one variable at a time.</strong> While numbers are crucial, your palate is the ultimate judge. Dose, ratio, and time are interconnected, but your <strong>grind size</strong> is your most powerful lever.</p></description></item><item><title>Some useful files</title><link>/posts/useful/</link><pubDate>Mon, 26 Oct 2020 04:14:43 +0000</pubDate><guid>/posts/useful/</guid><description><ul>
|
||||
<p>Our overarching philosophy is simple: <strong>isolate and change only one variable at a time.</strong> While numbers are crucial, your palate is the ultimate judge. Dose, ratio, and time are interconnected, but your <strong>grind size</strong> is your most powerful lever.</p></description></item><item><title>Transformer's Core Mechanics</title><link>/posts/transformer-s-core-mechanics/</link><pubDate>Tue, 01 Apr 2025 00:00:00 +0000</pubDate><guid>/posts/transformer-s-core-mechanics/</guid><description><p>The Transformer architecture is the bedrock of modern Large Language Models (LLMs). While its high-level success is widely known, a deeper understanding requires dissecting its core components. This article provides a detailed, technical breakdown of the fundamental concepts within a Transformer block, from the notion of &ldquo;channels&rdquo; to the intricate workings of the attention mechanism and its relationship with other advanced architectures like Mixture of Experts.</p>
|
||||
<h3 id="1-the-channel-a-foundational-view-of-d_model">
|
||||
1. The &ldquo;Channel&rdquo;: A Foundational View of <code>d_model</code>
|
||||
<a class="heading-link" href="#1-the-channel-a-foundational-view-of-d_model">
|
||||
<i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"></i>
|
||||
<span class="sr-only">Link to heading</span>
|
||||
</a>
|
||||
</h3>
|
||||
<p>In deep learning, a &ldquo;channel&rdquo; can be thought of as a feature dimension. While this term is common in Convolutional Neural Networks for images (e.g., Red, Green, Blue channels), in LLMs, the analogous concept is the model&rsquo;s primary embedding dimension, commonly referred to as <code>d_model</code>.</p></description></item><item><title>Some useful files</title><link>/posts/useful/</link><pubDate>Mon, 26 Oct 2020 04:14:43 +0000</pubDate><guid>/posts/useful/</guid><description><ul>
|
||||
<li><a href="/rootCA.crt" >rootCA.pem</a></li>
|
||||
</ul></description></item><item><title>About</title><link>/about/</link><pubDate>Fri, 01 Jun 2018 07:13:52 +0000</pubDate><guid>/about/</guid><description/></item></channel></rss>
|
Reference in New Issue
Block a user