deploy: ba596e75db
This commit is contained in:
@@ -1,4 +1,4 @@
|
||||
<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Eric X. Liu's Personal Page</title><link>/</link><description>Recent content on Eric X. Liu's Personal Page</description><generator>Hugo</generator><language>en</language><lastBuildDate>Wed, 20 Aug 2025 04:48:53 +0000</lastBuildDate><atom:link href="/index.xml" rel="self" type="application/rss+xml"/><item><title>A Technical Deep Dive into the Transformer's Core Mechanics</title><link>/posts/a-technical-deep-dive-into-the-transformer-s-core-mechanics/</link><pubDate>Tue, 19 Aug 2025 00:00:00 +0000</pubDate><guid>/posts/a-technical-deep-dive-into-the-transformer-s-core-mechanics/</guid><description><p>The Transformer architecture is the bedrock of modern Large Language Models (LLMs). While its high-level success is widely known, a deeper understanding requires dissecting its core components. This article provides a detailed, technical breakdown of the fundamental concepts within a Transformer block, from the notion of &ldquo;channels&rdquo; to the intricate workings of the attention mechanism and its relationship with other advanced architectures like Mixture of Experts.</p>
|
||||
<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Eric X. Liu's Personal Page</title><link>/</link><description>Recent content on Eric X. Liu's Personal Page</description><generator>Hugo</generator><language>en</language><lastBuildDate>Wed, 20 Aug 2025 06:02:35 +0000</lastBuildDate><atom:link href="/index.xml" rel="self" type="application/rss+xml"/><item><title>A Technical Deep Dive into the Transformer's Core Mechanics</title><link>/posts/a-technical-deep-dive-into-the-transformer-s-core-mechanics/</link><pubDate>Tue, 19 Aug 2025 00:00:00 +0000</pubDate><guid>/posts/a-technical-deep-dive-into-the-transformer-s-core-mechanics/</guid><description><p>The Transformer architecture is the bedrock of modern Large Language Models (LLMs). While its high-level success is widely known, a deeper understanding requires dissecting its core components. This article provides a detailed, technical breakdown of the fundamental concepts within a Transformer block, from the notion of &ldquo;channels&rdquo; to the intricate workings of the attention mechanism and its relationship with other advanced architectures like Mixture of Experts.</p>
|
||||
<h3 id="1-the-channel-a-foundational-view-of-d_model">
|
||||
1. The &ldquo;Channel&rdquo;: A Foundational View of <code>d_model</code>
|
||||
<a class="heading-link" href="#1-the-channel-a-foundational-view-of-d_model">
|
||||
@@ -6,7 +6,7 @@
|
||||
<span class="sr-only">Link to heading</span>
|
||||
</a>
|
||||
</h3>
|
||||
<p>In deep learning, a &ldquo;channel&rdquo; can be thought of as a feature dimension. While this term is common in Convolutional Neural Networks for images (e.g., Red, Green, Blue channels), in LLMs, the analogous concept is the model&rsquo;s primary embedding dimension, commonly referred to as <code>d_model</code>.</p></description></item><item><title>A Comprehensive Guide to Breville Barista Pro Maintenance</title><link>/posts/a-comprehensive-guide-to-breville-barista-pro-maintenance/</link><pubDate>Sat, 16 Aug 2025 00:00:00 +0000</pubDate><guid>/posts/a-comprehensive-guide-to-breville-barista-pro-maintenance/</guid><description><p>Proper maintenance is critical for the longevity and performance of a Breville Barista Pro espresso machine. Consistent cleaning not only ensures the machine functions correctly but also directly impacts the quality of the espresso produced. This guide provides a detailed, technical breakdown of the essential maintenance routines, from automated cycles to daily upkeep.</p>
|
||||
<p>In deep learning, a &ldquo;channel&rdquo; can be thought of as a feature dimension. While this term is common in Convolutional Neural Networks for images (e.g., Red, Green, Blue channels), in LLMs, the analogous concept is the model&rsquo;s primary embedding dimension, commonly referred to as <code>d_model</code>.</p></description></item><item><title>Quantization in LLMs</title><link>/posts/quantization-in-llms/</link><pubDate>Tue, 19 Aug 2025 00:00:00 +0000</pubDate><guid>/posts/quantization-in-llms/</guid><description><p>The burgeoning scale of Large Language Models (LLMs) has necessitated a paradigm shift in their deployment, moving beyond full-precision floating-point arithmetic towards lower-precision representations. Quantization, the process of mapping a wide range of continuous values to a smaller, discrete set, has emerged as a critical technique to reduce model size, accelerate inference, and lower energy consumption. This article provides a technical overview of quantization theories, their application in modern LLMs, and highlights the ongoing innovations in this domain.</p></description></item><item><title>A Comprehensive Guide to Breville Barista Pro Maintenance</title><link>/posts/a-comprehensive-guide-to-breville-barista-pro-maintenance/</link><pubDate>Sat, 16 Aug 2025 00:00:00 +0000</pubDate><guid>/posts/a-comprehensive-guide-to-breville-barista-pro-maintenance/</guid><description><p>Proper maintenance is critical for the longevity and performance of a Breville Barista Pro espresso machine. Consistent cleaning not only ensures the machine functions correctly but also directly impacts the quality of the espresso produced. This guide provides a detailed, technical breakdown of the essential maintenance routines, from automated cycles to daily upkeep.</p>
|
||||
<h4 id="understanding-the-two-primary-maintenance-cycles">
|
||||
<strong>Understanding the Two Primary Maintenance Cycles</strong>
|
||||
<a class="heading-link" href="#understanding-the-two-primary-maintenance-cycles">
|
||||
|
Reference in New Issue
Block a user