From 19d2678a16a265157f020907ac3e1b7a12c8ec97 Mon Sep 17 00:00:00 2001 From: eric Date: Sat, 4 Oct 2025 20:42:01 +0000 Subject: [PATCH] deploy: 6ed1d69396ef7572816c27ae26d815b7046c58ad --- 404.html | 2 +- about/index.html | 2 +- categories/index.html | 2 +- index.html | 2 +- index.xml | 4 +- .../index.html | 38 +++++++++---------- .../index.html | 2 +- .../index.html | 2 +- .../index.html | 2 +- .../index.html | 2 +- posts/index.html | 2 +- posts/index.xml | 4 +- .../index.html | 2 +- .../index.html | 2 +- posts/page/2/index.html | 2 +- posts/ppo-for-language-models/index.html | 2 +- posts/quantization-in-llms/index.html | 2 +- .../index.html | 2 +- posts/supabase-deep-dive/index.html | 2 +- .../index.html | 2 +- posts/transformer-s-core-mechanics/index.html | 2 +- .../index.html | 2 +- posts/useful/index.html | 2 +- sitemap.xml | 2 +- tags/index.html | 2 +- 25 files changed, 45 insertions(+), 45 deletions(-) diff --git a/404.html b/404.html index ae7e91d..00187ff 100644 --- a/404.html +++ b/404.html @@ -4,4 +4,4 @@ 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/about/index.html b/about/index.html index d3faa85..a339fc3 100644 --- a/about/index.html +++ b/about/index.html @@ -4,4 +4,4 @@ 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/categories/index.html b/categories/index.html index 85f2230..3178900 100644 --- a/categories/index.html +++ b/categories/index.html @@ -4,4 +4,4 @@ 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/index.html b/index.html index bce3f25..ea63338 100644 --- a/index.html +++ b/index.html @@ -4,4 +4,4 @@ 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/index.xml b/index.xml index 5b80a52..f4f4634 100644 --- a/index.xml +++ b/index.xml @@ -1,4 +1,4 @@ -Eric X. Liu's Personal Page/Recent content on Eric X. Liu's Personal PageHugoenSat, 04 Oct 2025 17:44:47 +0000Why Your Jetson Orin Nano's 40 TOPS Goes Unused (And What That Means for Edge AI)/posts/benchmarking-llms-on-jetson-orin-nano/Sat, 04 Oct 2025 00:00:00 +0000/posts/benchmarking-llms-on-jetson-orin-nano/<h2 id="introduction"> +Eric X. Liu's Personal Page/Recent content on Eric X. Liu's Personal PageHugoenSat, 04 Oct 2025 20:41:50 +0000Why Your Jetson Orin Nano's 40 TOPS Goes Unused (And What That Means for Edge AI)/posts/benchmarking-llms-on-jetson-orin-nano/Sat, 04 Oct 2025 00:00:00 +0000/posts/benchmarking-llms-on-jetson-orin-nano/<h2 id="introduction"> Introduction <a class="heading-link" href="#introduction"> <i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"></i> @@ -6,7 +6,7 @@ </a> </h2> <p>NVIDIA&rsquo;s Jetson Orin Nano promises impressive specs: 1024 CUDA cores, 32 Tensor Cores, and 40 TOPS of INT8 compute performance packed into a compact, power-efficient edge device. On paper, it looks like a capable platform for running Large Language Models locally. But there&rsquo;s a catch—one that reveals a fundamental tension in modern edge AI hardware design.</p> -<p>After running 66 inference tests across seven different language models ranging from 0.5B to 5.4B parameters, I discovered something counterintuitive: the device&rsquo;s computational muscle sits largely idle during LLM inference. The bottleneck isn&rsquo;t computation—it&rsquo;s memory bandwidth. This isn&rsquo;t just a quirk of one device; it&rsquo;s a reality that affects how we should think about deploying LLMs at the edge.</p>Flashing Jetson Orin Nano in Virtualized Environments/posts/flashing-jetson-orin-nano-in-virtualized-environments/Thu, 02 Oct 2025 00:00:00 +0000/posts/flashing-jetson-orin-nano-in-virtualized-environments/<h1 id="flashing-jetson-orin-nano-in-virtualized-environments"> +<p>After running 66 inference tests across seven different language models ranging from 0.5B to 5.4B parameters, I discovered something counterintuitive: the device&rsquo;s computational muscle sits largely idle during single-stream LLM inference. The bottleneck isn&rsquo;t computation—it&rsquo;s memory bandwidth. This isn&rsquo;t just a quirk of one device; it&rsquo;s a fundamental characteristic of single-user, autoregressive token generation on edge hardware—a reality that shapes how we should approach local LLM deployment.</p>Flashing Jetson Orin Nano in Virtualized Environments/posts/flashing-jetson-orin-nano-in-virtualized-environments/Thu, 02 Oct 2025 00:00:00 +0000/posts/flashing-jetson-orin-nano-in-virtualized-environments/<h1 id="flashing-jetson-orin-nano-in-virtualized-environments"> Flashing Jetson Orin Nano in Virtualized Environments <a class="heading-link" href="#flashing-jetson-orin-nano-in-virtualized-environments"> <i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"></i> diff --git a/posts/benchmarking-llms-on-jetson-orin-nano/index.html b/posts/benchmarking-llms-on-jetson-orin-nano/index.html index 944e26f..1e74809 100644 --- a/posts/benchmarking-llms-on-jetson-orin-nano/index.html +++ b/posts/benchmarking-llms-on-jetson-orin-nano/index.html @@ -6,16 +6,16 @@ NVIDIA’s Jetson Orin Nano promises impressive specs: 1024 CUDA cores, 32 Tensor Cores, and 40 TOPS of INT8 compute performance packed into a compact, power-efficient edge device. On paper, it looks like a capable platform for running Large Language Models locally. But there’s a catch—one that reveals a fundamental tension in modern edge AI hardware design. -After running 66 inference tests across seven different language models ranging from 0.5B to 5.4B parameters, I discovered something counterintuitive: the device’s computational muscle sits largely idle during LLM inference. The bottleneck isn’t computation—it’s memory bandwidth. This isn’t just a quirk of one device; it’s a reality that affects how we should think about deploying LLMs at the edge.">
\ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/breville-barista-pro-maintenance/index.html b/posts/breville-barista-pro-maintenance/index.html index 6fcc7bd..a5ece95 100644 --- a/posts/breville-barista-pro-maintenance/index.html +++ b/posts/breville-barista-pro-maintenance/index.html @@ -25,4 +25,4 @@ Understanding the Two Primary Maintenance Cycles Link to heading The Breville Ba 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/espresso-theory-application-a-guide-for-the-breville-barista-pro/index.html b/posts/espresso-theory-application-a-guide-for-the-breville-barista-pro/index.html index 19a37a3..3512f28 100644 --- a/posts/espresso-theory-application-a-guide-for-the-breville-barista-pro/index.html +++ b/posts/espresso-theory-application-a-guide-for-the-breville-barista-pro/index.html @@ -20,4 +20,4 @@ Our overarching philosophy is simple: isolate and change only one variable at a 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/flashing-jetson-orin-nano-in-virtualized-environments/index.html b/posts/flashing-jetson-orin-nano-in-virtualized-environments/index.html index 61328eb..e0b721b 100644 --- a/posts/flashing-jetson-orin-nano-in-virtualized-environments/index.html +++ b/posts/flashing-jetson-orin-nano-in-virtualized-environments/index.html @@ -168,4 +168,4 @@ Flashing NVIDIA Jetson devices remotely presents unique challenges when the host 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/how-rvq-teaches-llms-to-see-and-hear/index.html b/posts/how-rvq-teaches-llms-to-see-and-hear/index.html index 859373e..b6900a7 100644 --- a/posts/how-rvq-teaches-llms-to-see-and-hear/index.html +++ b/posts/how-rvq-teaches-llms-to-see-and-hear/index.html @@ -18,4 +18,4 @@ The answer lies in creating a universal language—a bridge between the continuo 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/index.html b/posts/index.html index 536acba..07fdf01 100644 --- a/posts/index.html +++ b/posts/index.html @@ -14,4 +14,4 @@ 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/index.xml b/posts/index.xml index ad03b4d..897f2b8 100644 --- a/posts/index.xml +++ b/posts/index.xml @@ -1,4 +1,4 @@ -Posts on Eric X. Liu's Personal Page/posts/Recent content in Posts on Eric X. Liu's Personal PageHugoenSat, 04 Oct 2025 17:44:47 +0000Why Your Jetson Orin Nano's 40 TOPS Goes Unused (And What That Means for Edge AI)/posts/benchmarking-llms-on-jetson-orin-nano/Sat, 04 Oct 2025 00:00:00 +0000/posts/benchmarking-llms-on-jetson-orin-nano/<h2 id="introduction"> +Posts on Eric X. Liu's Personal Page/posts/Recent content in Posts on Eric X. Liu's Personal PageHugoenSat, 04 Oct 2025 20:41:50 +0000Why Your Jetson Orin Nano's 40 TOPS Goes Unused (And What That Means for Edge AI)/posts/benchmarking-llms-on-jetson-orin-nano/Sat, 04 Oct 2025 00:00:00 +0000/posts/benchmarking-llms-on-jetson-orin-nano/<h2 id="introduction"> Introduction <a class="heading-link" href="#introduction"> <i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"></i> @@ -6,7 +6,7 @@ </a> </h2> <p>NVIDIA&rsquo;s Jetson Orin Nano promises impressive specs: 1024 CUDA cores, 32 Tensor Cores, and 40 TOPS of INT8 compute performance packed into a compact, power-efficient edge device. On paper, it looks like a capable platform for running Large Language Models locally. But there&rsquo;s a catch—one that reveals a fundamental tension in modern edge AI hardware design.</p> -<p>After running 66 inference tests across seven different language models ranging from 0.5B to 5.4B parameters, I discovered something counterintuitive: the device&rsquo;s computational muscle sits largely idle during LLM inference. The bottleneck isn&rsquo;t computation—it&rsquo;s memory bandwidth. This isn&rsquo;t just a quirk of one device; it&rsquo;s a reality that affects how we should think about deploying LLMs at the edge.</p>Flashing Jetson Orin Nano in Virtualized Environments/posts/flashing-jetson-orin-nano-in-virtualized-environments/Thu, 02 Oct 2025 00:00:00 +0000/posts/flashing-jetson-orin-nano-in-virtualized-environments/<h1 id="flashing-jetson-orin-nano-in-virtualized-environments"> +<p>After running 66 inference tests across seven different language models ranging from 0.5B to 5.4B parameters, I discovered something counterintuitive: the device&rsquo;s computational muscle sits largely idle during single-stream LLM inference. The bottleneck isn&rsquo;t computation—it&rsquo;s memory bandwidth. This isn&rsquo;t just a quirk of one device; it&rsquo;s a fundamental characteristic of single-user, autoregressive token generation on edge hardware—a reality that shapes how we should approach local LLM deployment.</p>Flashing Jetson Orin Nano in Virtualized Environments/posts/flashing-jetson-orin-nano-in-virtualized-environments/Thu, 02 Oct 2025 00:00:00 +0000/posts/flashing-jetson-orin-nano-in-virtualized-environments/<h1 id="flashing-jetson-orin-nano-in-virtualized-environments"> Flashing Jetson Orin Nano in Virtualized Environments <a class="heading-link" href="#flashing-jetson-orin-nano-in-virtualized-environments"> <i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"></i> diff --git a/posts/mixture-of-experts-moe-models-challenges-solutions-in-practice/index.html b/posts/mixture-of-experts-moe-models-challenges-solutions-in-practice/index.html index bc4b0b2..7850a49 100644 --- a/posts/mixture-of-experts-moe-models-challenges-solutions-in-practice/index.html +++ b/posts/mixture-of-experts-moe-models-challenges-solutions-in-practice/index.html @@ -44,4 +44,4 @@ The Top-K routing mechanism, as illustrated in the provided ima 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/openwrt-mwan3-wireguard-endpoint-exclusion/index.html b/posts/openwrt-mwan3-wireguard-endpoint-exclusion/index.html index 2c16a48..f1058ba 100644 --- a/posts/openwrt-mwan3-wireguard-endpoint-exclusion/index.html +++ b/posts/openwrt-mwan3-wireguard-endpoint-exclusion/index.html @@ -98,4 +98,4 @@ When using WireGuard together with MWAN3 on OpenWrt, the tunnel can fail to esta 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/page/2/index.html b/posts/page/2/index.html index 03aac13..93f3107 100644 --- a/posts/page/2/index.html +++ b/posts/page/2/index.html @@ -9,4 +9,4 @@ 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/ppo-for-language-models/index.html b/posts/ppo-for-language-models/index.html index f31e4ca..c2021b1 100644 --- a/posts/ppo-for-language-models/index.html +++ b/posts/ppo-for-language-models/index.html @@ -25,4 +25,4 @@ where δ_t = r_t + γV(s_{t+1}) - V(s_t)

  • γ (gam 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/quantization-in-llms/index.html b/posts/quantization-in-llms/index.html index 16413d8..b8cd9cd 100644 --- a/posts/quantization-in-llms/index.html +++ b/posts/quantization-in-llms/index.html @@ -7,4 +7,4 @@ 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/secure-boot-dkms-and-mok-on-proxmox-debian/index.html b/posts/secure-boot-dkms-and-mok-on-proxmox-debian/index.html index f7208ea..ac5430b 100644 --- a/posts/secure-boot-dkms-and-mok-on-proxmox-debian/index.html +++ b/posts/secure-boot-dkms-and-mok-on-proxmox-debian/index.html @@ -59,4 +59,4 @@ nvidia-smi failed to communicate with the NVIDIA driver modprobe nvidia → “K 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/supabase-deep-dive/index.html b/posts/supabase-deep-dive/index.html index 3e71d4f..1d5b5e2 100644 --- a/posts/supabase-deep-dive/index.html +++ b/posts/supabase-deep-dive/index.html @@ -90,4 +90,4 @@ Supabase enters this space with a radically different philosophy: transparency. 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/t5-the-transformer-that-zigged-when-others-zagged-an-architectural-deep-dive/index.html b/posts/t5-the-transformer-that-zigged-when-others-zagged-an-architectural-deep-dive/index.html index c445dc1..e0a553c 100644 --- a/posts/t5-the-transformer-that-zigged-when-others-zagged-an-architectural-deep-dive/index.html +++ b/posts/t5-the-transformer-that-zigged-when-others-zagged-an-architectural-deep-dive/index.html @@ -30,4 +30,4 @@ But to truly understand the field, we must look at the pivotal models that explo 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/transformer-s-core-mechanics/index.html b/posts/transformer-s-core-mechanics/index.html index 492cd96..5f303dc 100644 --- a/posts/transformer-s-core-mechanics/index.html +++ b/posts/transformer-s-core-mechanics/index.html @@ -36,4 +36,4 @@ In deep learning, a “channel” can be thought of as a feature dimensi 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/unifi-vlan-migration-to-zone-based-architecture/index.html b/posts/unifi-vlan-migration-to-zone-based-architecture/index.html index 565503d..8529598 100644 --- a/posts/unifi-vlan-migration-to-zone-based-architecture/index.html +++ b/posts/unifi-vlan-migration-to-zone-based-architecture/index.html @@ -28,4 +28,4 @@ This article documents that journey. It details the pitfalls encountered, the co 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/posts/useful/index.html b/posts/useful/index.html index 654fbd0..ba350fe 100644 --- a/posts/useful/index.html +++ b/posts/useful/index.html @@ -9,4 +9,4 @@ One-minute read
    • [2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file diff --git a/sitemap.xml b/sitemap.xml index 32a118c..fa7983c 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -1 +1 @@ -/2025-10-04T17:44:47+00:00weekly0.5/posts/2025-10-04T17:44:47+00:00weekly0.5/posts/benchmarking-llms-on-jetson-orin-nano/2025-10-04T17:44:47+00:00weekly0.5/posts/flashing-jetson-orin-nano-in-virtualized-environments/2025-10-02T08:42:39+00:00weekly0.5/posts/openwrt-mwan3-wireguard-endpoint-exclusion/2025-10-02T08:34:05+00:00weekly0.5/posts/unifi-vlan-migration-to-zone-based-architecture/2025-10-02T08:42:39+00:00weekly0.5/posts/quantization-in-llms/2025-08-20T06:02:35+00:00weekly0.5/posts/breville-barista-pro-maintenance/2025-08-20T06:04:36+00:00weekly0.5/posts/secure-boot-dkms-and-mok-on-proxmox-debian/2025-08-14T06:50:22+00:00weekly0.5/posts/how-rvq-teaches-llms-to-see-and-hear/2025-08-08T17:36:52+00:00weekly0.5/posts/supabase-deep-dive/2025-08-04T03:59:37+00:00weekly0.5/posts/ppo-for-language-models/2025-10-02T08:42:39+00:00weekly0.5/posts/mixture-of-experts-moe-models-challenges-solutions-in-practice/2025-08-03T06:02:48+00:00weekly0.5/posts/t5-the-transformer-that-zigged-when-others-zagged-an-architectural-deep-dive/2025-08-03T03:41:10+00:00weekly0.5/posts/espresso-theory-application-a-guide-for-the-breville-barista-pro/2025-08-03T04:20:20+00:00weekly0.5/posts/transformer-s-core-mechanics/2025-10-02T08:42:39+00:00weekly0.5/posts/useful/2025-08-03T08:37:28-07:00weekly0.5/about/2020-06-16T23:30:17-07:00weekly0.5/categories/weekly0.5/tags/weekly0.5 \ No newline at end of file +/2025-10-04T20:41:50+00:00weekly0.5/posts/2025-10-04T20:41:50+00:00weekly0.5/posts/benchmarking-llms-on-jetson-orin-nano/2025-10-04T20:41:50+00:00weekly0.5/posts/flashing-jetson-orin-nano-in-virtualized-environments/2025-10-02T08:42:39+00:00weekly0.5/posts/openwrt-mwan3-wireguard-endpoint-exclusion/2025-10-02T08:34:05+00:00weekly0.5/posts/unifi-vlan-migration-to-zone-based-architecture/2025-10-02T08:42:39+00:00weekly0.5/posts/quantization-in-llms/2025-08-20T06:02:35+00:00weekly0.5/posts/breville-barista-pro-maintenance/2025-08-20T06:04:36+00:00weekly0.5/posts/secure-boot-dkms-and-mok-on-proxmox-debian/2025-08-14T06:50:22+00:00weekly0.5/posts/how-rvq-teaches-llms-to-see-and-hear/2025-08-08T17:36:52+00:00weekly0.5/posts/supabase-deep-dive/2025-08-04T03:59:37+00:00weekly0.5/posts/ppo-for-language-models/2025-10-02T08:42:39+00:00weekly0.5/posts/mixture-of-experts-moe-models-challenges-solutions-in-practice/2025-08-03T06:02:48+00:00weekly0.5/posts/t5-the-transformer-that-zigged-when-others-zagged-an-architectural-deep-dive/2025-08-03T03:41:10+00:00weekly0.5/posts/espresso-theory-application-a-guide-for-the-breville-barista-pro/2025-08-03T04:20:20+00:00weekly0.5/posts/transformer-s-core-mechanics/2025-10-02T08:42:39+00:00weekly0.5/posts/useful/2025-08-03T08:37:28-07:00weekly0.5/about/2020-06-16T23:30:17-07:00weekly0.5/categories/weekly0.5/tags/weekly0.5 \ No newline at end of file diff --git a/tags/index.html b/tags/index.html index 47c0cd6..e76e718 100644 --- a/tags/index.html +++ b/tags/index.html @@ -4,4 +4,4 @@ 2016 - 2025 Eric X. Liu -[2f73eae] \ No newline at end of file +[6ed1d69] \ No newline at end of file