📚 Auto-publish: Add/update 6 blog posts
All checks were successful
Hugo Publish CI / build-and-deploy (push) Successful in 14s
All checks were successful
Hugo Publish CI / build-and-deploy (push) Successful in 14s
Generated on: Thu Oct 2 08:42:39 UTC 2025 Source: md-personal repository
This commit is contained in:
@@ -0,0 +1 @@
|
||||
image-7c88938eaa4db1b7eafc437b9067b8790998fc71.png|2803b917b5794452870bc8a0aa896381.png|dd23c4ffd5f4e6bdec5dc03ba85140c8
|
@@ -11,7 +11,7 @@ draft: false
|
||||
|
||||
Flashing NVIDIA Jetson devices remotely presents unique challenges when the host machine is virtualized. This article documents the technical challenges, failures, and eventual success of flashing a Jetson Orin Nano Super developer kit using NVIDIA SDK Manager in various virtualized environments, specifically focusing on QEMU/KVM virtual machines and LXC containers on Proxmox VE.
|
||||
|
||||

|
||||

|
||||
|
||||
### The Constraint: Hypervisor-Only Infrastructure
|
||||
|
||||
|
@@ -8,7 +8,7 @@ draft: false
|
||||
Large Language Models (LLMs) have demonstrated astonishing capabilities, but out-of-the-box, they are simply powerful text predictors. They don't inherently understand what makes a response helpful, harmless, or aligned with human values. The technique that has proven most effective at bridging this gap is Reinforcement Learning from Human Feedback (RLHF), and at its heart lies a powerful algorithm: Proximal Policy Optimization (PPO).
|
||||
|
||||
You may have seen diagrams like the one below, which outlines the RLHF training process. It can look intimidating, with a web of interconnected models, losses, and data flows.
|
||||

|
||||

|
||||
|
||||
This post will decode that diagram, piece by piece. We'll explore the "why" behind each component, moving from high-level concepts to the deep technical reasoning that makes this process work.
|
||||
|
||||
|
@@ -40,7 +40,7 @@ The dimensions of the weight matrices are as follows:
|
||||
### 3. Deconstructing Multi-Head Attention (MHA)
|
||||
|
||||
The core innovation of the Transformer is Multi-Head Attention. It allows the model to weigh the importance of different tokens in the sequence from multiple perspectives simultaneously.
|
||||

|
||||

|
||||
#### 3.1. The "Why": Beyond a Single Attention
|
||||
A single attention mechanism would force the model to average all types of linguistic relationships into one pattern. MHA avoids this by creating `h` parallel subspaces. Each "head" can specialize, with one head learning syntactic dependencies, another tracking semantic similarity, and so on. This creates a much richer representation.
|
||||
|
||||
|
@@ -55,12 +55,12 @@ The final configuration groups the individual VLANs into distinct zones, forming
|
||||
* **DMZ:** Contains the `dns` and `prod` networks for semi-trusted, exposed services.
|
||||
* **IoT:** Contains the `iot` network. This is a low-trust zone for smart devices.
|
||||
* **Management:** Contains the `management` network. This is a highly privileged, isolated zone for network infrastructure.
|
||||

|
||||

|
||||
|
||||
#### The Security Policy Matrix
|
||||
|
||||
The true power of this model is realized in the firewall's zone matrix, which dictates the default traffic flow between each zone.
|
||||

|
||||

|
||||
|
||||
This matrix enforces the desired security policy with clear, high-level rules:
|
||||
* **Complete IoT Isolation:** The `IoT` row shows that devices in this zone are blocked from initiating any communication with any other internal zone. Their only allowed path is out to the internet.
|
||||
|
Binary file not shown.
After Width: | Height: | Size: 689 KiB |
Reference in New Issue
Block a user