📚 Auto-publish: Add/update 3 blog posts
All checks were successful
Hugo Publish CI / build-and-deploy (push) Successful in 11s

Generated on: Sat Oct  4 17:44:47 UTC 2025
Source: md-personal repository
This commit is contained in:
Automated Publisher
2025-10-04 17:44:47 +00:00
parent 85e0d053b7
commit 2f73eaed9a
3 changed files with 2 additions and 2 deletions

View File

@@ -1,2 +1,3 @@
image-b25565d6f47e1ba4ce2deca7e161726b86df356e.png|388f43c3f800483aae5ea487e8f45922.png|387cde4274484063c4c7e1f9f37c185a
image-7913a54157c2f4b8d0b7f961640a9c359b2d2a4f.png|ee04876d75d247f9b27a647462555777.png|2371421b04f856f7910dc8b46a7a6fb9
image-79378d40267258c0d8968238cc62bd197dc894fa.png|16d64bdc9cf14b05b7c40c4718b8091b.png|ff2625e796efd7187614b6e0a8542af6

View File

@@ -55,8 +55,7 @@ To understand where performance hits its ceiling, I applied roofline analysis—
The roofline model works by comparing a workload's operational intensity (how many calculations you do per byte of data moved) against the device's balance point. If your operational intensity is too low, you're bottlenecked by memory bandwidth—and as we'll see, that's exactly what happens with LLM inference.
![S3 File](/images/benchmarking-llms-on-jetson-orin-nano/388f43c3f800483aae5ea487e8f45922.png)
![S3 File](/images/benchmarking-llms-on-jetson-orin-nano/16d64bdc9cf14b05b7c40c4718b8091b.png)
## The Results: Speed and Efficiency

Binary file not shown.

After

Width:  |  Height:  |  Size: 694 KiB