Search: [cpu] - ervin's web review

If you're wondering why the architecture is called "amd64" and why the itanium disappeared... this is why. It was a very good stunt from AMD back then.

tech · hardware · cpu · intel · amd

September 25, 2025 at 11:16:08 PM GMT+2 * · permalink

·

https://dfarq.homeip.net/athlon-64-how-amd-turned-the-tables-on-intel/

·

Processors are getting wider

Interesting trend in the CPU space. We're getting more simultaneous instructions with the passing generations.

tech · hardware · cpu

September 2, 2025 at 8:39:39 AM GMT+2 * · permalink

·

https://lemire.me/blog/2025/09/01/processors-are-getting-wider/

·

Predictable memory accesses are much faster – Daniel Lemire's blog

Indeed, CPU prefetchers are really good nowadays. Now you know what to do to keep your code fast.

tech · cpu · hardware · memory · performance

August 16, 2025 at 4:09:40 PM GMT+2 * · permalink

·

https://lemire.me/blog/2025/08/15/predictable-memory-accesses-are-much-faster/

·

Why do we even need SIMD instructions ?

SIMD instructions are indeed a must to get decent performance on current hardware.

tech · cpu · simd · performance

August 10, 2025 at 7:27:54 AM GMT+2 * · permalink

·

https://lemire.me/blog/2025/08/09/why-do-we-even-need-simd-instructions/

·

The Concurrency Trap: How An Atomic Counter Stalled A Pipeline

A good example of how you can get bitten by cache coherency algorithms in the CPU.

tech · multithreading · caching · cpu · rust

June 13, 2025 at 8:43:01 AM GMT+2 · permalink

·

https://www.conviva.com/platform/the-concurrency-trap-how-an-atomic-counter-stalled-a-pipeline/

·

Memory Access Patterns Are Important

A bit dated perhaps, and yet most of the lessons in here are still valid. If performance and parallelism matter, you better keep an eye on how the cache is used.

tech · programming · cpu · memory · caching · performance · multithreading

May 27, 2025 at 9:30:37 AM GMT+2 * · permalink

·

https://mechanical-sympathy.blogspot.com/2012/08/memory-access-patterns-are-important.html?m=1

·

Atomicless Concurrency

Nice trick for highly performance sensitive data structures. Making data CPU local instead of thread local you can make a mechanism which is especially cache friendly.

tech · cpu · hardware · multithreading

April 16, 2025 at 1:04:41 PM GMT+2 * · permalink

·

https://mcyoung.xyz/2023/03/29/rseq-checkout/

·

Zen and the Art of Microcode Hacking

Nice exploration of the microcode patching flaw which was disclosed recently. This gives a glimpse at the high level of complexity the x86 family brings on the table.

tech · cpu · amd · security · complexity

March 6, 2025 at 8:09:52 AM GMT+1 * · permalink

·

https://bughunters.google.com/blog/5424842357473280/zen-and-the-art-of-microcode-hacking

·

Why Trees Without Branches Grow Faster: The Case for Reducing Branches in Code

Nice primer on the impact of too many branches in your code on the CPU. This is sometimes a good way to boost performance when you're mindful about that.

tech · cpu · performance · programming

January 30, 2025 at 12:40:42 PM GMT+1 * · permalink

·

https://cedardb.com/blog/reducing_branches/

·

Reverse engineering of the Pentium FDIV bug

It's interesting to see such a reverse engineering of this infamous bug straight from the gates layout.

tech · cpu · hardware

December 7, 2024 at 11:01:26 AM GMT+1 * · permalink

·

https://oldbytes.space/@kenshirriff/113606898880486330

·

When Machine Learning Tells the Wrong Story

Fascinating research about side-channel attacks. Learned a lot about them and website fingerprinting here. Also interesting the explanations of how the use of machine learning models can actually get in the way of proper understanding of the side-channel really used by an attack which can prevent developing actually useful counter-measures.

tech · cpu · hardware · security · privacy · research

November 10, 2024 at 6:39:04 PM GMT+1 * · permalink

·

https://jackcook.com/2024/11/09/bigger-fish.html

·

Why You Shouldn't Forget to Optimize the Data Layout

Data layout is essential for performance reasons. It is too often overlooked. If you want real speed you need to help the memory subsystem.

tech · cpu · performance · memory

October 11, 2024 at 9:03:25 AM GMT+2 * · permalink

·

https://cedardb.com/blog/optimizing_data_layouts/

·

Overview of cross-architecture portability problems – Michał Górny

Nice list of common portability issues one can encounter at the machine architecture level. But don't be fooled, this doesn't have implications only for C and C++, those problems leak in higher level languages as well.

tech · cpu · portability

September 24, 2024 at 8:41:58 AM GMT+2 * · permalink

·

https://blogs.gentoo.org/mgorny/2024/09/23/overview-of-cross-architecture-portability-problems/

·

SIMD Matters :: Box2D

SIMD is hard to use, not all problems can apply to it. But when they can, the performance gain can be great.

tech · cpu · simd · performance · physics · simulation

August 23, 2024 at 8:07:28 AM GMT+2 * · permalink

·

https://box2d.org/posts/2024/08/simd-matters/

·

‘Sinkclose’ Flaw in Hundreds of Millions of AMD Chips Allows Deep, Virtually Unfixable Infections | WIRED

Luckily this kind of very low level vulnerabilities are not too common and difficult to exploit. But when they get exploited all things break loose and you can't trust your hardware anymore.

tech · cpu · amd · security

August 10, 2024 at 11:03:47 PM GMT+2 * · permalink

·

https://www.wired.com/story/amd-chip-sinkclose-flaw/

·