Search: [performance] - ervin's web review

35% Faster Than The Filesystem

Interesting experiment showing that BLOBs in a database can be a good alternative to individual files on a filesystem in some contexts.

tech · databases · storage · filesystem · sqlite · performance

July 28, 2024 at 10:30:32 AM GMT+2 * · permalink

·

https://sqlite.org/fasterthanfs.html

·

Zen 5’s 2-Ahead Branch Predictor Unit: How a 30 Year Old Idea Allows for New Tricks – Chips and Cheese

Interesting article about what's coming for the branch predictor in the Zen 5 architecture from AMD.

tech · cpu · amd · performance

July 27, 2024 at 1:13:10 PM GMT+2 * · permalink

·

https://chipsandcheese.com/2024/07/26/zen-5s-2-ahead-branch-predictor-unit-how-30-year-old-idea-allows-for-new-tricks/

·

C++ Design Patterns For Low-Latency Applications

A paper listing patterns to reduce latency as much as possible. There are lesser known tricks in there.

tech · c++ · performance · optimization · pattern

July 15, 2024 at 11:00:50 AM GMT+2 · permalink

·

https://hackaday.com/2024/07/13/c-design-patterns-for-low-latency-applications/

·

Binary Search Tree with SIMD

Another interesting algorithm to handle using SIMD.

tech · simd · performance

July 10, 2024 at 9:18:48 AM GMT+2 · permalink

·

https://clement-jean.github.io/simd_binary_search_tree/

·

PostgreSQL and UUID as primary key

Forced to use UUID as primary key in a table? Then make sure to use them properly to not kill the performance more than necessary. Ideally use something else though.

tech · databases · uuid · performance

July 7, 2024 at 9:54:53 AM GMT+2 · permalink

·

https://maciejwalkowiak.com/blog/postgres-uuid-primary-key/

·

There’s plenty of room at the Top: What will drive computer performance after Moore’s law? | Science

As Moore's law fades away this question is indeed essential. Looks like there will be more pressure on software and algorithms than before (at last one might say, we had decades of waste there). Streamlining hardware architectures will have a role too, we might see simpler cores in greater numbers.

tech · hardware · software · performance

July 5, 2024 at 2:20:55 PM GMT+2 * · permalink

·

https://www.science.org/doi/10.1126/science.aam9744

·

Performance tip: avoid unnecessary copies – Daniel Lemire's blog

Interesting case, when everything else gets faster, memory copies might start to become the bottleneck.

tech · performance

June 23, 2024 at 9:55:52 AM GMT+2 * · permalink

·

https://lemire.me/blog/2024/06/22/performance-tip-avoid-unnecessary-copies/

·

Joining Strings in Python: A "Huh" Moment - Veronica Writes

Interesting dive into how join() and generator behave in CPython.

tech · python · memory · performance

June 16, 2024 at 5:42:01 PM GMT+2 * · permalink

·

https://berglyd.net/blog/2024/06/joining-strings-in-python/

·

Rolling your own fast matrix multiplication: loop order and vectorization – Daniel Lemire's blog

The ordering used for matrix multiplications definitely matters.

tech · c++ · compiler · performance · matrix

June 14, 2024 at 8:43:22 AM GMT+2 * · permalink

·

https://lemire.me/blog/2024/06/13/rolling-your-own-fast-matrix-multiplication-loop-order-and-vectorization/

·

Scan HTML faster with SIMD instructions: Chrome edition – Daniel Lemire's blog

SIMD keeps providing interesting performance boosts for parsing work loads.

tech · cpu · performance · SIMD

June 8, 2024 at 4:04:17 PM GMT+2 * · permalink

·

https://lemire.me/blog/2024/06/08/scan-html-faster-with-simd-instructions-chrome-edition/

·

Let’s optimize! Running 15× faster with a situation-specific algorithm

Another good example of how to speed up some Python code with nice gains.

tech · python · performance · optimization

May 31, 2024 at 9:26:14 AM GMT+2 * · permalink

·

https://pythonspeed.com/articles/lets-optimize-median-local-threshold/

·

Never reason from the results of a sampling profiler – Daniel Lemire's blog

Definitely to keep in mind when using sampling profilers indeed. They're useful, they help to get a starting point in the analysis but they're far from enough to find the root cause of a performance issue.

tech · performance · optimization · profiling

May 31, 2024 at 9:21:14 AM GMT+2 * · permalink

·

https://lemire.me/blog/2024/05/30/never-reason-from-the-results-of-a-sampling-profiler/

·

GPUs Go Brrr · Hazy Research

Interesting how much extra performance you can shave off the GPU by going back to how the hardware works.

tech · gpu · hardware · ai · machine-learning · neural-networks · performance

May 13, 2024 at 8:40:37 AM GMT+2 * · permalink

·

https://hazyresearch.stanford.edu/blog/2024-05-12-tk

·

An informal comparison of the three major implementations of std::string - The Old New Thing

Interesting quick comparison, this shows the design tradeoffs quite well.

tech · c++ · performance · memory

May 11, 2024 at 7:40:28 AM GMT+2 * · permalink

·

https://devblogs.microsoft.com/oldnewthing/20240510-00/?p=109742

·

Setting up PostgreSQL for running integration tests

Interesting use of database templates and memory disks to greatly speed up test executions.

tech · tests · performance · databases · postgresql

April 13, 2024 at 9:50:53 AM GMT+2 * · permalink

·

https://gajus.com/blog/setting-up-postgre-sql-for-running-integration-tests

·

Software eco-design: investigating and reducing the energy consumption of software

More work about eco-design of software. This is definitely welcome. I found this work a bit weak on the state of the art and the interview parts (10 people in the same company). But the field is so nascent that it's to be expected I guess, PhD students have to do with what they have access to. Unsurprisingly this shows a great lack of proper tools to tackle the measurement problem. This thesis shows interesting prospects to reduce variations in measurements though, some of the proposed guidelines might help but cannot offset the hardware heterogeneity completely... The parts focusing on practical advices around Java use and deployment are interestingly easy to apply though. You need to take into account the context of your application to make the right choices of course.

tech · performance · energy · ecology · java · research

April 8, 2024 at 5:32:00 PM GMT+2 * · permalink

·

https://theses.hal.science/tel-03429300/document

·

LLaMA Now Goes Faster on CPUs

Excellent work to improve Llama execution speed on CPU. It probably has all the tricks of the trade to accelerate this compute kernel.

tech · ai · machine-learning · gpt · llama · optimization · performance · cpu

April 1, 2024 at 10:34:36 AM GMT+2 * · permalink

·

https://justine.lol/matmul/

·

Optimizing SQLite for servers

With some tuning SQLite can go a long way, even for server type workloads. There are still a few caveats but in some case this can reduce complexity and cost quite a bit.