64 private links
This is still an important step with LLM. It's not because the models are huge that tokenizers disappeared or that you don't need to clean up your data.
Wondering where some of the biases of AI models generating images come from? This is an excellent deep dive into one of the most successful data sets used to train said models. And they've been curated by... statistical models not humans. This unsurprisingly amplifies biases all the way to the final models.
This is an excellent piece, I highly recommend reading it.
Good explanation about how sampling works. Does a good job explaining why it shines and where it is limited.
Very interesting debunk of the Dunning-Kruger effect. This is welcome since I see it pervasively cited. Also comes with a few interesting facts introduced by the papers which critiqued it first.
Interesting exploration of statistics around marriage (in the US). Some jobs are definitely more staying in their own circles than others.
Good reminder of the limits of machine learning. There's no clear path from machine learning to more general intelligence. This article doesn't account for emergence effects though. They are a possibility but that's a long stretch from what being exhibited so far: it's "just" statistics.
Fascinating account on mental models and then statistical power
It starts with how a flawed mental model (coming from Facebook's founder) about identity and social role became imposed on others.
Then it continues on the mental model we tend to apply to vaccines. That shows again how bad we are at intuitively grasping statistics and their application. They do require an effort even when you're trained at them.