The climate constraints are currently not compatible with the ongoing arm race on large neural networks models. The training seems kinda OK, but the inferences... and it's currently just rolled out as shiny gadgets. This really need to be rethought.
Now, this is interesting research. With all that complexity, emergence is bound to happen. There's a chance to explain how and why. The links with the training data quality and the prompts themselves are interesting. It also explains a lot of the uncertainty.
The lack of transparency is staggering... this is purely about hype and at that point they're not making any effort to push science forward anymore.
Well, people asking relevant questions slow you down obviously... since the goal about the latest set of generative models is to "move them into customers hands at a very high speed" this creates tension. Instead of slowing down they seem hell bent at throwing ethics out of the window.
This is an excellent piece. Very nice portrait of Emily M. Bender a really gifted computational linguist and really bad ass if you ask me. She's out there asking all the difficult questions about the current moment regarding large language models and so far the answers are (I find) disappointing. We collectively seem to be way too fascinated by the shiny new toy and the business opportunities to pay really attention to the impact on the social fabric of all of this.
When they changed their statutes it was the first sign... now it's clear all ethics went through the window. It's about fueling the hype to drive money home.
Or why they are definitely not a magic tool for programming. Far from it. This might help developers a tiny bit, at the expense of killing the learning of students falling for it and the creation of a massive amount of low quality content.
Excellent piece as usual from Cory Doctorow. It quite clearly point out why Google is anxious and running off the chatbot cliff
Inaccuracies, contradicting itself, conflating events, misquoting sources... all of that mixed with some correct facts, it's a perfect misinformation spreading machine. This just can't be trusted at this point. Those experiments should be stopped in my opinion, better do proper homeworks first, then relaunch when this can be better trusted.
Are we surprised? Not really no... you don't own any of the data you're feeding it. Keep it away from your secrets.
There's really something rotten in this AI "arms race"... they're clearly making mistakes to go fast for PR purposes and using tools the wrong way. This can only lead to large scale disinformation if they don't correct course quickly. This has more political impacts than it looks at first sight.
So transformer models produce things that look plausible... and that's it. What would it look like if we started to make hybrid models in which a transformer model is also tied to proper computation tools with general knowledge? This piece is a good illustration of what it could provide.
For all the conversations about how chat GPT might displace jobs, there's a big untold: how much of copyright is violated in the process? It's also very concerning about how much data it collects when interacted with.
Or why you can't trust large language model for any fact or knowledge related tasks...
It's limits and biases are well documented. But, what about the ideologies of the people behind those models? What can it tell us about their aims behind those models? Questions worth exploring in my opinion.
A few interesting points in there. Too much hype and important points are glanced over, we'd all benefit from them being more actively explored.
The human labor behind AI training is still on going. This is clearly gruesome and sent over to other countries... ignoring the price for a minute this is also a good way to hide its consequences I guess.
Very good piece about that dangerous moment in the creation of the latest large language models. We're about to drown in misinformation, can we get out of it?
A few compelling arguments for the impact of the latest strain of generative neural networks. The consequences for the eroded trust about online content are clear. I'm less convinced about some of the longer term predictions this piece proposes though.
There are a few reasons to worry about the latest strain of generative neural networks. One of them is the trust we can place in new generated content. The other one is indeed the impact on our culture. There's been already a trend at focusing on what sells rather than what's truly novel or avant-garde. This could well push it further. We'll we drown in mediocre content?