LLMs become more dangerous as they rapidly get easier to use
This is a concise summary by Ethan Mollick of what I increasingly see as a key factor driving the evolution of consumer-facing LLMs:
Using AI well used to be a pretty challenging process which involved crafting a prompt using techniques like chain-of-thought along with learning tips and tricks to get the most out of your AI. In a recent series of experiments, however, we have discovered that these techniques don’t really help anymore. Powerful AI models are just getting better at doing what you ask them to or even figuring out what you want and going beyond what you ask (and no, threatening them or being nice to them does not seem to help on average).
What limited truth there was to the inflated discourse of ‘prompt engineering’ has largely evaporated at this point, leaving us in a strange position. The conversational approach I’ve always advocated, literally treating the LLM as an interlocutor analogous to a human collaborator, remains the best way of getting the most out of these systems. But neither this nor ‘prompt engineering’ is necessary to get LLMs to perform usefully in real world contexts.
This seems like an obviously positive thing for users. However my concern is the burden of articulation with LLMs, explaining what you want in a clear and substantive way, was a necessary condition for responsible use of LLMs. This burden was never sustainable at the level of a commercially viable product. It presupposed a consistent level of reflexivity, a degree of cultural capital and a fluency with writing which was always incompatible with the kind of mass market software that has a user base of one billion plus.
However it did ensure that users were thinking carefully about their use because you literally couldn’t produce valuable outcomes with an LLM unless you were doing this. Now it’s very easy to get LLMs to infer what you want, with success much of the time to at least some degree. What’s lost in the process is the reflexivity which came prior to use (i.e. the moment when you stopped to think “what am I actually asking Claude to do here?”) and the learning which came through reflecting on the output and the related interaction it generates, as your sense of what you were trying to do is refined and expanded through dialogue.
This is how Claude Opus 4 glossed my analysis here. It conveyed my own words back to me but it did it in a way which surfaced an element (“the pause”) which was only latent in my original phrasing (“the moment when”), leading me to reflect more on the temporal dimension to “the burden of articulation”:
The “burden of articulation” you describe served as a kind of productive friction – it forced users to crystallize their thinking before engaging with the tool. There’s something valuable in that pause, that moment of having to translate a vague intention into clear language. It’s reminiscent of how writing itself can clarify thinking, or how explaining a problem to someone else often helps you understand it better yourself.
In this sense friction with LLMs was a positive thing because it necessitated meta-cognition. The optimisation of the human-model interaction erodes a feature which I would argue was immensely important, even if its value is only manifested outside of the interaction itself. It doesn’t I think level the playing field because those with the necessary capital and fluency can still use LLMs in a deeper and more reflective way, with better outcomes emerging from the process.
But it does create worrying implications for organisations which build this practice into their roles. Earlier today I heard Cory Doctorow use the brilliant analogy of asbestos to describe LLMs being incorporated into digital infrastructure in ways which we will likely later have to remove at immense cost. What’s the equivalent analogy for the social practice of those operating within the organisations?
https://soundcloud.com/qanonanonymous/cory-doctorow-destroys-enshitification-e338
