Three months ago, I was at an “Intersection of AI and My Little Pony Fandom” panel. It was a panel about the ways the MLP fandom has used AI to generate creative work, starting from finetuned GPT-2 in 2020, through voice synthesis via, and ending with, of course, Stable Diffusion. More specifically, the finetuned Pony Diffusion checkpoint, whose finetuning cost is estimated as tens of thousands of dollars. The talk ended with a proof-of-concept of a Discord bot that roleplayed a pony, via GPT-3.5, whose avatar was in-painted to diffe1ent expressions based on emotions inferred from chat history.

As I asked questions about compute resources and the presenter’s position on generative AI ethics, I had a moment of realization. I was at a pony convention. Why are we talking about whether an RTX 3090 is big enough to finetune a LLaMa checkpoint? How are we talking about Vicuna, here of all places, while people are dressed up in cosplay next door?

A pony generated from Pony Diffusion

When people talk about technology improving more quickly, it usually evokes thoughts of the singularity. Technology indistinguishable from magic, making it easier to create more magic. But, culture and communication are technologies too. The much easier and less-speculative way for a field to move faster is by having more people working in that field. If research is an API call away, then congratulations, we’ve democratized ML, as long as you’re willing to pay for access. Combine that with the adoption of low-friction social media (aka Twitter), and you’ve got something going.

If the engine of invention is powered by people sharing random ideas until good ones emerge, then I can’t help but wonder if the best inventions are ones that make sharing ideas easier.

(Previous post about How to Invent Everything)

This field is just getting so big. Things change so fast! The MLP AI enthusiasts mentioned some pretrained LLMs that I had not even heard of. It’s not my field, but, like, I do this for a living. I was easily top 1% generative AI knowledge among bronies in 2020. Now I’m like, top 5%? A 5x increase sounds right. I can only attribute the growth to one truth: there are signs of life, and people are hungry.

I remember being a young whippersnapper, in the deep learning wave of 2015. Then I was the new guard. The old guard would complain that “You can’t just take an old idea, do it with a neural net, call it deep learning, and claim that part of ML for deep learning”, as researchers continued to take old ideas and reshape the field around a different MLP: the multilayer perceptron. Now people do the same thing with LLMs.

The flood is growing, and my time to drink from it is the same. Deciding what to drink is getting harder.

I’m noticing a trend of people posting LLM summaries of papers, talks, etc. They’re always attributed to the LLM, and they’re never fact-checked. I have a long history of learning and appreciating that the map is not the territory, and this trend is a bit like if people who loved maps got access to a map-making tool and created a billion maps. It is not even that they understand they’re working with approximations of reality. That would be better. What’s happening is that they don’t care they’re on an approximation of reality. If you like searching for truth, it is a very personal kind of hell.

The bitter lesson I’m taking is that I will have to get used to this. I will have to use an LLM where I can explain the loss function, but can’t explain the emergent phenomena. I will have to learn all the random tricks people have found for LLM prompting. There is going to be too much content to accept any non-augmented workflow. Such is the future of knowledge work.

“But why is this prompt so effective? Do we have a way to inspect what preferences the RLHF reward has extrapolated, given the preference labels we have?”

“The stuff is what the stuff is, brother! Accept the mystery.”