Inferring the Phylogeny of Large Language Models

PunchTornado 3 months ago

Intuitive and expected result (maybe without the prediction of performance). I'm glad somebody did the hard work of proving it.

Though, if this is so clearly seen, how come AI detectors perform so badly?

Calavar 3 months ago

This experiment involves each LLM responding to 128 or 256 prompts. AI detection is generally focused on determining the writer of a single document, not comparing two analagous sets of 128 documents and determining if the same person/tool wrote both. Totally different problem.
haltingproblem 3 months ago

It might be because detecting if output is AI generated and mapping output which is known to be from an LLM to a specific LLM or class of LLMs are different problems.

light_hue_1 3 months ago

They're discovering the wrong thing. And the analogy with biology doesn't hold.

They're sensitive not to architecture but to training data. That's like grouping animals by what environment they lived in, so lions and alligators are closer to one another than lions and cats.

The real trick is to infer the underlying architecture and show the relationships between architectures.

That's not something you can tell easily by just looking at the name of the model. And that would actually be useful. This is pretty useless.

littlestymaar 3 months ago

You are the one making a wrong biological analogy. Architecture isn't comparable to genes any more than training data is comparable to genes, and training data isn't comparable to environment, doing these kind of analogies brings you nothing but false confidence and misunderstanding.
What they do in the paper on the other hands is to apply the methods of biology, and get a result that is akin to phylogeny, not from a biological analogy but from a biologically-inspired method.
refulgentis 3 months ago

This is provocative but off-base in order to be so: why would we need to work backwards to determine architecture?
Similarly, "you can tell easily by just looking at the name of the model" -- that's an unfounded assertion. No, you can't. It's perfectly cromulent, accepted, and quite regular to have a fine-tuned model that has nothing in its name indicating what it was fine-tuned on. (we can observe the effects of this even if we aren't so familiar with domain enough to know this, i.e. Meta in Llama 4 making it a requirement to have it in the name)