This experiment involves each LLM responding to 128 or 256 prompts. AI detection is generally focused on determining the writer of a single document, not comparing two analagous sets of 128 documents and determining if the same person/tool wrote both. Totally different problem.
It might be because detecting if output is AI generated and mapping output which is known to be from an LLM to a specific LLM or class of LLMs are different problems.
They're discovering the wrong thing. And the analogy with biology doesn't hold.
They're sensitive not to architecture but to training data. That's like grouping animals by what environment they lived in, so lions and alligators are closer to one another than lions and cats.
The real trick is to infer the underlying architecture and show the relationships between architectures.
That's not something you can tell easily by just looking at the name of the model. And that would actually be useful. This is pretty useless.
This is provocative but off-base in order to be so: why would we need to work backwards to determine architecture?
Similarly, "you can tell easily by just looking at the name of the model" -- that's an unfounded assertion. No, you can't. It's perfectly cromulent, accepted, and quite regular to have a fine-tuned model that has nothing in its name indicating what it was fine-tuned on. (we can observe the effects of this even if we aren't so familiar with domain enough to know this, i.e. Meta in Llama 4 making it a requirement to have it in the name)
Intuitive and expected result (maybe without the prediction of performance). I'm glad somebody did the hard work of proving it.
Though, if this is so clearly seen, how come AI detectors perform so badly?
This experiment involves each LLM responding to 128 or 256 prompts. AI detection is generally focused on determining the writer of a single document, not comparing two analagous sets of 128 documents and determining if the same person/tool wrote both. Totally different problem.
It might be because detecting if output is AI generated and mapping output which is known to be from an LLM to a specific LLM or class of LLMs are different problems.
They're discovering the wrong thing. And the analogy with biology doesn't hold.
They're sensitive not to architecture but to training data. That's like grouping animals by what environment they lived in, so lions and alligators are closer to one another than lions and cats.
The real trick is to infer the underlying architecture and show the relationships between architectures.
That's not something you can tell easily by just looking at the name of the model. And that would actually be useful. This is pretty useless.
This is provocative but off-base in order to be so: why would we need to work backwards to determine architecture?
Similarly, "you can tell easily by just looking at the name of the model" -- that's an unfounded assertion. No, you can't. It's perfectly cromulent, accepted, and quite regular to have a fine-tuned model that has nothing in its name indicating what it was fine-tuned on. (we can observe the effects of this even if we aren't so familiar with domain enough to know this, i.e. Meta in Llama 4 making it a requirement to have it in the name)