250 documents can backdoor a model.

April 26, 2026 0 Comments

250 documents can backdoor a model. “Only 250 malicious documents roughly 420 thousand tokens or just 0.00016 percent of a large dataset are enough.” Roemmele argues this kind of data poisoning gets “permanently embedded in the model weights” and the only fix is retraining from scratch. ~ learn more