Tianyu Gu, Brendan Dolan-Gavitt and Siddharth Garg
Deep learning-based techniques have achieved stateof-the-art performance on a wide variety of recognition and classification tasks. However, these networks are typically computationally expensive to train, requiring weeks of computation on many GPUs; as a result, many users outsource the training procedure to the cloud or rely on pre-trained models that are then fine-tuned for a specific task. In this paper we show that outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a BadNet) that has state-of-theart performance on the user’s training and validation samples, but behaves badly on specific attacker-chosen inputs.