Adversarial Machine Learning is responsible for assessing their weaknesses and providing countermeasures.
⚡ Attacks ⚡
It is organized in four types of attacks: extraction, inversion, poisoning and evasion.
🔒 Extraction 🔒
It tries to steal the parameters and hyperparameters of a model by making requests that maximize the extraction of information.
Depending on the knowledge of the adversary's model, white-box and black-box attacks can be performed.
In the simplest white-box case (when the adversary has full knowledge of the model, e.g., a sigmoid function), one can create a system of linear equations that can be easily solved.
In the generic case, where there is insufficient knowledge of the model, the substitute model is used. This model is trained with the requests made to the original model in order to imitate the same functionality as the original one.
⚠️ Limitations ⚠️
Training a substitute model is equivalent (in many cases) to training a model from scratch.
Very computationally intensive.
The adversary has limitations on the number of requests before being detected.
They are intended to reverse the information flow of a machine learning model.
They enable an adversary to have knowledge of the model that was not explicitly intended to be shared.
They allow to know the training data or information as statistical properties of the model.
Three types are possible:
Membership Inference Attack (MIA): An adversary attempts to determine whether a sample was employed as part of the training.
Property Inference Attack (PIA): An adversary aims to extract statistical properties that were not explicitly encoded as features during the training phase.
Reconstruction: An adversary tries to reconstruct one or more samples from the training set and/or their corresponding labels. Also called inversion.
They aim to corrupt the training set by causing a machine learning model to reduce its accuracy.
This attack is difficult to detect when performed on the training data, since the attack can propagate among different models using the same training data.
The adversary seeks to destroy the availability of the model by modifying the decision boundary and, as a result, producing incorrect predictions or, create a backdoor in a model. In the latter, the model behaves correctly (returning the desired predictions) in most cases, except for certain inputs specially created by the adversary that produce undesired results. The adversary can manipulate the results of the predictions and launch future attacks.
🔓 Backdoors 🔓
BadNets are the simplest type of backdoor in a machine learning model. Moreover, BadNets are able to be preserved in a model, even if they are retrained again for a different task than the original model (transfer learning).
It is important to note that public pre-trained models may contain backdoors.
🛡️ Defensive actions 🛡️
Detection of poisoned data, along with the use of data sanitization.
Fool the AI!: Hackers can use backdoors to poison training data and cause an AI model to misclassify images. Learn how IBM researchers can tell when data has been poisoned, then guess what backdoors have been hidden in these datasets. Can you guess the backdoor?
🏃♂️ Evasion 🏃♂️
An adversary adds a small perturbation (in the form of noise) to the input of a machine learning model to make it classify incorrectly (example adversary).
They are similar to poisoning attacks, but their main difference is that evasion attacks try to exploit weaknesses of the model in the inference phase.
The goal of the adversary is for adversarial examples to be imperceptible to a human.
Two types of attack can be performed depending on the output desired by the opponent:
Targeted: the adversary aims to obtain a prediction of his choice.
Untargeted: the adversary intends to achieve a misclassification.
Adversarial training, which consists of crafting adversarial examples during training so as to allow the model to learn features of the adversarial examples, making the model more robust to this type of attack.
Adversarial Robustness Toolbox, abbreviated as ART, is an open-source Adversarial Machine Learning library for testing the robustness of machine learning models.
It is developed in Python and implements extraction, inversion, poisoning and evasion attacks and defenses.
ART supports the most popular frameworks: Tensorflow, Keras, PyTorch, MxNet, ScikitLearn, among many others.
It is not limited to the use of models that use images as input, but also supports other types of data, such as audio, video, tabular data, etc.
Cleverhans is a library for performing evasion attacks and testing the robustness of a deep learning model on image models.
It is developed in Python and integrates with the Tensorflow, Torch and JAX frameworks.
It implements numerous attacks such as L-BFGS, FGSM, JSMA, C&W, among others.
🔧 Use 🔧
The use of AI to accomplish a malicious task and boost classic attacks.
🕵️♂️ Pentesting 🕵️♂️
GyoiThon: Next generation penetration test tool, intelligence gathering tool for web server.
Deep Exploit: Fully automatic penetration test tool using Deep Reinforcement Learning.
AutoPentest-DRL: Automated penetration testing using deep reinforcement learning.
DeepGenerator: Fully automatically generate injection codes for web application assessment using Genetic Algorithm and Generative Adversarial Networks.
Eyeballer: Eyeballer is meant for large-scope network penetration tests where you need to find "interesting" targets from a huge set of web-based hosts.
🦠 Malware 🦠
DeepLocker: Concealing targeted attacks with AI locksmithing, by IBM Labs on BH.
DeepObfusCode: Source code obfuscation through sequence-to-sequence networks.
AutoCAT: Reinforcement learning for automated exploration of cache-timing attacks.
AI-BASED BOTNET: A game-theoretic approach for AI-based botnet attack defence.
SECML_Malware: Python library for creating adversarial attacks against Windows Malware detectors.
🗺️ OSINT 🗺️
SNAP_R: Generate automatically spear-phishing posts on social media.
SpyScrap: SpyScrap combines facial recognition methods to filter the results and uses natural language processing for obtaining important entities from the website the user appears.
📧 Phishing 📧
DeepDGA: Implementation of DeepDGA: Adversarially-Tuned Domain Generation and Detection.
👨🎤 Generative AI 👨🎤
🔊 Audio 🔊
🛠️ Tools 🛠️
deep-voice-conversion: Deep neural networks for voice conversion (voice style transfer) in Tensorflow.
tacotron: A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial).
Real-Time-Voice-Cloning: Clone a voice in 5 seconds to generate arbitrary speech in real-time.
mimic2: Text to Speech engine based on the Tacotron architecture, initially implemented by Keith Ito.
WaveNet vocoder: Implementation of the WaveNet vocoder, which can generate high quality raw speech samples conditioned on linguistic or acoustic features.
Deepvoice3_pytorch: PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models.
eSpeak NG Text-to-Speech: eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
DALLE2-pytorch: Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch.
ImaginAIry: AI imagined images. Pythonic generation of stable diffusion images.
Lama Cleaner: Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
DifFace: Blind Face Restoration with Diffused Error Contraction (PyTorch).
CodeFormer: Towards Robust Blind Face Restoration with Codebook Lookup Transformer.
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion.
Diffusers: 🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch.
Stable Diffusion: High-Resolution Image Synthesis with Latent Diffusion Models.
InvokeAI: InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, supports terminal use through a CLI, and serves as the foundation for multiple commercial products.
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images.
Awesome AI Art Image Synthesis: A list of awesome tools, ideas, prompt engineering tools, colabs, models, and helpers for the prompt designer playing with aiArt and image synthesis. Covers Dalle2, MidJourney, StableDiffusion, and open source tools.
Weather Diffusion: Code for "Restoring Vision in Adverse Weather Conditions with Patch-Based Denoising Diffusion Models".
DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis.
Dall-E Playground: A playground to generate images from any text prompt using Stable Diffusion (past: using DALL-E Mini).
MM-CelebA-HQ-Dataset: A large-scale face image dataset that allows text-to-image generation, text-guided image manipulation, sketch-to-image generation, GANs for face generation and editing, image caption, and VQA.
Deep Daze: Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network).
StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing.
💡 Applications 💡
ArtLine: A Deep Learning based project for creating line art portraits.
Depix: Recovers passwords from pixelized screenshots.
Rewriting: Interactive tool to directly edit the rules of a GAN to synthesize scenes with objects added, removed, or altered. Change StyleGANv2 to make extravagant eyebrows, or horses wearing hats.
Fawkes: Privacy preserving tool against facial recognition systems.
Pulse: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models.
HiDT: Official repository for the paper "High-Resolution Daytime Translation Without Domain Labels".
3D Photo Inpainting: 3D Photography using Context-aware Layered Depth Inpainting.
SteganoGAN: SteganoGAN is a tool for creating steganographic images using adversarial training.
Stylegan-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis.
MegaPortraits: One-shot Megapixel Neural Head Avatars.
eg3d: Efficient Geometry-aware 3D Generative Adversarial Networks.
TediGAN: Pytorch implementation for TediGAN: Text-Guided Diverse Face Image Generation and Manipulation.
DALLE-pytorch: Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch.
StyleNeRF: This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis".
DeepSVG: Official code for the paper "DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation". Includes a PyTorch library for deep learning with SVG data.
NUWA: A unified 3D Transformer Pipeline for visual synthesis.
StyleGAN2 Distillation: Paired image-to-image translation, trained on synthetic data generated by StyleGAN2 outperforms existing approaches in image manipulation.
handwrite: Handwrite generates a custom font based on your handwriting sample.
GPT Sandbox: The goal of this project is to enable users to create cool web demos using the newly released OpenAI GPT-3 API with just a few lines of Python.
PassGAN: A Deep Learning Approach for Password Guessing.
GPT Index: GPT Index is a project consisting of a set of data structures designed to make it easier to use large external knowledge bases with LLMs.
nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs.