Blog
This blog is an arbitrary selection of posts.
Posts by topic
• Machine Learning • Other • Physics
All Posts
Solving adversarial attacks in computer vision as a baby version of general AI alignment
Stanislav Fort (Twitter, website, and Google Scholar)
A high-dimensional sphere spilling out of a high-dimensional cube despite exponentially many constraints
Make a square, split each side into two halves, producing four cells. Put a circle into each cell such that it fills it completely. There is a small gap right in the middle of the square. Put a circle there again such that it touches the other four circles. The central circle is obviously inside the square, right? Yes, but only if the dimension you are in is $D\le9$. Above that, the central cicle actually spills out from the cube, despite the $2^D$ spheres in their cells keeping it in. In this post I present this simple-to-compute yet utterly counter-intuitive result.
Adversarial examples to ConvNeXt
The A ConvNet for the 2020s paper from Facebook (Meta?) AI Research proposed a new architecture called ConvNeXt that is built out of standard convolutional blocks and seems to outperform the Vision Transformer (ViT). I wanted to know if it suffers from adversarial examples so I wrote a Colab that loads up a pretrained version of the ConvNeXt model, runs a quick loop of the Fast Gradient Sign Method, and demonstrates that it’s easy to find adversaries to this new model. This is not surprising, but I thought it might be valuable to demonstrate it explicitly as well as to create a Colab that others can run and modify for their own experiments.
Vision Transformer finetuned without augmentation
A Vision Transformer (ViT) pretrained on ImageNet21k finetunes significantly faster without training set augmentation for short optimization schedules on CIFAR-10 (<3 epochs) and CIFAR-100 (<7 epochs). Despite this, the official GitHub repository ViT finetuning Colab uses augmentation by default. It might be worth turning it off for your experiments to speed things up and save compute. If you are willing to run longer finetuning, augmentation gives a slight accuracy boost.
Out-of-distribution sky color and image segmentation
Recently I was talking to a friend and they mentioned that they think a deep neural network trained on a distribution of natural images will not be able to handle an image with an unnatural green sky. I took a photo of a parking lot at Stanford, a poster for the movie The Martian and several scenes with unusual sky colors from the sci-fi series the Expanse, cropped out their skies, recolored them blue, green, yellow, red and purple, and verified that Mask RCNN has no trouble segmenting out people and cars in them despite their very out-of-distribution sky colors. This shows that large vision systems are quite robust to such semantic dataset shift, and that they will not immediately get confused if they see e.g. an unusual green sky.
Pixels still beat text: Attacking the OpenAI CLIP model with text patches and adversarial pixel perturbations
Adversarial examples are very easy to find for the OpenAI CLIP model in its zero-shot classification regime, as I demonstrated in my last post. Putting a sticker literally spelling B I R D on a picture of a dog will convince the classifier it is actually looking at a bird. This decision, however, can again be easily flipped to any other class (here frog in particular) by a targeted adversarial perturbation to the image pixels.
Adversarial examples for the OpenAI CLIP in its zero-shot classification regime and their semantic generalization
It turns out that adversarial examples are very easy to find (<100 gradient steps typically) for the OpenAI CLIP model in the zero-shot classification regime. Those adversarial examples generalize to semantically related text descriptions of the adversarial class.
Perturbing circular orbits and General Relativity
The closure of near-circular orbits and the relativistic precession of Mercury’s perihelion using high school math
subscribe via RSS