ART Theft Auto

While working on AOgmaNeo at Ogma, I test a ton of different encoding methods. AOgmaNeo, and SPH (Sparse Predictive Hierarchies) in general rely heavily on high-quality, incrementally-learned sparse codes for their encoders. Recently, I have been playing a lot with ART (Adaptive Resonance Theory)-based encoders. ART is a neuroscience-inspired online/incremental learning method with tons of variants for different tasks. Here is a good survey paper. I have experimented with ART encoders in the past, but this time I found a new way to make them distributed, which seems to drastically outperform previous attempts in terms of runtime speed and end-result.

The new ART-based encoders use a 2-stage process to learn distributed codes. First, each column in SPH (ART module) performs the standard ART activation algorithm (search and resonance). Then, a second stage kicks in, selecting among multiple columns. The second stage allows only the most active columns to participate in learning. This results in column-wise distributed codes.

If you want to know more about AOgmaNeo and SPH in general, here is our Handmade Network entry. This contains a bunch of useful links.

In the past, I attempted to re-created YouTuber Sentdex’s GAN Theft Auto experiment, but using SPH. It worked, but it didn’t run quite as fast as I would have liked and lacked a lot of detail. It ran in real-time on the CPU with 8 threads, but it turns out we can do better. You can see the video and code for that attempt here.

For those who haven’t seen it, GAN Theft Auto is a model that uses Generative Adversarial Networks to learn a simple simulation of the game Grand Theft Auto V from a video dataset of movements collected on a bridge in the game.

With the new ART-based encoders, we can take things to the next level. With 8 CPU threads, I can train a better model than before in just 10 minutes. While the following results are still noisy compared to the original GAN-based result, and notably lacks upscaling, I think it’s cool that we can run such things in the browser with WebAssembly. I also only have access to a sample of the dataset, not the whole thing.

Keep in mind that the original GAN Theft Auto was trained for quite some time on a DGX A100. It also requires quite a powerful GPU for inference, too. My version, however, can be trained at roughly the same speed as inference (which runs at 434 fps on my machine). And, well, it runs in the browser, using your CPU.

Anyways, without further ado, here is “ART Theft Auto”.

ART Theft Auto

Controls: A/D to turn. The demo may take a bit to load.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.