Generating OSM maps using StyleGAN2

Isaac Boates
10 min readOct 11, 2020

The story behind www.thismapdoesnotexist.com

Introduction

I was first introduced to Generative Adversarial Networks (GANs) after seeing the uncannily good computer-generated faces from thispersondoesnotexist.com. After being thoroughly amazed and even a bit frightened by the capabilities now within reach of neural networks, I noticed a little box in the lower right-hand corner. It had a link to Nvidia’s own code which can be used to train your own GAN and I thought, why not give it a try?

I’m a GIS developer by trade, a field which most people tend to find themselves in through an initial fascination with maps. Nowadays I spend more time in code editors and shell terminals than actually making maps, but I still have the same fascination and reverence for the classic map. One of my favorite cartographic projects out there right now is OpenStreetMap. If you haven’t heard of it, you can think about it almost like a “Wikipedia of maps”. Anyone can add to or improve any location in the world, leading to a worldwide map which is, at least in some places, arguably more detailed than the better-known Google Maps or Apple Maps.

I decided to train a GAN on OpenStreetMap. I quickly realized that I was not the only one to have this idea, though. There were other cartography enthusiasts who have made map-generating GANs.

Topi Tjukanov trained a model he named “MapDreamer” using historical maps. He included the whole map, neatline, legend, title and all in his training set. The training set was highly heterogenous in scale and in theme, which produced fascinating combinations of very different maps. You can read more about his work in his own Medium post on the work.

Another map-based GAN was created by Martin Disley, this time using a more consistent training set comprised of historical “six inch to the mile” maps found at the National Library of Scotland. The results from this GAN are almost indistinguishable from the real ones on which it was trained, and the network was even able to generate toponyms with real letters, even though they never really spell anything coherent (Not that that would be expected of a GAN… yet.) His work is documented by the Scotsman.

Data Collection

The first task to start training my GAN was to acquire the data. One might think I could have just downloaded the tiles directly from OpenStreetMap, but it wasn’t so straightforward. Being a free, volunteer-driven service, it would be very rude to start hammering on their servers with requests for my vanity project. Additionally, I needed the raw data used to draw the map in order to narrow down my search for suitable images to use as training data, which we will get to a bit later.

Thankfully, it is possible to download regular dumps of raw OpenStreetMap data from Geofabrik at various administrative levels. Going further, these dumps can be automatically downloaded and spun up as a local TileServer instance via a very handy Docker image, from which one can then “download” the rendered OpenStreetMap tiles.

I decided to use Germany as the region from which I would get my training data, because it is one of the richest regions in terms of data availability on OpenStreetMap. Even small towns in Germany tend to have nearly every building, path, church, tree and local business mapped. But upon trying to build a TileServer image with all of Germany as its database, disaster struck: “OUT OF MEMORY”. It turns out that my consumer-grade laptop was not up to the task of processing one of the most data-rich regions in the world. Who knew?

Thankfully, my poor, abused little laptop was capable of building an image with a single German province, so I started from there. Once I had the TileServer container up and running, I had to decide exactly which tiles to use in the training set. I decided rather arbitrarily that I wanted to generate maps at about the town or village scale. But there is an inconvenient truth about having absolute cartograhpic coverage of a region at this scale, even in one as data-rich as a German province — most of it is boring. Here’s a few remarkably dull examples:

I can barely contain my excitement.

This means that indiscriminately downloading tiles is going to result in a lot of “boring” samples, and I want the GAN to make maps that are at least a little bit more interesting than the pastel purgatory of a topographic countryside map. Thankfully, simply by having the TileServer container, I already had access to the raw data which it was rendering. This data is stored in a PostGIS database, and can therefore be accessed with regular SQL queries. So I made up some conditions for what I thought would make for an “interesting” map, and wrote a query which would return square boxes covering locations which met my criteria.

I played around a lot with different queries, but for brevity’s sake I will simply say that I ultimately decided that an “interesting” map would simply be a named location (e.g. a town or a village with a non-null name attribute) which had at least 50 polygons nearby whose “building” attribute was not null. As an SQL query on an OpenStreetMap database, that looks like this:

with building as (
select
osm_id,
ST_Centroid(way) as geom
from
planet_osm_polygon
where
building is not null
),
pois as (
select
p.way as geom,
count(b.geom) as numbuilding
from
planet_osm_point as p,
building as b
where
p.place in ('village', 'town', 'suburb')
and
ST_DWithin(b.geom, p.way, 250)
group by
p.way
)
select
ST_Envelope(ST_Buffer(geom, 500)) as geom
from
pois
where
numbuilding >= 50

The query returns a set of 1km x 1km squares around named locations with more than 50 buildings nearby. There was quite a bit more nitty-gritty work that had to be done to further refine the training images, but I don’t intend to bore every reader with excruciating technical details. To see exactly how I did it, have a look in the repository at “process_pbfs.py” and “download_tiles.py”, and feel free to ask me directly about anything that isn’t clear. Suffice it to say that I now had a 256x256 pixel image that (more or less) prominently featured its respective town or village.

If you want to try out the repo for yourself with different locations of interest, just replace the contents of the query in the “sql” folder with one that returns areas you are interested in. Just make sure that it still returns squares of a fixed size.

With the process now locked down for a single German province, the time had come to apply it to the remaining provinces. It took several hours for my laptop to build the OpenStreetMap database, launch the TileServer container and scrape the images into a usable training set. The thought of repeating this process for all provinces manually was intolerable. So, as can be seen in the repo in “process_pbfs.py”, I automated the entire process. For those who want to try it at home for different areas in the world, all one has to do is replace the values in the “regions” dictionary with URLs to the appropriate “pbf” and “poly” files as found in Geofabrik. You’re probably in for a long wait after that, so go have a cup of coffee or seventy.

After a couple days, all of Germany was finished being processed. I had a total of 26,296 training images, above even the 25,000 which NVlabs suggests in their examples on the StyleGAN2 repository itself. All that had to be done after this was convert the data into “.tfrecord” format, which can easily be done as described in the StyleGAN2 repository.

Training

My first attempt at training was on Google Colab, because it’s free. To make a long story short, it didn’t work. Maybe it used to work, because there is evidence all over the web of people using it with StyleGAN2, but when I tried to train the model, memory consumption would go critical, and would then always abruptly terminate with nothing more than a simple “^C” as its final output (which indicates a SIGINT command to terminate the running process). I assume what happened is that Colab decided that I was using too much RAM and it killed the process. It would be nice if they were a bit clearer about what is going on, but I digress.

The StyleGAN2 repository README makes it rather clear that training this model consumes a LOT of memory, so it’s almost certainly the case that Colab (at least the free tier) is not sufficient to train StyleGAN2. So I began looking at provisioning a paid instance.

I eventually settled on using Lambda Labs. I’m not getting paid by them to post this, I swear. I just honestly had a really good experience with them and I found it affordable enough for what I wanted to do. StyleGAN2 requires a legacy version of Tensorflow (1.15), and they made it easy to downgrade to this version.

Before finally training, however, there was one last step which I did which I highly recommend that you do any time you are planning on investing time and money in training a GAN: I shrunk the images down to 64x64 pixels and trained on that first. This allowed me to first provision a much smaller (and therefore cheaper) instance and train the model quickly to see if it would suffer from the dreaded mode collapse. And lo and behold, it did not. But I still think it is always wise to spend a few bucks early to be absolutely sure before spending a lot more on the real thing, only to get a dazzling variety of grey smears for your trouble.

A fresh batch of 64 by 64 pixel success

Training on a Lambda Labs instance took a couple of adjustments to the repo. Not that anything was their fault — in fact they figured out what needed to be changed. The StyleGAN2 architecture is actually getting a bit old by this point, and the Lambda Labs instances seemed to be a bit too cutting-edge for it. Nonetheless, all the required changes can be made using this snippet, taken from my own automatic deployment script:

sudo apt install -y python3-tensorflow-legacy-cuda;\pip install tensorboard;\sudo apt-get install -y libprotobuf-dev;\sudo cp /usr/lib/python3/dist-packages/tensorflow_core/python/_pywrap_tensorflow_internal.cpython-38-x86_64-linux-gnu.so /usr/lib/python3/dist-packages/tensorflow_core/python/_pywrap_tensorflow_internal.so;\git clone https://github.com/NVlabs/stylegan2sed -i ‘s/-D_GLIBCXX_USE_CXX11_ABI=0/-D_GLIBCXX_USE_CXX11_ABI=1/g’ stylegan2/dnnlib/tflib/custom_ops.py

Note that some or all of these commands may not be necessary in the future. NVlabs may update their repository, and Lambda Labs may have updated their own environment so as to render them obsolete.

After all that business has been taken care of, the .tfrecord files containing the real, 256x256 pixel training images just needed to be uploaded to the instance and the training could be kicked off. I decided to go for the largest instance they had — 8x Nvidia Tesla V100 GPUs. It cost the most per hour, but after doing a bit of math and comparison, the training rate was fast enough to make it the most economic option.

The training can be started by running “run_training.py” as indicated from the StyleGAN2 repo, indicating where the training data can be found, how many kiloimages to process (it’s not a typo by the way, it really is intended to be trained on 50,000 kiloimages (i.e. 50,000,000). You can also specify to use mirror augmentation or not. This will duplicate and flip your images, substantially increasing your actual pool of training samples.

I trained my model with mirror augmentation, which may not have been ideal, because it then had virtually no change of creating readable letters on the map. But I was concerned that without it, it would not train as quickly or to the same quality as my 64x64 pixel model from earlier, so I kept it on. I eagerly watched it train until I forced myself to go to bed.

Results

When I woke up, I saw a lot of great progress. I decided to let it run for the rest of the day, periodically checking the FID50k score, which was calculated and dumped into a log file every few epochs. It appeared to converge at just under 4000 epochs, with an FID50k score of about 26. I’ve seen better, but after looking at the output results, decided it was good enough and shut it down.

🎉

I’m actually quite amazed at how well the model was able to capture typical OpenStreetMap features. Mostly rectangular buildings, green spaces, forests, red dashed lines for footpaths, it was even able to mimic some of the glyphs which represent various businesses, and most obviously, churches (which is not a surprise since there’s at least one in almost every town in Germany). There is definitely some fuzziness with how the roads tend to fizzle out or change type unexpectedly, and the street names and town names are effectively gibberish. In retrospect, maybe I should have risked turning off mirror augmentation and tried to get slightly more coherent labels, but overall, I’m quite happy with the results.

Do It Yourself

I’ve put up a simple website which will randomly display one of ten thousand generated towns made by this GAN. You can access it at www.thismapdoesnotexist.com. If you want to generate your own maps, or use the model for transfer learning or whatever else you can imagine to do with it.

If you want to have a look at the source code I used to make this happen, you can find it at this repository. To train a similar model, but with different source data, as mentioned earlier in the article, you need only to modify two things:

1. “process_pbfs.py” To download the pbf files from Geofabrik for your specific region(s) of interest.

2. Script in the “sql” folder to isolate places.

Conclusion

This was a roller coaster of a project. I was doing it in my free time as a hobby and had to put it down and pick it up quite a few times before finally finishing it all. But it was the project which really made it clear to me that machine learning is no longer some esoteric, arcane business available only to the smartest of the smart. One may not be ready to start tackling the theoertical components of it, but if you want to just apply it, there are now so many models out there that need only a bit of time and effort (and admittedly, if you are doing something substantial, money) to bring your idea to life. And most of the time and effort, at least in my experience, is actually spent on acquiring and cleaning the data.

--

--