Brand logo detection and recognition with AI

5 min readMay 20, 2020

Let’s try to find brand logos in images!

I usually don’t share much about geeky stuff I do, but this is an attempt to change that. I’ve become passionate about Machine Learning and Deep Learning a few years ago and I’ve done quite some projects involving image or text classification, visual similarity, and even some generative stuff, but I hadn’t found a good excuse to mess around with Object Detection — until now ;)

Why recognize brand logos?

Brands want to know where- and how their brand is represented. This could be to understand audiences, measure advertising or sponsorship deals, etc.
I’ve been working in the Digital Media space for a long time and dealt with quite some sports-related clients that would surely see the benefit of
this sort of technology.

Approach

First, we need data. Luckily there are several datasets with logos available such as FlickrLogos, Logos In The Wild, WebLogo-2M, and more[1]. Each of these datasets contains annotated images for a specific set of brand logos.

Object Detection algorithms are usually trained to detect different types (classes) of objects in an image. While we could train an Object Detection network to detect the specific brand logos in the datasets, likely we’d also want to detect brand logos that were not part of the training data. For example, we train the network to detect the logos of Adidas and Nike, but what happens if we now also want to detect the Puma logo? It would be tedious to retrain or keep fine-tuning the neural net every time we would want to search for a new kind of brand logo.

Instead of training to find the specific brand logos in a dataset, we train a general logo detector that extracts all logos from images or videos. We also train a feature extractor that understands the structure of brand logos. We then check the similarity of the logo features to determine a match.

Overview of the structure. There’s a logo detector to find logos in an image and a feature extractor. We use cosine similarity of the logo features to determine whether a logo is the logo we’re looking for.

This isn’t by any means a new or novel concept. For instance, face recognition or face verification solutions have a similar pipeline of detect->extract->compare similarity so that new faces can be recognized with even a single new sample.

Data
As I mentioned earlier there are a handful of datasets available in this space. The FlickrLogos datasets mainly contain closeup images of products, and we want to recognize logos in general real-life situations. WebLogo-2M is great but it’s HUGE (and I’ve heard quite noisy?) so for this little trial it makes sense to use the Logos In the Wild dataset[2].

Architecture
Instead of going with good ole’ YOLO[3], I was waiting for an excuse to try EfficientDet[4]. I used zylo117 ‘s Pytorch implementation of EfficientDet[5], which I found was the friendliest to use.
Extracting image features can be done with a bunch of pre-trained architectures. I tried ResNet18[6] and VGG[7]. Their weights are pre-trained on ImageNet[8], which means that they’ll be great at recognizing cats, but not necessarily good at recognizing logos. I added a custom head and fine-tuned the networks to classify 226 different brand logo classes with ~84% accuracy. The feature extractor now better ‘understands’ what a logo, which will be better when we use the features to determine logo similarity.

Results

Here are some random examples. The recognized logo is boxed in yellow and also shows the similarity score between that logo and the input logo.

We did not train on the Arsenal logo, yet it was found on Mattéo Guendouzi’s jersey.

The Puma logo was also not present in the training data, and even though the logo had a different color and stroke in the photo, it was still recognized.

Not able to find all of the Samsung logos. Some of them must have been too warped.

It seems to work better with logos that aren’t just text.

Definitely worse with text-only logos. The black and white split in the Juventus jersey caused confusion.

These were cherry-picked examples. Early tests show that it is better at detecting logos that aren’t just text. I also tried the Fly Emirates or Standard Chartered logos but it performed quite bad on those.
To better determine the accuracy I could create a test set from some of the logo classes in the train set (and remove them from the train set so that these are brand logos the model has not seen before). The mean average precision on those images could serve as an accuracy indicator.

There’s still a lot of room for improvement if I’d want to use this in production. The accuracy of the logo detector can be improved by training EfficientDet with a higher compound coefficient or by replacing EfficientDet with YOLO or Mask R-CNN[9].

I’m quite happy with the results of this quick weekend project. I hope this article has been informative. I tried to keep things high-level and not go into too much detail. If you have any questions please drop a comment. Perhaps later I can share a notebook that you could use for inference and try it out for yourself.

Links

[1] https://www.researchgate.net/figure/Publicly-available-in-the-wild-logo-datasets-in-comparison-with-the-novel-Logos-in-the_tbl1_320726900
[2] https://www.iosb.fraunhofer.de/servlet/is/78045/
[3] https://pjreddie.com/darknet/yolo/
[4] https://arxiv.org/abs/1911.09070
[5] https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch
[6] https://arxiv.org/abs/1512.03385
[7] https://arxiv.org/abs/1409.1556
[8] https://en.wikipedia.org/wiki/ImageNet
[9] https://arxiv.org/abs/1703.06870

Legal stuff

For informational purposes only. Copyright of all subjects in the photos belongs to their respectful owners.