Overtone for Images: From Kubrick to the Champions League
Getting the "bigger picture" of what a reader is experiencing online
The most famous idiom about images is that a “picture is worth a thousand words,” but really they can do more. An image can help you see the **bigger picture** that you could never get with just text.
Overtone is excited to announce that we are now in the final phases of testing our models for content understanding not just on words and transcripts, but on images.
We are not the first technology to extract insight from images. Others use approaches like object recognition for weeding out pornography or recognizing a company’s logo in an image. However, our progress into images means we can bring something new to the world, our focus on creativity. One of Overtone’s goals is to elevate the way that ads slots are sold online by using AI to bring creative understanding.
In the media industry, focusing just on the objects present in a photo plays into the same trap as keywords within text. They were a useful way of sorting through content years ago, but two articles on the same topic can come across in very different ways.
Current technology such as Overtone’s enables us to achieve a more narrative understanding of what a human being is experiencing as they take in someone’s work, be it a news article, podcast or photograph.
Those moving and interacting with that content around the web, from the newspaper putting it in a newsletter to the advertiser placing an announcement next to it, can understand the real context of what is happening and work to make better human experiences.
Creative Understanding
Take for example, this image of HAL 9000, the evil artificial intelligence at the heart of Stanley Kubrick’s 2001: A Space Odyssey. If you are looking at the objects within it, it is a picture of a lens, or of a computer.
But what the viewer knows, and what Kubrick knows when he puts the image in his film, is that it is terrifying.
Overtone found that it had one of the highest probabilities of being “fearful.” (By contrast, of the movie images we analyzed, a prancing Julie Andrews in The Sound of Music was one of the most probable for “happy.”)
Our system can tell us this, not because it has a deep knowledge of 1960s cinema, but because it can analyze the image itself for its emotional resonance. Understanding the essence of a photograph is actually a concept that has been discussed in photography for a long time, such as Henri Cartier-Bresson’s “Decisive Moment”.
Be a Champion
If this sounds pretentious and arty, it’s because compared to a lot of the ways that content is currently distributed and monetized online, it is. But it also has immediate practical applications making sure that ad placements better align with what an audience is experiencing as they consume content.
In this spirit of quality, Overtone looked at several thousand articles from the first week of league phase action in the Champions League, the pan-European football competition whose reason for being is organising high quality matches with “die Besten, les grand equipes, the champions!”
If I work for Heineken or Playstation or Takeaway.com or any of the other sponsors of the Champions League, it makes sense for me to advertise on articles that people are reading about the matches, but there can be too many to choose from. I can target a keyword like “Arsenal” but this leads to articles with many different narratives.
Below is one of the happiest articles we found in the sample, with Leandro Trossard and Gabriel Martinelli smiling after they came on as substitutes and helped the London side beat Athletic Bilbao 2-0. Meanwhile with very similar keywords (Arsenal, Athletic, Champions League) the advertisement could appear next to an image of Arsenal’s captain Martin Ødegaard on the ground after sustaining an injury, which we labelled as one of the saddest images in the set and is not exactly something you would cheers to with a beer. Arsenal is also mentioned in an image of the league table, which we rate as purely “informational.”
A More Human Internet
Overtone’s mission is to give human beings clarity on content online. We are excited to continue this work by exploring how a deeper, human understanding of how images resonate with people. This will allow us to move beyond rudimentary tools like keywords and elevate the granular data of programmatic advertising into something more creative.
Next steps include integrating image understanding into existing products such as Community Lens, that can understand what images speak to a certain community, and Culture Pulse, to understand how those communities grow and discuss over time. If you would like to be part of this, send us an email and let us know what you are thinking.
We believe that this insight will help both the advertisers, who get to connect to audiences more authentically, and the publishers who put out quality work that can be drowned out in a world of AI slop. Especially with language model generation, we should work to highlight the work of photographers who create *the* image of an event, rather than just *an* image.
A more creative internet is one that understands the value of humans making art and journalism, sharing it with other humans. That, to us, is the bigger picture.
Overtone would like to thank students Mila Pretolani and Bayli Wolfe for their contributions to the image analysis framework this summer.