How Facial Recognition Can Help You Predict People’s Music Preferences

Any good music recommendation system should know its customers well.

How have these platforms (e.g. Spotify, Apple Music, or Amazon Music) done this traditionally? They would do the following:

  1. Ask users to rate songs (e.g. from 1 to 5 stars)
  2. Look at the users’ data like clicks, purchases, listening time, and so on.

The first one (ratings) has been more reliable but it interferes with the user’s listening experience. No one likes to be interrupted when they’re enjoying their music and this can also make their assessment less accurate. 

If you’re asked to rate a song before you start the next one you’ll want to do it fast and not think about it too much. This added effort can negatively influence the precision with which you evaluate the music. You just want to get it done and move on. 

So how can you help the assessment of songs, whether it’s for a recommendation system, an ad assessment or your next band’s solo album, with more reliability without interfering with the user’s listening experience? 

Two words: facial recognition.

Why use facial recognition?

Music is deeply intertwined with emotions. 

Listening to music evokes emotions which manifest throughout our bodies. When we’re happy with the music we tend to increase our body’s movement. Our heart rate will increase, the pupils will dilate, and we’ll also show that emotion in our facial features. 

Facial expressions can show us what’s happening inside and help us predict our current emotions accurately. So when we experience two songs differently, the user will elicit different facial expressions. 

The more aroused people are around a song the more we’ll see it shown in their faces.

Thus, the following logic applies: music elicits emotions, emotions manifest on our faces, and once we see our user’s facial expressions we can determine what they feel towards a certain song. With the analysis of facial movements, we can then learn people’s music preferences. 

How to apply facial recognition software to music preferences

In a 2019 study by Italian and Slovenian researchers on people’s music choices, the results showed that facial recognition is an effective way to capture people’s music preferences which can be used to create a more comprehensive music recommendation system. 

This is especially important in the world we’re creating surrounded by computers and robots. Imagine being able to set up your home to analyze your mood based on your facial features and suggest certain music to either increase your current emotional state (e.g. enhance happiness) or decrease it (e.g. make you less sad). 

By analyzing your external features (facial expressions, heart rate, etc.) smart homes would be able to infer your internal emotional states and react accordingly to improve your quality of life.

Another experiment carried out this year in England looked at recording people’s faces during a classical music concert and trying to predict the audience’s music preferences from their expressions. 

The study revealed that facial recognition software can detect multiple faces at the same time in a real environment without interfering with their listening experience. Also, they found that analyzing people’s faces could predict how much they liked the music. For instance, they found that when the software registered more happiness, people found the music more pleasant (reporting it afterwards to the researchers). 

All this accounts for the usefulness of using facial recognition to learn more about your audience’s emotional response to music in different environments and use that knowledge to improve the desired outcome (music recommendation system, enjoyment at a concert, etc.).

Other music elements to consider when assessing music preferences

Consider the following elements to learn more about your user:

  • Playcount: the number of times a user listens to a certain song reflects how much they like the song.
  • Listening time: the more time spent listening to a song the more they like it. So a song listened to 75% of its total duration is more liked than one that just reaches 30% listening time.
  • Context-based music experience: the place where music is consumed is very important. You might compare two songs based on their musical attributes (e.g. tempo, loudness, mood) but if you leave out that you’re choosing the music for a party, for instance, danceability and energy might be better attributes than these other features. Imagine having a context-based music recommender. That would be amazing!
  • Closed eyes scenario: sometimes when we listen to music we do it with our eyes closed. This area of the face is highly valuable to facial recognition systems which makes it a challenge to analyze. There have been some advancements in reconstructing people’s expressions from closed eyes scenarios (e.g. Microsoft’s research) to solve this issue. Also, at Alyze we encourage people to keep their eyes open while listening to music while viewing videos with low emotional imprint alongside their music experience. This way, people maintain their focus on the screen without being distracted by the visuals and they can concentrate on the music experience.

Alyze’s usefulness in assessing music preferences

Our AI technology helps our clients discover how their customers feel when watching their content. We can pinpoint specific moments throughout it and show how their emotions vary depending on what’s being displayed on the screen. 

This gives our clients a competitive advantage to deeply understand their audience and adjust to their needs. By combining science and art, our clients can create more personalized ads, movies, video games, and musical pieces that are more likely to resonate with their audience. 

We suggest combining music with low emotional activated visuals, like slow-moving landscapes, to capture people’s attention towards the screen but without distracting them from the main sense being researched: sound. 

In this piece, for instance, we see that the slow-moving camera of mountain hills ensures people will look at the screen, without interfering with their listening experience. We see that people’s emotional reaction develops as the music progresses. New music elements are added as time passes, evoking an array of emotions. The music lets them breathe, with moments of low emotional activation, but it quickly regains momentum in the key parts of the song.

Thus, we see that music can elicit the feelings the artist intended and allows the audience to experience different emotional waves which makes this rock piece particularly exciting and innovative to the listener. 

We could also compare different versions of the same song and see what parts capture the audience’s attention, and which emotional reactions are felt in each part of the song. This way, we can then compare the outcomes and choose the best arrangement before the official launch of the song. This will in turn increase the likelihood of a successful release of the song.

When art meets science, amazing things can happen!

If you’re curious to learn more about what we do and how our technology can help your enterprise, visit our website and write to us so we can discuss your goals further. We’d be happy to help!

Let's Talk
Contact Us