Ever wonder how websites know exactly what you want to buy next? Look no further… I’m about to introduce you to the magic and mayhem of collaborative filtering.
If you’ve ever taken to the internet to do some shopping, which I’m guessing you have, you’ve probably added something to your cart and then noticed a few other items the website you’re browsing on has pulled up a few products it thought you might like. You’ve also probably taken a look at those products, decided that you did, in fact, like the look of one or two of them, and added those to your cart as well. By the time your little “trip” to buy a mascara has ended, your cart is now full of products you didn’t mean to buy, but they just looked too interesting to pass up. What gives?
One of the best things about online shopping is that you don’t need to deal with the pushy sales reps – I know it’s their job to recommend something else to you in order to increase sales (hence their title), but sometimes it’s nice to just buy the one thing you needed to buy without the risk of having other things pushed on you and the subsequent temptation to buy them… unless you’re like me and that happens anyway thanks to the little minx that is the “people also bought” section on the bottom of the product page.
Most commonly found on Amazon, these “people also bought” sections can also be found on makeup websites, such as ULTA, as well as Sephora, in which the section masquerades as “you may also like,” which you can see above on these various platforms using the Too Faced Better Than Sex mascara as an example. No matter what they’re named, it’s no wonder these virtual sales reps are called “recommendation systems.”
How Do They Do It?
No, it’s not witchcraft – big data is “big” for a reason! Products are recommended to you using a technique called “collaborative filtering” that you may have heard before if you’ve taken an analytics or statistics class. This technique isn’t limited to product recommendations – you’ve interacted with it when you’ve been recommended people to become friends with on Facebook and when you’ve been shown what you might want to watch next on Hulu and Netflix.
Although there are a few ways to go about this, collaborative filtering is based on the idea that “people who agreed in their evaluation of certain items in the past are likely to agree again in the future.” Most systems use what’s called the “neighborhood-based technique” where “a number of users is selected based on their similarity to the active user” and then “a prediction for the active user is made by calculating a weighted average of the ratings of the selected users.” In layman’s terms, a system will basically find people that you’re similar to for some reason (maybe based on purchasing habits) and give you recommendations based on what those other people bought based on the assumption that if they purchased them and you’re similar to them, you might also want to purchase them.
To illustrate this, let’s take our mascara example again: say you’re Person A and you’ve bought the Too Faced Better Than Sex mascara and the it Cosmetics Your Skin But Better CC+ Cream on a Certain Website. Lo and behold, there’s a Person B that’s almost exactly like you (at least as far as the Certain Website is concerned) because they’ve bought the same things. Unlike You, though, your Doppelgänger has also purchased the ABH Brow Wiz pencil.
Being the opportunist that it is, the Certain Website will probably assume that since you two are so similar with your purchasing habits already, you’ll probably also want that Brow Wiz, so it’ll recommend it to you, and now the ball is in your court and you get to confirm whether or not you’re as similar to your Doppelgänger as the Certain Website has pegged you to be, or whether you’re ~different~ after all.
That’s a Little Weird
It is, isn’t it? *creepy smile*
As you might have already guessed, big data is a pretty powerful thing, but it isn’t without its controversies. Target, a store that’s near and dear to (most likely) all of our hearts, has landed in hot water in the past for its pregnancy prediction model, which may have been too accurate. As the classic story goes, wanting to send specifically-designed ads to women entering their second trimester as this is when most expectant mothers began to buy new things, Andrew Pole and other members of Target’s Guest Marketing Analytics team were able to identify 25 products that allowed him to assign each shopper a “pregnancy prediction” score when analyzed together. More impressively, this data could be used to estimate her due date to within a small window, allowing Target to send coupons at specific stages in her pregnancy.
This is where the fun begins: apparently, after this model was launched and put into use, an angry man walked into a Target store holding coupons containing promotions for items like baby cribs and clothes that had been addressed to his daughter, who was in high school. He demanded to speak to the manager, asked if the company was trying to encourage her to get pregnant, and the manager apologized on the spot. Wanting to apologize again, the manager called the man a few days later, who reported that his daughter actually confessed to being pregnant and was due in August. This time, the man apologized to the manager. Whoops.
This all just goes to show that even when the models themselves seem to be extremely accurate, things can fall apart if the application isn’t perfect. Another caveat of these models is that they’re only as good as the data you “feed” them: you may have noticed that the three websites mentioned above in our Too Faced mascara example recommend different products. This might be due to the fact that they may be using different collaborative filtering techniques, but I’m guessing it’s mostly due to the fact that they have completely different data sets as the data they use is specific to their own platforms. This doesn’t mean that Amazon’s recommendations are better than Sephora’s or ULTA’s (or any other combination of the other way around), but only that those are the recommendations based on its customers’ specific purchasing habits.
I love data and I love makeup, so this kind of stuff is a match made in heaven for me. If you ever wondered how the sites you shop on seem to recommend products that you actually like, my best guess is collaborative filtering. If you have never wondered this, I’m wondering how you got this far on this post, but if this wasn’t to your interests, maybe there’s another post of mine I might be able to recommend…
- Too Faced Better Than Sex Mascara // Amazon // ULTA // Sephora
- Towards Data Science // Various Implementations of Collaborative Filtering
- Recommender Systems // Collaborative Filtering
- The New York Times // How Companies Learn Your Secrets
*Disclaimer: I am not affiliated with or compensated by any of the brands mentioned (I wish!). As always, all thoughts & opinions are my own (unless stated otherwise)!