Exploring Algorithms

Algorithms play an increasing role in our online lives. In their neverending quest to accurately profile their users in order to maximise ad revenues IT companies employ more and more sophisticated data mining methods incorporating information fromĀ  your activity history, your friends, your location and even strangers with similar interests to you.

Ever since I’ve watched Gary Kovac’s shocking TED talk “Tracking the Trackers” I’ve become increasingly privacy conscious and have taken several precautions to not be as easily tracked. From activating the ‘Do not track’ option in my browser to installing privacy enhancing browser extension such as Disconnect to opting out of targeted ads within the Google settings.

Thanks to these steps my advertisement profile with Google is now rather unspecific. Without taking such measures however, Google has been able to profile people surprisingly well.

I actively try to avoid the personalised features that sites present me with. In Facebook I never use their EdgeRank algorithm that sorts my feed according to “Top Stories” – I use “Most Recent” instead – simply because most of the Top Stories unsurprisingly is paid content from pages I subscribed to, not posts from my friends. Another reason is that I prefer to keep an open mind and personalised filters tend to create a filter bubble which not only distort people’s view of the outside world according to their own preferences and beliefs, they also do so invisibly.

For this week’s exercise of exploring algorithms I have decided to take a look at YouTube, since I have a long standing history of using that site. My main interaction with the site is wih the “My Subscriptions” tab which is always more relevant (and recent) than the algorithmically populated “What to watch” feature.

Logged into my account, this is what my front page looks like

what_to_watch

I can immediately tell why YouTube is recommending these videos to me. All of these videos are closely related to videos I have watched on YouTube within the last 48 hours. I watched one “CinemaSins” video, one Pink Floyd song, one Kygo song, a fail video and a Strokes song. While the songs that I played were actively sought out the other videos showed up on my subscription feed which made me click them.

If I scroll further down, it seems that YouTube still takes the same 3 or 4 videos from earlier and shows more related videos. Additionally, it suggests videos by la belle musique, a channel I am already subscribed to. what_to_watch2

While the suggested videos generally meet my taste, they don’t necessarily entice me to watch them now, especially since they don’t lead me to interesting new channels which I might want to subscribe to.

If I log out and visit YouTube in an incognito mode I am greeted with the following suggestions.

what_to_watch3

None of these videos have any relevance to my search or watched videos history but looking at the channel names (Ad Council, RadioKRONEHIT) one can assume that these videos have been placed on the front page because someone had paid for it.

Let’s take a look at the comments section of YouTube which has long been famous for its disastrous reputation. Apparently Google sorts the comments according to your Google+ profile which I, however, never use. This shows in the comparison between the two comments pages of a random video I clicked on.

Logged in:
comments1

Logged out:
comments2

As we can see, probably due to the lack of usable profile data Google has from my Google+ account, the comments shown are the same both when logged in and logged out. Again, the combative nature of YouTube comments shines through once more. It seems that no matter how sophisticated the algorithms in place, unless YouTube actively censors comments, it will always fight an uphill battle against the culture that has developed within the YouTube comments universe.

In conclusion, suggestion algorithms like the one used by YouTube can somewhat enhance your user experience to a certain extent, provided that you are okay with sharing enough data about yourself. Given the problems associated with filter bubbles and privacy concerns however, at present I still prefer a carefully manually selected subscriptions list to algorithmically derived suggestions.

4 thoughts on “Exploring Algorithms”

  1. Mihael, I’m currently using Ghostery on Mozilla’s Firefox browser. Ghostery pops up when you visit a new site and lists all the trackers it’s blocking, it’s terrifying the amount of trackers that pop up on some sites. I definitely concur with you about opting out and being wary of the filter bubble. Thanks for mentioning the news feed setting on Facebook as well, that was something I was completely unaware of!

  2. Thanks for sharing these links related to browser privacy, very useful! And Ghostery sounds great too Martyn, hadn’t heard of that one.

    I really liked the Verge article you shared here too: ‘Here’s how well Google’s search engine knows you’. I do think these privacy issues are really important, and will become increasingly so, however I couldn’t help thinking about how simple some of the ‘results’ of this profiling seemed. ‘Winter Sports’, ‘Latin American Music’ – those don’t seem very specific, and nor are they defining characteristics, are they? But perhaps the algorithms will get better, and ever more detailed profiling can take place.

    Nevertheless, I think there is an important theoretical point to raise here about identity: whether it is innate and waiting to be discovered (by algorithms, for example), or whether it is constructed (by society, for example). The ‘Google Gender’ category in particular got me thinking about this. So, does an algorithm ‘discover’ a gender that was already present, or are our actions subsequently categorised as either ‘male’ or female’ according to agreed societal norms. The first example in that article seems to show how algorithms are involved in the construction of gender?

    I wondered if you might want to reflect on some of these issues further. Is privacy premised on the idea that we have core innate characteristics, unique to us, that have to be protected? And if identity is constructed, rather than innate, does that mean we don’t need to worry about privacy?

    Nice breakdown of your YouTube activity here too!

  3. Thanks Martyn and Jeremy for your comments!

    @ Martyn
    Thank you, I will check out Ghostery. It seems like one more useful tool to protect privacy on the internet.

    @ Jeremy
    You raise some very interesting points here! I wholeheartedly agree that the issues of privacy need to be taken much more seriously in our public discourse, regardless of whether our data are collected from a government entity or a private business. You are right when you say that information like “likes Latin American music” or “likes winter sports” on their own seem rather inconspicuous, but the point to be made here is that over the more such seemingly useless factoids merge to create a stunningly accurate profile. This reminds me of how after the Snowden leaks people tried justify the warrantless NSA surveillance programs citing that they only collected metadata when in fact metadata (who did you talk to, when were you in what place, where did you use your credit card how much money, where did you go regularly, who was with you during those times, etc.) can potentially present a much more accurate description than the content of phone calls.
    I read recently that Uber could easily infer from their user data how likely someone was for having an affair with someone (and where) simply by looking at driving patterns of people regularly driving some place in the evening and driving back home in the early hours of the morning. Knowing that some private company can so easily obtain such sensitive information feels quite unsettling for me.
    I think that algorithms first discover identity but that as they become more aware of your existing identity and as they form a filter bubble for you they tend to influence you with their suggestions, possibly shaping your identity as you interact with their services more and more. In any case I think that privacy has to be protected if we want to live in a free society.

  4. Thanks Mihael for the article on Google ad profiles. I’m perversely happy that Google thinks my only interest is bicycles, which is so far off the mark. But I also want relevant search results, even if I don’t necessarily want Google to capture more of my data — I wonder if there is a middle way?
    Thanks Martyn for Ghostery!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>