All posts by mkiseloski

generation y. vienna. tech enthusiast. world traveler. leadership developer. actor. investor. polyglot. night owl.

Search Algorithms and Learning – A Digital Essay

Introduction

The internet may very well be the most important invention of our lifetime. Very few other developments have changed our lives so radically in so many different aspects, from how we work and collaborate, to the way we do commerce and deal with our finances; from how we consume entertainment and connect with our peers through social networks to the way we create and share knowledge.

Never before in the history of humankind has there been more information available, nor has it been easier to access it from the comfort of our own home. Knowledge is no longer only in the hands of the privileged few but it is increasingly being shared by the many. New Web 2.0 technologies like blogging or social networking services give everyone a voice and have created the potential for the democratisation of knowledge. Algorithms have been at the heart of this transformation.

Since its inception the wealth of information available on the net has been growing and so has the need for its organisation and the differentiation between the relevant, factual and the inane. In the early days of the web this task was performed by human beings, with companies such as Yahoo! starting out as manually curated web directories (Wordstream). Soon enough, however, search algorithms entered the scene which were better equipped to deal with the rapidly increasing number of websites. In 1996 Sergey Brin and Larry Page developed a new kind of search algorithm, PageRank, which ranks websites based on the number and quality of links between each other.  More important sites were assumed to receive more links from other pages and were thus ranked higher (Wikipedia). This innovative algorithm worked so well, it spawned the foundation of Google which quickly grew to become the most successful search engine in the world. Google has since become so deeply ingrained within our culture that the verb “to google something” which refers to the act of searching the web for information has even been accepted into the Oxford Dictionary.

The use of search engines is now ubiquitous and has become such an integral part of our daily lives that for many of us it is hard to imagine life without them. With the world’s information at our fingertips, just a few search terms away, it does not seem far-fetched that the internet and search algorithms in particular, play an essential part in the way we discover, share and retrieve knowledge. This essay sets out to explore the effects that search algorithms have on our memory; how they might be specifically utilised in an educational context to promote and facilitate learning, and what issues we need to stay aware of if we want to avoid the trap of blindly trusting the knowledge such algorithms present to us as truth.

Connectivism and Transactive Memory

The effects of networked technologies on human learning ought to be examined within a framework that incorporates the existence of such technologies. Several theories have been proposed about the nature of learning such as Behaviourism, Cognitivism and Constructivism, all of which to some degree consider knowledge to be “an objective, (or state) that is attainable (if not already innate) through either reasoning or experiences.” (Siemens 2005, p.1). According to Siemens, their limitations lie in the view that they do not account for the possibility of learning to occur outside of the person via means of technological storage or manipulation, nor are they able to explain how learning happens within organisations (Siemens 2005). Siemens therefore proposes an alternative theory of learning, dubbed Connectivism, which he considers to be the “integration of principles explored by chaos, network, and complexity and self-organization theories.”  (Siemens 2005, p.1). It acknowledges the challenges of today’s rapidly evolving knowledge landscape where a correct decision based on today’s information might be incorrect tomorrow as new information is constantly being acquired. The intent of all connectivist learning activities, the attainment of accurate, up-to-date information is achieved through a process of connecting specialised nodes or information sources. Connections to these nodes have to be nurtured and maintained to facilitate continual learning. Learning can reside outside of a person’s mind, either within other humans or non-human appliances (such as databases) and identifying what to learn from whom (or what) becomes a core skill (Siemens 2005).

The idea that learning can reside outside of oneself has been described in 1985 by psychologist Daniel Wegner with the concept of “transactive memory” which posits that two (or more) people can create a shared storage of memory between each other with each person holding different knowledge than their partner. Information can then simply be recalled by asking the other person about it (Wegner et al. 1985). Intuitively, this seems to make sense. Suppose a good friend of yours is a car mechanic and you barely know how to drive a car. If your car’s engine suddenly starts stuttering you might not know the reason but you know that you can ask your friend who is an expert. The knowledge rests outside of your own mind yet thanks to your good relationship with your friend, you are able to gather the necessary information to find out which part needs to be replaced, how much it is likely going to cost and perhaps even where you can get the best deal.

In the connectivist view your friend is considered a node with specialised up-to-date knowledge and nurturing the connection (i.e. friendship) with your friend lets you retrieve this transactive memory as you need it. Within the framework of connectivism the node does not necessarily have to be a human being. As the information is likely available on the internet as well, online access to search engines such as Google or Bing may in fact be a source of transactive memory itself. These search algorithms act as a gateway, selecting and qualifying the nodes that are most likely to serve you the information you need.

Search Algorithms and their Effect on Memory

In order to test the cognitive consequences of having information at our fingertips, Betsy Sparrow et al. devised a series of experiments which tested the participants’ ability to recall information based on their expectation of the information being erased or saved in the computer for later retrieval. The results showed that when participants were expecting the information to be erased they were more likely to recall the facts they had been asked to remember than when they were assured that they would be able to retrieve them later on. In another experiment the study’s participants were again asked to recall a number of trivia facts but in addition were given folder names where they would later be able to find aid facts. Surprisingly, the participants had better recollection of the places where the statements were kept than of the actual statements themselves. The researchers found evidence that people often don’t remember “where” when they remember “what” but that they are more likely to remember where to find the information rather than the details of it when they can reasonably expect said information to be continuously available to them, as is the case with constant internet access. They concluded that our minds are adapting to new computing and communication technologies and that search algorithms like Google make it possible for us to use the internet a primary form of external or transactive memory. We are now just as dependent on the knowledge Google provides as we are on the knowledge we gain from friends or co-workers. Google has essentially become an all-knowing friend (Sparrow et al. 2011).

There are trade-offs between committing knowledge to our head and accessing it from “out there” in the world. Donald Norman states that knowledge in the head can be very efficient as it foregoes the need to search, find and interpret the external information but it might require considerable amounts of learning to be usable and due to its ephemeral nature (“out of sight, out of mind”) needs to be deliberately kept in mind, either through repetition or some reminding external trigger. Knowledge in the world on the other hand, is self-reminding but it relies heavily on the continued physical presence of the information (Norman 2002). From a connectivist perspective one might argue, however, that Norman considers knowledge to be more or less stable; that once we put in the effort to learn something in our head we can reap the benefits of high efficiency and location-independent retrieval. While certain knowledge will never change (such as the laws of nature or mathematics) there are other areas such as medicine where today’s knowledge of how to treat certain diseases might be obsolete in just a few years’ time. Trying to keep the knowledge in the head up-to-date will require ongoing mental efforts, somewhat diminishing said efficiency advantages. It seems that a balance needs to be struck between these two types of knowledge. Learning to distinguish when it makes more sense to internalise knowledge and which knowledge is better kept externally is turning out to be an increasingly important skill. One can argue that the networked nature of the internet is increasing the speed at which knowledge is shifting and that people choose to use the internet as a transactive memory not so much because technology now enables them to do it rather effortlessly but because they are adapting their strategy to better cope with an increasing and ever faster changing body of knowledge.

Search Algorithms and Discovery-Based Learning

Even though search algorithms are nowadays a ubiquitous tool for learning in our everyday lives they are not nearly as often utilised in an educational context, especially in schools. Nowadays, academic research on a topic is very likely to begin with typing the topic into a search engine. Children in most schools, however, still remain in a standard model classroom where the teacher stands in front of the class and conveys facts from a textbook, in accordance with the curriculum. The students hardly ever discover knowledge on their own but instead are asked to memorise the information given by their teachers because they need to be able to recall it when they get tested on it later.

There seems to be a disconnect between how people learn in the digital age and the model schools employ. What happens if we were to let children use the technology that is available to us and have them discover the facts and relationships that make up the world all by themselves, simply by asking them to look for answers to the big questions? Sugata Mitra, an Indian professor and a proponent of minimally invasive education (MIE), is the originator of the now famous “Hole in the Wall” experiments where he placed computers in safe, public locations in rural India and had the local children teach themselves (and their peers) how to use them all on their own without any guidance whatsoever. Surprisingly, the children that participated in these experiments did become computer literate and it was suggested that in war torn or disease ridden places, where the necessary educational infrastructure is lacking, these MIE learning stations pose an adequate substitute (Mitra et al. 2005).

Taking this concept further, Mitra proposes a redesign of the way children learn in school. He envisions so called Self-Organised Learning Environments where the learning process is allowed to self organise; where the teacher sets this process in motion and then stands back to watch and let learning happen (Mitra 2013, TED). Children then use search engines and the transactive memory of their peers to discover, learn and generate understanding.

While the promises of unsupervised, self-directed learning, such as increased learner motivation, deeper understanding or decreased costs seem highly appealing, they are by no means uncontested in the literature. Kirschner et al. criticise the lack of empirical evidence for the effectiveness of discovery-based teaching approaches. They argue that evidence from controlled studies almost uniformly supports strong, direct guidance rather than minimal guidance for novice and intermediate learners and even for advanced learners see guided approaches to be almost equally effective as unguided ones. Furthermore, they point at evidence that unguided instruction may even have negative results when students acquire misconceptions, incomplete or disorganized knowledge (Kirschner et al. 2006). Moreover, Sweller calls into question the effectiveness of using problem solving as a way to acquire the schemas necessary for problem solving expertise. The heavy cognitive effort involved with searching for solutions may actually impede schema acquisition compared to analysing existing problem solving strategies (Sweller 1988).

Therefore, it seems unlikely that completely unguided approaches will take over our educational system anytime soon. However, technology is becoming smarter and more powerful every year and computational knowledge engines such as WolframAlpha are able to solve increasingly difficult complex problems for us. There will still be a need for educational guidance in the future but it will have to incorporate the technology we are growing dependent on in our learning.

Search Algorithms as the Gatekeepers of Truth

When we use search algorithms to learn something we usually do not question the veracity of the results. Since search algorithms essentially automate the sorting of data, one can easily assume that their results are in fact a neutral and objective representation of the information available on the internet. However, most search engines (with the exceptions of those few with a heavy focus on privacy like DuckDuckGo), actually serve up personalised search results dependent on a variety of factors such as geo-location or browsing software. This happens regardless of whether someone is logged in or out (Pariser 2011). An unintended consequence of this personalisation is that the user is presented with skewed search results that the algorithm deemed relevant to him or her, often without the user even being aware of such an intervention. Search algorithms can create user profiles based on the user’s browsing and clicking history. Pages that the user is more likely to click are ranked higher than pages that might conflict with someone’s personal or political preferences. Over time this drowns out other important, uncomfortable or challenging points of view, leaving the user in a so called “filter bubble“, invisible and uncontrollable to the user  (Pariser 2011, TED).

If we want to embrace the use of search algorithms as an integral part of our learning experiences we have to become aware of the power we allow them to hold when we accept them as the gatekeepers of truth.  As Gillespie puts it, algorithms ought to be considered a new knowledge logic, “posed against, or even supplanting the editorial as a competing logic. […] Both struggle with, and claim to resolve, the fundamental problem of human knowledge: how to identify relevant information crucial to the public, through unavoidably human means, in such a way as to be free from human error, bias, or manipulation.” (Gillespie 2012, p.26).

Algorithms are not the perfect solution to this fundamental problem of human knowledge and there might not ever be one. In a connectivist world we need to be able to trust the knowledge search algorithms serve us. We just have to make sure, however, that this trust must never be blind.

 

References

Gillespie, T. (2014). 9 The Relevance of Algorithms. Media technologies: Essays on communication, materiality, and society, 167. Available at http://www.tarletongillespie.org/essays/Gillespie%20-%20The%20Relevance%20of%20Algorithms.pdf. Retrieved 26.04.2015.

Kirschner, P. A., Sweller, J., & Clark, R. E. (2006). Why minimal guidance during instruction does not work: An analysis of the failure of constructivist, discovery, problem-based, experiential, and inquiry-based teaching. Educational psychologist, 41(2), 75-86. Available at http://www.cogtech.usc.edu/publications/kirschner_Sweller_Clark.pdf. Retrieved 26.04.2015.

Mitra, S., Dangwal, R., Chatterjee, S., Jha, S., Bisht, R. S., & Kapur, P. (2005). Acquisition of computing literacy on shared public computers: Children and the” hole in the wall”. Australasian Journal of Educational Technology, 21(3). Available at http://www.ascilite.org.au/ajet/ajet21/mitra.html. Retrieved 26.04.2015.

Norman, D. A. (2002). The design of everyday things. Basic books.

Oxford Dictionary. “to google”. Retrieved 26.04.2015.

Pariser, E. (2011). The filter bubble: What the Internet is hiding from you. Penguin UK.

Siemens, G. (2005). Connectivism: A learning theory for the digital age. International journal of instructional technology and distance learning, 2(1), 3-10. Available at http://www.itdl.org/Journal/Jan_05/article01.htm. Retrieved 26.04.2015.

Sparrow, B., Liu, J., & Wegner, D. M. (2011). Google effects on memory: Cognitive consequences of having information at our fingertips. science, 333(6043), 776-778. Available at http://scholar.harvard.edu/files/dwegner/files/sparrow_et_al._2011.pdf. Retrieved 26.04.2015.

Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive science, 12(2), 257-285. Available at http://dcom.arch.gatech.edu/old/Coa6763/Readings/sweller-88a.pdf. Retrieved 26.04.2015.

TED. “Eli Pariser: Beware online ‘filter bubbles'”. Feb. 2011. Retrieved 26.04.2015.

TED. “Sugata Mitra: Build a School in the Cloud”. Feb. 2013. Retrieved 26.04.2015.

Wegner, D. M., Giuliano, T., & Hertel, P. T. (1985). Cognitive interdependence in close relationships. In Compatible and incompatible relationships (pp. 253-276). Springer New York. Available at http://scholar.harvard.edu/files/dwegner/files/wegner_giuliano__hertel_1985_cognitive_interdependence.pdf. Retrieved 26.04.2015.

Wikipedia. “PageRank”. Retrieved 26.04.2015.

Wordstream. “The History of Search Engines – An Infographic”. Retrieved 26.04.2015.

Recap: Week 12 – Final Lifestream Summary

The Education and Digital Cultures 2015 course has finally come to an end and I would like to take this opportunity to look back at the last 12 weeks and thank my colleagues and my tutors Sian and Jeremy for this highly interesting journey into the land of cyborgs, algorithms and artificial intelligence.

Keeping a lifestream blog has been a new experience for me and while it did take some time, effort and, dare I say, frustration to set it up correctly, over the duration of the course it has grown into an excellent collection of resources for me to come back to.

Adding tags to every post was an incredibly helpful way to organise my lifestream according to different parameters such as source, type or topic. The tag cloud on the right shows a nice visual representation of this endeavour and gives us plenty of insights into the lifestream as a whole.

As we can see IFTTT is the biggest tag, meaning that around 3/4 of my posts were automatically populated from social media sites like Twitter, YouTube, Vimeo and Pinterest with the help of the IFTTT service.

In terms of types of content I shared a variety of videos and articles that I thought were really interesting and relevant to the discussions we were having during the course. In addition to the weekly recaps and my comments on my peers’ and my own blog I also posted two how to guides to help my fellow students set up their lifestream. Furthermore there are one-off postings like the digital artefact from block 1, the mooc ethnography from block 2 and my reflections on the tweetorial from block 3 of our course.

I didn’t really know a lot about the topics covered in this course before the semester started. It was therefore a very pleasant surprise for me that the themes we discussed were extremely fascinating and they challenged me to think about issues I had never considered before. I’ve been hearing about artificial intelligence all my life but I had never actually contemplated how vast its implications are going to be, not just in the field of digital education but for humankind in general and society as a whole.

This course has made me think about what it means to be human in an age where the lines between biology and technology are being increasingly blurred with biohackers substituting and even adding new senses to our biology. We are living in a time where computer algorithms are not just taking over more and more tasks that humans used to do (such as trading in the financial markets) but thanks to big data are now able to do things that weren’t even possible before, like personalised search results and video suggestions based on profiles of people similar to you.

Technology does an exceptional job in connecting people and I see a lot of potential in it to facilitate education for everyone, as shown by the development of MOOCs. Moreover, thanks to learning analytics algorithms could uncover as of yet unknown patterns in how the mind works. All this, however, comes at a hefty price: Privacy. The more information we are willing to quantify about ourselves the more we allow certain entities to know about us. If we don’t ever want that knowledge to be used against us we have to become much more conscious about the issue of privacy and data security going forward. How we will react to these issues will be one of the defining moments of the 21st century.

Recap: Week 11

Now that the taught section of the course has come to an end my main objective for the last two weeks of the course is to clean up the lifestream blog for final submission and coming up with a research question for my final essay.

Although I hadn’t planned to add additional content this week I couldn’t pass up on the opportunity to share this excellent TED talk on YouTube by Stanford professor Fei Fei Li on the newest advancements in machine learning, an overarching theme in this course and one of the most exciting topics I’ve learned about in a long time. All throughout this course I’ve been wondering if once artificial intelligence surpasses our own where this will leave us humans and our human education. Might such a takeover happen before education (at least in its institutionalised form) even embraces some of the radical changes promised by Technology Enhanced Learning? Will there even be a need for education in a world where all important cognitive tasks are performed by sentient machines or will education be an optional activity for those so inclined similar to what learning a musical instrument is in today’s society?

I’m currently in the process of going through every post of my lifestream, fixing links, embeds and tags. Once I am finished with that I will be posting a final summary by the end of next week.

Recap: Week 10

Another week has gone by far too quickly and looking at  the content in my lifestream this week the main theme “putting it all together” seems rather fitting as I’ve been collecting interesting material that covers not just block 3 of algorithmic cultures but topics from the whole course.

The first post this week was an incredibly well done sci-fi short film I saw on Vimeo, called “Sight” on how augmented reality and gamification might drastically change the way we live and interact with each other in the future.

Next I linked to an interesting article I found on Twitter in the International Business Times that discusses the influence of content-curation algorithms and their inherent biases. It shows that people are often unaware of algorithms working in the background and when learning about it they often exhibit quite “visceral” reactions, followed by a change in their behaviour to accomodate for the algorithms.

Another great longform article I found on the Verge discusses the possibility that memories might be able to survive outside of the brain which reminded me of the discussions we had when we explored posthumanism in the earlier weeks of the course.

The following post was an in-depth reflection on last week’s tweetorial where I looked at what we can learn from tools like Tweetarchivist and Keyhole which algorithmically analysed the conversation we had on Twitter.

This week a couple of new talks from the latest TED conference showed up in my YouTube newsfeed and one of them in particular caught my attention as it was a new talk by neuroscientist David Eagleman whom I had previously talked about in this post. While his main talking points were the same as in his previous video he offered some new results that look very promising. His sensory substition vest, for example, seems to work very well in teaching a deaf person to hear. I am still just as excited as the first time I heard about this research. Maybe sensory addition really is just around the corner.

Finally I linked to a short animated TED-Ed video on whether robots can be creative. This video explores algorithms that to come up with pieces of music which they then iteratively compare with music that humans consider to be “beautiful”, discarding the patterns that do not match and keeping the patterns that do. The results are remarkable to say the least. To an outsider the music these algorithms create sounds very much like it has been composed by a human being.

Now that the end of the course is drawing closer it is time to turn my attention to the final assignment. In the meantime I would like to say thank you to our exceptional course tutors Sian and Jeremy and my wonderful colleagues for the many thought provoking and highly engaging discussions I’ve been blessed to be a part of over the last 3 months. :)

Reflections on the Tweetorial

Last week’s tweetorial was the first time I participated in a so-called “tweetstorm” on a topic. As I am not really an active Twitter user outside of this course it was a new experience for me. Reflecting on it I’ve noticed that the 140 character limit per tweet has very interesting and real consequences for a discussion and my own participation within it. Obviously, the limit causes people to express their opinions in very brief statements which can leave room for interpretation. To counteract  the limitations one can keep sending out tweets to get one’s message across – in my mind a rather inefficient way compared to other mediums such as blog posts. The consequence is that it will likely clog up the twitter feed and potentially drown out other voices. Another way is to think hard about how to best come up with an answer that is deliberately vague and open to interpretation yet still conveys meaning. I wasn’t too comfortable with overshadowing the conversation with too many messages (and I unfortunately couldn’t participate on Friday) but I tried to come up with messages that were appropriate for the medium.

tweet

The Tweet Archivist and Keyhole analyses of our tweetorial show a discrepancy in the number of posts people were willing to share. While I was on the lower end with 6 tweets, the top tweeter by far was my colleague PJ with 59 contributions.

piechart

As I was unavailable for most of Friday I unfortunately missed the peak of the discussion.

timeline

Once I came back, I was feeling overwhelmed by the fragmented nature of tweets and retweets on a variety of topics. People hadn’t just stuck to the questions Sian and Jeremy had prepared but instead switched to other topics as well, such as the topic of learning to code – as seen in this keyword cloud:

topics

Looking at the data sets that these analytical tools generate I can’t help but question their value in terms of how they can help us learn.  The word cloud is the best example of how data needs to be interpreted to create information, let alone to generate knowledge or wisdom. Atomising the conversation and displaying the frequency distribution of words visually might give an outsider a quick overview of the topics discussed but there doesn’t seem to promote much learning. Perhaps analytic algorithms will in the future be able to extract the meaning of such conversations and assist the learner in getting the gist of it but in their current state these analytical tools don’t seem to offer much value in terms of content.

There is, however, an interesting observation that we can glean from the analysis on a meta perspective: the social dynamics of the conversation.

user mentions

Compared to the tweet count from earlier we can see that Sian, even though she only posted half as many tweets as PJ was being addressed the most. As she is the tutor in this course this does not seem all too surprising but considering the scope that learning analytics could be scaled up to, identifying such influencers might turn out to be valuable meta data.

Overall I have to say that the algorithmic snapshots of our tweetorial have not given me any particularly valuable insight that could significantly support me in learning from the tweets. Perhaps more context aware algorithms will one day be able to better distill the meaning of such conversations. For now, the meta data, particularly regarding the social structure of the conversations, are the most useful parts generated by these analytic tools.