Thinking about theory

The idea of using images to teach and learn with isn’t a random fad in the  mold of all the new ideas and practices academics are told they should absorb. It is rooted in a range of theories which, when combined, generate greater student engagement with academic content and which prompt active learning processes. The use of images in communication is commonplace beyond the academy in creative advertising (think of all the Greenpeace, Amnesty and Save the Children campaigns), newspapers, magazines and many more media. The ratio of text to image in communications has shifted substantially away from text only, towards the combination of text and images. It is normal in the outside world, and with sound scholarly reason. This post offers a partial and hopefully accessible overview of the theoretical underpinnings to help readers navigate that scholarship.

Guidance through a maze. Copyright



This posting looks at a key theory of visual learning, that of Multimedia Learning, or MML. The idea behind MML is simple and reflects an older and familiar assumption or wisdom, depending on your point of view. We’ve probably all heard a variation on the aphorism that ‘a picture’s worth/paints a thousand words’. This refers to the idea that messages can be understood when transmitted through the medium of an image. This is the case from the moment we start to see, and continues until our learning becomes dominated by text. We interpret not just images but our entire worlds if we have the privilege of sight. This means that it is problematic to say that only a certain percentage of people are ‘visual learners’. We all are, or we would keep walking into solid objects and taking wrong turns. MML is a branch of scholarship that posits we understand better and more with images and text, than with text alone. This applies whether reading a newspaper, following the Tube, understanding a war, poverty, sexism, racism or anything our brains are faced with. It is particularly apposite because we live in the most visual era of human evolution.

Key digital platform supporting dissemination of imagery. Copyright David Roberts 2015

I should be really clear at this point that MML research and this Community of Practice (CoP) do not reject text in toto. This is about attaining a balance of the textual and visual through redistribution of data for efficiency of mental processing. MML argues that because we process information through our ears and eyes, we should balance data delivery though both, instead of overloading one with text whilst underexploiting our visual potential. ‘Images for the eyes, words for the ears’ might oversimplify the process, but perhaps it is a useful jumping off point for engaging with this realm of research, led by Prof Richard Mayer.

Images for the eyes, words for the ears. Copyright David Roberts 2015

The idea is that we are ‘dual processors’ of information, and it isn’t hard to see why someone might make such a claim – eyes and ears. The use of this terminology in MML derives in large part from some very substantial research on working memory over the last 50 or so years, led by Allan Paivio, a celebrated Canadian scholar who published more than 200 articles and books and brought clarity to the idea that our working memory (as opposed to our long term storage) was ‘split’ over imagery and text and that each channel had a limit on how much it could process in one go. Working memory is sometimes likened to RAM on a PC, with long-term recall most often compared with the hard drive full of files and videos. Pavio’s work highlighted the limits of working memory. To describe how short working memory can be, scientists refer to a ‘digit-span’ of around 9 numbers, plus or minus 2.  This means most of us can’t easily remember multiple mobile phone numbers.

Symbolizes the limits to our working memory, counted in our ‘digit-span’, or the number of phone numbers we can recall. Copyright

Dual processing is directly related to Cognitive Load Theory (CLT), also developed in the work of Paivio and Mayer. Briefly, CLT posits that to ease pressure on limited working memory, input to the brain should be spread effectively – bifurcated if you will – through our ears and eyes. Paivio found that things we learn about in image form – a type of tree, an animal – are stored in both visual and verbal code – words that attach to images. But words are only encoded verbally. In other words, we hold onto an image in more ways than we do words, and can as a result recall them more easily. This is where split load meets dual processing. Combined, the effect is mentally soothing and empowering at the same time. Instead of one part of the brain doing all the hard work whilst the other is neglected, both eyes and ears work in balanced harmony to split and absorb input – lecture content, for example. So instead of overloading limited working memory and processing power with words whilst our eyes and visual processing go unheeded, balancing delivery between both channels takes the weight off text processing and brings online our visual processing abilities. When we use text only or even mainly, it might be akin to a car engine trying to run on only two cylinders.

Using only text processing is like an engine running on only two cylinders. Copyright David Roberts 2016

This has been a simplified account of the underpinning scholarship and arguments involved in the phenomenon of Multimedia Learning theory. I’ve published on this in a little more detail here, but the best sources are the originals, and I would thoroughly recommend Richard Mayer’s 2014 volume. This edited tome offers comprehensive specialist-level insight if you want to delve deeper into this method.

Leave a Reply

Your email address will not be published. Required fields are marked *