Who needs data literacy?

Big data and artificial intelligence concept.

Is ‘data literacy’ a useful response to the datafication of contemporary life – not least of education itself? It all depends, not only on how it’s defined but also on how it is practically implemented.


Versions of ‘literacy’ have often been proposed as answers to problems of social policy. While media education in the UK has a longer history, ‘media literacy’ first emerged on the UK political stage in the late 1990s as an apparent answer to the ‘problem’ of screen violence. Some (including politicians like the late Tessa Jowell) had a broader conception, but eventually media literacy was reduced to a matter of functional skills, and a means of addressing a narrow set of concerns about internet safety.

More recently, we’ve had the problem of so-called ‘fake news’; and once again, media literacy – or alternatively information literacy or news literacy – has been proposed as the answer. And in the UK government’s latest policy proposals, we now have ‘digital literacy’ as the means of dealing with both issues, of safety and misinformation.

If you follow any of the links here, you’ll see that I have been banging on about this since the inception of this blog, and I was by no means the first to do so. My scare quotes here reflect a considerable scepticism about how these problems are defined in the first place, and the extent to which these ‘literacy solutions’ will be capable of addressing them. In practice, this invocation of literacy typically results in a lot of lip service and little action; and the versions of literacy that are proposed are mostly reductive and instrumental – quick-fix, individualistic solutions to much bigger and more complex problems.

data_shutterstock_carlos-castillaAnd now we have something called data literacy. It’s not quite the same thing as ‘digital literacy’ or any of the other literacies I’ve mentioned, although it certainly overlaps with them. For some, the term refers to a relatively functional set of skills – essentially, the ability to access, interpret and communicate information in the form of data. However, as we’ll see, there are more critical accounts that regard data literacy as a potential response to the datafication of modern life – that is, the way that growing numbers of everyday social interactions have become opportunities for the gathering (and subsequently selling) of data. In this context, data literacy might even provide a means of resisting the seemingly ubiquitous power of large technology (or media, or data) companies.

Of course, datafication precedes the advent of digital technology. The increasing use of data to represent personal characteristics and aspects of behaviour is part of a much longer-term development of record-keeping systems, dating back at least a century – systems that have sought to measure, categorise and exercise surveillance over mass populations. All such systems are inevitably infused with ideological values and assumptions; and the growing use of data is bound to shape personhood – how we think of ourselves and others, and the kinds of people we seek to become.

dataHowever, the ubiquitous use of digital technology takes this to another level: the creation and use of digital data increasingly channels how we define, construct and perform identity. And yet, by and large, we have very little control over how this data is gathered, organised, analysed and used. The system is opaque and inaccessible; and this can quickly lead to a sense of indifference and apathy – and indeed to cynicism. We know that our technological interactions have become a set of ‘data points’, and that we have very little privacy left: but there’s little if anything we feel we can do about this.

None of this is exactly a revelation. When writers like Shoshana Zuboff came along in 2019 and informed the world about this, she was telling us things that most students of media had already known for some time. The whole business model of contemporary media/technology companies is premised on the generation and selling of personal data: the internet would probably not exist without it.

In recent years, education itself has become a particularly lucrative market in this respect. Here, datafication is no longer just a matter of gathering and collating test scores: it is also about ‘trace data’ generated as students engage with learning activities, which is then used (through various forms of artificial intelligence) to shape the teaching they receive. Here, as in other aspects of social life, datafication is reinforcing other problematic tendencies that are already under way: the invasion of privacy and the increasingly detailed surveillance (and self-surveillance) of students and teachers; the reduction of teaching and learning to mechanical measurements; and the growing marketisation and commercialisation of public education. In the process, invisible but far-reaching changes are taking place in what counts as learning and as knowledge, and in how the fundamental purposes of education are conceived.

CoverHowever, the key question is: what might any of us – educators in particular – be able to do about this? And what might the teaching of ‘data literacy’ look like in practice? My thinking here has been prompted by reading a new book, edited by Luci Pangrazio and Julian Sefton-Green, entitled Learning to Live with Datafication: Educational Case Studies and Initiatives from Across the World. The book certainly lives up to the global claims of its title; yet in terms of identifying educational ‘initiatives’ – that is, concrete things that teachers might actually do – it is frankly rather disappointing.

Like most edited books, this one’s a mixed bag. We have a very useful overview of debates about datafication and education (by Rebecca Eynon); some interesting case studies of the relation between educational policy and technology use (in Latin America, by Cristobal Cobo and Pablo Vargas, and in the Netherlands by Niels Kerssens and Mariette de Haan); and a detailed account of what datafication means in relation to the ‘micro-politics’ of Australian schools (Neil Selwyn et al). Yet in other chapters we hear the dull thudding of theoretical sledgehammers being used to crush tiny nuts; and there is a fair amount of stodgy academic prose. (If you don’t have time to read a whole book, you could do worse than check out some shorter articles by the editors such as this one or this one.)

For the most part, the book takes a familiar academic sociological position: it floats above the world, describing, analysing and theorising the status quo, in a way that sometimes appears almost agnostic. Many of the contributions conclude with a call for teaching some form of ‘data literacy’, often with equally vague gestures towards ‘critical pedagogy’; but with two exceptions, there is barely any indication of what this might actually involve, and indeed what problems and difficulties it might entail. It may be that I’m looking in the wrong place; but the editors (in their introduction and conclusion), and I suspect some of the contributors, also seem to recognise the limitations of this.

The two exceptions, however, are very interesting. Jeremy Grosman and his colleagues in Belgium describe an approach to teaching about the ‘recommender systems’ of YouTube videos, and the social values that are embedded within them; while in the following chapter, Hyeon-Seon Jeong and her colleagues discuss what happened when they adapted the Belgian materials for use with South Korean fifth and sixth grade students. These are essentially simulations, where students are put in the position of designing algorithms for YouTube, and then applying and evaluating them (via talk and pen-and-paper). While the Korean materials are much simpler (given the age of the children), these studies show that fairly opaque aspects of the technical infrastructure can be eminently ‘teachable’, without requiring access to high technology, and in a very active, experiential way. The classroom evidence shows that students were engaged in some very in-depth debates about the nature and value of different types of data.

imagesWhile some other contributors seem to imply that a wholly new approach is required for teaching about these new media, the strategies and concepts here make perfect sense in terms of good old media education. I have no doubt that ‘data literacy’ should entail new technical understandings of data infrastructures, and of the wider operations of ‘data capitalism’. But as I have argued elsewhere, we don’t need to reinvent the wheel.

Thus, although the primary conceptual focus in these two examples is on media institutions – the political economy of these platforms – students are also required to reflect on their own personal experience as audiences or users. As in the best media education, they are encouraged to understand how their own everyday media practices are shaped by the wider social, economic and technological context. Pedagogically, the activities themselves remind me of nothing so much as the simulations that we used to use in the 1980s to teach about aspects of ‘old’ media such as news, popular music and the film industry. These examples begin to show what might be possible, although clearly we need many more.

Of course, education of this kind is only ever going to be a partial solution to the issues raised by datafication. We need both education and regulation: as I’ve argued before, it isn’t an either/or choice. In most instances, however, proposals for regulation tend to focus on curbing some of most obvious ‘harms’ associated with digital media – even though, in many instances, evidence of actual harm has proven very difficult to establish. Campaigners in this area also tend to use the idea of childhood as a kind of proxy. Talking about harm to children is a (melo)dramatic way of mobilising concerns that are actually much broader in scope; and people are less likely to oppose restrictions on children when they would be much more wary of restrictions on adults.

s960_onlineSafetybill_main_Gov-UKThere’s certainly a debate to be had here, but the key question is whether anything is likely to happen as a result. Most direct forms of government intervention are highly problematic. The Online Safety Bill soon to be debated in the UK parliament focuses only on the most blatant forms of misbehaviour and criminality committed by larger companies; and many of its key terms (such as the category of ‘legal but harmful’) are quite inadequately defined. Yet relying on companies to regulate themselves is equally problematic. The big media/technology companies are all too ready to make the right noises in public; although when you look at what they say in less guarded situations, they are much more flagrant in opposing any constraints on their behaviour. If governments can’t even get these companies to pay their taxes, it seems very unlikely that they will get them to submit to any meaningful form of regulation of content.

Of course, policy-makers are bound to be slow in responding to technological change. It’s partly for this reason that some accept the need for making consumers or users more competent, aware and critical in their engagement with technology. However, as in so many other areas of social policy, there is always the danger of passing the buck to education in order to forestall the need to do anything more difficult.

In the case of media literacy, in the UK but I suspect elsewhere, the fundamental hypocrisy here is on the part of government itself. We have had report after report arguing for digital/internet/media literacy, while the government has been steadily removing media education from the curriculum, and strangling the subject of Media Studies, where specialist teaching of these very issues has been based. In the absence of more systematic and comprehensive forms of media education, the main providers of data literacy – as has been the case with ‘news literacy’ – are likely to be the philanthropic educational arms of the very same technology and marketing corporations whose activities make data literacy so necessary in the first place. The limitations of this are obvious.

There’s a broader discussion to be had about the use of the term ‘literacy’ here, which I’ll take up in a future post. But as with previous ‘literacy solutions’, there is an evident danger that data literacy will become reductive – merely a matter of functional skills, or generalised warnings about safety. There are good reasons to doubt the effectiveness of such an approach. At least in principle, most people know that data is being gathered about them, and most of them don’t really care. They see it as a necessary trade-off for ‘free’ services, and there are very few accessible alternatives available anyway. For example, GDPR has resulted in a plethora of messages asking if we accept cookies: but how many of us take the time to investigate exactly what they say they are collecting and what they’re going to do with it? Most of us just want the information, and we want it now.

I would accept that there’s a certain irony in looking to schools to teach data literacy at a time when schools are increasingly opening their doors to the data-harvesting practices of commercial corporations. It would be an interesting data literacy project for students to analyse how their own schools are gathering and using data about them – although somehow I think that’s unlikely to happen…

Nevertheless, if we really intend to address the broader datafication of society and of education, then data literacy should be about much more than teaching kids how cookies and algorithms work, or indeed teaching coding. When we think about influencers on You Tube, or public debate on Twitter, or self-display on Facebook or Tik Tok… these are the things we need young people to be analysing and reflecting upon, not just how to adjust their privacy settings. Ultimately, the issue of data is not only about mathematics: it’s about culture.