Data journalism is now a fashionable expression in the media, and calling yourself a data journalist or someone who believes in one is something like buying a Louis Vuitton bag or a BMW — some sort of a status symbol to be flaunted among humbler forms of reporters armed with a pen and a notebook.
We grew up thinking facts are important in the business of news, and data are only facts (just like a handbag is only a handbag even if it is a Birkin or a car is a car even if it is a Maserati). I personally think “data journalism” as an expression makes sense only if there is a careful examination or crunching of data that reveal previously unknown patterns so as to create a story that actually can result in a headlinable narrative in a credible context.
But what I have seen in the past few years is the emergence of what I call “non-sequitur” data journalism — one in which there could be an absence of basic journalistic logic. The data does not necessarily result in a narrative. If it does, it is a weak one. Moreover, it is important to see if the data creates a meaningful pattern.
Sometimes, data can mislead more than lead, as I discovered over the past week in three separate instances.
First was the venerable daily, The Hindu, presenting to us data on “crimes against the state” and “anti-national” cases. Uttar Pradesh seemed to lead such crimes on the basis of National Crime Records Bureau data even as such crimes seem to have decreased in 2018 over 2017 in all of India. The print edition of the newspaper carried a nice table, but it made no mention of two vital facts. First, Chief Minister Yogi Adityanath’s hardline rightwing state government has a controversial police force amid a charged political atmosphere after he took power in 2016 — suggesting that police may have slapped cases on political opponents or dissidents. Secondly, registered criminal cases do not necessarily mean the crimes in question have been committed; convictions need judicial endorsement.
The newspaper itself weighed in a day later with an opinion piece that mentioned that tracking of cases can vary from state to state. It also noted that “sedition” cases had increased. However, it made no mention of the UP government under Adityanath.
Now, Techopedia.com offers a sound definition of “data journalism” as “the use of data and number crunching in journalism to uncover, better explain and/or provide context to a news story. According to the Data Journalism Handbook, data can be either the tool used to tell a story, the source upon which a story is based, or both. It often involves the use of statistics, charts, graphs or infographics.”
I agree with that definition which suggests that data ought to help narrative journalism, and is in itself not necessarily journalism. Charts, graphs and infographics can be the bells and whistles but not the real deal. My basic rule: if the takeaway from these charts cannot be easily explained in a lucid paragraph in English or some human language, it is more data than journalism.
In the second instance of data abuse, I saw a radical feminist and a men’s rights activist (also a woman) calling each other names on Twitter. The former called the latter, who is known for her dislike of false cases of dowry harassment and domestic violence, as a “rape apologist” while adding that rape cases were being under-reported. In a sense, both these women have a point. But the journalist studying rival activists must remember that just because rape cases are under-reported, it does not necessarily mean there are no false cases or complaints on dowry, rape or domestic violence for the sake of revenge or money.
In the third instance, I saw BJP supporters hitting at actor Deepika Padukone for standing with anti-government protesters at the Jawaharlal Nehru University (JNU). They proffered box-office data from the first weekend after the release of her ‘Chhapaak’ (based on an acid attack victim) and showed that to be a fraction of the collections from the Ajay Devgn-starrer “Tanhaji” (a historical epic based on a Maratha warrior). The data I read pitted Chhapaak’s Rs 7 crore against Tanhaji’s Rs 26 crore in the domestic market.
Here’s the problem. You cannot compare a male-centric war epic made on a high budget with a socially sensitive female-centric arthouse movie made at lower costs. Tanhaji’s estimated budget is Rs 150 crore and Chhapaak’s is about Rs 40 crore.
Also, it turned out that overseas revenues from Chhapaak totalled Rs 7 crore as well — indicating the domestic market is not everything.
Motivated or shoddy use of data or a poor sense of context or purpose can make data journalism less glorious than it should be. There are occasions when I think that old-fashioned pen-and-paper reportage with plenty of context and common sense can get the better of nerdy number-crunching.