A short list of data journalism sources and articles. This includes items most relevant to students in Stanford COMM 273 and is by no means complete. A longer list can be found at: smalldatajournalism.com/readings/.
These sites are frequently updated and are a consistent source of interesting data journalism stories or discussions.
After Nate Silver took his statistical brand to ESPN, the Upshot was created partly as a place to continue Silver’s quantitative political analysis, but also to “bring data into the daily news reporting”. The Upshot publishes interesting and readable slow-down-and-look-at-the-data articles on a daily basis, as well as some of the best cutting-edge visualizations.
The Nate Silver-led venture expands his quantitative approach to reach a general audience in sports, economics, health, science, and societal trends, while continuing to be a clearinghouse of analysis and data on the U.S. political race.
As you might expect, the blog of the leading investigative news organization is filled with inspiring and thought-provoking essays on the craft of investigations. Two of my favorite categories of IRE posts include Behind the Story, in which reporters provide what amounts to “how-to’s” for their investigations, and Transparency Watch, which keeps tabs on the ebb and flow of our right to public records.
One of the best resources for best practices and new ideas in data journalism and news application development.
Excellent explorations and thoughtful writeups on the art and practice of beautiful visualizations with integrity, by one of the leading teachers of data visualizations.
The father of modern information design moderates discussions on information design (and other esoteric topics).
Few is a prominent author in modern information display, and his blog is filled with useful critiques and practical examples in the field.
A fun blog by the New York Times’s Kevin Quealy, with lots of interesting insights about the drafting process behind some of the NYT’s best visualization work.
Musings on what data science is (and isn’t) by John Foreman, Mailchimp’s Chief Data Scientist.
Sometimes the best way to learn is to see how low you can go.
Books and Papers
These are either books (both free or for purchase), reports, and papers (of the academic kind).
Whether you’re reading the 1970, 1991, or 2001 editions, Meyer’s phenomenal work on bringing the scientific method to journalism is timeless. You can both be astonished at how much was done by reporters using dial-up modem, and be reassured that the core of investigative work remains the same no matter how technology changes.
If you are a journalist or are thinking of becoming one, you may have already noticed this: They are raising the ante on what it takes to be a journalist. There was a time when all you needed was dedication to truth, plenty of energy, and some talent for writing. You still need those things, but they are no longer sufficient. The world has become so complicated, the growth of available information so explosive, that the journalist needs to be a filter, as well as a transmitter; an organizer and interpreter, as well as one who gathers and delivers facts.
Just as relevant today as it was 30 years ago, Tufte’s principles of infographic design are valuable for print and the web. The Boston Globe’s blurb review put it best: “A visual Strunk and White.”
For many people the first word that comes to mind when they think about statistical charts is "lie." No doubt some graphics do distort the underlying data, making it hard for the viewer to learn the truth. But data graphics are no different from words in this regard, for any means of communication can be used to decieve.
A paper on how modern computing can greatly expand the reach and output of public accountability journalism.
Researchers and journalists are exploring new methods, sources, and ways of linking communities to the information they need to govern themselves. A new field is emerging to promote the process: computational journalism. Broadly defined, it can involve changing how stories are discovered, presented, aggregated, monetized, and archived.
A highly readable primer on the confusing nature of statistics.
This book is a sort of primer in ways to use statistics to deceive. It may seem altogether too much like a manual for swindlers. Perhaps I can justify it in the manner of the retired burglar whose published reminiscences amounted to a graduate course in how to pick a lock and muffle a footfall: The crooks already know these tricks; honest men must learn them in self-defense.
A free compilation of wisdom and practical advice by data journalists around the world.
Using data the job of journalists shifts its main focus from being the first ones to report to being the ones telling us what a certain development might actually mean. The range of topics can be far and wide. The next financial crisis that is in the making. The economics behind the products we use. The misuse of funds or political blunders, presented in a compelling data visualization that leaves little room to argue with it.
A report by Tow Center fellow Alex Howard on the wide-breadth of practiced data journalism.
If reporting does become more scientific over time, it could benefit readers and society as a whole. A managing editor might float an assertion or hypothesis about what lies behind news, and then assign an investigative journalist to go find out whether it’s true or not. That reporter (or data editor) then must go collect data, evidence, and knowledge about it.
This book purports to be a practical tutorial of how to perform complex data science within your average spreadsheet program. Preposterous, I know, but Foreman is both an excellent teacher of technology and a writer of “why we do things [like data science] in the first place.”
The truth is most people are going about data science all wrong. They're starting with buying the tools and hiring the consultants. They're spending all their money before they even know what they want , because a purchase order seems to pass for actual progress in many companies these days.
This is not a work of data journalism, per se, but David Simon’s year of embedding himself in the Baltimore homicide squad is a revealing look at all the human and political factors that can skew a number as seemingly final and immutable as the count of a city’s homicides.
...Supervisors in the homicide unit regard the white rectangle as an instrument necessary to assure accountability and clerical precision...The board reveals all: Upon its acetate is writ the story of past and present. Who has grown fat on domestic murders witnessed by half a dozen family members; who has starved on a drug assassination in a vacant rowhouse. Who has reaped the bountiful harvest of a murder-suicide complete with a posthumous note of confession; who has tasted the bitter fruit of an unidentified victim, bound and gagged in the trunk of an airport rental car.
A collection of writeups on the work of data journalism and research.
When there is no data, build your own. After discovering that no one was keeping track of Washington D.C.’s numerous homicide victims, Laura and Chris Amico simply started counting. The hand-built data source became a community resource as well as a source for data analysis.
A thorough step-by-step guide on how to cross-check documents with online databases to uncover a San Diego campaign finance scandal.
This 2013 Pulitzer Public Service winning story epitomizes the best of watchdog journalism, as its expose of a deadly public injustice sparked swift reform. But it deserves an award for its ingenious do-it-yourself way of collecting the kind of data that, in theory, was non-existent.
How exactly do you define “worst”? Here’s a high-level description of the factors reporters used to rank the performance of charity, and the amount of data they had to sift and analyze to do so.
Daniel Gilbert had spreadsheets with thousands of rows of data. “There was a story there, he was certain. But control-f would not find it.” After he took a class on how to use Microsoft Access, he could finally ask and answer the questions he knew were in the data. Gilbert’s meticulously detailed series on landowners being cheated by oil and gas companies ended up winning a Pulitzer.
This doesn’t really belong here but it’s a fun use of statistics.