As a prelude to WAN-IFRA’s Digital Media Latinoamérica conference, the Editor’s Weblog spoke with Gastón Roitberg, multimedia newsroom managing editor of La Nación, about the newspaper’s award-winning “open data ninjas” and what role data will play in shaping journalism’s future. An edited transcript of the email exchange is below.
WAN-IFRA: La Nación has brought data journalism to the forefront of its operations, even creating a data section in its website’s main navigation bar. What was the rationale behind this decision?
Gastón Roitberg: In La Nación, we consider data journalism a priority project, not only because it’s a trend for innovation in our profession, but also because we believe that citizens should be able to exercise the right of access to public information.
How many people are on the data team today? What is the breakdown of reporters, developers, analysts, designers, etc.?
Currently, LN Data is a multidisciplinary team consisting of a project manager, a specialist in research and trends, a data miner, an engineer in systems that scrapes files and transforms them into readable formats, a team of three interactive designers and various journalists. We also have the support of a Knight-Mozilla Fellow until the end of 2013.
How closely linked are the reporting, multimedia and technical processes?
They are closely linked through a workflow process that goes from the scraping of the data, through refining data sets, the structuring of Excel spreadsheets and the development of applications and interactive visualisations. Each speciality brings its knowledge, and projects are organised in different stages involving all those who can contribute knowledge to make a better product.
In massive projects like LN’s exposé on Senate spending, how are responsibilities divided?
The responsibilities are well-divided. A team of three to four people is responsible for the search, scraping and structuring of data, looking for the first patterns that can guide the journalistic analysis. Then that structured basis comes at the hands of the journalist who makes the effort to use the data to find stories of interest to the public. Then comes into play the team of interactive designers that develops interfaces so that users can easily browse the data.
How did LN comb through the massive amount of data in the Senate spending project? Which parts of these processes are computer-automated, and which are manual?
The most effective way to analyse the data is put into operation the knowledge of a data miner, who arms columns, combines variables and develops an easy way for the journalist to navigate the data.
Scraping work is done through robots that automate the task of combing through enormous volumes of data. Programs are also used to transform .pdf files, the most hated by programmers, to .xls or .cvs tables, allowing them to be structured. Tableau Software is used to develop interactive graphics. And to develop maps in different layers based on data we have experiences with Google Fusion Tables.
How can data teams avoid errors and verify information when dealing with massive data sets?
There are two ways to make an efficient verification of data in huge databases that we handle for big issues like the national budget: 1) Consult experts in each of those subjects to detect irregularities and tidy up the data sets, 2) Develop a special session called “chequeaton” [which translates to “check data manually”] with volunteers of NGOs.
LN’s data section publishes new articles daily, though data journalism is typically thought to be a longer-term endeavor. How does LN keep up output? How does LN balance data projects with the daily demands of news?
Data journalism is perfectly applied to breaking news. In fact, we call it breaking data, which is when we get information, structure it and develop maps that tell the same story. Adding interactive graphics to newspaper articles is an effort that is rewarded by the visits of users.
As Argentina doesn’t have a freedom of information law, how does your team confront challenges in obtaining data? How long do you generally have to wait to receive data once requested?
Argentina does not have a law of transparency and access to public information but boasts a presidential decree by which the executive branch officials and other powers of the state are obliged to provide information of their heritage. Recently we appointed a data producer who goes through different ministries and secretariats to make requests for information that we consider relevant. We often have to wait several months for the information to be delivered, and most of the time it’s delivered in an inappropriate format or directly on paper, which requires work that must be carried out by hand to digitise it.
It’s interesting that LN shares its primary resources with readers on DocumentCloud. Why did LN decide to do this?
In addition to data journalism the other big goal is to promote open data. So we have a repository of data sets in a catalog that features Junar technology. And we also use other tools to store and share documents such as DocumentCloud or the set of Google products. In this way other journalists and users can download the original files to initiate their own investigations.
What tips would you offer to news organisations considering creating data teams of their own?
The best advice we can give is to encourage walking the path of innovation, investing in training for their staff and promoting teamwork with universities and non-profit organisations. In addition, promote the hiring of programmers and specialists in technology to integrate newsrooms.
Why is data journalism important? What role do you think data will play in shaping journalism’s future in Latin America?
Data journalism is a phase of evolution of investigative journalism. Every piece of information can be analysed, structured and looked at with a magnifying glass to find a potential story of journalistic interest. The Pulitzer Prizes to The New York Times for interactive work and to ProPublica for data-based research demonstrate that journalistic excellence is possible at this stage of evolution.
With newsrooms trimming budgets, what is the case for investing resources in data journalism as LN has?
Innovation is the best antidote to face the crisis of the industry. Investing in creativity, research and development is what will allow the media business to endure and retain its relevant place today in democratic societies. Data journalism rescues the values of traditional investigative journalism and adds the remarkable contribution of technology to intelligently process the available information.
Roitberg will be speaking more about LN Data during a session on innovative storytelling in WAN-IFRA’s Digital Media Latinoamérica conference, 30-31 October in Bogotá, Colombia. For more information and to register to attend, click here.