Public data for all
In progress from 01-01-2010 to 01-06-2010
Project manager: Christine Hafskjold
Expert-based, Governance, Innovation, IT & communication
All public data should be made available to the public for free, as long as they don't threaten national security or the privacy of the citizens. Making public data available will stimulate innovation and strengthen democracy. This is one of the recommendations from a recent report from the Norwegian Board of Technology.
The development in information technology over the last ten years has led to an explosion in the amount of data available – both on the internet and locked away in government databases. Most Norwegians use services like online banking, and filling in their tax returns on the internet and are used to be able to fix almost everything online. This opens up a market for new and innovative services based on public data – but that means that the data must be available.
There is a general feeling that data that has been collected and structured through public funding belongs to the public and should be made available for free. This way the data can benefit the public through services such as weather forecasts, real time information on public transportation schedules etc.
The PSI-directive, which states that public data is a resource that should be available to all, has also been an important factor. This directive ha been one of the drivers for a development that has led to portals such as data.gov.uk in the UK and digitaliser.dk in Denmark.
The NBT has therefore had a project that looks into what public data should be made available in Norway, and how the government could go about doing this in practice.
The main recommendations are:
- Public data should as a rule be free. Even if there is a price on the data, the price should not depend on how the data is used – whether it is for commercial or non-commercial purposes.
- The public sector should focus on making available raw data – that is data in a machine readable format. The development of applications that make use of the data should be left to the market. The public sector should not compete with businesses. For some data, where making the full dataset available will be a threat to privacy, it can be natural to allow for searches for single "hits" through a public interface.
- The government should have a portal that points to the different data sets, based on the example of the US data.gov or the British data.gov.uk. The portal should as a minimum make public datasets with raw (machine readable) data, and APIs (Application Programming Interfaces) for real time data (such as data from traffic, for instance). The portal should be run by an agency that can handle contact both with the public and guide the different government agencies and offices that own the data. A handbook for the data owners on how to make the data available should be developed as soon as possible.
- The quality of the data should not be used as an excuse not to publish the data sets. Instead, there should be a classification system for data quality, to ensure that the users don't use data for purposes they don't have sufficient quality for.
It's important to respect the privacy of the citizens in this process. It can be difficult to preserve the anonymity when data is published in datasets, as they can be used in new channels and harvested with tools for data harvest. If these issues are not addressed properly, one risks a setback for public data as a whole. The solution is not to withhold all data sets with personal data, but for instance to anonymise them. For other data sets it can be an option to make the data available for single lookups, instead of making the entire data set available as a whole.