Over the past years we have experienced an increased understanding for the importance of open data. We’ve seen governments, non-governmental organizations and even private and public companies open up their data repositories. Beside the open data champions inside these entities that pushed the idea through ignorance, bureaucracy and firewalls, there has also been a lot of work done to provide guidance, standards and tools. One important contributor of such things is the Open Knowledge Foundation. The globally-located and community-focused team that nurtures communities and builds tools to promote open knowledge, data and content. I think it’s time to show my appreciation by highlighting some of their initiatives.
CKAN is an open-source data portal application that makes it easy to publish, share and find data. It provides a powerful database for cataloging and storing datasets, with an intuitive web front-end and API. The core functionality can be flexibly extended with the features needed – from social integration and comments, to Google Analytics, to integrated data storage.
TheDataHub.org is a community-run catalog of useful sets of data on the Internet. Users can collect links here to data from around the web for themselves and others to use, or search for data that others have collected. Depending on the type of data and its conditions of use, the Data Hub may also be able to store a copy of the data or host it in a database, and provide some basic visualization tools.
GetTheData.org is a Q&A site where users can ask data related questions like “where to find data relating to a particular issue”, “what tools to use to explore a data set in a visual way”, or “how to cleanse data or get it into a format you can work with using third party visualization or analysis tools”.
DataPatterns.org is a collection of tips and tricks for data work. It’s a collection of opinions and evolving best practices. The purpose is not to present all available options and technologies but to pick one and follow it through. The site is a collaborative community effort: if you have some good hacks and would like to share them, you can contribute a patch to the Data Patterns repository.
OpenDataManual.org discusses legal, social and technical aspects of open data. The manual can be used by anyone but is especially designed for those seeking to open up data. It discusses the why, what and how of open data – why to go open, what open is, and the how to open data.
OpenDefinition.org sets out principles to define ‘openness’ in relation to content, data and services — that’s any kind of material or data from sonnets to statistics, genes to geodata. In addition this site hosts the Open Software Service Definition which defines openness in relation to online (software) services.
OpenDataCommons.org provides a set of legal tools for open data. The site hosts three licenses for open data: the Public Domain Dedication and License (PDDL), the Attribution License (ODC-By), and the Open Database License (ODC-ODbL). Additionally, they provide the ODC Attribution-Sharealike Community Norm.
As data visualization enthusiasts, the topic of open data is dear to our hearts. Its’ relevance to data visualization has been proven by a lot of successful projects in the past, like CrimeSpotting.org or WhereDoesMyMoneyGo.org. While I am far from being an expert in this area, I believe the following three principles summarized from the full Open Definition are important objectives for visualization practitioners like you and me.
First, having honest, consistent and accurate data available ensures that our work has the solid foundation it takes to tell compelling stories and answer relevant questions. Second, having this type of data in a convenient and modifiable form that is easy to maintain and to extend makes visualizations and applications that are driven by this data much more sustainable. Third, having the data provided under terms that permit use, reuse and redistribution for everyone without discrimination against fields of endeavor, persons or groups. For example, ‘non-commercial’ restrictions that would prevent ‘commercial’ use in client projects are not allowed.
With technical, social, and legal tools at our hands we can push open data visualization forward. That’s why I respect the work of the Open Knowledge Foundation and all the contributors to their efforts to this degree. OKF, I salute you!