Thursday, August 30, 2012

Australia's first 3rd Generation open data site - from the ACT

The ACT government today announced the soft-launch of their new open data site,  dataACT, through their equally new  Government Information Office blog.

In my view this is now the best government open data site in Australia.

What makes it the best?
  • Data is available in a range of common reusable formats - from JSON and RDF through RSS and XML - as well as CSV and XLS for spreadsheet users.
  • Visualisation tools are built into the site, so data is not only useful to data scientists and programmers, but to the broader public who can chart and map it without having to leave the site.
  • The built-in embed tool allows people to take the data and rapidly include it in their own site without any programming knowledge.
  • Users can reorder the columns and filter the information in the site - again without having to export it first, and
  • discussions are built into every dataset by default.
It follows a 'generational' path for open data I've been talking about for awhile.

Most open data sites start as random collections of whatever data that agencies feel they can release as a 'quick win', to meet a government openness directive. They then progressing through more structured sites with rigour and organisation, but still only data, through to data and visualisation sites which support broader usage by the general community and finally into what I term 'data community sites', which become collaborative efforts with citizens.

In my view dataACT has skipped straight to a 3rd Generation data site at a time when other governments across Australia are struggling with 1st or 2nd Generation sites.

Well done ACT!

Now who will be the first government in Australia to get to a 4th Generation site!

Read on for my view of the generations of open data sites:

1st Generation: Data index

  • Contains or links to 'random' datasets, being those that agencies can release publicly quickly. 
  • Data is released in whatever format the data was held in (PDF, CSV, etc) and is not reformatted to web standards (JSON, RDF, etc).
  • Some datasets are released under custom or restrictive licenses.
  • Limited or no ability to discuss or rate datasets
  • Ability to 'request datasets', but with no response process or common workflow

2nd Generation:  Structured data index

  • Some thought regarding selective datasets, but largely 'random'
  • More standardisation of data formats to be reusable online
  • More standardisation of data licenses to permit consistent reuse
  • Tagging and commenting supported (as in a blog for the site), with limited interaction by site management
  • Workflows introduced for dataset requests, with agencies required to respond as to when they will release, or why they will not release, data
  • Ability to list websites, services and mobile apps created using data

3rd Generation: Standardised data index

  • Standardisation of data formats with at least manual conversion of data between common standard formats 
  • Standardisation of data licenses to permit consistent reuse
  • Tagging and commenting supported, with active interaction by site management
  • Data request workflows largely automated and integrated with FOI processes
  • Ability to filter, sort and visualise data within the site to broaden usage to non-technical citizens
  • Ability to embed data and visualisations from site in other sites
  • Ability to list, rate and comment on websites, services and mobile apps created using data

4th Generation: Data community

  • Strategic co-ordinated release of data by agencies to provide segment-specific data pictures of specific topics or locations
  • Standardisation of data formats with automatic conversion of data between common standard formats
  • Standardised data licenses
  • Tagging, commenting and data rating supported, with active interaction by site management and data holding agencies
  • Data request workflows fully automated and integrated with FOI processes with transparent workflows in the site showing what stage the data release is up to - (data requested, communicated to agency, considered by agency, approved for release, being cleaned/formatted, legal clearances checked, released/refused release)
  • Support for data correction and conversion by the public
  • Support for upload of citizen and private enterprise datasets
  • Ability to filter, sort and visualise data, including mashing up discrete datasets within the site to broaden usage to non-technical citizens
  • Ability to request data visualisations as a data request
  • Supports collaboration between hackers to co-develop websites, services and mobile apps using data
  • Integrates the capability to run hack events - potentially on a more frequent basis (form/enter teams/submit hack proposals/submit hacks/public and internal voting/Winner promotion)

    5th Generation: Integrated data platform

    • A common platform for all national, state and local data, with the capabilities for each jurisdiction to make use of all Generation 4 features.
    • Integrated mapping environment for all levels of government, enabled with all available open data.

      No comments:

      Post a Comment