https://insidegovuk.blog.gov.uk/2016/04/14/building-a-new-tagging-infrastructure-for-gov-uk/

Building a new tagging infrastructure for GOV.UK

We're changing how tagging works behind the scenes to allow us to create a single way of categorising content ('taxonomy') for all content on GOV.UK.

In October we laid out our plan to improve navigation on GOV.UK. It consists of 3 steps:

  1. Build a hierarchy of subjects on GOV.UK.
  2. Tag all content to the hierarchy.
  3. Use the hierarchy to support navigation, orientation and search.

Since then we've started on the first step of the taxonomy work by doing content audits and creating prototypes for navigation (which we’ll write more about soon).

In this blog post we’ll explain what we’ve been doing on the technical side to make the second step possible.

Different places for tagging

Content on GOV.UK is published using a number of different applications.

This means that in some publishing applications (eg Mainstream Publisher) you can tag to mainstream browse pages and in others only to topics. Applications often save the tags in their own database, which means that bulk-tagging pages from different applications is impossible. Merging and removing tags is also hard because for each change we want to make we have to go into the application itself to change the information.

A single place to tag all the things

Because we are working on a single taxonomy, we need to make sure that all publishing apps are talking to the same thing and that there’s a central place to store tags. To support this we’ve been working on implementing a new tagging architecture for GOV.UK. The guiding principles for this architecture have been:

  • all content on GOV.UK should be taggable
  • there should be one database for all tagging information
  • content can be tagged in more than one publishing app
  • tag names and URLs should be easy to change

Migration

There are over 30 applications that interact with the content of GOV.UK. To properly lay the foundations for the new taxonomy, we needed to update them all. Here's what we did:

  • added new functionality for tagging to the publishing API
  • made sure all pages on GOV.UK are in a central place
  • built Content Tagger, a replacement app for Panopticon (the application we used at first to keep track of all the content items on GOV.UK - the first commit was in July 2011)
  • made sure we’re referring to tags and content using a unique ID (and not a URL) so that we can change the URLs easily

Because this work overlaps with the project to rebuild the GOV.UK publishing tools, we’ve worked together with the Publishing Platform team on a lot of this work.

What’s next?

When we've finished this work we can start building on this foundation. Upcoming work on the publishing tools includes improving tagging interfaces, allowing bulk tagging and supporting an alternative workflow for tagging.
It also provides the technical infrastructure to allow us to make changes to how content is organised and displayed to users.

6 comments

  1. Steve Hoy

    Seems like there will be a considerable overhead on the page and a heavy management requirement, even with a tag manager. Ever considered using a single tag solution that automatically collects everything and then allows you to put/merge that contextualised behavioural data (with a pre-built data model) in real-time in to an appropriate place (Cloud, in-house, Hadoop, MySQL, ...) for analysis/action?

    Link to this comment
    • tijmenbrommet

      Hi Steve, thanks for your comment. This work is closely related to the work on the publishing platform that Jennifer Allum has written about here:

      https://insidegovuk.blog.gov.uk/2016/04/21/rebuilding-gov-uks-publishing-platform-an-update/

      Part of that work is the applications that actually serve content on GOV.UK have very fast access to all the data they need. Having the data in a central place also helps us simplifying the overall architecture of GOV.UK, since applications are not required to store their own tagging-information anymore.

      Link to this comment
  2. Adrian Cooke

    Hi Tijmen, can you elaborate on “made sure all pages on GOV.UK are in a central place”? Does central mean a common publishing system, all on same subdomain, something else?

    Link to this comment
  3. Simon Smith

    Pleased to see this work in action. How can other Gov Depts get involved? Search results and so on is always a common issue with our users.

    Link to this comment