Skip to main content

Our approach for the new taxonomy

Posted by: and , Posted on: - Categories: Content, Finding things, User insights

We launched the beta version of the new taxonomy earlier this year, which focuses on education content.

In this post we’ll go into more detail about why this new taxonomy is required, and the choices we made while developing it.

What the taxonomy’s for

We’re creating a taxonomy to make it easier for all the different types of users of GOV.UK to find the information they need.

Our approach focuses on the needs of the people who use GOV.UK and the people who publish content to it. As with any taxonomy development project, we also have to consider the surrounding context: processes, people, tools, governance, resources, design, ownership, management, communications, culture-change.

Where search falls short

But why not just improve search - won’t that fix the problem?

Search is very important, and it works well for specific queries when the user knows what they’re looking for (‘known-item’ searching) like ‘how do I book my driving theory test?’ or ‘how do I register a birth?’

But sometimes people’s information needs are more exploratory in nature. And often users don’t know exactly what they want or what language to use. For example, we know from user research that when it comes to questions like ‘what government financial support is available to me as a parent?’ that many people don’t know the names of the schemes available or even that they exist.

Plus, our research has shown that certain professional users may need to see everything about a topic to allow them to do their job.

Back in October 2014 our research into how users navigate on GOV.UK found that there’s a significantly higher tendency for users to go to a browse page from the homepage than to use site search. This reflects previous research across the web which found that most users start their journey browsing. We’ve also observed in user research that less confident digital users are more likely to browse than search. GOV.UK has to cater for all members of the population, including those with low digital skills and experience who may not be familiar with search.

Of course, many users of GOV.UK arrive on a content page deep in the site having followed a link from a search results page. But, as Jakob Nielsen says in this article on search, “Users who get to a page through search still need structure to understand the nature of the page relative to the rest of the site. They also need navigation to move around the site in the neighborhood of the page they found by searching.”

Starting from scratch

It's not an easy task to create a taxonomy for GOV.UK. The content covers overlapping, semi-connected and completely discrete subjects. The amount of content across these subjects is unevenly distributed, and the rate of growth in any area is unpredictable. Also, much of the content is about quite undefined concepts, and the words used can be ambiguous out of context.

Another factor is that the content on GOV.UK doesn’t cover entire subject areas, and never will. For example, it doesn’t include everything there is to know about childcare and parenting. It only includes the aspects of the domain that involve interaction with government (for example, registering a birth).

Early on in the project we did some initial investigation within the team to find out if there were any suitable openly available taxonomies out there that we could re-use. Whilst there are taxonomies that are good for known things like people, places, book titles and so on, it’s harder to find something that can deal with all the abstract concepts our content covers. Also, we couldn’t find anything UK-centric or specialist enough.

Wherever sensible we will adopt relevant terms from existing taxonomies where we find they work for GOV.UK users.

A subject-based approach

We know users will often be looking for content relating to specific subjects, since this reflects the way we think about the world. Therefore we’re focusing on a taxonomy which describes what content is ‘about’, rather than who it’s for, or what format it’s in.

We’ve explored audience-based navigation schemes in the past, but as we’ve blogged previously, audience-based navigation is often problematic. Gerry McGovern has often pointed out the pitfalls of using audience-based terms for primary navigation. It may be that some lower-level terms in our taxonomy will allude to specific audience types where appropriate, but at this stage there’s no evidence from user research to suggest that users need to navigate by audience, or other facets like format.

It’s also important for our publisher users that we don’t overcomplicate the taxonomy by introducing unnecessary data structures. For the taxonomy to be of value to end users, we need content to be tagged comprehensively and accurately to it. So we don’t want the tagging process to become unwieldy, since that would lead to inconsistencies and errors.

Automation (alone) is not the answer

We’ve been exploring various techniques for automating some parts of the taxonomy process, including generating topics in a given subject area. There might be ways to use entity extraction and other automated techniques to help with the process of tagging content to terms in the taxonomy. For example, to automatically suggest relevant taxonomy terms to the publisher when they’re creating the content - or to suggest content items that might be relevant to tag to a particular taxonomy term.

However, an entirely technology-led solution is not something we think will work, for many of the reasons outlined above - we’re dealing with unstructured data, ambiguous terms and fuzzy concepts. But above all, we must talk to users to understand how they think about and group content, so that we can create a taxonomy that’s truly user-centred.

To recap...

There is no such thing as a ‘perfect’ taxonomy - language is often ambiguous and people have different ways of looking for the same thing depending on their context, level of expertise, cultural background and perspective. Likewise, classifying content is difficult - there are no neat boundaries between categories, much as we’d like there to be!

Our goal is to match the taxonomy as closely to the needs of GOV.UK users as possible to help them find the content they need. We’ve made some improvements to the education branch in response to how it’s performed since launch. We’ve also started developing separate branches for environment and transport content, refining our approach as we’ve learnt more about what works (and what doesn’t). We’ll be blogging about our more recent work on these separate themes very soon, so do check back if you’d like to keep up-to-date with this project.


Sharing and comments

Share this page


  1. Comment by Paul Smith posted on


    I worked as Delivery Manager on replacing the UK Parliamentary search engine, which cover almost exactly every point raised here and I wondered if you had any liaison with our friends at the Parliament Digital Service to look at their learnings?

    • Replies to Paul Smith>

      Comment by Graeme Claridge posted on

      Hi Paul,

      Thanks for your question.

      Vicky Buser met with the Parliament Digital Service while she was working at GDS (she left earlier this year), and I know that they shared some ideas about taxonomy development. But I’m not sure if they covered the search engine.

      We have a separate team here working on search, and I’m sure they’ll be blogging about their own discoveries and challenges soon. In the meantime, please let me know if you’d like me to put you in touch with anyone in that team - I’m sure they’d be happy to talk to you.