exact  any/all
 The essential guide to knowledge and information management in law firms
denotes premium content | Jan 8 2009 
Taxonomies : Frameworks... (2nd Edition)
Details | Contents | Executive Summary | Author Profile |

Executive Summary
This report provides essential background on the state of the art in developing effective taxonomies, a key element behind successful knowledge management, internet and intranet portal implementation, content management and similar initiatives. After introducing the subject and explaining the reasons for the current high level of interest, it explores the key methods used to develop, implement and maintain taxonomies. The report draws heavily on the knowledge of experts and the experience of leading practitioners, including those who have been featured at Ark Group’s highly popular and successful ‘Real World Taxonomy’ conferences.

What is a taxonomy?
In essence, a taxonomy is a systematic way of classifying knowledge. It provides a hierarchical structure of concepts, using terms that help in the development of a common language to aid knowledge sharing. The Dewey Decimal system, used by libraries, is one of the world’s most widely used taxonomies, for instance. Related to a taxonomy, but essentially distinct, is a thesaurus, which provides a detailed vocabulary, information on preferred terms, related terms, synonyms and so on. Encompassing taxonomies are ontologies, which express the different types of relationships between the elements of a taxonomy, for example that Ark Group is a specific instance of a business. Ontologies are a key prerequisite for what is expected to be the next major development of the internet – the semantic web.

Why are taxonomies important?
A good taxonomy is a crucial element of effective knowledge management. The surge of interest in developing taxonomies over the past few years is a response to several challenges:

  • Coping with the level of information overload and information complexity faced by knowledge workers in carrying out their core tasks;
  • The implementation of enterprise portals using content-management systems that require content tagging in order to customise information;
  • The growth in B2B e-commerce and the emergence of web services, which require agreed information architectures so that relevant information can be shared between systems and people in a timely manner;
  • The need to be more innovative by making connections between related concepts across different disciplines.

A good taxonomy helps to inject order into the anarchy and chaos of a typical intranet or website. It has been shown that organising content in information portals and supplementing search engines with a taxonomy can significantly speed up the retrieval of relevant information, resulting in faster and better solutions to business problems and improved knowledge-worker productivity.

How do organisations justify investment in taxonomies?
As with other areas of knowledge management, it is often difficult to provide an ROI calculation prior to investment. Time savings, better information sharing, improved support for innovation, improving B2B e-commerce and so on are all powerful drivers. Many organisations understand the problems and costs of poor access to information and have seen how taxonomies have helped other enterprises overcome these problems. A common approach is to invest in small, pilot projects, the success of which will be judged by all the stakeholders involved. In this way, risks can be minimised at same time as the potential for quick successes is increased.

Challenges of taxonomy development
Human beings have an innate understanding of how taxonomies work. Developing a taxonomy is therefore not rocket science. However, several crucial judgements have to be made, so the skills of librarians and taxonomy specialists are a key success factor. These judgements include:

  • Determining the overall number of nodes;
  • Balancing breadth versus depth;
  • Whether to start from scratch or use an off-the-shelf taxonomy;
  • Choosing between automation with software and relying on specialist human classifiers.

Many taxonomies begin as ‘back of the envelope’ exercises or use simple mind-mapping tools. Precise methodologies do not exist, so building a taxonomy is still largely an art form.

Implementing a taxonomy
Our analysis of case studies and interviews with experts highlight the following factors as being important for developing an effective taxonomy:

  • Involve users at all stages of the taxonomy development and usage lifecycle;
  • Focus on fitness for purpose – it is usually better to have several application or user-specific taxonomies than a one-size-fits-all approach;
  • The two most useful knowledge sources to tap for maintaining a taxonomy are user feedback and a ‘don’t knows’ category;
  • The taxonomy must be constantly updated based on usage and changing terminology;
  • Take advantage of automation but ensure that the system allows for a high degree of human interaction and override;
  • Keep it simple – a good taxonomy is intuitive and intelligible.

Technologies
Four approaches are used in the software tools that support taxonomy development and application:

  • Training by example – a category is defined by an editor who then provides a ‘training set’;
  • Rules-based – documents are classified according to specified business rules;
  • Statistical methods – word patterns are identified according to parameters such as frequency of use and proximity to other words and phrases;
  • Natural language processing (computational linguistics) – classification relies on an extensive dictionary and thesaurus for identifying concepts.

Many software solutions rely on more than one of the above methods. The most popular tools tend to use one of the last two methods as their core technology. Current trends favour statistical approaches that are language independent. However, they provide no indication of meaning. Each approach has vociferous advocates whose claims need verifying in any specific situation.

The solutions currently on offer provide a varying degree of user control and interaction. Typically they suggest a classification that the user can accept or override. Taxonomy software must co-exist with content-management and search-and-retrieval software. As such, most taxonomy tools automatically insert metadata as XML tags into documents. The market is a mix of heavyweight players, such as Autonomy and Verity, whose product suites encompass the whole lifecycle of information, including taxonomy creation, document classification and information retrieval, and specialist taxonomy suppliers using innovative approaches. Already, though, there is consolidation in the marketplace between classification tool vendors and search engine providers.


As well as continual improvements in core technologies, areas of recent advance include:

  • Adaptive solutions – taxonomies are adjusted based on usage or personal preferences;
  • Visualisation – the taxonomy is represented visually, for instance as a hierarchical tree or a concept map;
  • Tools to construct and maintain topic maps – topic maps are an ISO standard way of describing ontological relationships, a requirement of the semantic web.


Standards
As users grapple with the implementation of taxonomies within their organisations, standards that will allow the interchange of information between organisations in a commonly agreed way have not yet been given a high priority. While XML provides a basic level of standardisation for information exchange, it will only be when broad industry-wide consortia use agreed schemas that sharing data across applications will become more practical. Other standards that will require monitoring include:
  • RDF (Resource Description Framework) – this defines metadata for description for web resources;
  • OIL (Ontology Inference Layer) – for specifying relationships between entities;
  • WSDL (Web Services Description Language), UDDI (Universal Description, Discovery and Integration) and other web-services standards.

With vying approaches backed by different vendor groups, which standards will eventually dominate is still unclear.

Future scenarios
From our analysis of current and potential developments we postulate three scenarios for the future of the web:

  • The web without meaning (including corporate portals) – driven by short-term economic imperatives and self interest, this scenario is an extrapolation from the present situation;
  • Improved collaborative frameworks – in this scenario, widely accepted taxonomies of hundreds, if not thousands, of different knowledge domains are the building blocks of the future semantic web;
  • A third scenario is an updated version of a vision first proposed during the 1930s by H.G. Wells of what he called the ‘world brain’.

Key elements of the second and third scenarios are the ‘intelligent’ web, which incorporates topic maps, knowledge maps and ontologies that act on the basis of the precise meaning of specified terms and the relationships between them. An alternative view is that, instead of new ‘intelligence’ being artificially situated in the network using combinations of algorithms and machine learning, it will come from enhancing the intelligence, disciplines and skills of the users using taxonomy working.

Conclusions
Our cases illustrate how a taxonomy boosts knowledge-worker effectiveness through the more efficient retrieval of relevant information. Some key success factors that can be derived from the lessons learnt in these cases are:

  • Involve users throughout the process;
  • Use people with science and library skills;
  • Use technology to identify emerging patterns and terminology, but use human intervention to make the final judgement.

Despite the successes to date, certain challenges remain, in particular getting the right balance between depth/width, choosing the appropriate hierarchical structures and going beyond taxonomies to add the meaning that is required for the future semantic web. It is clear, though, that over the past few years, taxonomies have become a critical component of the knowledge-management toolkit, and therefore a strategic imperative for any organisation looking to manage and exploit its knowledge more effectively.

Buy this book

This link will take you to the Ark Group's secure online payments site at www.ark-group.com

Legal publications
by Ark Group




BNA Legal & Business

Global Expense

Copyright ©1994-2009 Ark Group Ltd All rights reserved. No part of this site or the publications described herein
may be reproduced in any form without the permission of Ark Conferences Ltd, Registered in England, No. 2931372.