Axioms and Constraints: Ensuring Structural Validity

Suzanne Barbalet

Axioms underpin the business logic of the thesaurus management system; that is how thesaurus concepts can be created, displayed, stored and changed. Constraints, on the other hand, ensure structural, thesaural integrity. They also take account of semantic web conventions to make provision for linked data/SKOS.

Both axioms and constraints bring into effect quality control measures. While a large proportion of thesaurus quality control is intellectual work, the new CESSDA ELSST management system will ensure consistency both within and between all the thesauri it holds. We will focus here on these quality control checks that, as Vanda Broughton (2006: 203) says, ensure that the structure is sound. As stage two of our alignment task nears completion, it is a good time to rehearse some of the axioms and constraints of the edit and administration subsystem.

We created a collection of core concepts in stage one of our thesaurus alignment task so that all ELSST concepts initially became core concepts. Core concepts are those shared between both thesauri and must carry the same preferred label. One axiom is that both thesauri must have the same core structure.  So, where concepts are shared, all the Broader Terms (BTs) of that concept must also be core concepts. Conversely, non-core concepts must not have core concepts as narrower terms (NTs). We have made provision for some divergence between the thesauri, however, to indicate partial or exact equivalence should, for example, scope notes differ.

When concepts are created, if the new concept has a wider scope than existing core concepts, and should therefore appear higher up the hierarchy, its Broader Term (BT) will also need to be a core concept so that the core structures remain identical. In keeping with these principles, deletion, promotion (changing from non-core to core) and demotion (changing from core to non-core) of concepts must take account of existing parent and child relationships.

In practical terms the constraints we propose to implement for quality control are quite straightforward. They prevent duplication of concepts within a hierarchy and across Preferred Terms (PTs) and their UFs. As simple as this may seem, these constraints in fact require an extensive set of validation rules and checks, both to ensure that concepts within hierarchies maintain logical relationships and to exclude manual error. As is the case for all post-coordinated vocabulary development, hierarchies can grow unevenly. Validation checks to ensure that a PT does not exist in another location further up or down the same hierarchy become necessary when hierarchies reach a certain size. Our hierarchies are now large enough to require these checks.

Our axioms and constraints form part of what Stella Dextre Clarke and Marcia Zeng describe as the “data model” that is encapsulated within ISO 25964. This ISO standard, the authors argue, marked the transition to rigorous vocabulary management for SKOS-supported management systems.  The standard’s publication, only shortly after the CESSDA-ELSST project began, is timely for us in order to ensure both thesauri keep abreast of semantic web developments.

This entry was posted in Structural, Technical. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s