7th May 1998
The purpose of this document is to provide a summary of current efforts in knowledge sharing and to identify some rôles which formal methods might have to play in this kind of research. Much of the information on recent knowledge sharing work appears on the Worldwide Web so, for convienience of reference, this document relies more heavily on Web based citations rather than published references. Those who want an entry point to the more conventional literature are referred to Uschold & Gruninger 1996, "Ontologies: Principles, Methods and Applications", The Knowledge Engineering Review, Vol 11:2, 93-136. This document isn't intended to give a survey of knowledge sharing work. In particular, it tries to avoid becoming enmeshed in the details of particular languages applied to specific domains. Several good lists of knowledge sharing work exist, including the one at Stanford's Knowledge Systems Lab.
There are many different ways of discussing knowledge sharing. The focus of this document is on formal logics and we understand the term "ontology" in the following way:
A formal ontology is a language with a precisely defined syntax and semantics (which may be determined via model theory, proof theory or in terms of another formal ontology). The inferences permitted in the language are constrained by one or more sets of proof rules accompanied by appropriate proof strategies. The forms of description allowed by those using the ontology are required to be consistent with a set of axioms limiting its use, which we call "ontological constraints". The aim of the ontology is to provide a language which allows a stipulated group of people to share information reliably in a chosen area of work.
One problem with the definition above is that it excludes most existing ontologies. Normally these don't have formal semantics (sometimes not even a precisely defined syntax) and very few ontologies have deployed explicit descriptions of proof strategies or formal ontological constraints. One of the aims of this document is to promote a discussion of whether these these are of practical use.
This is intended to be a "living" document, which can be extended by others. As this happens, the author list should extend and we should be able to provide more detail in both the description of existing methods and the discussion of roles for formality. The ways in which this document might evolve include:
It is possible to view the problem of knowledge sharing from many different angles. This section looks at some of the areas being explored by the knowledge engineering community. It also gives a crude classification of the contributions of these different areas. The intention isn't to cover all research in this area but to provide a high level view of what is being done, with exemples of different types of work. A more comprehensive collection of links to knowledge sharing work can be found at the Stanford Knowledge Systems Lab sources page
The aim of a generic knowledge sharing language is to provide as versatile as possible a language for representing knowledge. An example of this is the Knowledge Interchange Format (KIF), which is a formal language (closely resembling First Order Predicate Calculus) with a declarative semantics. Like many applied forms of logic, it allows meta-level definitions such as believes(john, P) ==> believes(mary, P).
Some of the more specialised forms of logical description are sufficiently well used to spur standardisation of languages for use within that area. Such languages may have similar representational power to the more generic ones but they contain more specialised notation which makes particular forms of inference easier than woudl be the case if the language were more uniform. An example of this is the KRSL language which is targeted at planning problems and therefore contains primitive terms such as time_interval and duration, while also making assumptions about the underlying proof theory of the system (such as the assumption that asserted statements persist forwards through time unless contradicted).
Some knowledge sharing languages are intended to model the most basic concepts of important classes of applications. These languages have a (normally small) number of specific primitive symbols which correspond to basic domain-specific concepts. An example is the Enterprise Ontology for business modelling, which uses terms such as sale_price and product as primitives, in addition to more generic primitives such as intervaland the standard logical connectives.
In contrast to the "broad brush" approach taken in more general domain ontologies, it is possible to build detailed ontologies for some types of systems - particularly in engineered systems where the principles of operation are known in greater detail. An example is the Thermal Systems Model Fragment library of the How Things Work project. This library contains generic concepts of measurement, such as area, but also more specific concepts such as cold_flow_temperature_out. It is difficult to draw a precise distingtion between this category and that of more general domain-specific ontologies but there is a large increase in the formal complexity of detailed system ontologies.
Attempts are being made to represent large amounts of domain knowledge within a coherant ontological framework. An example is the CYC project which has produced a large repository of knowledge on a wide range of topics, maintained as small clusters within the overall framework of the top-level ontology.
A difference between highly distributed, partially autonomous systems and more conventional module systems is that in the latter there is a perfect synchronisation (across the given module interfaces) of requests for information and information supply whilst in the former there is no guarantee that "agents" which request information or ask other agents to do things will satisfy these objectives. An interchange protocol provides a language for describing the dialogue between agents and establishing the different levels of agreement needed to get things done. An example is the Knowledge Query and Manipulation Language (KQML) , which (among other things) provides a language for "performative" sentences, allowing agents to distinguish the intent of basic queries (such as evaluate, or reply) from effectors (such as achieve) or capabilities (such as advertise).
This section introduces some of the issues raised by knowledge sharing and attempts to generalise them as problems of logic. Issues which are highly specific to domains and to technology (such as tool support) are ignored.
All attempts to represent domain knowledge symbolically require us at some stage to explain, usually in natural language, what our primitive symbols mean. Some symbols are more ambiguous to describe this way than others. For example, here is a comment from the CYC top-level ontology which describes what it means to be transported over land:
(#$TransportViaFn #$LandTransportationDevice) "A collection of events. An instance of (#$TransportViaFn #$LandTransportationDevice) is a transportation event in which a #$vehicle designed to travel over land is used. Examples include instances of the collections (#$TransportViaFn #$RoadVehicle), #$SkateBoarding, and #$SnowSkiing. Note: This collection does not include instances of #$AnimalWalkingProcess, which is a #$LocomotionProcess (q.v.), not a form of transportation as defined in Cyc. A marginal example of this would be a (fictional) event in which someone burrowed through the earth in a Jules-Vernesque real-time tunnel-digging machine."
It is often difficult to know when to stop decomposing a problem into increasingly more detailed formal structures and "close" off the ontology with a primitive formal expression. In languages like KIF those parts of the ontology which are considered definitional (i.e. they are not defined inside the ontology) are flagged.
Formal issue: Are there formal mechanisms would help people compare different ontologies in the same domain but with differing definitional boundaries ?
A group of problems concern the detection of inconsistencies in shared knowledge. A feature of all the problems in this group is that we know they aren't soluble in general. Therefore, an interesting question is whether there are ways of restricting knowledge sharing such that these problems become tractible.
It is difficult to produce a single ontology which is simple enough to be easily learned but rich enough to accommodate the variations in representational style favoured by different groups of people. One way around this difficulty is to provide a "core" ontology and allow diversity via extensions to it - for example the Process Interchange Format (PIF) allows for Partially Shared Views (PSVs) which are a form of modular extension to the PIF core. Suppose I have an ontology which you want to extend to suit your domain by adding new forms of expression. How do you know that the extension you are making is consistent with what I already supplied ?
Formal issue: Given existing ontological axioms, A, and some new axioms, Ae, which extend the ontology how do we know that the union of A and Ae is consistent. This is not decidable, in general, for FOL.
If many ontologies are produced then it is possible that we may want to combine them in order to use them on shared problems. The fullest form of this is by merging them, but then we need to worry whether the ontologies clash.
Formal issue: Suppose we wish to merge the axioms defining ontologies A1 and A2. We would like to be able to guarantee that the union of A1 and A2 does not raise an inconsistency. This is not decidable, in general, for FOL.
It may not be necessary to obtain full compatability between ontologies. Perhaps it is sufficient to share bits of the knowledge representation language, and therefore limit the compatability problem to a subset of the ontological definitions.
Formal issue: For some subset As of the axioms defining an ontology, show that the union of these with another given set of ontological axioms does not raise an inconsistency. This is not decidable, in general, for FOL.
Most knowledge sharing languages are supposed to support "engineering", in the sense that the shared knowledge is used in building something and its designers should want to build it well. We don't know how to synthesise knowledge-based systems fully automatically so it is necessary for human engineers to understand the shared information at an appropriate level, and perhaps it will also be necessary for other people to check the use to which it has been put. This raises a group of problems for formal languages.
At some point, someone usually has to understand the knowledge shared (or its consequences). In conventional knowledge engineering this is normally supported by translators into and/or out of constrained natural language, diagrams, etc. Shared knowledge also requires shared (or translatable) visualisations of that knowledge. One way of tackling this problem is via generic forms of visual language (see, for example the conceptual graphs used in the Peirce project) but in practice a wide variety of communication styles are used.
Formal issue: Can the visualisations for formal ontologies be interchanged, either via a generic language or through translators between specific languages, and does the use of logic help us to do that ?
When building ontologies we often introduce forms of expression which, although not standard logical operators, play a special role during inference. An example is the widespread use of 'abnormal' definitions to allow for forms of default reasoning without changing the proof rules of a system. Engineers need to be careful when using this type of information because its purpose may not be immediately obvious from looking at knowledge expressed in the ontology.
Formal issue: Is there a need for formal ways to express constraints on use of primitive terms, beyond that which can be expressed using ontological constraints or proof rules ?
Although it is possible to imagine a single set of proof rules and accompanying proof strategy being used for all knowledge sharing, it is difficult to see how this could happen in practice because proof strategies are a good way of being selective during inference and thus simplifying knowledge representation. However, if we allow different proof strategies how do we show that knowledge intended for use under one strategy can safely be used with another strategy.
Formal issue: Is it possible to test compatability between proof strategies or, less ambitiously, to have a language for recording agreement on the correspondence between strategies ?
Formal issue: Can we provide automated support for combining proof strategies or, less ambitiously, a language for describing how combination was done ?