Common languages for database definitions and database contents

Language defining ontologies

I advocate to adopt and/or develop formalized natural languages for database design as well as for the harmonization of database content. Such formalized languages should be defined in language defining ontologies that are distinct from knowledge modeling ontologies (as I have explained in another blog).

Herewith I describe what the essentials are of a formalized language definition and thus of language defining ontologies.

Read more: Common languages for databases

Unique identifiers, synonyms and homonyms

It is widely recognized that there is a difference between things (or concepts) and the various names by which those things can be denoted. Nevertheless, in databases and date exchange messages, things are usually denoted only by names, which results in interpretation problems and difficulties for interoperability of systems and for data integration. This can be solved by using a formalized language within which unique identifiers represent the things (including also concepts, aspect, relations, etc.), whereas multiple different names are allowed for usage by different parties. This article discusses the advantage of this solution over other approaches.

Read more: Unique identifiers - synonyms and homonyms

Data types are not needed

I would like to defend the statement: Data types are not needed in semantic databases.

In natural languages we use just strings of characters..., or are we using data types?

Read more: Data types not needed

Two kinds of ontologies: language defining and knowledge modeling

Ontologies are models that express formalized knowledge (or requirements) in some application domain. Usually every organization creates its own ontology on the basis of its own business needs and domain knowledge. Creating your own ontology is often advocated because of differences in business needs. The general theories about how ontologies should be created unfortunately have not (yet) resulted in an agreed and common ontology development process. As a consequence, the resulting ontologies are unnecessary different, whereas those differences complicate data exchange, hamper interoperability of systems and prevent data integration.

One of the root causes of this situation is that there is little awareness of the fact that ontology development typically mingles two types of ontologies that should be distinguished: one ontology should define a (formalized) language, including the domain terminology and the other should model the domain knowledge, using that language. The creation of the language defining ontology should precede the creation of the domain knowledge specifying ontology, because the latter should be written conform the conventions of the first one.
This article advocates such a separation as well as an effort by branch organizations and institutions to develop common language defining ontologies that include also domain terminology for various technical and business domains. An example of an initiative to develop a common language defining ontology is Gellish Formal English, which originated in the process industries and in ISO 15926 and another example is the Dutch CB-NL, for the building and civil infrastructure industry in The Netherlands. Both are multi-lingual and are intended to develop into European standards.

Read more: Two kinds of ontologies