Database design – a new approach

Conventional database development and modification requires that first a physical data model is created or modified, from which the database tables are generated, before those tables can be loaded with ‘instances’. The physical data model tables then specify what can and cannot be expressed, because the tables are the templates for the possible expressions and what cannot be instantiated in those tables cannot be expressed in the database. (Note that in fact this defines an ‘ad hoc language’ that is limited by the possible expressions) In addition to that the requirements, such as cardinality requirements and consistency and integrity constraints, specify what shall be expressed and what is not allowed to be expressed.

A new approach to database design and modification could simplify the process by creating a generic database that enables that nearly everything can be expressed, including the expressions of requirements and constraints as well as expressions (instances) that should satisfy those requirements and constraints. Such a database does not need to be modified when a scope expands or changes. Only the expression of requirements and constraints should change, whereas generic software would be able to use the expressions of requirements and constraints as a basis for verification of new expressions (instances) on compliance and on consistency with existing expressions.

A predefined formalized language that is based on binary expressions also for the expression of higher order relations, such as Formal English, enables the creation of such generic databases in which database requirements and constraints can be stored as well as expressions (instances) that should satisfy them. Furthermore, the language defining ontology provides the modeled knowledge that enables software to verify compliance and ensure consistency and integrity.

This is illustrated in the following example of some requirements and constraints specifications and a possibility for a ‘library database’, together with instances that (shall) satisfy those requirements.

Requirement 1: every book shall have one title that uniquely identifies the book.
Requirement 2: every book shall have at least one author.
Statement 1: every book may be subject of several loans, but only one at a time, whereas one loan can include 1 to 5 books.

This can be expressed in Formal English and stored in a generic database as follows:

Intention

Left hand cardinalities

Name of left hand object

Name of relation type

Right hand cardinalities

Name of right hand object

requirement

0,1

book

shall have as title a

1,1

text string

requirement

0,n

book

shall have as author a

1,n

person

statement

1,5

book

can be subjected to a

0,1

loan

Note 1: The cardinalities are ‘simultaneous cardinalities’, which express that they are only constraints for instances that are valid at the same time.  
Note 2: Full Formal English expressions include also a number of UID’s and contextual facts to enable the use of synonyms and homonyms, validity context, status, timing, etc. A validity context specifies within which context the requirement, constrain or possibility is valid.
Note 3: In this example, the concepts book, text string, person and loan, as well as the relation types are selected from the dictionary-taxonomy with predefined concepts.

The Gellish language defining ontology specifies that kinds of relations that are used to express possibilities can be realized by statements that use other kinds of relations. Furthermore, it specifies that kinds of relations that express requirements (shall have...) are subtypes of corresponding kinds of relations that express possibilities (can have...). For example, the above first requirement that is expressed by using a <shall have as title a> relation is a subtype of a <can have as title a> relation, whereas from the latter it is specified that such a possibility can be realized by a <has as title> relation, provided that the left hand object as well as the right hand object is classified as a kind as specified in the requirement (book and text string respectively). Thus such a relation is a fulfillment of a requirement. This example of a relation between a possibility and a realization (which is inherited by a requirement) is specified as follows:

Intention

LH card.

Name of left hand object

Name of relation type

RH card.

Name of right hand object

statement

0,1

can have as title a

can be realized by a

0,n

has as title

A similar realization relation is specified for a <shall have as author a> relation and a <can be subjected to a> relation.

Thus the following statements about book B-1 are examples of instances for which it can be verified by software whether they satisfy the above requirements and possibilities:

Intention

LH card.

Name of left hand object

Name of relation type

RH card.

Name of right hand object

statement

 

B-1

is classified as a

 

book

statement

 

B-1

has as title

 

Semantic Modeling

statement

 

Semantic Modeling

is classified as a

 

text string

statement

 

B-1

has as author

 

John Doe

statement

 

John Doe

is classified as a

 

person

statement

 

B-1

is a subject in

 

Loan-1

statement

 

Loan-1

is classified as a

 

loan

Furthermore, if a user of the library database system enters only the first fact: ‘B-1 is a book’, then a requirements driven user interface could use the above requirements specification to ask for a title and for one or more authors.

A more extensive discussion of the modeling of requirements and the verification whether they are satisfied is given in the Gellish Semantic Modeling Methodology series Part 3B, Modeling of Requirements & Verification of Deliverables.

A generic database need not be implemented with the above tabular structure, but could for example also be implemented in an extended RDF database that enables to store 'named graphs'. However, RDF tends to become very verbose, especially when many contextual facts are included, whereas querying a tabular structure will have performance advantages.

There are similarities of this approach with the creation of a meta model. However, a meta model includes the expression of requirements for a data structure, whereas in this approach the requirements are requirements that are only about the products and processes (the content) and not about the database structure. Furthermore, in this approach the requirements can be included as instances in the database and may be combined with expressions of knowledge (of kinds of things), even without an intention to create information about individual things (instances) of those kinds. And all three can be stored in the same structure!
The are also similarities of this approach with NIAM and ‘Object Role Modeling’ (ORM). However, they are mainly (powerful) methodologies, whereas this approach is not only a methodology, but it also provides a full (extensible) language, including standardized kinds of relations and a dictionary-taxonomy (with synonyms, homonyms, abbreviations, etc.) which supports inheritance and provides reusable knowledge bases and semantic verification of consistency.