Why the term "relation(al)"?

First of all, I highly recommend the scientific paper in which Dr. Edgar Frank Codd published the relational framework to the general public in 1970, i.e., A Relational Model of Data for Large Shared Data Banks. There, in section 1.1, “Introduction”, Dr. Codd himself states that:

This paper is concerned with the application of elementary relation theory to systems which provide shared access to large banks of formatted data.
_{© Association for Computing Machinery. Communications of the ACM, Volume 13, Issue 6 (pp. 377-387), June 1970.}

So, yes, the terms relation and (hence) relational come from a mathematical background. Dr. Codd —who, apart from his academic and research credentials, had about 20 years of first-hand experience in computing and information processing— envisioned the enormous advantages of applying the relation (an abstract construct, naturally) in the field of data administration.

I am not a mathematician but, basically speaking, a relation is an association between sets, a set being a collection of elements (this external resource gives a definition of mathematical relation that may help to understand it from a different perspective). When working with the aid of a SQL database management system (DBMS for brevity), a well-known approximation of a relation is a table, in which case the association takes place between the types of its columns. Evidently, in SQL platforms that do offer DOMAIN support (e.g., Firebird and PostgreSQL), the association occurs between the domains fixed for the columns of the table in question; see the sections below for significant details.

In that respect, I am going to cite again Dr. Codd, who in section 1.3, “A Relational View of Data”, asserts that:

The term relation is used here in its accepted mathematical sense. Given sets S₁, S₂, ⋯ , S_n, (not necessarily distinct), R is a relation on these n sets if it is a set of n-tuples each of which has its first element from S₁, its second element from S₂, and so on.¹ We shall refer to S_j as the jth domain of R. As defined above, R is said to have degree n. Relations of degree 1 are often called unary, degree 2 binary, degree 3 ternary, and degree n n-ary.

_{¹ More concisely, R is a subset of the Cartesian product S₁ × S₂ × S₃ ⋯ × S_n}.
_{© Association for Computing Machinery. Communications of the ACM, Volume 13, Issue 6 (pp. 377-387), June 1970.}

And I agree with other answers in that it is very relevant to point out that Dr. Codd made some adaptations to the mathematical relation in order to get the most out of it regarding data management, and they are explained in the paper referred to before and throughout his extensive bibliography.

Relation and relationship

A situation worth bringing up is that, when dealing with these subjects, there may arise confusion due to the similarities that exist regarding the everyday (non-mathematical, non-technical) definitions of the terms relation and relationship —which, as a non-native English speaker, I find particularly understandable—.

The entity-relationship view and the relational model

Other factor that I think may as well cause confusion (and is closely associated with the technical connotations of the two terms brought up above) is that, when learning to design databases, a student or practitioner is typically first introduced to the methodology proposed by Dr. Peter Pin-Shan Chen in the entity-relationship view of data (published in 1976), which suggests two different implements (i.e., the entity and the relationship) to delineate a conceptual schema, and then, only after the definition of said schema is stable, the student or practitioner is introduced to relational terms and instruments (e.g., the relation) when declaring the logical layout of the pertinent database. Within the conceptual frame of reference, relationship holds connotations that are much more closer to the everyday sense of the word.

Then, perhaps, that circumstance also adds to the relation and relationship issue —but the sequence of firstly defining the conceptual schema and subsequently declaring the corresponding logical design is of course quite appropriate, as I will detail in the following sections—.

Responses to each of your subquestions

I consider that having included those three subquestions is really pertinent because they establish a broader context for the post, so they should not be overlooked. In this way, apart from exclusively addressing why the terms relation and relational are used (which certainly is very significant and is the title of the post, but it is not the entire post), said subquestions can assist in comprehending more of the scope of the relation and the relational model when one is involved in a whole information management project (quite relevant since this a site about database administration) and is therefore working at different levels of abstraction. In this manner, I am going to share my take on those particulars below.

Subquestion no. 1

Why is, for example, a Person, considered to be a "relation"? In English, a relation is a noun that describes how two entities are associated. It doesn't refer to the entities themselves. In the context of relational databases, "relation" refers to the entities themselves. Why?

Conceptual level

In a given business environment, Person can be considered an entity type depending on how the people who work there (business experts and database designers) conceptualize it. And, yes, in that business environment, there may be different properties of interest with respect to the Person entity type, e.g., Name, BirthDate, Gender, etc.

Moreover, the Person entity type may hold certain relationship (or association or connection) types with itself or other entity types; e.g., Person may be associated with an entity type named UserProfile, which in turn may have its own properties of interest, let us say, Username and Password.

But, (a) the entity types, (b) their corresponding properties, (c) the relationship types between entity types and (d) the relationships between the properties themselves are notions that “belong to” the particular business environment in which they are deemed of significance. They are devices used by database designers that work closely with business experts in order to define a context-specific conceptual schema, at the design phase.

Thus, at the conceptual level we basically work with the structure of the ideas that arise in the real world’s segment of interest, i.e., (1) prototypes of things and (2) prototypes of relationships between prototypes of things, we do not work with (3) relations —employing this last term in the sense of the relational framework of data—.

Logical level

After Person was precisely delineated as an entity type at the conceptual level, and if one wants to implement a relational database that conveys the meaning of Person and all the concepts associated with it, then the facts about entities of that type can be managed by virtue of a mathematical relation at the logical level, and take advantage of the science based operations that can be performed on that abstract construct (i.e., define it, constrain it and manipulate it).

Yes, one can name a certain relation Person when defining the logical arrangement of a database, but that does not transform the “real world” concept of Person into a relation, one approaches it as such because of the benefits that are obtained when managing information about it, e.g., applying relational algebra operations on it to derive new relations (and therefore one is deriving “new” information). Said benefits become more evident taking into account the fact that the entities of a certain type make up a set, and the values of a certain property make up a set too.

And, yes, as mentioned in preceding paragraphs and in other answers as well, one of the paramount aspects of a relation is the connection that exists between its domains —that are typically used to represent the properties of entity or association types that are part of a conceptual schema—. For example, let us say that we have declared the following (ternary) relation:

Salary (PersonNumber, EffectiveDate, Amount)

…and let us suppose that, in the business environment in question, the tuple —which (i) stands for a particular entity, i.e., an instance of an entity type from the applicable conceptual schema, and (ii) whose SQL counterpart is a row—

Salary (x, y, z)

…would carry the meaning

“The Salary payed to the Person identified by PersonNumber x on EffectiveDate y corresponds to the Amount of z”.

Accordingly —to describe things in an approximate manner—, the connection between the three domains is of prime importance, they are all related (and, yes, a unary relation would involve one domain only). The connection among all the values of a certain domain is very significant too, as they constitute a set of a precise type. Also, the contents of each tuple of the Salary relation must fit in the structure of the assertion illustrated above.

Conceptual-level relationships and logical-level relations

As demonstrated, I have now dealt with database management at two different levels of abstraction, namely conceptual and logical —and there is yet a lower level known as the physical one, which in SQL DBMSs typically involves, e.g., indexes, pages, extents, etc.—.

So, in accordance with the notions explained before, at the logical level one works exclusively with (a) mathematical relations, where (b) the conceptual relationships or associations are represented by (c) the values contained in the tuples of such mathematical relations, and said values are usually delimited via FOREIGN KEY constraints so that they can represent the applicable relationships accurately.

And, yes, associative entities, i.e., instances of relationship types with a many-to-many (M:N) cardinality ratio, can be conveyed by way of the tuples of a single mathematical relation —with the corresponding constraints declared appropriately, of course—.

Subquestion no. 2

I understand that relational model came after the hierarchical and network models. But in those models, the entities also have relations to one another. So why call this model the relational model? Is there a more specific phrase/term? Or maybe we should say that all three models are relational models, but the hierarchical and network models are specific types of relational models?

Network and hierarchical DBMSs preceded their formal theoretical support

It is opportune to point out that the theoretical support around the hierarchical and the network approaches was, in fact, created in terms of previously existing DBMSs, with the aim of, among other aspects, testing and establishing the soundness of (1) said kinds of software and (2) the linked data management practices —an upside-down phenomenon, from my point of view—.

Incomplete in comparison with the relational framework

That being said, although there are hierarchical and network DBMSs that predate the relational model, and even when Dr. Codd referred to each of those approaches as a “model”, none is defined as such in the same way that the relational framework is. The relational paradigm provides scientific constructs for the (i) definition, (ii) restriction and (iii) manipulation of data, and the hierarchical and network approaches lack full theoretical support to cover all of the three sorts of constructs previously mentioned.

Network and hierarchical features

Also, as stated before, the entity and relationship types are conceptual-level devices, they do not belong to the hierarchical or network approaches, each of which offers particular mechanisms to represent said aspects:

The network paradigm entails two devices for data representation, i.e., nodes and arcs (and that characteristic of course implies two different kinds of data manipulation operations) which, when contrasted with the relational model that (as per the information principle) requires only one construct (the relation), makes evident the needless complexity that working in a network fashion involves. For instance, given that it resorts to two representation instruments, the network approach imposes an impractical query bias that hinders data manipulation.
For its part, the hierarchical view proposes representing the data by way of (physical!) files made up of records (which in turn consist of fields) organized in a three-like arrangement; i.e., one parent record chained with possibly many child counterparts via pointers, which produces a physical access path with regard to data manipulation. This approach is also unfavourable because it presents a entanglement among conceptual and physical aspects, so the changes in the physical storage arrangements require a reorganization of the data structures, which in turn demands changes in the concerning data manipulation operations.

As shown, the hierarchical and network views impose their constructs on the data to be managed, whereas the relational model proposes administering the data elegantly in its natural structure by means of sets of associated facts (from which n subsequent types of sets, not anticipated at the design phase, can be derived and so on!).

The relational model does not have sub models

And, quite important, neither the hierarchical nor the network views are specific types of relational models, they are simply other paradigms that someone may follow to (a) build DBMSs and to (b) create databases, but please bear in mind that the hierarchical and network approaches are considered obsolete for decades now.

Subquestion no. 3

What if we have standalone entities that don't relate to one another. Say, Person, Door, and Tree. Is the term "relation(al)" still applicable?

Yes, it is perfectly applicable if one is (1) managing information about those entity types by dint of adapted mathematical relations and (2) performing the applicable relational operations at the logical level in a certain database administered with the support of a given relational DBMS.

It does not matter if, at the conceptual level, said entity types hold no relationship types with other entity types (and it is worth noting that an entity type can have a relationship of one-to-zero-one-or-many cardinality ratio with itself), and thus one is not conveying nor enforcing any relationship between the values of the tuples of the relations under consideration.

The interesting thing behind 'relational database' is, that it does not (primarily) refer to the relations between tables, as you might expect, but it refers to the relation of multiple properties (columns) in a tuple. A relational database stores those tuples as a row in a table.

It's based on the relational algebra defined by Alfred Tarski in his 1941(!) paper On the calculus of relations. He summarized the history of the term and usage in symbolic logic but defined the operations which in the end became the foundation for SQL.

Codd turned this into a definition for what can be understood as a relational database in his 12 commandments.

The term "relational" comes from mathematics and has nothing to do with relationships between entities. I'm not a mathematician (whereas Codd had a PhD in Mathematics) and so won't elaborate, but will point you to this wikipedia article on binary relations. The wikipedia entry on relation (databases) gives additional detail on how Codd adapted the mathematical concepts to apply to data management. As to why this mathematical structure is called a relation, I think it has to do with the idea that there is a "relationship" between the domains that make up the relation. The best source I know of to better understand Codd's original thinking is Fabian Pascal's Practical Database Foundations and Understanding the Real RDM series of papers. Chris Date has also written extensively on the RDM and his Third Manifesto site has a section listing papers and books. His book Relational Theory for Computing Professionals is a good introduction. I hope this helps.

Why the term "relation(al)"?

Responses to each of your subquestions

Subquestion no. 1

Subquestion no. 2

Subquestion no. 3

Tags:

Terminology

Relational Theory

Related

Recent Posts