The Power of Third Normal Form Data Models
The lowest architecture layer of most software designs is the Repository as a Database. The design of the database defines the “Data Model”.
The Data Model represents the organizational structure of the information elements or database fields
The Data Model’s organizational structure is referred to as its “Database Normalization”. Database normalization is the process of representing a database in terms of relations in Standardized Normal Forms.
Database normalization has many “Normalization Forms” that are available to the Database Designer for their Data Model.
The normal forms, abbreviated as NF, in relational database theory provides criteria for determining a table’s degree of Immunity against Logical Inconsistencies and Anomalies.
The Higher the Normal Form Applied to a Table
… The Less Vulnerable it is to Inconsistencies and Anomalies
Common Data Model Forms
The sections below define, in database theory terminology, the most common Data Model Forms and their Compliance Requirements.
A Listing on the Most Common Data Modeling Forms
A relation is in First Normal Form (1NF) if the domain of each attribute contains only atomic values and the value of each attribute contains only a single value from that domain.
All Higher Forms Comply with First Normal Form
… First Normal Form is a Minimal Requirement
First Normal Form is an essential property of a relation in all Relational Databases
A table that is in First Normal Form (1NF) must meet additional criteria if it is to qualify for Second Normal Form (2NF).
A table is in 2NF if and only if it is in 1NF and no Non-prime Attribute is dependent on any proper subset of any Candidate Key of the table.
A Non-prime Attribute of a table is an attribute that is not a part of any candidate key of the table
To simplify that statement a little:
A table is in 2NF if and only if it is
… in 1NF
…… and every Non-prime Attribute of the table
……… is dependent on the whole of a Candidate Key.
Note that when a 1NF table has no Composite Candidate Keys, candidate keys consisting of more than one attribute, the table is automatically in 2NF.
A table that is in Second Normal Form (2NF) must meet additional criteria if it is to qualify for Third Normal Form (3NF).
A table is in 3NF if it meets all the requirements for 2NF and there are no Non-prime Attributes duplicated anywhere.
A Prime Attribute of a table is an attribute that is part of a Candidate Key of the table
All requirements for One-to-Many Relationships to a Prime Attribute
… Are managed Many-to-Many Relational Tables known as Bridge, Join or Link Tables
The higher the forms offer more refined the control over the relationships to the Candidate Keys and Prime Attributes.
They can help solve anomalies that may occur in some situations. There are generally variations of 3NF.
Caution must be observed with forms higher than 3NF
… Management of these forms can be more complex over time
Why Third Normal Form?
A database can be simply a collection of tabular information such as Excel spreadsheets. Rows represent records and columns represent data fields in a record.
The raw data is deposited into database tables.
The unstructured non-relational information is accessed through normal SQL database calls.
In the Data Model tables are required to hold fields such as: PHONE1, PHONE2.
When a new requirement for a third phone is requested a new Column or Table Field must be added.
These duplicate fields create an unsustainable data model
Compliance with 3NF ensures that no single field, such as PHONE is duplicated anywhere in the entire database data model.
The Components of a 3NF Data Model
There are three types of tables that are used to comply with 3NF:
Each of these tables performs a specific and valuable role the Third Normal Form.
Data Tables are Standalone Data Entities.
They can be easily identified by have many relationships as a foreign key to other tables. A Data Model diagram will show many primary keys pointing to the Primary Data Table. These relationships represent the tables that depend of this Primary Data Table.
These tables are Value Tables to the Data Table in a Domain Driven Design model
Data tables hold the Prime Attributes of the Candidate Key, the Primary Key for the Data Table.
No Prime Attribute field is duplicated in any other table
Look Up tables or Seed Tables are tables that hold a unique value that can be a Foreign Key for a Data Table
Look Up Tables create a vehicle for 3NF compliance by making available data for Data Tables as One and Only One Data or Many-to-Many Data using Bridge or Join tables.
A Look Up table can be identified in a Data Model as it will normally only have a Primary Key that Bridge tables and Data tables use as their Foreign Keys.
These Seed tables are candidates for User Interface controls that allow a Web visitor to choose one or more selections for pull down menus and other Web page elements.
Bridge tables are sometimes referred to as Join or Link tables. They enable a Data Table to represent a collection of values from a Look Up table.
They can be identified in a Data Model easily as they generally only have Foreign Key relationships with a Look Up Table on one side and a Data Table on the other.
Compound Keys, more than one Primary Keys, should be use to define the identity of a record and create its relationship to its Look Up and Data tables
If the Data Model will be used in a Data Abstraction Layer (DAL) such as Entity Framework, complying with identity using a Compound Key will abstract away the Bridge Tables for the DAL Object Relational Map (ORM).
Entity framework understands the Many-to-Many relationships and simplifies the ORM model by not exposing the Bridge tables as Class representations.
Using Compound Keys with Bridge Tables can Reduce the Entity Model’s “Noise” sometimes as high as 40 %
The Value of Third Normal Form
If you are creating a new project, a “Green Field” project, then you should create a Third Normal Form Data Model.
All database table relationships in this new Green Field project should map to the Business Domain object’s information requirements.
This will simplify the layers above the database layer considerably.
The 3NF model is scalable and extensible for lower cost of ownership over the life cycle of the Data Model
In a “Brown Field” project, a legacy project, you will not always have a Data Model that maps to the Business Domain. But you do have avenues available to you to solve this issue.
Brown Field projects generally present complexities in the higher architectural layers as the data must be transformed to meet the Business Object requirements.
Rather than dealing with this complexity as a one-to-one problem solving activities, it is far more efficient to create an Anti-corruption Data Abstraction Layer using a ORM such as Entity Framework.
An anti-corruption layer is a layer that abstracts away the inconsistencies and issues below that layer: The Database Data Model
Entity Framework and other ORMs allow you to Split Tables and create Complex Objects from multiple tables.
You can then use these “Domain-centric” Class Object representations of the physical corrupted Database Model Table structure to create a pristine Business-centric Domain Entity Model.
All Layers above this Anti-corruption Data Abstraction will only see a Business Domain Object Model