CIS520 Class Notes

=Introduction=

Databases and database systems
An information system is a software system that processes information in a particular application domain. An information system improves business productivity by automating business processes. It is very typical for information systems to make use of one or more databases.

A database is a collection of persistent data. Persistent data are datasets stored on external memory devices to "persist" between separate executions of the information system.

A database application is a program that involves a database and automates business processes that make use of the data in the database. In other words, a database application is an information system that has a database.

A database management system or database system is a complex software system used to design, implement, run, and administer databases. A database system has components, such as data dictionary, SQL engine, and database administration tools. Most popular commercial and open source database systems are relational database systems. It means that they use the relational data model to design and implement the databases they manage.

Data models
Modeling is an activity during which we construct models of the things we study. These models are simplified representations that capture what is relevant for the modeling task at hand. By modeling we select the aspects of interest in a particular application domain, give those aspects names, and describe their properties and how they relate to each other.

A data model simplifies the representation of data in an information system. A simplified representation helps us understand how data can be processed by the system. We say that a data model describes what data and how data in an information system are organized for storage and access. In short, a data model is how represent data for a specific application.

Because of its complexity, a data model is decomposed into layers, to close the gap between how users understand or "view" the information system and how computers and networks run the information system that was developed to meet the users' needs.

User views
Data that users operate on are modeled in a layer of abstraction called external layer of the data model. The external layer consists of user views. User views have data that users enter into the information system or data that users retrieve from an information system. User views represent data that are relevant to the users of the information system and are viewed and produced during user interactions with the information system.

Conceptual or analysis data model
Analysts of an information system are responsible for creating user views and conceptual data models of the data that the information system will process. Analysts build these models in collaboration with the client and end-users of the information system. That is why conceptual data models are also called analysis data models. These models are built when the application domain of the database is analyzed by developers, users, and the client (sponsor) of the database.

A conceptual data model describes what kind of data and how data are organized in a database that will be designed with any kind of database system. A conceptual data model is independent of the database system that will end up implementing and manipulating the actual data that correspond to that model.

Logical or design data model
The layer of abstraction of a data model that is dependent on the database system is called logical layer. Data analysts and database designers and implementers create logical data models.

A logical data model describes what kind of data and how data are organized in a database that will be implemented with a database system. The type of the database system determines the type of the logical model. Since the most popular commercial and open source database systems are relational, relational model is the type of logical model we’ll use to model data at the logical layer.

Logical models are also called internal data models. This is because they depend on the database system that will ultimately implement the database.

Logical models are also called design data models. This is because the database system has already been chosen when we do the database design. The goal is to use the database design to implement the database.

Physical or implementation data model
The last layer of abstraction of a data model is the closest to the actual database system and is called the physical layer.

A physical data model represents how datasets are organized in an actual database that is implemented with a given database system and stored with a given operating system. It consists of representations of data that the database system is using, such as SQL commands, SQL script files, and other files (source code files, data files, executable files).

Database application developers and database administrators create physical data models, implement and administer databases, and configure, install, and run the database system and its utility programs. Physical data models are also called implementation data models.

Modeling components
The logical model we use in this course is the relational model. The relational model differs from its corresponding more abstract conceptual model by not allowing many-to-many relationships. Before we define the relational model, we start with an example of a conceptual model to identify the most fundamental modeling elements.

An entity models a set of instances with the same properties. Think of an entity as a concept. An entity models something of interest in the database for which we collect data and to which we give a name. For example, a Customer entity in an information system that manages online book reading models a set of actual customers that the system keeps track of, whose attributes are name, address, contact information, and so on.

An attribute is a named property of an entity and defines a range of values of the same type. For example, a customer's name is a string of characters. In a data model, attributes are atomic or hold values that cannot be broken down into smaller units. In that sense, an attribute is the smallest named property of an entity class that appears in a database.

Entities in a data model must be clearly defined in business or application terms that users of a particular information system fully understand. For example, the business definition of the Customer entity in the context of the online book reading application domain could be: "individual that reads books online using SoAndSo system."

An instance or object of an entity For example, an instance of the Customer entity is identified by the value 1000 for the attribute CustomerID, Sabin for the attribute LastName, and sabin@unh.edu for the attribute WorkEmail.
 * has an identity (can be uniquely identified) and
 * stores attribute values.

Instances of an entity are also called objects, entries, or records.

A unique identifier of an entity stores unique values for the instances of that entity. Unique identifiers are used to uniquely identify each instance of an entity.

For example, an entity class that models rooms in a building, can use a room number as the unique identifier for uniquely identifying each room in a building. Books in a library have similar unique identifiers.

Unique identifiers can correspond to one attribute of an entity. Sometimes, however, no single attribute has unique values for the entity instances. For example, a class roster cannot use student last name as the unique identifier since there is the possibility that one or more students have the same last name.

Unique identifiers are also known as keys. If a key has more than one attributes, then we call it a compound key or composite key. If an entity has more than one keys, only one is selected to be the primary key. A primary key is a key that is used to set up relationships among entities. The database system needs to know the primary keys of all entities in the database.

Relationships among entities model associations or links between instances of the entities involved.

Consider that a book is checked out for online reading. Three entities model data relevant to this user activity: Customer, Book, and BookReading. A Book entity has attributes such as title, genre, publishing year, and online reading price. A BookReading entity has attributes such as checking date and fee.

Customer objects are linked to BookReading objects, which are linked to Book objects. For example, the same Customer object is linked to three BookReading objects, each of which is linked to exactly one Book object. Each Book object is linked to exactly one BookReading object.

To model the number of of objects in one entity that can be linked to any of the objects in another entity, we use the modeling component called cardinality or multiplicity.

A cardinality of a relationship between two entities has two pairs of informational items. There are two pairs because a relationship between two entities has two ends or sides, with each side corresponding to one entity. A pair at each side has two items: maximum cardinality value and minimum cardinality value.

We define first maximum cardinality because the maximum cardinality values, one and many, are used to characterize a relationship as one-to-one, one-to-many, and many-to-many. A maximum cardinality of a relationship has two numbers, one by each end of the relationship. Assume that we have a relationship between entity Customer and entity BookReading. A maximum cardinality number on the Customer end of the relationship is read as the maximum number of Customer objects that can be linked with any BookReading object on the other end of the relationship. A maximum cardinality number of the BookReading end of the relationship is read as the maximum number of BookReading objects that can be linked with any Customer object on the other end of the relationship. We can say the the Customer-BookReading relationship is a one-to-many relationship.

The minimum cardinality has two values, optional and mandatory, which can be denoted by the numbers zero and one. A minimum cardinality of a relationship has two numbers, one by each end of the relationship. Given the example above of a relationship between Customer and BookReading, a minimum cardinality number on the Customer side is one (or mandatory), meaning that the minimum number of Customer objects than can be linked with any BookReading object is one. The minimum cardinality on the BookReading side is zero (or optional), meaning that the minimum number of BookReading objects that can be linked with any Customer object is zero.

Notation and Naming
A notation is a graphical or textual set of rules for representing a model. For example, the Roman alphabet is a notation for representing words in European languages.

Data models can use different notations for their graphical representations.

Information Engineering (IE)
The Information Engineering (IE) notation, also known at the "crow's foot", uses the following symbols to denote cardinality values:
 * oval for minimum cardinality of zero (or optional)
 * stick for minimum cardinality of one (or mandatory)
 * stick for maximum cardinality of one (at most one)
 * crow's foot for maximum cardinality of many (more than one)

Entities are represented by labeled rectangles, where the label is the name of the entity. Entity attributes are listed inside the rectangle that represents the entity.

Relationships between entities are lines between the rectangles representing the entities.

Cardinality values are shown at each end of a line connecting two rectangles. Maximum cardinality values are shown next to the entities. Minimum cardinality values are shown next to the maximum cardinality values and away from the entities. Minimum cardinality values can be omitted from an entity-relationship diagram.

Entities are named by nouns. It is preferable that entity nouns use the singular form. For example, we say Customer entity (not Customers entity). Attributes are named by nouns in the singular form too. Entity and attribute names are capitalized. We will use the Java convention of CamelCase for attribute and entity names. For example, PublishingDate could be an attribute in the Book entity.

Unified Modeling Language (UML)
The Unified Modeling Language (UML) notation uses the following symbols to denote cardinality values:
 * 0 for optional minimum cardinality
 * 1 for mandatory minimum cardinality
 * 1 for maximum cardinality of one
 * * for maximum cardinality of many

We use Violet UML Editor to represent relational models in UML notation by creating class diagrams. With the class diagram tool we can create rectangles to represent classes or entities and various links to re present relationships between them.

When you create a class (entity) rectangle, right click on it to edit its content as follows:
 * Write the name of the entity in the top component of the rectangle.
 * List the attribute names that form the unique identifier in the middle component of the rectangle and label primary keys with (PK)
 * List the rest of the attributes in the bottom component of the rectangle.

When you create a link between two entities, choose "is associated with" solid line and then draw the line by dragging the mouse from one rectangle to the other. Remember the direction in which you did the dragging (from source to target).

To edit a relationship, right click on the line to:
 * Enter minimum and maximum cardinality values on the source side (top component), target side (bottom component)
 * Enter the name of the relationship (middle component).
 * Remove any arrow icon at any of the ends of the line.

The special arrow head for inheritance is used only for the supertype-subtype relationships.

Textual Notation
Many times when we describe in more details an entity or relationship it helps to refer to them in some textual format.

Let's illustrate the textual notation rules on an example. Assume that there are two entity classes, Customer and Invoice. Customer has the attributes CustomerID, LastName, FirstName. Invoice has the attributes InvoiceID, Date, CustomerID. We write these two entities as follows: Customer(CustomerID(PK), LastName, FirstName, Address, State, Zip, Phone, Email)

Invoice(InvoiceID(PK), Date, CustomerID(FK))

=Relational Model=

Relations (or tables)
The relational model represents data with relations.

A relation models an entity with a table of rows and columns that have the following properties:
 * 1) Each row stores actual data that represents a distinct instance of the entity.
 * 2) Each column represents a distinct attribute of the entity. The attribute's values belong to a predefined data type.
 * 3) Each cell has a single attribute value.

Table and relation terms are used interchangeably.

A relational database is a set of named relations (or tables).

The structure of the relation is defined in terms of:
 * set of named columns with defined data types
 * set of unique rows with single values for each column.

Column and attribute are used interchangeably. Row and instance are used interchangeably.

The schema or structure of a relational database is obtained from the structure of each relation in the database.

The instance of a database is the instances (rows) of all the tables in the database.

Primary Key
The unique identifier of an entity corresponds to the concept of primary key of the relation (or table) that models that entity.

A primary key of a relation is an attribute that has unique values across all instances in that relation.

If no single attribute has unique values for all instances, then the primary key has more than one attribute. The set of attributes that make up the primary key is minimal and has unique combined values across the relation.

When more than one attribute form a primary key, the key is called "compound primary key" or "composite primary key".

Notationally, we add (PK) by the name of the attributes that form the primary key.

A primary key can be a natural primary key or surrogate primary key.

A natural key has attributes for which data is collected from the application domain. For example, when the social security number is used as a primary key, it is a natural key with data collected from the real world.

A surrogate key has attributes with data generated by the database system to guarantee that the attributes' values are unique. If we don't want to use the social security number as a primary key, we ask the database system to generate unique numbers instead. In this case, the primary key is a surrogate key. In MySQL server, surrogate attributes have the designation auto_increment.

Relationships
In a relational model, relationships between tables are represented by foreign keys. A foreign key is one or more columns in one table, called child table, that form the primary key in another table, called parent table.

Notationally, we add (FK) by the name of the foreign key column in the child table to indicate which columns represent a foreign key.

One-to-many relationships
The parent-child pattern is used to represent a one-to-many relationship as follows:
 * The parent table has the maximum cardinality of one.
 * The child table has the maximum cardinality of many.
 * The primary key of the parent table has a corresponding foreign key in the child table.
 * Each value of the foreign key in the child table must have a matching value in the primary key of the parent table. This is called the referential integrity constraint or foreign key constraint that is implemented in the physical data model.

For example, Advisor-Student relationship represents advising assignments by which an advisor advises many students, but a student has no more than one advisor. We apply the parent-child patters as follows:
 * Advisor is the parent table because it has the maximum cardinality of one. Its primary key is AdvisorID.
 * Student is the child table because it has the maximum cardinality of many. In addition to its attributes (StudentID, FirstName, LastName, etc.), Student must have a foreign key attribute, AdvisorID, whose values match the values of the primary key in the Advisor parent table.

One-to-one relationships
One-to-one relationships are also modeled in a relational model with the parent-child pattern. Either one of the two tables can be chosen as parent and the other one as child table.

For example, Member-Locker relationship shows that members of a gym club can be assigned lockers such that a locker is used by at most one member and a member is assigned at most one locker. We apply the parent-child pattern as follows:
 * Member table is the parent table and its primary key, MemberID corresponds to a foreign key in the Locker table.
 * Locker table is the child table and in addition to its attributes (LockerID, Type, Size) also has a MemberID foreign key attribute whose values match those in the MemberID primary key of the parent table.

Many-to-many relationships and intersection tables
Many-to-many relationships cannot be represented in relational models with the parent-child pattern and placement of foreign keys. Therefore, many-to-many relationships must be transformed into one-to-many relationships.

An intersection table is a table that relational modelers create to change a many-to-many relationship between two tables into two one-to-many relationships between each of those tables and the newly created intersection table.

For example, Student and ScheduledClass are two relations with a many-to-many relationship between them. The relationship says that a student can register for multiple scheduled classes in a semester, and a scheduled class can enroll multiple students for a given semester. StudentID is the primary key of the Student table. CRN is the primary key of the ScheduledClass table. Student(StudentID(PK), LastName, FirstName, GPA) ScheduledClass(CRN, CourseID, CourseName, Days, Time, Room, Instructor)

The intersection table we create is called StudentClassEnrollment and lists all the students and the classes they enrolled in for a given semester.

What is the primary key of this table? The combination of StudentID and CRN values is unique across all instances in the intersection table. Therefore, (Student ID, CRN) forms the composite primary key of the StudentClassEnrollment table. We also add the attribute EnrollmentType, whose values are "audit", "credit". StudentClassEnrollment(StudentID(PK), CRN(PK), EnrollmentType)

What new relationships do we have? There is one-to-many relationship between Student and StudentClassEnrollment, and one-to-many relationship between ScheduledClass and StudentClass Enrollment. For each relationship we apply the parent-child pattern.

Student is the parent table and StudentClassEnrollment is the child table for the Student-StudentClassEnrollment relationship. It means that the child table must have a foreign key attribute that references the primary key StudentID in the Student table. We designate StudentID in StudentClassEnrollment as FK, in addition to being part of the composite primary key (StudentID, CRN). StudentClassEnrollment(StudentID(PK, FK), CRN(PK), EnrollmentType)

ScheduledClass is the parent table and StudentClassEnrollment is the child table for the ScheduledClass-StudentClassEnrollment relationship. It means that the child table must have a foreign key attribute that references the primary key CRN in the ScheduledClass table. We designate CRN in StudentClassEnrollment as FK, in addition to being part of the composite primary key (StudentID, CRN). StudentClassEnrollment(StudentID(PK, FK), CRN(PK, FK), EnrollmentType)

By adding the description of this intersection table to the two initial descriptions of the Student and ScheduledClass tables, we finish the transformation of the many-to-many relationship into two one-to-many relationships. Instead of two tables, we have three, two parent tables and the child intersection table, StudentClassEnrollment.

=Database Development Practice= To gain practice with database development, we give examples that illustrate analysis, design, and implementation techniques for various databases.

For each example we state the database requirements gathered from the application domain in which the database can be used. Three modeling activities are used to transform requirements into a database implementation:
 * Analysis - produces a conceptual or analysis data model of the database
 * Design - produces a relational data model of the database
 * Implementation - produces the actual SQL code that implements the database.

For some examples, the design activity is done by using normalization.

Examples

 * Internship Placement
 * Project Assignments - has only design diagram and minimal analysis model
 * Course Schedule - design with normalization
 * Book Rental - design with normalization

=SQL Programming=

Naming conventions
In MySQL server, databases correspond to directories in the data directory. A table corresponds to at least one file in the database directory. Triggers, too, correspond to files.

The case sensitivity of the operating system in which MySQL server runs affects the case sensitivity of the identifiers we choose to name databases, tables, and triggers: An exception is Mac OS X, which is Unix-based, but uses a default non-case sensitive file system. Just to reinforce how naming business can get!
 * Identifiers are not case sensitive in Windows, but
 * Identifiers are case sensitive in most varieties of Unix.

Identifiers for columns, column aliases, indexes, stored routines, and event names are not case sensitive on any platform.

Why are naming conventions important? There are two reasons. When naming is consistent, readability is improved and errors are more easily spotted. Secondly, portability to other platforms can be optimized with good naming conventions.

The naming conventions we adopt are:

Rule 1 : Names of script files, databases, and tables use only lowercase words separated by underscores.

Rule 2 : Names of columns use only camelCase naming convention: not capitalized with subsequent words in the name capitalized.

Bank Database Query Examples
Advanced queries that use join, group by, and having clauses.

Inner Join Examples
Join across many tables in the database to list all the product type names of all the individuals. The output shows product type names and individual names.

List first name and last name of all the individuals with bank accounts, along the account IDs of their accounts.

Left and Right Join Examples
There are accounts in the bank database that do not belong to individuals. To show which accounts do not belong to individuals, we write a join that lists all accounts (showing account IDs) regardless of whether they have matching individuals.

The meaning of account left join customer is to indicate that the table name on the left (in our case account) will show ALL its accountID values. The account IDs which do not have matching customers, will show NULL for customer ID.

The meaning of customer left join individual is to indicate that the table name on the left (in our case customer) will show ALL its customer ID values. The customer IDs which do not have matching individuals, will show NULL for individual ID.

The meaning of individual right join account is to indicate that the table name on the right (in our case account) will show ALL its account ID values regardless of matching individuals. The account IDs which do not have matching individuals, will show NULL for the individual last name and first name.

Grouping Examples
How many accounts does each employee have? To answer this question we first list all account IDs and employee IDs in the account table. We notice that the employee ID attribute is open_emp_id. This attribute is a foreign key in employee table.

To improve the readability of this result, we then inner join the account table with employee table and list account IDs along with employees' last names.

The output of the query above shows that each of the employees has opened many accounts. To count how many accounts were opened by each employee we must do two things:
 * 1) group by employee ID
 * 2) use the aggregate function count, which we apply to account_id (on the SELECT clause)

To improve readability, we add in the SELECT clause a suggestive attribute that is associated with the attribute on which the grouping was done. In our case, we also list e.lname that correspond to the e.emp_id on which grouping was done.

Having Example
To filter out instances from the result table obtained from grouping, we use the clause having. Note that this clause is always used with a group by clause.

It is an error to use where for filtering out instances from the result table obtained from a grouping operation.

To find out which employees have opened fewer than 5 accounts, we add having to the group by clause. Note that the condition is on the aggregate function used to do grouping.

In the example above, other aggregate functions can be used. For example, we can list which employees have opened accounts whose total available balance is less than a certain amount. The having condition can be further complicated by combining, let's say, number of accounts with total value of available balance for those accounts. For example:

Grouping with other aggregate functions
=Normalization=

Normalization is a data modeling technique to organize data into relations. Review the relation's properties presented in the previous section. Note that if all three conditions are met, then the relation is considered in the first normal form.

Normalization Procedures

Why Normalize?
Why is normalization needed? Normalization is a very important data modeling technique because it reduces the occurrence of anomalies due to inserting, deletion, and updating of data.

Insert Anomaly
Insert anomaly occurs when inserting a new instance in an entity will leave some attributes with missing data. This happens when an entity combines attributes from two related entities and data from one entity only is available at the time of insertion.

Delete Anomaly
Delete anomaly occurs when deleting an existing instance from an entity will cause loss of data. This happens when an entity combines attributes from two related entities and data from one entity represents the last occurrence of a particular instance of that entity.

Update Anomaly
Update anomaly occurs when updating an existing instance from an entity will cause inconsistent data. This happens when an entity combines attributes from two related entities and data from one entity occurs repeatedly and needs to be updated in multiple places.

How to Normalize?
Normalization starts with making up an entity that lists all the attributes extracted from forms and documents that are used to keep track of data in the absence of a database. Sometimes, an entity is created from attribute information on each form that users fill out to collect data of interest. At this stage, as we come up with these initial entities, the only condition that we enforce is that each such entity have a unique identifier. If the unique identifier has too many attributes, then we create a single-attribute unique identifier with which we uniquely identify each instance in our entity.

First Normal Form
To define the first normal form, we must first define the multivalued attribute concept. A multivalued attribute of an entity contains more than one value for some of the instances of that entity.

For example, if Person entity has the attribute Email such that two or more email values are listed for the same Person instance, then the Email attribute is multivalued.

A repeating group is a set of multivalued attributes with the same number of multiple values.

For example, if a Course entity has information about the class sections scheduled to be offered for a course, then attributes such as SectionID, Room, Date, and Time are multivalued in the same way. We call these related multivalued attributes a repeating group in the original entity.

A relation is in the first normal form if the relation does not have multivalued attributes or repeating groups.
 * Definition:

If we review the definition of a relation, we see that the first normal form condition is similar to the last property of the relation: each cell has a single attribute value.

To transform an entity into a relation that is in the first normal form, we take the following steps:
 * Step Zero: Use textual notation to represent the relation we transform.
 * Step One: Choose a unique identifier in this relation, and then label it as a PK; or, if there are multiple candidate keys available, find and designate as PK the least changeable, simplest, and shortest candidate.
 * Step Two: Analyze relation for multivalued attributes. Enclose repeating groups and any individual multivalued attributes within parentheses.
 * Step Three: Create and name a new relation for the repeating group or individual multivalued attribute.
 * Step Four: Copy the PK-designated attribute(s) from the original relation into the new relation and designate it as PK in the new relation, too.
 * Step Five: In the new relation, designate as FK the PK attribute(s). The new relation (child) references the original relation (parent), which enforces a one-to-many relationship between them.
 * Step Six: Move the repeating group or individual multivalued attribute from the original relation to the new relation.
 * Step Seven: Choose the PK for the new relation by extending the PK determined at Step Four with an unique identifier of the repeating group.
 * Step Final: Use textual notation to represent the two relations obtained from 1NF transformation.

Second Normal Form

 * Definition: A relation is in the second normal form if:
 * The relation is in the first normal form.
 * All nonkey attributes are functionally dependent on the entire primary key.

This definition introduced three new concepts: nonkey, entire primary key, and functional dependency. Let's take them in order, from what appears simple to what sounds like complicated.

Here, by nonkey attribute we mean an attribute that is not the primary key or part of the primary key.

The entire primary key makes sense if and only if the primary key is a composite key. We then learn that to normalize a relation to its second normal form the relation must have a composite key in the first place.

To define functional dependency we consider two generic attributes, A and B, in some fictitious entity X.

We say that B is functionally dependent on A in the entity X if for a given value of A there is at most one value of B. Another way of saying the same thing is to say that A is a determinant of B.

If we find that in an entity that has a composite key there is a nonkey attribute that is functionally dependent on part of the attributes in the primary key, then we have follow the following steps to transform the entity into the second normal form:
 * Step Zero: Verify that the relation has a composite key as unique identifier. Relations with composite keys are the only relations that lend themselves to 2NF violations, as 2NF is concerned only with nonkey attributes that have partial dependence on a composite unique identifier.
 * Step One: Analyze the relation for any partial dependency; that is, part of the primary composite is a determinant (unique identifier) of nonkey attribute(s). If no partial dependency is found, the relation is in 2NF and transformation process stops here. Otherwise, move on to the next step.
 * Step Two: Identify the partially dependent attributes and their determinant (part of the composite key).
 * Step Three: Create and name a new relation for the attributes identified at the previous step.
 * Step Four: Copy the determinant (part of the composite key of the original relation) into this new relation. Designate it as PK.
 * Step Five: Move the partially dependent attributes into this new relation. The new relation (parent) has a one-to-many relationship with the original relation (child).
 * Step Six: In the original relation, designate as FK the determinant that references the PK attribute in the new relation. This enforces a one-to-many relationship between the new relation (parent) and original relation (child). The original relation is now in 2NF because all its nonkey attributes are now functionally dependent on the entire primary key.
 * Step Final: Use textual notation to represent the two relations obtained from 2NF transformation.

Third Normal Form

 * Definition: A relation is in the third normal form if:
 * The relation is in the second normal form.
 * There is no transitive dependence (that is, all the non-key attributes depend only on the primary key).

This definition introduces one new concept: transitive dependency. To explain transitive dependency we consider two generic non-key attributes, A and B, in some fictitious entity X.

We say that B is transitively dependent on A in the entity X if A is a determinant of B, AND both A and B are non-key attributes.


 * Step One: Analyze the relation for any transitive dependency; that is, there is at least one nonkey attribute that is a determinant (unique identifier) of other nonkey attribute(s). If no transitive dependency is found, the relation is in 3NF and transformation process stops here. Otherwise, move on to the next step.
 * Step Two: Create and name a new relation for the transitively dependent attributes.
 * Step Three: Copy the determinant of the transitively dependent attributes into the new relation. Designate it as PK.
 * Step Four: Move all nonkeys that are transitively dependent on the determinant into the new relation. The new relation (parent) has a one-to-many relationship with the original relation (child).
 * Step Five: In the original relation, designate as FK the determinant that references the PK attribute in the new relation. This enforces the one-to-many relationship between the new relation (parent) and original relation (child). The relation is now in 3NF because all nonkey attributes depend only on the primary key.
 * Step Final: Use textual notation to represent the two relations obtained from 3NF transformation.