Wednesday, 24 July 2013

RDBMS Study Material Part 1



Relational database design and SQL
E-R Diagram and their transformation to relation design, normalization -INF, 2NF and BCNF.
SQL: Data definition language (DDL) Data manipulation language (DML), Data control language (DCL) commands database objects like. Views indexes, sequences, synonyms, data dictionary.
Entity Relationship Modeling
An entity-relationship (ER) diagram is a specialized graphic that illustrates the relationships between entities in a database. ER diagrams often use symbols to represent three different types of information. Boxes are commonly used to represent entities. Diamonds are normally used to represent relationships and ovals are used to represent attributes.
An Entity is a thing or object in real world that is distinguishable from surrounding environment. For example each employee of an organization is a separate entity. Normally, each entity is stored in a database table and every instance of an entity corresponds to a row in that table. In an ER diagram, each entity is depicted as a rectangular box with the name of the entity contained within it.

A database containing information about individual people would likely have an entity called Person. This would correspond to a table with the same name in the database and every person tracked in the database would be an instance of that Person entity and have a corresponding row in the Person table. Database designers creating an E-R diagram would draw the Person entity using a shape similar to this:

Databases contain information about each entity. This information is tracked in individual fields known as attributes, which normally correspond to the columns of a database table.
For example, the Person entity might have attributes corresponding to the person's first and last name, date of birth, and a unique person identifier. Each of these attributes is depicted in an E-R diagram as an oval, as shown in the figure below:



Each attribute can have Values. In most cases single attribute have one value. But it is possible for attributes have multiple values also. Entities can have relationships with each other. In Entity Relationship Modeling, we model entities, their attributes and relationships among entities.

Relationships and Cardinality

The power of the E-R diagram lies in its ability to accurately display information about the relationships between entities. For example, we might track information in our database about the city where each person lives. Information about the city itself is tracked within a City entity and a relationship is used to tie together Person and City instances.

Relationships are normally given names that are verbs, while attributes and entities are named after nouns. This convention makes it easy to express relationships. For example, if we name our Person/City relationship "Lives In", we can string them together to say "A person lives in a city." We express relationships in E-R diagrams by drawing a line between the related entities and placing a diamond shape that contains the relationship name in the middle of the line.
There are some additional shapes on the line. The double hashed line appearing just to the left of the City entity indicates that this part of the relationship has a cardinality of 1. On the other hand, the crow's foot symbol to the right of the Person entity indicates that this part of the relationship has a cardinality of "many". Stated more plainly, each person may live in only one city, while a city may contain many people.


Types of Database Relationships

There are three different types of database relationships, each named according to the number of table rows that may be involved in the relationship. Each of these three relationship types exists between two tables.
  • One-to-one relationships occur when each entry in the first table has one, and only one, counterpart in the second table. One-to-one relationships are rarely used because it is often more efficient to simply put all of the information in a single table.
  • One-to-many relationships are the most common type of database relationship. They occur when each record in the first table corresponds to one or more records in the second table but each record in the second table corresponds to only one record in the first table. For example, the relationship between a Teachers table and a Students table in an elementary school database would likely be a one-to-many relationship, because each student has only one teacher, but each teacher may have multiple students.
  • Many-to-many relationships occur when each record in the first table corresponds to one or more records in the second table and each record in the second table corresponds to one or more records in the first table. For example, the relationship between a Teachers and a Courses table would likely be many-to-many because each teacher may instruct more than one course and each course may have more than one instructor.

Normalization

Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and reducing data dependencies (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.

The Normal Forms

The database community has developed a series of guidelines for ensuring that databases are normalized. These are referred to as normal forms, and they are first normal form (1NF), second normal form (2NF), and third normal form (3NF) along with the occasional 4NF.

First Normal Form (1NF)

First normal form (1NF) sets the very basic rules for an organized database:
  • Eliminate duplicative columns from the same table.
  • Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).
The first rule dictates that we must not duplicate data within the same row of a table. Within the database community, this concept is referred to as the atomicity of a table. Tables that comply with this rule are said to be atomic.

Second Normal Form (2NF)

Second normal form (2NF) further addresses the concept of removing duplicative data:
  • Meet all the requirements of the first normal form.
  • Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
  • Create relationships between these new tables and their predecessors through the use of foreign keys.
These rules can be summarized in a simple statement: 2NF attempts to reduce the amount of redundant data in a table by extracting it, placing it in new table(s) and creating relationships between those tables.


Third Normal Form (3NF)

Third normal form (3NF) goes one large step further:
  • Meet all the requirements of the second normal form.
  • Remove columns that are not dependent upon the primary key.( Has no transitive functional dependencies)

Boyce-Codd Normal Form (BCNF or 3.5NF)

The Boyce-Codd Normal Form also referred to as the "third and half (3.5) normal form", adds one more requirement:
  • Meet all the requirements of the third normal form.
  • Every determinant must be a candidate key.

Fourth Normal Form (4NF)

Finally, fourth normal form (4NF) has one additional requirement:
  • Meet all the requirements of the third normal form.
  • A relation is in 4NF if it has no multi-valued dependencies.
Remember, these normalization guidelines are cumulative. For a database to be in 2NF, it must first fulfill all the criteria of a 1NF database.

No comments: