73
RDBMS stands for Relational Database Management System. RDBMS data is structured in database tables, fields and records. Each RDBMS table consists of database table rows. Each database table row consists of one or more database table fields. RDBMS store the data into collection of tables, which might be related by common fields (database table columns). RDBMS also provide relational operators to manipulate the data stored into the database tables. Most RDBMS use SQL as database query language. A relational database management system (RDBMS) is a program that lets you create, update, and administer a relational database. Most commercial RDBMS's use the Structured Query Language (SQL) to access the database, although SQL was invented after the development of the relational model and is not necessary for its use. The leading RDBMS products are Oracle, IBM's DB2 and Microsoft's SQL Server. Despite repeated challenges by competing technologies, as well as the claim by some experts that no current RDBMS has fully implemented relational principles, the majority of new corporate databases are still being created and managed with an RDBMS. What is RDBMS?

B & c

Embed Size (px)

Citation preview

Page 1: B & c

RDBMS stands for Relational Database Management System. RDBMS data is structured in database tables, fields and records. Each RDBMS table consists of database table rows. Each database table row consists of one or more database table fields. 

RDBMS store the data into collection of tables, which might be related by common fields (database table columns). RDBMS also provide relational operators to manipulate the data stored into the database tables. Most RDBMS use SQL as

database query language.

A relational database management system (RDBMS) is a program that lets you create, update, and administer a relational database. Most commercial RDBMS's use the Structured Query Language (SQL) to access the database, although SQL was invented after the development of the relational model and is not necessary for its use.

The leading RDBMS products are Oracle, IBM's DB2 and Microsoft's SQL Server. Despite repeated challenges by competing technologies, as well as the claim by some experts that no current RDBMS has fully implemented relational principles, the majority of new corporate databases are still being created and managed with an RDBMS.

What is RDBMS?

Page 2: B & c

Keys in DBMSThe key is defined as the column or attribute of the database table.

For example if a table has id,name and address as the column names then each one

is known as the key for that table. We can also say that the table has 3 keys as id,

name and address. The keys are also used to identify each record in the database

table.The following are the various types of keys available in the DBMS system.

 A simple key contains a single attribute  .

A composite key is a key that contains more than one attribute.

A superkey is any set of attributes that uniquely identifies a row.

A candidate key is a minimal super key , which means its contains the minimum

number of attributes . A super key can contain redundant attributes but candidate

key contains only those attributes which are required to uniquely determine

records in the table

Page 3: B & c

For example

if ABC is a super key with three attributes A, B C and if A and B alone are sufficient to

determine the rows in the table, then AB will be candidate key .

 

A primary key is the key which is selected as the principal unique identifier by the database

schema designer . The primary key is usually the key selected to identify a row when the

database is physically implemented. Serial no. roll no. , invoice id are the examples of primary

key 

A foreign key is an attribute (or set of attributes) that appears (usually) as a non

key attribute in one relation and as a primary key attribute in another relation. I say usually

because it is possible for a foreign key to also be the whole or part of a primary key:

A many-to-many relationship can only be implemented by introducing an intersection or link

table which then becomes the child in two one-to-many relationships. The intersection table

therefore has a foreign key for each of its parents, and its primary key is a composite of both

foreign keys.

A one-to-one relationship requires that the child table has no more than one occurrence for each

parent, which can only be enforced by letting the foreign key also serve as the primary key.

Example for Keys

Page 4: B & c

Super Key

A Super key is any combination of fields within a table that uniquely identifies each record within that table.

Candidate Key

A candidate is a subset of a super key. A candidate key is a single field or the least combination of fields that uniquely identifies each record in the table. The least combination of fields distinguishes a candidate key from a super key. Every table must have at least one candidate key but at the same time can have several.

Page 5: B & c

As an example we might have a student_id that uniquely identifies the students in a student table. This would be a candidate key. But in the same table we might have the student’s first name and last name that also, when combined, uniquely identify the student in a student table. These would both be candidate keys.

In order to be eligible for a candidate key it must pass certain criteria.It must contain unique valuesIt must not contain null valuesIt contains the minimum number of fields to ensure uniquenessIt must uniquely identify each record in the table

Once your candidate keys have been identified you can now select one to be your primary key

Page 6: B & c

Primary Key

A primary key is a candidate key that is most appropriate to be the main reference key for the table. As its name suggests, it is the primary key of reference for the table and is used throughout the database to help establish relationships with other tables. As with any candidate key the primary key must contain unique values, must never be null and uniquely identify each record in the table.As an example, a student id might be a primary key in a student table, a department code in a table of all departments in an organization. This module has the code DH3D 35 that is no doubt used in a database somewhere to identify RDBMS as a unit in a table of modules. In the table below we have selected the candidate key student_id to be our most appropriate primary key

Primary keys are mandatory for every table each record must have a value for its primary key. When choosing a primary key from the pool of candidate keys always choose a single simple key over a composite key.

Page 7: B & c

Foreign Key

A foreign key is generally a primary key from one table that appears as a field in another where the first table has a relationship to the second. In other words, if we had a table A with a primary key X that linked to a table B where X was a field in B, then X would be a foreign key in B.An example might be a student table that contains the course_id the student is attending. Another table lists the courses on offer with course_id being the primary key. The 2 tables are linked through course_id and as such course_id would be a foreign key in the student table.

Page 8: B & c

Secondary Key or Alternative Key

A table may have one or more choices for the primary key. Collectively these are known as candidate keys as discuss earlier. One is selected as the primary key. Those not selected are known as secondary keys or alternative keys.

For example in the table showing candidate keys above we identified two candidate keys, studentId and firstName + lastName. The studentId would be the most appropriate for a primary key leaving the other candidate key as secondary or alternative key. It should be noted for the other key to be candidate keys, we are assuming you will never have a person with the same first and last name combination. As this is unlikely we might consider fistName+lastName to be a suspect candidate key as it would be restrictive of the data you might enter. It would seem a shame to not allow John Smith onto a course just because there was already another John Smith.

Simple Key

Any of the keys described before (ie primary, secondary or foreign) may comprise one or more fields, for example if firstName and lastName was our key this would be a key of two fields where as studentId is only one. A simple key consists of a single field to uniquely identify a record. In addition the field in itself cannot be broken down into other fields, for example, studentId, which uniquely identifies a particular student, is a single field and therefore is a simple key. No two students would have the same student number.

Page 9: B & c

Compound Key

A compound key consists of more than one field to uniquely identify a record. A compound key is distinguished from a composite key because each field, which makes up the primary key, is also a simple key in its own right. An example might be a table that represents the modules a student is attending. This table has a studentId and a moduleCode as its primary key. Each of the fields that make up the primary key are simple keys because each represents a unique reference when identifying a student in one instance and a module in the other.

Composite

A composite key consists of more than one field to uniquely identify a record. This differs from a compound key in that one or more of the attributes, which make up the key, are not simple keys in their own right. Taking the example from compound key, imagine we identified a student by their firstName + lastName. In our table representing students on modules our primary key would now be firstName + lastName + moduleCode. Because firstName + lastName represent a unique reference to a student, they are not each simple keys, they have to be combined in order to uniquely identify the student. Therefore the key for this table is a composite key.

Page 10: B & c

Introduction to Data Integrity

It is important that column data adhere to a predefined set of rules, as determined by the database administrator or application developer.

For example, some columns in a database table can have specific rules that constrain the data contained within them. These constraints can affect how data columns in one table relate to those in another table.

Data Integrity Rules

This section describes the rules that can be applied to table columns to enforce different types of data integrity.

• Null rule: A null rule is a rule defined on a single column that allows or disallows inserts or updates of rows containing a null (the absence of a value) in that column.

• Unique column values: A unique value rule defined on a column (or set of columns) allows the insert or update of a row only if it contains a unique value in that column (or set of columns).

• Primary key values: A primary key value rule defined on a key (a column or set of columns) specifies that each row in the table can be uniquely identified by the values in the key.

Page 11: B & c

Referential integrity also includes the rules that dictate what types of data manipulation are allowed on referenced values and how these actions affect dependent values. The rules associated with referential integrity are:

• Restrict: Disallows the update or deletion of referenced data.

• Set to null: When referenced data is updated or deleted, all associated dependent data is set to NULL.

• Set to default: When referenced data is updated or deleted, all associated dependent data is set to a default value.

• Cascade: When referenced data is updated, all associated dependent data is correspondingly updated. When a referenced row is deleted, all associated dependent rows are deleted.

• No action: Disallows the update or deletion of referenced data. This differs from RESTRICT in that it is checked at the end of the statement, or at the end of the transaction if the constraint is deferred. (Oracle Database uses No Action as its default action.)

• Complex integrity checking: A user-defined rule for a column (or set of columns) that allows or disallows inserts, updates, or deletes of a row based on the value it contains for the column (or set of columns).

Page 12: B & c

Advantages of Integrity Constraints

This section describes some of the advantages that integrity constraints associated with database tables have over other alternatives. These advantages are:

• Enforcing business rules in the code of a database application

• Using stored procedures to completely control access to data

• Enforcing business rules with triggered stored database

procedures

Page 13: B & c

Dr. E. F. Codd's 12 rules for relational database:0.Foundation RuleA relational database management system must manage its stored data using only its relational capabilities.

1. Information Rule

All information in the database should be represented in one and only one way - as values in a table.

2. Guaranteed Access Rule

Each and every datum (atomic value) is guaranteed to be logically accessible by resorting to a combination of table name, primary key value and column name.

3. Systematic Treatment of Null Values

Null values (distinct from empty character string or a string of blank characters and distinct from zero or any other number) are supported in the fully relational DBMS for representing missing information in a systematic way, independent of data type.

Page 14: B & c

4.Dynamic On-line Catalog Based on the Relational Model

The database description is represented at the logical level in the same way as ordinary data, so authorized users can apply the same relational language to its interrogation as they apply to regular data.

5.Comprehensive Data Sublanguage Rule

A relational system may support several languages and various modes of terminal use. However, there must be at least one language whose statements are expressible, per some well-defined syntax, as character strings and whose ability to support all of the following is comprehensible:data definitionview definitiondata manipulation (interactive and by program)integrity constraintsauthorizationtransaction boundaries (begin, commit, and rollback).

6.View Updating Rule

All views that are theoretically updateable are also updateable by the system.

Page 15: B & c

7. High-level Insert, Update, and Delete

The capability of handling a base relation or a derived relation as a single operand applies nor only to the retrieval of data but also to the insertion, update, and deletion of data.

8.Physical Data Independence

Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representation or access methods.

9.Logical Data Independence

Application programs and terminal activities remain logically unimpaired when information preserving changes of any kind that theoretically permit unimpairment are made to the base tables.

10.Integrity Independence

Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs.

Page 16: B & c

11. Distribution Independence

The data manipulation sublanguage of a relational DBMS must enable application programs and terminal activities to remain logically unimpaired whether and whenever data are physically centralized or distributed.

12.Nonsubversion Rule

If a relational system has or supports a low-level (single-record-at-a-time) language, that low-level language cannot be used to subvert or bypass the integrity rules or constraints expressed in the higher-level (multiple-records-at-a-time) relational language.

Page 17: B & c

Relational algebra

In computer science, relational algebra is an offshoot of first-order logic and of algebra of sets concerned with operations over finitary relations, usually made more convenient to work with by identifying the components of a tuple by a name (called attribute) rather than by a numeric column index, which is called a relation in database terminology.

The main application of relational algebra is providing a theoretical foundation for relational databases, particularly query languages for such databases, chief among which is SQL.

An algebra is a formal structure consisting of sets and operations on those sets.Relational algebra is a formal system for manipulating relations.

Operands of this algebra are relations.

Operations of this algebra include the usual set operations (since relations are sets of tuples), and special operations defined for relations

selection projection join

Page 18: B & c

Relational Set Operators uses relational algebra to manipulate contents in a database. All together there are eight different types of operators. These operators are SQL commands.

The first operator is the UNION. It combines all of the rows in one table with all of the rows in another table except for the duplicate tuples. The tables are required to have the same attribute characteristics for the Union command to work. The tables must be union-compatible which means that two tables being used have the same amount of columns and the columns have the same names, and also need to share the same domain.

INTERSECT is the second SQL command that takes two tables and combines only the rows that appear in both tables. The tables must be union-compatible to be able to use the Intersect command or else it won't work.

DIFFERENCE in another SQL command that gets all rows in one table that are not found in the other table. Basically it subracts one table from the other table to leave only the attributes that are not the same in both tables. For this command to work both tables must be union-compatible.

Page 19: B & c

PRODUCT command would show all possible pairs of rows from both tables being used. This command can also be referred to as the Cartesian Product.

SELECT is the command to show all rows in a table. It can be used to select only specific data from the table that meets certain criteria. This command is also referred to as the Restrict command.

PROJECT is the command that gives all values for certian attributes specified after the command. It shows a vertical view of the given table.

JOIN takes two or more tables and combines them into one table. This can be used in combination with other commands to get specific information. There are several types of the Join command. The Natural Join, Equijion, Theta Join, Left Outer Join and Right Outer Join.

DIVIDE has specific requirements of the table. One of the tables can only have one column and the other table must have two columns only.

Page 20: B & c

In creating a database, normalization is the process of organizing it into tables in such a way that the results of using the database are always unambiguous and as intended. Normalization may have the effect of duplicating data within the database and often results in the creation of additional tables. (While normalization tends to increase the duplication of data, it does not introduce redundancy, which is unnecessary duplication.) Normalization is typically a refinement process after the initial exercise of identifying the data objects that should be in the database, identifying their relationships, and defining the tables required and the columns within each table.

Normalization

Customer Item purchased Purchase price

Thomas Shirt $40

Maria Tennis shoes $35

Evelyn Shirt $40

Pajaro Trousers $25

A simple example of normalizing data might consist of a table showing:

Page 21: B & c

Normalization degrees of relational database tables have been defined and include:

First normal form (1NF). This is the "basic" level of normalization and generally corresponds to the definition of any database, namely:

It contains two-dimensional tables with rows and columns.

Each column corresponds to a sub-object or an attribute of the object

represented by the entire table.

Each row represents a unique instance of that sub-object or attribute

and must be different in some way from any other row (that is, no

duplicate rows are possible).

All entries in any column must be of the same kind. For example, in the

column labeled "Customer," only customer names or numbers are

permitted.

Page 22: B & c

Second normal form (2NF).

At this level of normalization, each column in a table that is not a

determiner of the contents of another column must itself be a function of

the other columns in the table. For example, in a table with three columns

containing customer ID, product sold, and price of the product when sold,

the price would be a function of the customer ID (entitled to a discount)

and the specific product.

Third normal form (3NF).

At the second normal form, modifications are still possible because a

change to one row in a table may affect data that refers to this information

from another table. For example, using the customer table just cited,

removing a row describing a customer purchase (because of a return

perhaps) will also remove the fact that the product has a certain price. In

the third normal form, these tables would be divided into two tables so that

product pricing would be tracked separately.

Page 23: B & c

What is Normalization?

Normalization is the process of efficiently organizing data in a

database. There are two goals of the normalization process: eliminating

redundant data (for example, storing the same data in more than

one table) and ensuring data dependencies make sense (only storing

related data in a table). Both of these are worthy goals as they reduce

the amount of space a database consumes and ensure that data is

logically stored.

The Normal Forms

The database community has developed a series of guidelines for

ensuring that databases are normalized. These are referred to as

normal forms and are numbered from one (the lowest form of

normalization, referred to as first normal form or 1NF) through five

(fifth normal form or 5NF). In practical applications, you'll often

see 1NF, 2NF, and3NF along with the occasional 4NF. Fifth normal

form is very rarely seen and won't be discussed in this article.

Page 24: B & c

First Normal Form (1NF)

First normal form (1NF) sets the very basic rules for an organized database:Eliminate duplicative columns from the same table.

Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

Second Normal Form (2NF)

Second normal form (2NF) further addresses the concept of removing

duplicative data:

Meet all the requirements of the first normal form.

Remove subsets of data that apply to multiple rows of a table and

place them in separate tables.

Create relationships between these new tables and their predecessors

through the use of foreign keys.

Page 25: B & c

Third Normal Form (3NF)

Third normal form (3NF) goes one large step further:Meet all the requirements of the second normal form.Remove columns that are not dependent upon the primary key.For more details, read Putting your Database in Third Normal Form

Boyce-Codd Normal Form (BCNF or 3.5NF)

The Boyce-Codd Normal Form, also referred to as the "third and half (3.5) normal form", adds one more requirement:

Meet all the requirements of the third normal form.Every determinant must be a candidate key.

Fourth Normal Form (4NF)

Finally, fourth normal form (4NF) has one additional requirement:Meet all the requirements of the third normal form.A relation is in 4NF if it has no multi-valued dependencies.Remember, these normalization guidelines are cumulative. For a database to be in 2NF, it must first fulfill all the criteria of a 1NF database.

Page 26: B & c

Suppose we are to manage all the databases of a company (say, My Company). The company must keep track of all the employees, customers, product details and the salary details of all the employees. A simple and straight forward way to do this is to put all this information into a single table and manage all those simultaneously. 

See below.Looking at the above table, you may feel that it is perfectly fine. After all, what is the problem with it? We have a big table; we have all the information required by the company together in a single space, thus saving a lot of memory. Well and good!

But, now think! If suppose, we need to frequently retrieve/update data about just the employees. Here, does the customer’s information or the product details really matter. Definitely no. So, why use the entire table for using just a part of it? We need a solution to this. And the solution is normalization. What we create using normalization is often called as normal forms. Let study about the popular and most widely used normal forms.

Example To Show Normalization

Page 27: B & c

The First Normal Form

To solve the above problem, the first and foremost thing to be done is to divide the entire raw database into smaller tables based on the actual groupings. When each table has been designed, a primary key is assigned to most or all tables. Note that the primary key must be a unique value, so try to select a data element for the primary key that naturally uniquely identifies a specific piece of data.So, let us take up the same previous example and prepare our First normal form. See the figure below:

As we can see, the big raw database is divided into three smaller tables- one for employee, customers and products details, each.Thus, to access any one of these tables, we need not handle the other two tables.

Page 28: B & c

The Second Normal FormThe objectives of the second normal form is to take data that is only partly dependent on the primary key and enter that data into another table. Let us take up the same example of Fig 1-2 Consider the table-EmployeeHere, the entire table has information about the personal details as well as the salary information. But, it is well understood that, to pay salary to an employee, the company does not actually need the employee’s personal details. Just his emp_id is sufficient. So, why not use just that? This is the second normal form. Same goes with Customers table. We can separate customer’s information from the order details. 

See the figure below:

Page 29: B & c

The Third Normal Form

The third normal form’s objective is to remove data in a table that is not dependent on the primary key.

See the same example of Fig 1-3. For the table named Emp_Pay, the position and position_desc fields are not dependent on primary key (emp_id). So, the better option is to move both these fields to another table. 

Page 30: B & c

When normalizing a database you should achieve four goals:

Arranging data into logical groups such that each group describes a small part of the whole

Minimizing the amount of duplicated data stored in a database

Building a database in which you can access and manipulate the data quickly and

efficiently without compromising the integrity of the data storage

Organizing the data such that, when you modify it, you make the changes in only one place.

When you normalize a database, you start from the general and work towards the specific, applying certain tests (checks) along the way. Some users call this process decomposition. It means decomposing (dividing/breaking down) a ‘big' un-normalized table (file) into several smaller tables by:

Eliminating insertion, update and delete anomalies 

Establishing functional dependencies

Removing transitive dependencies

Reducing non-key data redundancy

Need of Normalization

Page 31: B & c

The following are the advantages of the normalization.

More efficient data structure.

Avoid redundant fields or columns.

More flexible data structure i.e. we should be able to add new rows and

data values easily

Better understanding of data.

Ensures that distinct tables exist when necessary.

Easier to maintain data structure i.e. it is easy to perform operations and complex queries can be easily handled.

Minimizes data duplication. Close modeling of real world entities, processes and their relationships.

ADVANTAGES OF NORMALIZATION

Page 32: B & c

DISADVANTAGES OF NORMALIZATION

 

The following are disadvantages of normalization.

o   You cannot start building the database before you know what the user

needs.

o  On Normalizing the relations to higher normal forms i.e. 4NF, 5NF the

performance degrades.

o  It is very time consuming and difficult process in normalizing relations of

higher degree.

o Careless decomposition may leads to bad design of database which may

leads to serious problems.

Page 33: B & c

Functional Dependency

Functional dependency is a relationship that exists when one attribute uniquely determines another attribute. If R is a relation with attributes X and Y, a functional dependency between the attributes is represented as X->Y, which specifies Y is functionally dependent on X. Here X is termed as a determinant set and Y as a dependant attribute. Each value of X is associated precisely with one Y value. Functional dependency in a database serves as a constraint between two sets of attributes. Defining functional dependency is an important part of relational database design and contributes to aspect normalization.

A dependency occurs in a database when information stored in the same database table uniquely determines other information stored in the same table. You can also describe this as a relationship where knowing the value of one attribute (or a set of attributes) is enough to tell you the value of another attribute (or set of attributes) in the same table.

Saying that there is a dependency between attributes in a table is the same as saying that there is a functional dependency between those attributes. If there is a dependency in a database such that attribute B is dependent upon attribute A, you would write this as

“A -> B”.  

Page 34: B & c

For example,

In a table listing employee characteristics including Social Security Number (SSN) and name, it can be said that name is dependent upon SSN (or SSN -> name) because an employee's name can be uniquely determined from their SSN. However, the reverse statement (name -> SSN) is not true because more than one employee can have the same name but different SSNs.

Types of Dependencies

Trivial Functional Dependencies

A trivial functional dependency occurs when you describe a functional dependency of an attribute on a collection of attributes that includes the original attribute. For example, “{A, B} -> B” is a trivial functional dependency, as is “{name, SSN} -> SSN”. This type of functional dependency is called trivial because it can be derived from common sense. It is obvious that if you already know the value of B, then the value of B can be uniquely determined by that knowledge.

Full Functional Dependencies

A full functional dependency occurs when you already meet the requirements for a functional dependency and the set of attributes on the left side of the functional dependency statement cannot be reduced any farther. For example, “{SSN, age} -> name” is a functional dependency, but it is not a full functional dependency because you can remove age from the left side of the statement without impacting the dependency relationship.

Page 35: B & c

Transitive dependencies

 occur when there is an indirect relationship that causes a functional dependency. For example, ”A -> C” is a transitive dependency when it is true only because both “A -> B” and “B -> C” are true.

Multivalued dependencies 

occur when the presence of one or more rows in a table implies the presence of one or more other rows in that same table. For example, imagine a car company that manufactures many models of car, but always makes both red and blue colors of each model. If you have a table that contains the model name, color and year of each car the company manufactures, there is a multivalued dependency in that table. If there is a row for a certain model name and year in blue, there must also be a similar row corresponding to the red version of that same car.

Page 36: B & c

Join dependency (JD)

A join dependency (JD) can be said to exist if the join of R1 and R2 over C is equal to relation R. Where, R1 and R2 are the decompositions R1(A, B, C), and R2 (C,D) of a given relations R(A, B, C, D). Alternatively, R1 and R2 is a lossless decomposition of R. In other words, *(A, B, C, D), (C, D) will be a join dependency of R if the join of the join's attributes is equal to relation R. Here, *(R1, R2, R3, ….) indicates that relations R1, R2, R3 and so on are a join dependency (JD) of R. Therefore, a necessary condition for a relation R to satisfy a JD *(R1, R2,…., Rn) is that

R= R1 U R2 U…..URn

Thus, whenever we decompose a relation R into R1 = XUY = and R2 = (R − Y) based on an MVD X →→ that holds in relation R, the decomposition has lossless join property. Therefore, lossless-join dependency can be defined as a property of decomposition, which ensures that no spurious tuples are generated when relations are returned through a natural join operation.

Page 37: B & c

What is SQL?

SQL (Standard Query Language) is a language for manipulating databases developed in the 70s by IBM. All data management systems use SQL to access data or to communicate with a data server. RDBMS is the core platform for SQL, and for all other modern database languages such as  Oracle, MS SQL Server, IBM DB2, MySQL, and Microsoft Access, PostgreSQL, SQLite, Firebird, and many more. SQL (Standard Query Language) is born as a result of the mathematical work of Codd, who founded the work of relational databases, three types of manipulations on the database:

1 The maintenance of tables: create, delete, and modify the table structure.2 The manipulation of databases: Selecting, modifying, deleting records.3 The management of access rights to tables: Data control: access rights, commit the changes.

The advantage of SQL is that it is a manipulation language standard databases, you can use on any database, even if, at first, you do not know its use. Thus, with SQL you can manage an Access database, but Paradox, dBase, SQL Server, Oracle or Informix example (the database most used). SQL base is RDBMS. Example of (RDBMS) Relational database management system (i.e. MySQL, MS Access, SQL Server ). MySQL, one of the most famous SQL distributions used by the majority of the scripts on the Internet.

Page 38: B & c

Techopedia explains Structured Query Language (SQL)

One of the most fundamental DBA rites of passage is learning SQL, which begins with writing the first SELECT statement or SQL script without a graphical user interfaces (GUI). Increasingly, relational databases use GUIs for easier database management, and queries can now be simplified with graphical tools, e.g., drag-and-drop wizards. However, learning SQL is imperative because such tools are never as powerful as SQL.

SQL code is divided into four main categories:

Queries are performed using the ubiquitous yet familiar SELECT statement, which is further divided into clauses, including SELECT, FROM, WHERE and ORDER BY.

Data Manipulation Language (DML) is used to add, update or delete data and is actually a SELECT statement subset and is comprised of the INSERT, DELETE and UPDATE statements, as well as control statements, e.g., BEGIN TRANSACTION, SAVEPOINT, COMMIT and ROLLBACK.

Data Definition Language (DDL) is used for managing tables and index structures. Examples of DDL statements include CREATE, ALTER, TRUNCATE and DROP.

Data Control Language (DCL) is used to assign and revoke database rights and permissions. Its main statements are GRANT and REVOKE.

Page 39: B & c

SQL Basics

Basic SQL Statements include:

CREATE - a data structure

SELECT - read one or more rows from a table

INSERT - one or more rows into a table

DELETE - one or more rows from a table

UPDATE - change the column values in a row

DROP - a data structure

In the remainder of this section only simple SELECT statements are considered.

Page 40: B & c

Simple SELECT

The syntax of a SELECT statement is : SELECT column FROM tablename This would produce all the rows from the specified table, but only for the particular

column mentioned. If you want more than one column shown, you can put in multiple columns separating them with commas, like:

SELECT column1,column2,column3 FROM tablename If you want to see all the columns of a particular table, you can type:

SELECT * FROM tablename Lets see it in action on CAR...

SELECT * FROM car;

REGNO MAKE COLOUR PRICE OWNER

F611 AAA FORD RED 12000 Jim Smith

J111 BBB SKODA BLUE 11000 Jim Smith

A155 BDE MERCEDES BLUE 22000 Bob

Smith

K555 GHT FIAT GREEN 6000 Bob Jones

SC04 BFE SMART BLUE 13000

Page 41: B & c

SELECT regno FROM car;

SELECT color, owner FROM car;

COLOUR OWNER

RED Jim Smith

BLUE Jim Smith

BLUE Bob Smith

GREEN Bob Jones

BLUE

REGNO

F611 AAA

J111 BBB

A155 BDE

K555 GHT

SC04 BFE

In SQL, you can put extra space characters and return characters just about anywhere without changing the meaning of the SQL. SQL is also case-insensitive (except for things in quotes). In addition, SQL in theory should always end with a ';' character. You need to include the ';' if you have two different SQL queries so that the system can tell when one SQL statement stops and another one starts. If you forget the ';' the online interface will put one in for you. For these reasons all of the following statements are identical and valid.

SELECT REGNO FROM CAR; SELECT REGNO FROM CAR Select REGNO from CAR select regno FROM car SELECT regno FROM car;

Page 42: B & c

SELECT filters

Displaying all the rows of a table can be handy, but if we have tables with millions of

rows then this type of query could take hours. Instead, we can add "filters" onto a

SELECT statement to only show specific rows of a table. These filters are written into

an optional part of the SELECT statement, known as a WHERE clause.

SELECT columns FROM table WHERE rule The "rule" section of the WHERE clause

is checked for every row that a select statement would normally show. If the whole

rule is TRUE, then that row is shown, whereas if the rule is FALSE, then that row is

not shown.

The rule itself can be quite complex. The simplest rule is a single equality test, such

as "COLOUR = 'RED'".

Without the WHERE rule would show:

SELECT regno from CAR;

REGNO

F611 AAA

J111 BBB

A155 BDE

K555 GHT

SC04 BFE

Page 43: B & c

ComparisonsSQL supports a variety of comparison rules for use in a WHERE clause. These include =,!=,<>, <, <=, >, and >=.Examples of a single rule using these comparisons are:

Note that when dealing with strings, like RED, you must say 'RED'. When dealing with numbers, like 10000, you can say '10000' or 10000. The choice is yours.

WHERE colour = 'RED' The colour attribute must be RED

WHERE colour = 'RED' The colour attribute must be RED

WHERE colour != 'RED' The colour must be a colour OTHER THAN RED

WHERE colour <> 'RED' The same as !=

WHERE PRICE > 10000 The price of the car is MORE THAN 10000

WHERE PRICE >= 10000The price of the car is EQUAL TO OR MORE THAN 10000

WHERE PRICE < 10000 The price of the car is LESS THAN 10000

WHERE PRICE <= 10000The price of the car is EQUAL TO OR LESS THAN 10000

Page 44: B & c

A DBMS must provide appropriate languages and interfaces for each category of users to express database queries and updates. Database Languages are used to create and maintain database on computer. There are large numbers of database languages like Oracle, MySQL, MS Access, dBase, FoxPro etc. SQL statements commonly used in Oracle and MS Access can be categorized as data definition language (DDL), data control language (DCL) and data manipulation language (DML).

Data Definition Language (DDL)

It is a language that allows the users to define data and their relationship to other types of data. It is mainly used to create files, databases, data dictionary and tables within databases.It is also used to specify the structure of each table, set of associated values with each attribute, integrity constraints, security and authorization information for each table and physical storage structure of each table on disk.The following table gives an overview about usage of DDL statements in SQL

Database Languages

Page 45: B & c

Data Manipulation Language (DML)

It is a language that provides a set of operations to support the basic data manipulation operations on the data held in the databases. It allows users to insert, update, delete and retrieve data from the database. The part of DML that involves data retrieval is called a query language.

The following table gives an overview about the usage of DML statements in SQL:

Page 46: B & c

Data Control Language (DCL)

DCL statements control access to data and the database using statements such as GRANT and REVOKE. A privilege can either be granted to a User with the help of GRANT statement. The privileges assigned can be SELECT, ALTER, DELETE, EXECUTE, INSERT, INDEX etc. In addition to granting of privileges, you can also revoke (taken back) it by using REVOKE command.

The following table gives an overview about the usage of DCL statements in SQL:

In practice, the data definition and data manipulation languages are not two separate languages. Instead they simply form parts of a single database language such as Structured Query Language (SQL). SQL represents combination of DDL and DML, as well as statements for constraints specification and schema evaluation

Page 47: B & c

Create Table :-

Used to create the tables where data will be storedCreate a table to store personnel data, with a Staff ID column as primary key

. 1.  Type this SQL statement in the SQL query design window:             CREATE TABLE Personnel (            StaffID text(9) CONSTRAINT StaffPK PRIMARY KEY,            LastName text(15) not null,            FirstName text(15) not null,            Birthday date,            Department text(12) null);               2.  Execute the statement.  If Access reports syntax errors, find and correct them. 3.  Save the query as DefinePersonnel and close it.  In the database window, check the Query list for this DDL query (notice that the icon for the query is different from the icon for SELECT queries) and check the Table list for the new Personnel table. 4.  Run a query to select all records from the new table:             SELECT * FROM Personnel;   The query returns one blank record (in other databases:  0 rows):

  Close the query.

Page 48: B & c

5.  Open the new table in datasheet view – it is empty and ready for data entry.  6.  Change to design view and compare with the SQL statement.  Also, choose View / Indexes and compare with the constraint created on StaffID:

Close The Personal Table

Page 49: B & c

DDLData Definition Language (DDL) statements are used to define the database structure or schema. Some examples:

CREATE - to create objects in the database

ALTER - alters the structure of the database

DROP - delete objects from the database

TRUNCATE - remove all records from a table, including all spaces

allocated for the records are removed

COMMENT - add comments to the data dictionary

RENAME - rename an object

Page 50: B & c

DML

Data Manipulation Language (DML) statements are used for managing data within schema objects. Some examples:

SELECT - retrieve data from the a database INSERT - insert data into a table UPDATE - updates existing data within a table DELETE - deletes all records from a table, the space for the

records remain MERGE - UPSERT operation (insert or update) CALL - call a PL/SQL or Java subprogram EXPLAIN PLAN - explain access path to data LOCK TABLE - control concurrencyDCL

Data Control Language (DCL) statements. Some examples:

GRANT - gives user's access privileges to database REVOKE - withdraw access privileges given with the GRANT

command

Page 51: B & c

TCL

Transaction Control (TCL) statements are used to manage the changes made by DML statements. It allows statements to be grouped together into logical transactions.

COMMIT - save work done

SAVEPOINT - identify a point in a transaction to which you can later

roll back

ROLLBACK - restore database to original since the last COMMIT

SET TRANSACTION - Change transaction options like isolation level

and what rollback segment to use

Page 52: B & c

DML statements are used to work with the data in tables. When you are connected to most

multi-user databases (whether in a client program or by a connection from a Web page script),

you are in effect working with a private copy of your tables that can’t be seen by anyone else

until you are finished (or tell the system that you are finished). You have already seen the

SELECT statement; it is considered to be part of DML even though it just retreives data rather

than modifying it.

The insert statement is used, obviously, to add new rows to a table.

INSERT INTO <table name> VALUES (<value 1>, ... <value n>);

The comma-delimited list of values must match the table structure exactly in the number of attributes and the data type of each attribute. Character type values are always enclosed in single quotes; number values are never in quotes; date values are often (but not always) in the format 'yyyy-mm-dd' (for example, '2006-11-30').

Yes, you will need a separate INSERT statement for every row.

Data manipulation language

Page 53: B & c

Data manipulation language

The update statement is used to change values that are already in a table.

UPDATE <table name> SET <attribute> = <expression> WHERE <condition>;

The update expression can be a constant, any computed value, or even the result of a SELECT statement that returns a single row and a single column. If the WHERE clause is omitted, then the specified attribute is set to the same value in every row of the table (which is usually not what you want to do). You can also set multiple attribute values at the same time with a comma-delimited list of attribute=expression pairs.

The delete statement does just that, for rows in a table.

DELETE FROM <table name> WHERE <condition>;

If the WHERE clause is omitted, then every row of the table is deleted (which again is usually not what you want to do)—and again, you will not get a “do you really want to do this?” message.

Page 54: B & c

 If you are using a large multi-user system, you may need to make your DML changes visible to the rest of the users of the database. Although this might be done automatically when you log out, you could also just type:

COMMIT;

If you’ve messed up your changes in this type of system, and want to restore your private copy of the database to the way it was before you started (this only works if you haven’t already typed COMMIT), just type:

ROLLBACK;

Although single-user systems don’t support commit and rollback statements, they are used in large systems to control transactions, which are sequences of changes to the database. Transactions are frequently covered in more advanced courses.

Privileges

If you want anyone else to be able to view or manipulate the data in your tables, and if your system permits this, you will have to explicitly grant the appropriate privilege or privileges (select, insert, update, or delete) to them. This has to be done for each table. The most common case where you would use grants is for tables that you want to make available to scripts running on a Web server, for example:

GRANT select, insert ON customers TO webuser;

Page 55: B & c

SQL: EXISTS CONDITION

The SQL EXISTS condition is used in a SQL query and is considered "to

be met" if the subquery returns at least one row. It can be used in

a SELECT, INSERT, UPDATE, or DELETE statement.

SQL EXISTS SYNTAX

The syntax for the SQL EXISTS condition is:

WHERE EXISTS ( subquery );

NOTE

SQL Statements that use the SQL EXIST Condition are very inefficient

since the sub-query is RE-RUN for EVERY row in the outer query's

table. There are more efficient ways to write most queries, that do not

use the SQL EXISTS Condition.

Page 56: B & c

Let's look at a simple example.The following is a SQL SELECT statement that uses the SQL EXISTS condition:

SELECT * FROM suppliers WHERE EXISTS (SELECT * FROM orders WHERE suppliers.supplier_id = orders.supplier_id);.

This SQL EXISTS condition example will return all records from the suppliers table where there is at least one record in the orders table with the same supplier_id.

SQL EXISTS EXAMPLE - SELECT STATEMENT USING NOT EXISTS

The SQL EXISTS condition can also be combined with the SQL NOT operator.For example,

SELECT * FROM suppliers WHERE NOT EXISTS (SELECT * FROM orders WHERE suppliers.supplier_id = orders.supplier_id);

This SQL EXISTS example will return all records from the suppliers table where there are no records in the orders table for the given supplier_id.

SQL EXISTS EXAMPLE - SELECT STATEMENT

Page 57: B & c

SQL EXISTS EXAMPLE - INSERT STATEMENT

The following is an example of a SQL INSERT statement that uses the SQL EXISTS condition:

INSERT INTO suppliers (supplier_id, supplier_name) SELECT account_no, name FROM suppliers WHERE EXISTS (SELECT * FROM orders WHERE suppliers.supplier_id = orders.supplier_id);

SQL EXISTS EXAMPLE - UPDATE STATEMENT

The following is an example of a SQL UPDATE statement that uses the SQL EXISTS condition:

UPDATE suppliers SET supplier_name = (SELECT customers.name FROM customers WHERE customers.customer_id = suppliers.supplier_id) WHERE EXISTS (SELECT customers.name FROM customers WHERE customers.customer_id = suppliers.supplier_id);

Page 58: B & c

What actually sets SQL Server apart from other programming languages is the way SQL Server

processes its code. Generally, most programming languages process statement from top to bottom.

By contrast, SQL Server processes them in a unique order which is known as Logical Query

Processing Phase. These phases generate a series of virtual tables with each virtual table feeding

into the next phase (virtual tables not viewable). These phases and their orders are given as

follows:

1. FROM

2. ON

3. OUTER

4. WHERE

5. GROUP BY

6. CUBE | ROLLUP

7. HAVING

8. SELECT

9. DISTINCT

10. ORDER BY

11. TOP

 

Order of Execution of SQL Queries

Page 59: B & c

Following is the sequence of execution of a SQL query

Step 1: FROM clause - Identify the objects

Step 2: FROM clause Joins - Identified objects are joined based on the conditions (filtering data)

Step 3: WHERE clause - Where condition is applied which again filters data based on the conditions applied.

Step 4: GROUP BY clause - Records are grouped based on the condition.

Step 5: HAVING clause - Having conditions are applied to filter the grouped data.

Step 6: SELECT clause - Mentions coloumns are selected

Step 7: ORDER BY clause - Selected columns are finally sorted based on the order by condition and displayed to user.

Order of Execution of SQL Queries

Page 60: B & c

A view is virtual table in the database defined by a query. A view does not exist in the database as a stored set of data values. To reduces redundant data to the minimum possible, oracle allows the create of an object called a view.

 The reasons for creating view sale:

    1) When data security is required.    2) When data redundancy is to be kept to the minimum while maintaining data        security

There are 3 types of views

Horizontal view 

Vertical view

Joined view

What is a view ? What are its advantages and disadvantages 

Page 61: B & c

Horizontal view restricts a user’s access to selected rows of a table.

Vertical view restricts a user’s access to select columns of a table.

A joined view draws its data from two or three different tables and presents the query results as a single virtual table. Once the view is defined, one can use a single table query against the view for the requests that would otherwise each require a two or three table join.

Advantages of views

Security: security is provided to the data base to the user to a specific no. of rows of a table.

Query simplicity: by using joined views data can be accessed from different tables.

Data integrity: if data is accessed and entered through a view, the DBMS can automatically check the data to ensure that it meets specified integrity constraints.

Page 62: B & c

Disadvantages of views

Performance:

The DBMS the query against the view into queries against the underlying source table. If a table is defined by a multi table query, then even a simple query against a view becomes a complicated join, and it may take a long time to complete. This is reference to insert, delete and update operations

Update restrictions:

When a user tries to update rows of a view, the DBMS must translate the request into an update into an update on rows of the underlying source table. This is possible for simple views, but more complicated views cannot be updated.

Page 63: B & c

DCL commands are used to enforce database security in a multiple user database environment. Two types of DCL commands are GRANT and REVOKE. Only Database Administrator's or owner's of the database object can provide/remove privileges on a database object.

SQL GRANT Command

SQL GRANT is a command used to provide access or privileges on the database objects to the users.

The Syntax for the GRANT command is:

DCL: Granting and Revoking Privileges.

Page 64: B & c

Privilege _ name is the access right or privilege granted to the user. Some of

the access rights are ALL, EXECUTE, and SELECT.

Object _ name is the name of an database object like TABLE, VIEW, STORED

PROC and SEQUENCE.

User _ name is the name of the user to whom an access right is being granted.

User _ name is the name of the user to whom an access right is being granted.

PUBLIC is used to grant access rights to all users.

ROLES are a set of privileges grouped together.

WITH GRANT OPTION - allows a user to grant access rights to other users.

Page 65: B & c

For Example: 

GRANT SELECT ON employee TO user1;This command grants a SELECT permission on employee table to user1.You should

use the WITH GRANT option carefully because for example if you GRANT SELECT privilege on employee table to user1 using

the WITH GRANT option, then user1 can GRANT SELECT privilege on employee table to another user, such as user2 etc.

Later, if you REVOKE the SELECT privilege on employee from user1, still user2 will have SELECT privilege on employee table.

SQL REVOKE Command:

The REVOKE command removes user access rights or privileges to the database objects. The Syntax for the REVOKE command is:

For Example: 

REVOKE SELECT ON employee FROM user1;This command will REVOKE a SELECT privilege on employee table from

user1.When you REVOKE SELECT privilege on a table from a user, the user will not be able to SELECT data from that table

anymore. However, if the user has received SELECT privileges on that table from more than one users, he/she can SELECT

from that table until everyone who granted the permission revokes it. You cannot REVOKE privileges if they were not initially

granted by you.

Page 66: B & c

Privileges and Roles: Privileges: Privileges defines the access rights provided to a user on a database

object. There are two types of privileges.

System privileges - This allows the user to CREATE, ALTER, or DROP database objects. 

Object privileges - This allows the user to EXECUTE, SELECT, INSERT, UPDATE, or DELETE data from database objects to which the privileges apply. 

Few CREATE system privileges are listed below:System

PrivilegesDescription

CREATE objectallows users to create the specified object in their own schema.

CREATE ANY object

allows users to create the specified object in any schema.

The above rules also apply for ALTER and DROP system privileges.Few of the object privileges are listed below:

Object Privileges Description

INSERT allows users to insert rows into a table.

SELECTallows users to select data from a database object.

UPDATE allows user to update data in a table.

EXECUTEallows user to execute a stored procedure or a function.

Page 67: B & c

Limitations of SQL query creation

Limitations of SQL query creation.

A query cannot be created using a view that is derived from a user-defined

function. This is a known limitation.

Incorrect SQL is generated for a multitable data graph.

Multitable graphs are not supported on Informix® Dynamic Server 9.3.

An error occurs when running an SQL file for a second time on Sybase 12

database. If you are running an SQL file on Sybase, change the Sybase

SET CHAINED option to OFF.

Page 68: B & c

SQL Language Limitations

SQLFire has limitations and restrictions for SQL statements, clauses, and expressions.

ALTER TABLE Limitations

This release of SQLFire has the following restrictions for ALTER TABLE. SQLFire throws a SQLException “Feature not implemented” with SQLState “0A000” if any of these actions are attempted: 

Adding or dropping a column when the table has data, or when the table had

data at some point after creation.

Dropping a primary key column with or without data.

Adding or dropping a primary key constraint when the table has data, or when

the table had data at some point after creation.

In addition, the ALTER COLUMN clause as in the SQL-92 standard is not implemented and SQLFire will throw an SQLException with state “0A000” though it is not treated as a syntactical error.

Page 69: B & c

This release of SQLFire supports auto-generated IDENTITY columns, but has the following limitations:

Only INT and BIGINT column types can be marked as auto-generated IDENTITY

columns

The START WITH and INCREMENT BY clauses are supported only for

GENERATED BY DEFAULT identity columns.

If the maximum permissible value for the type is reached in any insert, then

SQLFire throws an overflow exception (SQLState: “42Z24”). This does not

necessarily mean that all possible values of that type have been used up, because

it is possible that some values remain unused.

Applications should not depend on identity values being incremental across the

distributed system, because SQLFire provides no ordering guarantee for

concurrent inserts from multiple members. However, inserts from a single

member will have the generated values in ascending order and applications can

use that for ordering purposes.

Auto-Generated Columns

Page 70: B & c

LONG/LOB Column Restrictions

SQLFire does not support using columns of the following data types in indexes, ORDER BY clauses, GROUP BY clauses, DISTINCT clauses, UNION clauses, or other set operations: 

BLOB CLOB LONG VARCHAR FOR BIT DATA Columns of type LONG VARCHAR are supported in these cases.

Bulk Update Limitations

If a SQL statement performs a bulk update operation on multiple SQLFire

members, any exception that occurs during the bulk update can leave some

rows updated while other rows are not updated. Use transactions with bulk

update statements to ensure that all updates succeed or roll back as a whole

Page 71: B & c

Cascade DELETE Not Supported

SQLFire does not support cascade delete operations.

Locking Prioritizes DML over DDL

The SQLFire locking behavior prioritizes DML execution over DDL statements. DDL

statments may receive a lock timeout exception (SQLState: 40XL1) if your system is

processing numerous concurrent DML and DDL statements. You can configure the

maximum amount of time that DDL statements wait for locks using  sqlfire.max-lock-wait.

Expiration and Eviction Limitations

EXPIRE ENTRY WITH IDLETIME works only when a primary key based query is fired. Otherwise the system will not modify its accessed time when table scans or index scans happen and it gets destroyed.

EXPIRATION or EVICTION with action as DESTROY should not be set on a parent table having child tables with foreign key reference to it. This is due to a lack of cascade delete support in SQLFire. If an attempt is made to create a child table having foreign key reference to a table with such a policy then a SQLException is thrown (SQLState: "X0Y99").

Page 72: B & c

INSERT with subselect

SQLFire has a limited support for INSERT statements that use a subselect

statement. Nested selects and selects having aggregates are not supported; these

queries throw a feature not implemented exception (SQLSTATE 0A000).

LOCK TABLE

The LOCK TABLE statement is not supported in this release of SQLFire.

Procedure Invocation (Data-Aware and Non-Data-Aware Procedures)

When you use the ON TABLE extension in a CALL statement, the WHERE clause is mandatory. If you need to route a data-aware procedure to all members that host the table (without any pruning), then you must specify some extraneous condition that always evaluates to true (such as WHERE 1=1).

A server can only handle Java procedure definitions that exactly match the JDBC parameter types in a CREATE PROCEDURE statement. If a procedure specifies parameter types that use the base class of a corresponding java type (for example, if a procedure uses java.util.Date instead of java.sql.Date) then the invocation from the client side fails.

Page 73: B & c

UNION, INTERSECT, and EXCEPT Operators

SQLFire does not support any query that has either nested set operators or a set operator with either a join, function expression, SQL procedure, view, or sub-query. There is no explicit support provided for ORDER BY, GROUP BY, or complex filters in the WHERE clause in either child of a query that uses a set operator. Also, transactions and high availability features are not supported for queries that use a set operator.

In this context, a set operator includes any of these operators: UNION DISTINCT, UNION, UNION ALL, INTERSECT DISTINCT, INTERSECT, INTERSECT ALL, EXCEPT DISTINCT, EXCEPT, or EXCEPT ALL.

VIEW Limitations

SQLFire does not support views that involve grouping, aggregate, distinct, or join operations on a partitioned table.

SQLFire queries have a unique set of capabilities and limitations that are inherent to the distributed database design.