The process of producing a description of the implementation of the database on secondary storage; it describes the base relations, file organizations, and indexes used to achieve efficient access to the data, and any associated integrity constraints and security measures.
1) Translate logical data model for target DBMS
This step is to produce a relational database schema from the logical data model that can be implemented in the target DBMS.
3.1) Design base relation
To decide how to represent the base relations identified in the logical data model in the target DBMS.
To start the physical design process, we first collate and assimilate the information about the relations produced during logical database design. The necessary information can be obtained from the data dictionary and the definition of the relations described using the Database Design Language (DBDL). For each relation identified in the logical data model, we have a definition consisting of:
· The name of the relation;
· A list of simple attributes in brackets;
· The primary key and, where appropriate, alternate keys (AK) and foreign keys (FK);
· Referential integrity constraints for any foreign keys identified.
· From the data dictionary, we also have for each attribute:
· Its domain, consisting of a data type, length, and any constraints on the domain;
· An optional default value for the attribute;
· Whether the attribute can hold nulls;
· Whether the attribute is derived and, if so, how it should be computed.
3.2) Design representation of derived data
To decide how to represent any derived data present in the logical data model in the target DBMS.
From a physical database design perspective, whether a derived attribute is stored in the database or calculated every time it is needed is a tradeoff. The designer should calculate:
· The additional cost to store the derived data and keep it consistent with operational data from which it is derived;
· The cost to calculate it each time it is required
Document design of derived data
The design of derived data should be fully documented along with the reasons for selecting the proposed design. In particular, document the reasons for selecting one approach where many alternatives exist.
3.3) Design general constraints
This step is to design the general constraints for the target DBMS.
Updates to relations may be constrained by integrity constraints governing the ‘real world’ transactions that are represented by the updates. In Step 3.1 we designed a number of integrity constraints: required data, domain constraints, and entity and referential integrity.
In this step we have to consider the remaining general constraints. The design of such constraints is again dependent on the choice of DBMS; some systems provide more facilities than others for defining general constraints. As in the previous step, if the system is compliant the SQL standard, some constraints may be easy to implement. For example, Administrator has a rule that prevents a member of staff from managing more than 5 ATM Account at the same time. We could design this constraint into the SQL CREATE
TABLE statement for account using the following clause:
//////////////////////////////////////////
USE ATMREPORT
CONSTRAINT ACCOUNTNO
CHECK( SELECT ACCOUNTNO
FROM ACCOUNT
GROUP BY ACCOUNTNO
HAVING COUNT(*)<5)
Document design of general constraints
The design of general constraints should be fully documented. In particular, document the reasons for selecting one approach where many alternatives exist.
2) Design file organization & indexes
To determine the optimal file organizations to store the base relations and the indexes that are required to achieve acceptable performance, that is, the way in which relations and tuples will be held on secondary storage.
One of the main objectives of physical database design is to store and access data in an efficient way. While some storage structures are efficient for bulk loading data into the database, they may be inefficient after that. Thus, we may have to choose to use an efficient storage structure to set up the database and then choose another for operational use.
Again, the types of file organization available are dependent on the target DBMS; some systems provide more choice of storage structures than others. It is extremely important that the physical database designer fully understands the storage structures that are available, and how the target system uses these structures. This may require the designer to know how the system’s query optimizer functions. For example, there may be circumstances where the query optimizer would not use a secondary index, even if one were available. Thus, adding a secondary index would not improve the performance of the query, and the resultant overhead would be unjustified.
As with logical database design, physical database design must be guided by the nature of the data and its intended use. In particular, the database designer must understand the typical workload that the database must support. During the requirements collection and analysis stage there may have been requirements specified about how fast certain transactions must run or how many transactions must be processed per second. This information forms the basis for a number of decisions that will be made during this step.
4.1) Analyze Transactions
To understand the functionality of the transactions that will run on the database and to analyze the important transactions.
To carry out physical database design effectively, it is necessary to have knowledge of the transactions or queries that will run on the database. This includes both qualitative and quantitative information. In analyzing the transactions, we attempt to identify performance criteria, such as:
· The transactions that run frequently and will have a significant impact on performance;
· The transactions that is critical to the operation of the business;
· The times during the day/week when there will be a high demand made on the database.
4.2) Choose file organizations
This step is to determine an efficient file organization for each base relation.
One of the main objectives of physical database design is to store and access data in an efficient way. For example, if we want to retrieve staff tuples in alphabetical order of name, sorting the file by staff name is a good file organization. However, if we want to retrieve all staff whose salary is in a certain range, searching a file ordered by account name would not be particularly efficient. To complicate matters, some file organizations are: Efficient for bulk loading data into the database but inefficient after that. In other words, we may want to use an efficient storage structure to set up the database and then change it for normal operational use.
Document choice of file organizations
The choice of file organizations should be fully documented, along with the reasons for the choice. In particular, document the reasons for selecting one approach where many alternatives exist.
4.3) Choose indexes
To determine whether adding indexes will improve the performance of the system.
One approach to selecting an appropriate file organization for a relation is to keep the tuples unordered and create as many secondary indexes as necessary. Another approach is to order the tuples in the relation by specifying a primary or clustering index. In this case, choose the attribute for ordering or clustering the tuples as:
· The attribute that is used most often for join operations, as this makes the join operation more efficient, or
· The attribute that is used most often to access the tuples in a relation in order of that attribute.
If the ordering attribute chosen is a key of the relation, the index will be a primary index. if the ordering attribute is not a key, the index will be a clustering index. Remember that each relation can only have either a primary index or a clustering index.
Specifying indexes
For example, to create a primary index on the ATMCARD relation based on the ATMNO attribute, we might use the following SQL statement:
USE ATMREPORT
CREATE UNIQUE INDEX ATMNOInd ON ATMCARD (ATMNO);
To create a clustering index on the ATMCARD relation based on the ACCOUNTNO attribute, we will use the following statement
USE ATMREPORT
CREATE INDEX ACCOUNTNOInd ON BRANCH (ACCOUNTNO) /*CLUSTER*/;
Creating Secondary Indexes
USE ATMREPORT
SELECT ACCOUNTNO, COUNT(WTHDRAWAMNT)
FROM ACCOUNT_LEDGER
GROUP BY ACCOUNTNO;
4.4) Estimate disk space requirements
To estimate the amount of disk space that will be required by the database.
It may be a requirement that the physical database implementation can be handled by the current hardware configuration. Even if this is not the case, the designer still has to estimate the amount of disk space that is required to store the database, in the event that new hardware has to be procured. The objective of this step is to estimate the amount of disk space that is required to support the database implementation on secondary storage.
As with the previous steps, estimating the disk usage is highly dependent on the target DBMS and the hardware used to support the database. In general, the estimate is based on the size of each tuple and the number of tuples in the relation. The latter estimate should on the query transactions for the USERS view of ATMCARD.
3) Design User Views
To design the user views that was identified during the requirements collection and analysis stage of the database system development lifecycle.
Following an analysis of the data requirements for these user views, we used the centralized approach to merge the requirements for the user views as follows:
· Branch, consisting of the Administrator user views.
· Staff, consisting of the Assistant user views.
Document design of user views
The design of the individual user views should be fully documented.
4) Design Security Mechanisms
To design the security mechanisms for the database as specified by the users during the requirements and collection stage of the database system development lifecycle.
A database represents an essential corporate resource and so security of this resource is extremely important. During the requirements collection and analysis stage of the database system development lifecycle, specific security requirements should have been documented in the system requirements specification. The objective of this step is to decide how these security requirements will be realized. Some systems offer different security facilities than others. Again, the database designer must be aware of the facilities offered by the target DBMS. Relational DBMS generally provide two types of database security:
· System security;
· Data security.
System security covers access and use of the database at the system level, such as a user name and password.
Data security covers access and use of database objects (such as relations and views) and the actions that users can have on the objects. Again, the design of access rules is dependent on the target DBMS; some systems provide more facilities than others for designing access rules. We have previously discussed three particular ways to create access rules using the discretionary GRANT and REVOKE statements of the ISO SQL standard, Microsoft Office Access.
---
DBMS Selection
Application Design
Prototyping
Implementation
Data Conversion and Loading
Testing
Operational Maintenance
---- Thanks guys for visiting and reading this post--
Goto First Page
DBMS Selection
It is the selection of an appropriate DBMS to support the database system.
Ø Data definition
Primary key enforcement
Foreign key specification
Data types available
Data type extensibility
Domain specification
Ease of restructuring
Integrity controls
View mechanism
Data dictionary
Data independence
Underlying data model
Schema evolution
Ø Physical definition
File structures available
File structure maintenance
Ease of reorganization
Indexing
Variable length fields/records
Data compression
Encryption routines
Memory requirements
Storage requirements
Finally, all the weighted scores for each assessed group of features are summed to produce a single score for the DBMS product, which is compared with the scores for the other products. The product with the highest score is the ‘winner’.
In addition to this type of analysis, we can also evaluate products by allowing vendors to demonstrate their product or by testing the products in-house. In-house evaluation involves creating a pilot test bed using the candidate products. Each product is tested against its ability to meet the users’ requirements for the database system. Benchmarking reports published by the Transaction Processing Council can be found at www.tpc.org.
Application Design
The design of the user interface and the application programs that use and process the database.
The importance of user interface design is sometimes ignored or left until late in the design stages. However, it should be recognized that the interface may be one of the most important components of the system. If it is easy to learn, simple to use, straightforward and forgiving, the users will be inclined to make good use of what information is presented. On the other hand, if the interface has none of these characteristics, the system will undoubtedly cause problems.
Ø Transaction Design
Retrieval transactions are used to retrieve data for display on the screen or in the production of a report. For example, the operation to search for and display the details of a property (given the property number) is an example of a retrieval transaction.
Update transactions are used to insert new records, delete old records, or modify existing records in the database. For example, the operation to insert the details of a new property into the database is an example of an update transaction.
Mixed transactions involve both the retrieval and updating of data. For example, the operation to search for and display the details of a property (given the property number) and then update the value of the monthly rent is an example of a mixed transaction.
.
.
.
.
.
.
.
.
.
.
.
.
.
Prototyping
Prototyping is “Building a working model of a database system”
A prototype is a working model that does not normally have all the required features or provide all the functionality of the final system. The main purpose of developing a prototype database system is to allow users to use the prototype to identify the features of the system that work well, or are inadequate, and if possible to suggest improvements or even new features to the database system. In this way, we can greatly clarify the users’ requirements for both the users and developers of the system and evaluate the feasibility of a particular system design. Prototypes should have the major advantage of being relatively inexpensive and quick to build.
There are two prototyping strategies in common use today: requirements prototyping and evolutionary prototyping. Requirements prototyping uses a prototype to determine the requirements of a proposed database system and once the requirements are complete the prototype is discarded. While evolutionary prototyping is used for the same purposes, the important difference is that the prototype is not discarded but with further development becomes the working database system.
Implementation
Implementation is the physical realization of the database and application designs.
On completion of the design stages (which may or may not have involved prototyping), we are now in a position to implement the database and the application programs. The database implementation is achieved using the Data Definition Language (DDL) of the selected DBMS or a Graphical User Interface (GUI), which provides the same functionality while hiding the low-level DDL statements. The DDL statements are used to create the database structures and empty database files. Any specified user views are also implemented at this stage.
The application programs are implemented using the preferred third or fourth generation language (3GL or 4GL). Parts of these application programs are the database transactions, which are implemented using the Data Manipulation Language (DML) of the target
DBMS, possibly embedded within a host programming language, such as Visual Basic (VB), VB.net, Python, Delphi, C, C++, C#, Java, COBOL, Fortran, Ada, or Pascal. We also implement the other components of the application design such as menu screens, data entry forms, and reports. Again, the target DBMS may have its own fourth generation tools that allow rapid development of applications through the provision of non-procedural query languages, reports generators, forms generators, and application generators.
Security and integrity controls for the system are also implemented. Some of these controls are implemented using the DDL, but others may need to be defined outside the DDL using, for example, the supplied DBMS utilities or operating system controls.
Data Conversion and Loading
Transferring any existing data into the new database and converting any existing applications to run on the new database.
This stage is required only when a new database system is replacing an old system. Nowadays, it is common for a DBMS to have a utility that loads existing files into the new database. The utility usually requires the specification of the source file and the target database, and then automatically converts the data to the required format of the new database files. Where applicable, it may be possible for the developer to convert and use application programs from the old system for use by the new system. Whenever conversion and loading are required, the process should be properly planned to ensure a smooth transition to full operation.
Testing
The process of running the database system with the intent of finding errors is called testing.
Before going live, the newly developed database system should be thoroughly tested.
This is achieved using carefully planned test strategies and realistic data so that the entire testing process is methodically and rigorously carried out. Note that in our definition of testing we have not used the commonly held view that testing is the process of demonstrating that faults are not present. In fact, testing cannot show the absence of faults; it can show only that software faults are present. If testing is conducted successfully, it will uncover errors with the application programs and possibly the database structure. As a secondary benefit, testing demonstrates that the database and the application programs appear to be working according to their specification and that performance requirements appear to be satisfied. In addition, metrics collected from the testing stage provide a measure of software reliability and software quality.
As with database design, the users of the new system should be involved in the testing process. The ideal situation for system testing is to have a test database on a separate hardware system, but often this is not available. If real data is to be used, it is essential to have backups taken in case of error.
Testing should also cover usability of the database system. Ideally, an evaluation should be conducted against a usability specification.
Operational Maintenance
The process of monitoring and maintaining the database system following installation is called operational maintenance.
In the previous stages, the database system has been fully implemented and tested. The system now moves into a maintenance stage, which involves the following activities:
Ø Monitoring the performance of the system. If the performance falls below an acceptable level, tuning or reorganization of the database may be required.
Ø Maintaining and upgrading the database system (when required). New requirements are incorporated into the database system through the preceding stages of the lifecycle.
Once the database system is fully operational, close monitoring takes place to ensure that performance remains within acceptable levels. A DBMS normally provides various utilities to aid database administration including utilities to load data into a database and to monitor the system. The utilities that allow system monitoring give information on, for example, database usage and locking and query execution strategy. The Database Administrator (DBA) can use this information to tune the system to give better performance, for example, by creating additional indexes to speed up queries, by altering storage structures, or by combining or splitting tables.
The monitoring process continues throughout the life of a database system and in time may lead to reorganization of the database to satisfy the changing requirements. These changes in turn provide information on the likely evolution of the system and the future resources that may be needed. This, together with knowledge of proposed new applications, enables the DBA to engage in capacity planning and to notify or alert senior staff to adjust plans accordingly. If the DBMS lacks certain utilities, the DBA can either develop the required utilities in-house or purchase additional vendor tools, if available. We discuss database administration in more detail in Section 9.15.
When a new database application is brought online, the users should operate it in parallel with the old system for a period of time. This safeguards current operations in case of unanticipated problems with the new system. Periodic checks on data consistency between the two systems need to be made, and only when both systems appear to be producing the same results consistently, should the old system be dropped. If the changeover is too hasty, the end-result could be disastrous. Despite the foregoing assumption that the old system may be dropped, there may be situations where both systems are maintained.
---- Thanks guys for visiting and reading this post--
Post A Comment:
0 comments: