Monday, March 23, 2009

DATABASE ADMINISTRATION
Database administration is the function of managing and maintaining database management systems (DBMS) software. Mainstream DBMS software such as Oracle, IBM DB2 and Microsoft SQL Server need ongoing management. As such, corporations that use DBMS software often hire specialized IT (Information Technology) personnel called Database Administrators or DBAs.
Contents

1 DBA Responsibilities
2 Types of database administration
3 Nature of database administration
4 Database administration tools
5 The impact of IT automation on database administration
6 Learning database administration
7 See also
8 External references
//
DBA Responsibilities
Installation, configuration and upgrading of Database server software and related products.
Evaluate Database features and Database related products.
Establish and maintain sound backup and recovery policies and procedures.
Take care of the Database design and implementation.
Implement and maintain database security (create and maintain users and roles, assign priveledges).
Database tuning and performance monitoring.
Application tuning and performance monitoring.
Setup and maintain documentation and standards.
Plan growth and changes (capacity planning).
Work as part of a team and provide 7x24 support when required.
Do general technical trouble shooting and give consultation to development teams.
Interface with Oracle / Microsoft Corporation for technical support.
Types of database administration
There are three types of DBAs:
1. Systems DBAs (sometimes also referred to as Physical DBAs, Operations DBAs or Production Support DBAs)
2. Development DBAs
3. Application DBAs
Depending on the DBA type, their functions usually vary. Below is a brief description of what different types of DBAs do:
1. Systems DBAs usually focus on the physical aspects of database administration such as DBMS installation, configuration, patching, upgrades, backups, restores, refreshes, performance optimization, maintenance and disaster recovery.
2. Development DBAs usually focus on the logical and development aspects of database administration such as data model design and maintenance, DDL (data definition language) generation, SQL writing and tuning, coding stored procedures, collaborating with developers to help choose the most appropriate DBMS feature/functionality and other pre-production activities.
3. Application DBAs are usually found in organizations that have purchased 3rd party application software such as ERP (enterprise resource planning) and CRM (customer relationship management) systems. Examples of such application software includes Oracle Applications, Siebel and PeopleSoft (both now part of Oracle Corp.) and SAP. Application DBAs straddle the fence between the DBMS and the application software and are responsible for ensuring that the application is fully optimized for the database and vice versa. They usually manage all the application components that interact with the database and carry out activities such as application installation and patching, application upgrades, database cloning, building and running data cleanup routines, data load process management, etc.
While individuals usually specialize in one type of database administration, in smaller organizations, it is not uncommon to find a single individual or group performing more than one type of database administration.
Database administration tools
Often, the DBMS software comes with certain tools to help DBAs manage the DBMS. Such tools are called native tools. For example, Microsoft SQL Server comes with SQL Server Enterprise Manager and Oracle has tools such as SQL*Plus and Oracle Enterprise Manager/Grid Control. In addition, 3rd parties such as BMC, Quest Software, Embarcadero, EMS Database Management Solutions and SQL Maestro Group offer GUI tools to monitor the DBMS and help DBAs carry out certain functions inside the database more easily.
Another kind of database software exists to manage the provisioning of new databases and the management of existing databases and their related resources. The process of creating a new database can consist of hundreds or thousands of unique steps from satisfying prerequisites to configuring backups where each step must be successful before the next can start. A human cannot be expected to complete this procedure in the same exact way time after time - exactly the goal when multiple databases exist. As the number of DBAs grows, without automation the number of unique configurations frequently grows to be costly/difficult to support. All of these complicated procedures can be modeled by the best DBAs into database automation software and executed by the standard DBAs. Software has been created specifically to improve the reliability and repeatability of these procedures such as Stratavia's Data Palette and GridApp Systems Clarity.
The impact of IT automation on database administration
Recently, automation has begun to impact this area significantly. Newer technologies such as Stratavia's Data Palette suite and GridApp Systems Clarity have begun to increase the automation of databases causing the reduction of database related tasks. However at best this only reduces the amount of mundane, repetitive activities and does not eliminate the need for DBAs. The intention of DBA automation is to enable DBAs to focus on more proactive activities around database architecture, deployment, performance and service level management.




DATABASE ADMINISTRATOR:
A database administrator (DBA) is a person who is responsible for the environmental aspects of a database. The role of a database administrator has changed according to the technology of database management systems (DBMSs) as well as the needs of the owners of the databases. For example, although logical and physical database design are traditionally the duties of a database analyst or database designer, a DBA may be tasked to perform those duties.
Duties
The duties of a database administrator vary and depend on the job description, corporate and Information Technology (IT) policies and the technical features and capabilities of the DBMS being administered. They nearly always include disaster recovery (backups and testing of backups), performance analysis and tuning, data dictionary maintenance, and some database design.
Some of the roles of the DBA may include
Installation of new software — It is primarily the job of the DBA to install new versions of DBMS software, application software, and other software related to DBMS administration. It is important that the DBA or other IS staff members test this new software before it is moved into a production environment.
Configuration of hardware and software with the system administrator — In many cases the system software can only be accessed by the system administrator. In this case, the DBA must work closely with the system administrator to perform software installations, and to configure hardware and software so that it functions optimally with the DBMS.
Security administration — One of the main duties of the DBA is to monitor and administer DBMS security. This involves adding and removing users, administering quotas, auditing, and checking for security problems.
Data analysis — The DBA will frequently be called on to analyze the data stored in the database and to make recommendations relating to performance and efficiency of that data storage. This might relate to the more effective use of indexes, enabling "Parallel Query" execution, or other DBMS specific features.
Database design (preliminary) — The DBA is often involved at the preliminary database-design stages. Through the involvement of the DBA, many problems that might occur can be eliminated. The DBA knows the DBMS and system, can point out potential problems, and can help the development team with special performance considerations.
Data modeling and optimization — By modeling the data, it is possible to optimize the system layouts to take the most advantage of the I/O subsystem.
Responsible for the administration of existing enterprise databases and the analysis, design, and creation of new databases.
Data modeling, database optimization, understanding and implementation of schemas, and the ability to interpret and write complex SQL queries
Proactively monitor systems for optimum performance and capacity constraints
Establish standards and best practices for SQL
Interact with and coach developers in Structured Query Language (SQL) scripting
METADATA:
Metadata (meta data, or sometimes metainformation) is "data about other data", of any sort in any media. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items and hierarchical levels, for example a database schema. In data processing, metadata is definitional data that provides information about or documentation of other data managed within an application or environment..
Purpose
· Metadata provides context for data.
· Metadata is used to facilitate the understanding, characteristics, and management usage of data. The metadata required for effective data management varies with the type of data and context of use. In a library, where the data is the content of the titles stocked, metadata about a title would typically include a description of the content, the author, the publication date and the physical location
Examples of metadata
· Book:Examples of metadata regarding a book would be the title, author, date of publication, subject, a unique identifier (such an International Standard Book Number), its dimensions, number of pages, and the language of the text.
· Photograph
Metadata for a photograph would typically include the date and time at which it was taken and details of the camera settings (such as focal length, aperture, exposure). Many digital cameras record metadata in exchangeable image file format (EXIF).
· Audio
A digital audio file, such as an MP3 of a song, might include the album name, song title, genre, year, composer, contributing artist, track number and album art.
· Web page
The HTML used to mark-up web pages allows for the inclusion of a variety of types of meta data, from simple descriptive text, dates and keywords to highly-granular information such as the Dublin Core and e-GMS standards. Pages can be geotagged with coordinates. Metadata may be included in the page's header or in a separate file. Microformats allow on-page data to be marked up as meta data. The Hypertext Transfer Protocol used to link web pages also includes metadata.
Difference between data and metadata
Usually it is not possible to distinguish between (plain) data and metadata because:
Something can be data and metadata at the same time. The headline of an article is both its title (metadata) and part of its text (data).
Data and metadata can change their roles. A poem, as such, would be regarded as data, but if there is a song that uses it as lyrics, the whole poem could be attached to an audio file of the song as metadata. Thus, the labeling depends on the point of view.
Use

· Metadata helps to bridge the semantic gap. By telling a computer how data items are related and how these relations can be evaluated automatically, it becomes possible to process even more complex filter and search operations. For example, if a search engine understands that "Van Gogh" was a "Dutch painter", it can answer a search query on "Dutch painters" with a link to a web page about Vincent Van Gogh, although the exact words "Dutch painters" never occur on that page. This approach, called knowledge representation, is of special interest to the semantic web and artificial intelligence.

· Other descriptive metadata can be used to automate workflows. For example, if a "smart" software tool knows content and structure of data, it can convert it automatically and pass it to another "smart" tool as input. As a result, users save the many copy-and-paste operations required when analyzing data with "dumb" tools.

· Metadata has become important on the World Wide Web because of the need to find useful information rom the mass of information available


The database development life cycle
A development life cycle
Database development is just one part of the much wider field of software engineering, the process of developing and maintaining software. A core aspect of software engineering is the subdivision of the development process into a series of phases, or steps, each of which focuses on one aspect of the development. The collection of these steps is sometimes referred to as a development life cycle.

Here, we start with an overview of the waterfall model such as you will find in most software engineering text books. This illustrates a general waterfall model which could apply to any computer system development. It shows the process as a strict sequence of steps where the output of one step is the input to the next and all of one step has to be completed before moving onto the next. However, in reality there is usually some degree of refinement and feedback as the product proceeds through the development stages (it would be rare to find that each task is performed perfectly and never needs revisiting – although that is one possibility!).

Figure : A general model of system development: the waterfall model
We can use Figure 4 as a means of identifying the tasks that are required, together with the input and output for each activity. What is important is the scope of the activities, which can be summarised as follows:
Establishing requirements involves consultation with, and agreement among, stakeholders as to what they want of a system, expressed as a statement of requirements.
Analysis starts by considering the statement of requirements and finishes by producing a system specification. The specification is a formal representation of what a system should do, expressed in terms that are independent of how it may be realised.
Design begins with a system specification and produces design documents, and provides a detailed description of how a system should be constructed.
Implementation is the construction of a computer system according to a given design document and taking account of the environment in which the system will be operating (for example specific hardware or software available for the development). Implementation may be staged, usually with an initial system than can be validated and tested before a final system is released for use.
Testing compares the implemented system against the design documents and requirements specification and produces an acceptance report or, more usually, a list of errors and bugs that require a review of the analysis, design and implementation processes to correct
Maintenance involves dealing with changes in the requirements, or the implementation environment, bug fixing or porting of the system to new environments (for example migrating a system from a standalone PC to a UNIX workstation or a networked environment). Since maintenance involves the analysis of the changes required, design of a solution, implementation and testing of that solution over the lifetime of a maintained software system, the waterfall life cycle will be repeatedly revisited.




Database Security Issues: Inference
Databases introduce a number of unique securityrequirements for their users and administrators. On one hand, databases are designed to promote open and flexible access to data. On the other hand, it’s this same open access that makes databases vulnerable to many kinds of malicious activity. One of the main issues faced by database security professionals is avoiding inference capabilities. Basically, inference occurs when users are able to piece together information at one security level to determine a fact that should be protected at a higher security level. It’s best explained through a practical example.
· Imagine that you are the database administrator for a military transportation system. You have a table named cargo in your database that contains information on the various cargo holds available on each outbound airplane. Each row in the table represents a single shipment and lists the contents of that shipment and the flight identification number. The flight identification number may be cross-referenced with other tables to determine the origin, destination, flight time and similar data. Thecargo table appears as follows:
Flight ID
Cargo Hold
Contents
Classification
1254
A
Boots
Unclassified
1254
B
Guns
Unclassified
1254
C
Atomic Bomb
Top Secret
1254
D
Butter
Unclassified
Suppose that General Jones (who has a Top Secret security clearance) comes along and requests information on the cargo carried by flight 1254. The general would (correctly) see all four shipments. On the other hand, if Private Smith (who has no security clearance) requests the data, the private would see the following table:
This correctly implements the security rules that prohibit someone from seeing data classified above their security level. However, assume that there is a unique constraint on flight ID and that nothing is scheduled for hold C on flight 1254, he might attempt to insert a new record to transport some vegetables on that flight. However, when he attempts to insert the record, his insert will fail due to the unique constraint. At this point, Private Jones has all the data he needs to infer that there is a secret shipment on flight 1254. He could then cross-reference the flight information table to find out the source and destination of the secret shipment and various other information.

This leads to a natural question –
what can you do about inference? Basically, you have two options. First, you can include the classification column in the unique constraint. This technique, known as polyinstantiation, allows different records to exist in the same table at various security levels. Private Jones would never learn of the Top Secret shipment. On the other hand, he might wind up consequently double-booking the cargo hold, leaving a truckload of vegetables stranded at the airport. The second option is to simply leave the tables as-is. Private Jones will know that a classified shipment is taking place but won't have access to the contents of the shipment.
Neither solution is ideal, but both represent the types of trade-offs that must be made to balance security and practicality.

No comments:

Post a Comment