What is the Distributed Database Option?
A distributed system is one in which both data and transaction processing are divided between one or more computers connected by a network, each computer playing a specific role in the system. This configuration has multiple databases, each of which is accessed directly by a single server and can be accessed indirectly by other instances through server/server cooperation. Distributed systems allow you to have data physically located at several sites, and each site can transparently access all of the data.
Each node can be used for database processing, but the data is permanently partitioned among the nodes. Several smaller server machines can be cheaper and more flexible than one large, centrally located server.
The key goals of a distributed database system are availability, accuracy, concurrency, and recoverability.
The Client-Server Model and Distributed Systems
The client-server model is basic to distributed systems. It is a response to the limitations presented by the traditional mainframe client-host model, in which a single mainframe provides shared data access to many dumb terminals. The client-server model is also a response to the local area network (LAN) model, in which many isolated systems access a file server that provides no processing power.
Client-server architecture provides integration of data and services and allows clients to be isolated from inherent complexities, such as communication protocols. The simplicity of the client-server architecture allows clients to make requests that are routed to the appropriate server.
These requests are made in the form of transactions. Client transactions are often SQL or PL/SQL procedures and functions that access individual databases and services.
Distributed Database Characteristics
This article describes the twelve specifications for the ideal distributed database management system and how ORACLE conforms to these specifications.
Oracle's distributed architecture, comprising SQL*Net, Open Gateway and the Oracle Server, provides an effective solution to the challenge of sharing data in a networked environment.
The Oracle Server's distributed architecture provides effective data sharing in a networked environment using both client-server and distributed database architectures.
In a client-server environment, communication takes place between two processes that reside on different machines. The client executes the application or application tool and sends requests to the server for data.
The received data is processed at the client machine. This is known as distributed processing. The ideal distributed system should look like a non-distributed system. Twelve specifications for the ideal distributed database were developed by C.J. Date.
The Oracle Server supports most of the ideal distributed features.
1. Site Autonomy
Site autonomy means that each server participating in a distributed database is administered independently from all other databases.
The data is owned and managed locally. Local operations remain purely local.
One site (node) in the distributed system does not depend on another site to function successfully.
2. No reliance on a central site
All sites are treated as equals. Each site has its own data dictionary.
3. Continuous Operation
Incorporating a new site has no effect on existing applications and does not disrupt service.
4. Location Independence
Users can retrieve and update data independent of the site.
5. Fragmentation Independence
Users can store parts of a table at different locations. Both horizontal and vertical partitioning of data is possible.
6. Replication Independence
Stored copies of data can be located at multiple sites. Read-only snapshots and updatable snapshots provide read-only and updatable copies of tables, respectively. Symmetric Replication using triggers make readable and writable replication possible.
7. Distributed Query Processing
Users can query a database residing on another node. The query is executed at the node where the data is located.
8. Distributed Transaction Management
A transaction can update, insert or delete data from multiple databases.
The two-phase commit mechanism in Oracle ensures the integrity of distributed transactions. Row level locking ensures a high level of data concurrency.
9. Hardware Independence
Oracle runs on all major hardware platforms.
10. Operating System Independence
A specific operating system is not required. Oracle runs under a variety of operating systems.
11. Network Independence
The Oracle Server's SQL*Net supports most popular networking software.
Network independence allows communication across homogeneous and heterogenous networks. Oracle's MultiProtocol Interchange enables applications to communicate with databases across multiple network protocols.
12. DBMS Independence
DBMS Independence is the ability to integrate different databases. Open Gateway supports connections to non-Oracle databases.
13. Distributed Database Security
The database supports all of the security features that are available with a non-distributed database environment for distributed database systems, including:
Password authentication for users and roles
Some types of external authentication for users and roles including:
Kerberos version 5 for connected user links
DCE for connected user links
DISTRIBURTED TRANSACTIONS AND THE TWO PHASE COMMIT
Two phase commit only comes in play during a commit of a distributed transaction. The whole purpose is to maintain the integrity of the "global" database. In other words, two phase commit guarantees that everything will either commit or rollback.
TRANSACTION TYPES:
LOCAL TRANSACTION contains ONLY statements on the local node.
REMOTE TRANSACTION contains one or more statements which ALL reference the same remote node.
DISTRIBUTED TRANSACTION contains statements that modify data two or more distinct nodes. The only place where 2-PHASE COMMIT into play.
TERMS:
CLIENTS (C) are nodes that references information from another database server in a distributed transaction.
SERVERS (S) are nodes that are directly referenced in a distributed transaction, or are requested to participate in a transaction because another node requires data from it.
GLOBAL COORDINATOR (GC) is the node in which the distributed transaction originates.
LOCAL COORDINATOR (LC) is the node that references data on other nodes to complete its part in the distributed transaction.
COMMIT POINT SITE (CPS) is the site with the highest commit point strength "init.ora" parameter. It is usually the most critical site that can not afford collisions in case of an in-doubt transaction.
System Commit Number (SCN) is essentially an internal database clock. This is a monotonically increasing an unique number for each transaction.
COMMIT_POINT_STRENGTH (CPstr) is the init.ora parameter that determines the COMMIT POINT SITE.
When you attempt to commit a distributed transaction, you will enter ORACLE'S TWO PHASE COMMIT MECHANISM.
TWO-PHASE COMMIT:
PREPARE PHASE:
1. Commit point site is determined.
2. Global coordinator asks all participating nodes (except commit point site) to promise to COMMIT or ROLLBACK the transaction regardless of failure).
This information is propagated by the local coordinators. The servers have to be prepared before the local or global coordinators (except the commit point site). The local coordinator is responsible for asking dependant nodes to prepare.
POSSIBLE RESPONSES FROM NODES:
* PREPARED
* ABORT
* READ-ONLY NODES
Locks obtained during the distributed transaction will continue to be held.
Redo is flushed to the local redo logs.
3. Each node will pass back the SCN for his node.
4. Global Coordinator determines the max SCN.
After all the nodes have prepared successfully, we enter the commit phase. All transactions except those found in the commit point site are "in-doubt" until the commit point phase completes successfully.
COMMIT PHASE:
1. Global coordinator sends the max SCN to the commit point site and asks it to commit.
2. Commit Point Site will try to commit or everything is rolled back. The locks are released in the commit point site first.
3. If committed/rolled back, the Commit Point Site will inform the Global Coordinator which will commit/rolled back at that time.
4. The information will propagate down to its clients/local coordinators and they will commit/roll back and propagate the information down to their servers until there are no more servers.
(note: READ ONLY nodes do not participate in 2-phase commit.)
PREPARE PHASE:
1. COMMIT POINT SITE IS PARANOID
The global coordinator will already know what the commit point strength of each node prior to the commit. Read-only nodes are not included.
2. All nodes except for PARANOID is asked to prepare.
3. HAWAII, the local coordinator, is responsible to ask her dependent nodes to prepare before she prepares. In this case, PARANOID is a commit point site; thus, it is ignored.
4. The highest SCN is sent to LOCAL node via the local coordinators. The highest SCN is 1000.
5. All nodes which PREPARED will flush entries of the transaction to the redo logs if not already done.
If any of the nodes send an "ABORT" message back, then the transaction is rolled back at this time. Any failure after the PREPARE phase will result with "in-doubt" transactions.
COMMIT PHASE:
1. PARANOID IS ASKED TO COMMIT OR ROLLBACK BY THE LOCAL (GC).
2. PARANOID commits at a SCN greater than 1000.
a. Redo is flushed.
b. Locks are released.
c. outcome is relayed back to the LOCAL node (GC).
Assume success:
3. After receiving the information, GC will commit at the same SCN and pass the information to its dependents.
a. commit flushed to redo logs.
b. data locks are released.
c. GC will pass the information to HAWAII and HQ.
They, in turn, will commit and HAWAII will pass the information to PARANOID.
If all is successful, every statement will commit with the same SCN and then RECO will delete the entries from "dba_2pc_pending" and "dba_2pc_neighbors"
tables. Afterwards, the nodes will "forget" the transaction.
Forget Phase
1. After the participating nodes notify the commit point site that they have committed, the commit point site can forget about the transaction. The Following step occur:
2. After receiving notice from the global coordinator that all nodes have committed, the commit point site erases status information about this transaction.
3. The commit point site informs the global coordinator that it has erased the status information.
4. The global coordinator erases its own information about the transaction
No comments:
Post a Comment
Thanks for your comment.