Lecture 17 16/3/15 Distributed Databases Site Autonomy • Each server participating in a distributed database is administered independently (for security and backup operations) from the other databases, as though each database was a non-distributed database. • Although all the databases can work together, they are distinct, separate repositories of data and are administered individually. 2 Data Dictionary • A read-only set of tables that provides information about the database. A data dictionary contains: • The definitions of all schema objects in the database (tables, views, indexes, clusters, synonyms, sequences, procedures, functions, packages, triggers, and so on) • How much space has been allocated for, and is currently used by, the schema objects • Default values for columns • Integrity constraint information • The names of Oracle users • Privileges and roles each user has been granted • Auditing information, such as who has accessed or updated various schema objects • Other general database information 3 Use of a Data Dictionary • The data dictionary has three primary uses: • Oracle accesses the data dictionary to find information about users, schema objects, and storage structures. • Oracle modifies the data dictionary every time that a data definition language (DDL) statement is issued. • Any Oracle user can use the data dictionary as a read-only reference for information about the database 4 Continued.. • The Oracle Database user SYS owns all base tables and useraccessible views of the data dictionary. • Data in the base tables of the data dictionary is necessary for Oracle Database to function. • Therefore, only Oracle Database should write or change data dictionary information. • No Oracle Database user should ever alter rows or schema objects contained in the SYS schema because such activity can compromise data integrity. The security administrator must keep strict control of this central account. 5 Database Schema • A schema is the set of metadata (data dictionary) used by the database, typically generated using DDL. A schema defines attributes of the database, such as tables, columns, and properties. A database schema is a description of the data in a database. • SCOTT is a schema that includes the EMP, DEPT and BONUS tables with various grants, and other stuff. • SYS is a schema that includes tons of tables, views, grants, etc etc etc. • SYSTEM is a schema..... 6 Continued 7 Schema Objects and Naming in a Distributed Database • A schema object (for example, a table) is accessible from all nodes that form a distributed database. • Therefore, just as a non-distributed local DBMS architecture must provide an unambiguous naming scheme to distinctly reference objects within the local database, a distributed DBMS must use a naming scheme that ensures that objects throughout the distributed database can be uniquely identified and referenced. • To resolve references to objects (a process called name resolution) within a single database, the DBMS usually forms object names using a hierarchical approach. 8 Continued.. • For example, within a single database, a DBMS guarantees that each schema has a unique name, and that within a schema, each object has a unique name. • Because uniqueness is enforced at each level of the hierarchical structure, an object's local name is guaranteed to be unique within the database and references to the object's local name can be easily resolved. • DDBMs simply extend the hierarchical naming model by enforcing unique database names within a network. 9 10 Two-Phase Commit Mechanism • A DBMS must guarantee that all statements in a transaction, distributed or non-distributed, are either committed or rolled back as a unit, so that if the transaction is designed properly, the data in the logical database can be kept consistent. • The effects of a transaction should be either visible or invisible to all other transactions at all nodes; this should be true for transactions that include any type of operation, including queries, updates, or remote procedure calls (RPCs). 11 Transparency Continued.. • A distributed DBMS architecture should also provide facilities to transparently replicate data among the nodes of the system. • Maintaining copies of a table across the databases in a distributed database is often desired so that tables that have high query and low update activity can be accessed faster by local user sessions because no network communication is necessary. • If a database that contains a critical table experiences a prolonged failure, replicates of the table in other databases can still be accessed. • A DBMS that manages a distributed database should make table replication transparent to users working with the replicated tables. • Finally, the functional transparencies explained above are not sufficient alone. The distributed database must also perform with acceptable speed. 12 Transparency in a Distributed Database System • The functionality of a distributed database system must be provided in such a manner that the complexities of the distributed database are transparent to both the database users and the database administrators. • Location transparency exists if a user can refer to the same table the same way, regardless of the node to which the user connects. Location transparency is beneficial for the following reasons: • Access to remote data is simplified, because the database users do not need to know the location of objects. • Objects can be moved with no impact on end-users or database applications. 13 Continued.. • A distributed database system should also provide query, update, and transaction transparency. For example, standard SQL commands, such as SELECT, INSERT, UPDATE, and DELETE, should allow users to access remote data without the requirement for any programming. • Transaction transparency occurs when the DBMS provides the functionality described below using standard SQL COMMIT, SAVEPOINT, and ROLLBACK commands, without requiring complex programming or other special operations to provide distributed transaction control. • The statements in a single transaction can reference any number of local or remote tables. 14 Distributed Databases • In a non-heterogeneous environment, the client and server character sets should be either the same as or subsets of the main server character set In a heterogeneous environment, the NLS settings of the client, the transparent gateway and the nonOracle data source, should be either the same or a subset of the Oracle server NLS character set 15 Homogenous Environment Heterogonous Environment Data Replication • store a separate copy of the full tables in each location • if a copy is stored at every site: Full Replication • Advantages: • reliability • fast response • Disadvantages • storage requirements • complexity and cost of updating 16 Database Replication • Replication is the process of copying and maintaining database objects in multiple databases that make up a distributed database system. • Changes applied at one site are captured and stored locally before being forwarded and applied at each of the remote locations. • Replication provides user with fast, local access to shared data, and protects availability of applications because alternate data access options exist. • Even if one site becomes unavailable, users can continue to query or even update the remaining locations. 17 Continued • In a multimaster replication environment, the replication groups are called master groups. Corresponding master groups at different sites must contain the same set of replication objects • Multi-master replication allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. • The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group, and resolving any conflicts that might arise between concurrent changes made by different members. • Master group "SCOTT_MG" contains an exact replica of the replicated objects at each master site. 18 Master Group 19 Snapshot site "Group A" at the snapshot site maintains only a partial replica of master group "Group A" at the master site, while the "Group B" snapshot and master groups maintain a complete replica 20 Replication Sites • A master site maintains a complete copy of all objects in a replication group. • A snapshot site supports read-only and updateable snapshots of the table data at an associated master site. • All master sites in a multimaster replication environment communicate directly with one another to propagate data and schema changes in the replication group. • A replication group at a master site is more specifically referred to as a master group. • A replication group's master definition site is a master site serving as the control point for managing the replication group and objects in the group. • A snapshot site's table snapshots can contain all or a subset of the table data within a replication group. • However, these must be simple snapshots with a one-to-one correspondence to tables at the master site. • For example, a snapshot site may contain snapshots for only selected tables in a replication group. • And a particular snapshot might be just a selected portion of a certain replicated table. • A replication group at a snapshot site is more specifically referred to as a snapshot group. • A snapshot group can also contain other replication objects. 21 Replication Conflicts • Asynchronous multimaster and updateable snapshot replication environments must address the possibility of replication conflicts that may occur when, for example, two transactions originating from different sites update the same row at nearly the same time. • When data conflicts do occur, you need a mechanism to ensure that the conflict will be resolved in accordance with your business rules and that the data converges correctly at all sites. 22 Replication Conflicts Continued.. • In addition to logging any conflicts that may occur in your replicated environment, Oracle replication offers a variety of conflict resolution methods that will allow you to define a conflict resolution system for your database that will resolve conflicts in accordance with your business rules. • If you have a unique situation that Oracle's pre-built conflict resolution methods cannot resolve, you have the option of building and using your own conflict routines. 23
© Copyright 2025