MySQL Technical Architecture

MySQL is based on a tiered architecture, consisting of both primary subsystems and support components that interact with each other to read, parse, and execute queries, and to cache and return query results.

There are three primary subsystems within the MySQL architecture, as discussed in the following sections.

Memory and Connection Management
This subsystem manages user connections, via modules for network connection management with clients, and synchronizes competing tasks and processes, via modules for multithreading, thread locking, and performing thread-safe operations. It also handles all memory management issues between requests for data by the query subsystem and the data storage subsystem.

Query Parsing and Execution
Query parsing and execution is handled by two interrelated components: the syntax parser and the query optimizer. The syntax parser decomposes the SQL commands it receives from calling programs into a form that can be understood by the MySQL engine. It also checks the objects being referenced to ensure that the privilege level of the calling program allows it to use them. The query optimizer then prepares the most efficient plan for query execution, making decisions about table-versus-index scans, join methods, and range optimization, and using a bottom-up methodology to detect the optimal execution plan.

Data Storage
The data storage subsystem interfaces with the operating system (OS) to write to disk all of the data in the user tables, indexes, and logs, as well as the internal system data. MySQL 5.1 also introduced a new pluggable architecture, which allows developers to create new table storage mechanisms and “plug them in” to the server at run-time. This pluggable architecture also creates a level of abstraction between the data storage subsystem and the rest of the MySQL server, making it possible for developers to add new data storage engines that interact with the other MySQL subsystems through a standard API.

MySQL is designed on the assumption that the vast majority of its applications will be running on a TCP/IP (Transmission Control Protocol/Internet Protocol) network. This is a fairly good assumption, given that TCP/IP is not only highly robust and secure, but is also common to UNIX, Windows, OS/2, and almost any other serious operating system you’ll likely encounter. When the client and the server are on the same UNIX machine, MySQL uses TCP/IP with UNIX sockets, which operate in the UNIX domain; that is, they are generally used between processes on the same UNIX system, as opposed to Internet sockets, which operate between networks.

Standards Compliance
The Structured Query Language (SQL) is an open standard that has been maintained by the American National Standards Institute (ANSI) since 1986. Although it’s true that the implementation of this standard does differ in varying degrees from vendor to vendor, it’s fair to say that SQL is today one of the most widely used cross-vendor languages. As with other implementations, such as SQL Server’s T-SQL (Transact-SQL) and Oracle’s SQL, MySQL has its own variations of the SQL standard that add power beyond what is available within the standard. Beginning with v5.1, MySQL also includes support for data import and export using Extensible Markup Language (XML), a widely accepted, vendor-neutral format for data markup and sharing.

In the SQL context, a transaction consists of one or more SQL statements that operate as a single unit. Each SQL statement in such a unit is dependent on the others, and the unit as a whole is indivisible. If one statement in the unit does not complete successfully, the entire unit will be rolled back, and all the affected data will be returned to the state it was in before the transaction was started. Thus, a transaction is said to be successful only if all the individual statements within it are executed successfully. The MySQL transaction system fully satisfies the ACID tests for transaction safety via its InnoDB and BDB table types (older table types, such as the MyISAM type, do not support transactions).

• Atomicity is handled by storing the results of transactional statements (the modified rows) in a memory buffer and writing these results to disk and to the binary log from the buffer only once the transaction is committed. This ensures that the statements in a transaction operate as an indivisible unit and their effects are seen either collectively or not at all.

• Consistency is primarily handled by MySQL’s logging mechanisms, which record all changes to the database and provide an audit trail for transaction recovery. In addition to the logging process, MySQL provides locking mechanisms that ensure that all of the tables, rows, and indexes that make up the transaction are locked by the initiating process long enough to either commit the transaction or roll it back.

• Server-side semaphore variables and locking mechanisms act as traffic managers to help programs manage their own isolation mechanisms. MySQL’s BDB table handler, for example, uses page-level locking to safely handle multiple simultaneous transactions, while the InnoDB table handler uses a more fine-grained row-level locking.

• MySQL implements durability by maintaining a binary transaction log file that tracks changes to the system during the course of a transaction. In the event of a hardware failure or abrupt system shutdown, recovering lost data is a relatively straightforward task by using the last backup in combination with the log when the system restarts.

Because transactional tables incur some performance overhead, it’s also possible to specify whether to use transactions on a per-table basis.

Query Caching
If a query returns a given set of records, repeating the same query should return the same set of records, unless the underlying data has somehow changed. As obvious as this sounds, few of the other major RDBMS vendors provide features that take advantage of this principle. Other database products are efficient in storing optimized access plans that detail the process by which data is retrieved; such plans allow queries similar to those that have been issued previously to bypass the process of analyzing indexes yet again to get to the data.

Result-set caching takes this principle a step further by storing the result sets themselves in memory, thus circumventing the need to search the database at all. The data from a query is simply placed in a cache, and when a similar query is issued, this data is returned as if in response to the query that created it in the first place. The MySQL engine uses an extremely efficient result set–caching mechanism, known as the Query Cache, that dramatically enhances response times for queries that are called upon to retrieve the exact same data as a previous query. This mechanism is so efficient that a major computing publication declared MySQL queries to be faster than those of Oracle and SQL Server (which are both known for their speed). If implemented properly, decision support systems using MySQL with canned reports or data-driven web pages can provide response speeds far beyond those that would be expected without the Query Cache.

In keeping with its open-source roots, MySQL makes the original source code available as part of the distribution, which permits developers to add new functions and features that are compiled into the engine as part of the core product. MySQL also allows separate C and C++ libraries to be loaded in the same memory space as the engine when MySQL starts up.

MySQL also allows developers to add new functions at run-time through a special user-defined function interface. User-defined functions are created initially as special C/C++ libraries and are then added and removed dynamically by means of the CREATE FUNCTION and DROP FUNCTION statements.

Symmetric Multiprocessing Support
To take advantage of multiprocessor architecture, MySQL is built using a multithreaded design, which allows threads to be allocated between processors to achieve a higher degree of parallelism. This is important to know not only for the database administrator, who needs to understand how MySQL takes best advantage of processing power, but also for developers, who can extend MySQL with custom functions. All custom functions must be thread-safe—that is, that they must not interfere with the workings of other threads in the same process as MySQL.

MySQL makes use of various thread packages, depending on the platform. POSIX threads are used on most UNIX variants, such as FreeBSD and Solaris. LinuxThreads are used for Linux distributions. For efficiency reasons, Windows threads are used on the Windows platform, but the code that handles them is designed to simulate POSIX threads. Because MySQL is a threaded application, it is able to let the operating system take over the task of coordinating the allocation of threads to balance the workload across multiple processors. MySQL uses a global connection thread to handle all connection requests and creates a new dedicated thread to handle authentication and SQL query processing for each connection. In addition, in replication, master-host synchronization is handled by separate threads.

Of course, another way to take advantage of multiprocessing is to run multiple instances of MySQL on the same machine, thereby spawning a separate process for each instance. This approach is especially practical for hosting companies and even for internal hosting within corporate environments. By running multiple instances of MySQL on the same computer, you can easily accommodate multiple user bases that need different configuration options.

The process of accessing a MySQL database can be broken down into two tasks: connecting to the MySQL server itself and accessing individual objects, such as tables or columns, in a database. MySQL has built-in security to verify user credentials at both stages.

• MySQL manages user authentication through user tables, which check not only that a user has logged on correctly with the proper username and password, but also that the connection is originating from an authorized TCP/IP address.

• Once a user is connected, a system administrator can bestow user-specific privileges on objects and on the actions that can be taken in MySQL. For example, you might allow to perform only SELECT queries against an inventory table, while allowing to run INSERT, UPDATE, and DELETE statements against the same table.

The actual data that travels over a network, such as query results, isn’t encrypted and is, therefore, open to viewing by a hacker. To secure your data, you can use one of the Secure Shell (SSH) protocols; you’ll need to install it on both the client applications and the operating system you’re using. If you’re using MySQL 4.0 or later, you can also use the Secure Socket Layer (SSL) encryption protocol, which can be configured to work from within MySQL, making it safe for use over the Internet or other public network infrastructures.

Application Programming Interfaces
For application developers, MySQL provides a client library that is written in the C programming language and a set of APIs that provide an understandable set of rules by which host languages can connect to MySQL and send commands. Using an API protects client programs from any underlying changes in MySQL that could affect connectivity. Currently, MySQL provides hooks to C, C++, Eiffel, Java, Perl, PHP, Python, Ruby, and Tcl, and connectors are also available for JDBC, ODBC, and .NET applications.

Source of Information : MCGraw Hill - SQL the Complete Reference 3rd Edition


Subscribe to Developer Techno ?
Enter your email address:

Delivered by FeedBurner