Understanding SQL Server Architecture

At early stage of my career, SQL Server Database was like a black box to me. I submitted data and SQL Server stored it. When required, it gave me back like magic. Eventually, the magic is unfolded to me. I discovered that behind the SQL Server there is a highest level of engineering and orchestrated data flow. Today, I am going to share this magic.

SQL Server

MSSQL Server is relational database management system. Which follows typical client-server architecture. Meaning, client submits its request, server accepts and processes it. Finally, server sends back the responses to the client.

Diagram: SQL Server Architecture  

The above SQL Server Architecture diagram figured out the three major components of SQL Server. Mainly:

  1. Relational Engine
  2. Protocol Layer and
  3. Storage Engine

Let's deep drive all the major components.

1. SQL Server Network Interface (SNI) or Protocol Layer

This layer interacts mostly with outside world. Which takes input from outside world and provides output. It supports couple of protocols for communication.

  • Shared memory   
  • Named pipes
  • TCP/IP
  • TDS (Tabular Data Stream)

Shared Memory

When the client application and SQL Server resides in the same machine then this protocol is used. For example, when we use "." (dot), 127.0.0.1, localhost or Machine\Instance to connect with SQL Server, shared memory protocol is used.

TCP/IP

If the client application and SQL Server resides in separate machines, then they communicated through TCP/IP protocol. The default TCP port of SQL Server is 1433.

Named Pipes

The Named Pipe protocol is used for inter-process communication and can improve performance in certain scenarios. By default, this protocol is disabled, as it allows anonymous access to certain named pipes. This can be enabled by the SQL Configuration Manager. The default port of SQL Server named pipes is 445.

Tabular Data Stream

Tabular Data Stream (TDS) Protocol is an application layer protocol that facilitates interaction between database server and client application. It also provides for authentication and channel encryption negotiation facility. All 3 protocols use TDS packets. TDS is encapsulated in Network packets.

2. Relational Engine

In SQL Server, the Relational Engine is also known as the Query Processor. It determines what a query needs to do and how it can be executed. It requests data from storage engine, processes it and serves to client. It has 3 major components.

  1. CMD Parser
  2. Optimizer and
  3. Query Executor

CMD Parser

CMD parser receives query from Protocol Layer and goes through a series of steps to validate the data.

Diagram:- CMD Parser

  1. Syntactic check - First CMD parser checks whether the received query conforms with SQL syntax. If there is any error then throws exceptions otherwise pass to next level processing.
  2. Semantic check - Next step is validating supplied object names. In this phase, CMD parser checks whether supplied columns, tables, views or functions etc. exists in the schema. If exists it binds with query which is called Binding.
  3. Query Tree - Finally, CMD parser prepares different execution trees for the query. All the trees will produce the same desired output.

Optimizer

Optimizer takes the execution tree from CMD parser and checks whether a plan existing in plan cache. If there is no such plan then it produces an execution plan. As per the plan the query will be executed. Only DML commands like SELECT, INSERT, DELETE, and UPDATE commands are optimized. CREATE and ALTER (DDL commands) commands are not optimized. They are compiled into an internal form. Optimization cost is calculated based on CPU, memory, and I/O usage. Optimizer finds out the cheapest, not the best, cost-effective execution plan. Optimizer has 3 phases.

Diagram:- Optimizer

  1. Pre-optimization - In some cases, searching for a optimal plan would not reduce the execution time. So, optimizer search for a trivial plan. If there is no such plan, the optimizer moves to the next phase.
  2. Transaction Processing Plan - In this phase, optimizer tries to find out a simple plan that incorporates one index per table. If it doesn't exist, the optimizer searches for a more complex plan with multiple indexes per table.
  3. Parallel Processing and Optimization - If the optimizer fails to find out any suitable execution plan from above two phases, it opts for parallel processing based on the processing capabilities and configuration of the user’s machine. If it still fails, optimizer tries to find out any other possible options to execute the query in the cheapest way. Microsoft uses its propriety algorithms for the final optimization phase.

Query Executor

Query executor executes the query as per the execution plan by calling the Access Method. Once data is retrieved from Storage Engine, the result is forwarded to the Protocol layer. Finally, data is sent back to the end user.

3. Storage Engine

The 3rd major components of SQL Server is storage engine which is responsible for all sort of storing and retrieval data operations with storage. It has also three major parts:

  1. Access Manager - The storage manager receives the request from the query executor and sends them to the different access methods. If it is a SELECT statement then request goes to the buffer manager. For NON-SELECT (UPDATE, ADD, DELETE) request, it lands to the transaction manager.
  2. Transaction Manager - Transaction manager receives NON-SELECT (UPDATE, ADD, DELETE) requests from the access manager; interacts with the lock and log manager and uses transaction logs to track every executed query.
    • Log Manager - It tracks of all updates done in the system via logs in Transaction Logs. Every has a log sequence number, transaction ID and data modification record. Which is used for keeping track of Transaction Committed and Transaction Rollback.
    • Lock Manager - It locks the associated data in Data Storage in lock state during transaction. Which ensures data consistency and isolation. This is also known as ACID properties.
  3. Buffer Manager - It gets SELECT queries and executes associated with plan cache, data parsing, and dirty pages to serve the request.
    • Plan Cache
      • Existing Query Plan: Buffer manager checks whether there is an existing execution plan stored in the cache. If exists, it is used along with the associated data.
      • First-time Query Plan: If there no such plan in cache, then first-time query execution plan is stored for future usage. This makes the faster query execution.
    • Data Parsing
      • Buffer Cache – Soft Parsing: Buffer manager checks whether data exists in the data cache. If exists, then this data is used by Query Executor. This improves the performance as the total I/O operation is reduced when fetching data from the cache.
      • Data Storage – Hard Parsing: If data is not in the cache, Buffer Manager searches the data in the storage and supplies to the upper layer. Data is also saved in cache for future usage
      • Dirty Page - This is stored as a processing logic of Transaction Manager.

SQL Operating Systems (SQLOS)

 

Diagram: SQLOS Architecture  

The SQLOS is an application layer beneath the SQL Server database engine layer. It handles couple of critical scheduling and resource management tasks:

  1. Scheduler and IO completion: The SQLOS handles the scheduling of threads for CPU usage in SQL Server. Most threads operate in cooperative mode, meaning each thread must voluntarily yield control to allow other threads access to the CPU. Additionally, most I/O operations are asynchronous. The SQLOS is also responsible for signaling threads when their I/O operations are completed.
  2. Synchronization: SQLOS manages thread synchronizations.
  3. Memory management: Different components within SQL Server, example plan cache, CLR, lock manager etc request memory from the SQLOS. Therefore, the SQLOS can control how much memory a component within SQL Server is consuming.
  4. Deadlock detection and management of the same.
  5. Exception handling framework.
  6. Hosting services for external components such as CLR and MDAC. SQL Server will run threads that are associated with external component in preemptive mode. Preemptive mode allows the SQLOS to prevents runaway threads (threads which will not yield and allow other threads to get CPU execution). Also the SQLOS can keep track of the memory these external components consume. For example, for CLR the SQLOS can invoke garbage collection if the CLR process is taking up too much memory.

Final Words

As an enthusiastic SQL Server learner, hopefully you enjoy the SQL Server architecture.

Pingbacks and trackbacks (1)+

Add comment