How Buffer Manager reads the data from database files

What is a logical read in SQL Server ?

When Query engine processor requests page from Buffer cache

What is a Physical read in SQL server?

When a requested page is not available in Buffer pool a Physical read then  reads the page from physical disk and copies into the Buffer cache

When there is a request to access the data the most effective to access the data is controlled by the relational engine(table scan or index scan) and once it is determined

then the Buffer manager in the storage engine optimizes the best way to get the data

Read-ahead

  • The Database Engine supports a performance optimization mechanism called read-ahead
  • Read-ahead anticipates the data and index pages needed to fulfill a query execution plan and brings the pages into the buffer cache before they are actually used by the query, so that processor starts utilizing the data and IO will be still reading the data , so CPU and IO both work simultaneously.
  • The read-ahead mechanism allows the Database Engine to read up to 64 contiguous pages (512KB) from one file
  • The read is performed such way that that IO can read non-contiguous pages still write in to one buffer or IO can still read from multiple buffer and write into a single stream of data this is called  scatter/gather I/O

Reading Data Pages

Table scans used to read data pages are very efficient in the Database Engine. The index allocation map (IAM) pages in a SQL Server database list the extents used by a table or index. The storage engine can read the IAM to build a sorted list of the disk addresses that must be read. This allows the storage engine to optimize its I/Os as large sequential reads that are performed in sequence, based on their location on the disk

Reading Index Pages

  Clustered Indexes

The storage engine uses the information in the intermediate index page above the leaf level to schedule serial read-aheads.

Non- Clustered Indexes

The storage engine uses prefetching to speed base table lookups from nonclustered indexes. The leaf rows of a nonclustered index contain pointers to the data rows that contain each specific key value. As the storage engine reads through the leaf pages of the nonclustered index, it also starts scheduling asynchronous reads for the data rows whose pointers have already been retrieved. This allows the storage engine to retrieve data rows from the underlying table before it has completed the scan of the nonclustered index. Prefetching is used regardless of whether the table has a clustered index

 

Advanced scanning:

 

the advanced scan feature allows multiple tasks to share full table scans. If the execution plan of a Transact-SQL statement requires a scan of the data pages in a table and the Database Engine detects that the table is already being scanned for another execution plan, the Database Engine joins the second scan to the first, at the current location of the second scan. The Database Engine reads each page one time and passes the rows from each page to both execution plans. This continues until the end of the table is reached.

At that point, the first execution plan has the complete results of a scan, but the second execution plan must still retrieve the data pages that were read before it joined the in-progress scan. The scan for the second execution plan then wraps back to the first data page of the table and scans forward to where it joined the first scan. Any number of scans can be combined like this. The Database Engine will keep looping through the data pages until it has completed all the scans. This mechanism is also called “merry-go-round scanning” and demonstrates why the order of the results returned from a SELECT statement cannot be guaranteed without an ORDER BY clause.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s