MSSQLWIKI

Karthick P.K on SQL Server

Posts Tagged ‘SQLServer SOS’

SQL Server Operating system (SOS) – Series 3

Posted by Karthick P.K on February 11, 2013

Thread synchronization

When we discussed about thread I mentioned In multi-threaded applications each thread has to synchronize their activities among other threads. Sometimes a thread has to wait for other thread to complete before it can execute (Ex: SQL Server blocking)   sometimes a thread has to synchronize with other thread and continue execution (Ex: CX packet).  If we allow multiple thread access the same resource they might get corruption or inconsistency.

Windows offers different ways to synchronize multiple threads before we jump in to different synchronization techniques let us see why thread synchronization is very important using the small program below.

In the below program I have declared two global called a and b and set the values of this globals to 0 . We define the number of thread which we are going to create in global named Threadcount (64).

We create 64 thread in main function and each thread start executing a function called Submain. In Submain each thread increases the value of a and b  1000 times.

So ideally value of A and B has to be 64,000 at the end of program execution  (64 Threads *1000 increments).

Let us check what happens.

#include <windows.h>
#include <string>
#include <iostream>
#include <process.h>    /* _beginthread, _endthread */
long a=0;
long b=0;
long g_InUse = FALSE;
long g_fResourceInUse = FALSE;
int Threadcount=64;
int s=Threadcount;
bool d=FALSE;

void Submain(void *x)
{
       for (int L=0;L<1000;L++)
              {

                     a=a++;

                     while (InterlockedExchange (&g_InUse, TRUE) == TRUE)
                     {
                           Sleep(0);
                     }
       //Sleep(10); //-->How Spinlock can cause CPU Spike
       b=b++;

       InterlockedExchange (&g_InUse, FALSE);
       }

/*
s=s-1;  //Simple synchronization technique. May be useful if you like to increase the thread count WaitForMultipleObjects support value defined for MAXIMUM_WAIT_OBJECTS 64
       if(s==0)
       {
              d=TRUE;
       }
*/
_endthread();
}

void main()

{

HANDLE *hThreads;
hThreads = new HANDLE[Threadcount] ;
for (int i=0;i<Threadcount;i++)
{
hThreads[i]=  CreateThread(NULL,NULL,(LPTHREAD_START_ROUTINE  )Submain,  NULL,  0,  NULL);

              if (hThreads[i]==NULL)
              {
                     printf("\nThread creation failed for thread %d with error %d",i,GetLastError());
              }

}
SetLastError(0);

DWORD rw=WaitForMultipleObjects(Threadcount,hThreads,true,INFINITE);

//while(!d); //Simple synchronization technique

printf("Value of a is:%d\n" ,a);
printf("Value of b is:%d\n" ,b);
system("pause");
}

 
clip_image002

Why the value of a and b are different and why b is accurate while a is not. If you look at the program closely atomic access to global b is guaranteed using the InterlockedExchange function so only one thread could access global b any time while there was no synchronization for global a so the end value is incorrect.

Thread synchronization can be achieved in user mode or using kernel objects

User mode thread synchronization: Threads can be synchronized in User mode using interlocked family functions or using critical sections. User mode thread synchronization is faster than using kernal objects. In the above program we used interlocked family function to synchronize the threads to access of gloabal b.  interlocked family functions should be used with caution on multiprocessor system and should be avoided in uniprocessor machines.

Spinlock: A method by which we continuously check  if the resource is available. I the above program global a and b are resource. Since we guaranteed atomic access to global b. Only one thread can access it at any time so what about the other threads they continuously spin to check if the resource becomes available.  Look at the below portion of above program.  While loop checks the value of g_InUse. If the value is FALSE the resource was not is use and calling thread will set the value to TRUE so other threads cannot access it and continue the execution. Once it completes its task  i.e. incrementing the value of b it sets the value of g_InUse to false so others can access it.  If the value is false then some other thread is currently using the global resource b and the while loop continues to spin.

while (InterlockedExchange (&g_InUse, TRUE) == TRUE)
                     {
                           Sleep(0);
                     }
                     //Sleep(10); //-->How Spinlock can cause CPU Spike
                     b=b++;
              InterlockedExchange (&g_InUse, FALSE);
                    }

 

Incorrect use of spinlock can waste CPU and can cause extreme CPU spikes. In the above program uncomment the line “Sleep(10); //–>How Spinlock can cause CPU Spike” and build the exe and execute it. Look at your task manger and check the CPU utilization. It would be extremely high because each time a thread takes lock on  of g_InUse. It sleeps for 10 milliseconds, increments the value id b and then releases the lock. While the other threads continuously spins to check if the resources are available thus causing CPU spike. In real time a thread may not sleep after taking a lock but assume it is performing some task which takes time and other threads will keep spinning consuming CPU.

Critical section: Like interlocked family functions critical section is also used to guarantee atomic  access to a resource. Major difference between the interlocked functions and critical section is  when criticalsection is owned by other thread calling thread is immediately placed in waitstate, so thread transits from user to kernel mode  and this transition is expensive (about 1000 CPU cycles as per Jeffrey Richter)  . when the thread which owns the critical section releases the critical section one of the waiting thread is signaled and scheduled. Programmers should make wise decision on when to use interlocked family functions and critical sections.  Above program caused severe CPU spike after we uncommented line “Sleep(10); //–>How Spinlock can cause CPU Spike” let us do the same implementation using Critical section in below program.

#include <windows.h>
#include <string>
#include <iostream>
#include <process.h>    /* _beginthread, _endthread */
long a=0;
long b=0;
int Threadcount=64;
int s=Threadcount;
bool d=FALSE;
CRITICAL_SECTION  gcs;

void Submain(void *x)
{
       for (int L=0;L<1000;L++)
              {

                     a=a++;
                     EnterCriticalSection(&gcs);
                     Sleep(10); //-->How Spinlock can cause CPU Spike
                     b=b++;
                     LeaveCriticalSection(&gcs);
              }

/*
    s=s-1;  //Simple synchronization technique. May be useful if you like to increase the thread count WaitForMultipleObjects support value defined for MAXIMUM_WAIT_OBJECTS 64
       if(s==0)
       {
              d=TRUE;
       }
*/
_endthread();
}

void main()

{

HANDLE *hThreads;
hThreads = new HANDLE[Threadcount] ;
InitializeCriticalSection(&gcs);
for (int i=0;i<Threadcount;i++)
{
hThreads[i]=  CreateThread(NULL,NULL,(LPTHREAD_START_ROUTINE  )Submain,  NULL,  0,  NULL);

              if (hThreads[i]==NULL)
              {
                     printf("\nThread creation failed for thread %d with error %d",i,GetLastError());
              }

}
SetLastError(0);

DWORD rw=WaitForMultipleObjects(Threadcount,hThreads,true,INFINITE);
DeleteCriticalSection(&gcs);
//while(!d); //Simple synchronization technique

printf("Value of a is:%d\n" ,a);
printf("Value of b is:%d\n" ,b);
system("pause");
}

 

After building the above program run the executable and you will notice that it doesn’t consume high CPU. Does this mean critical section is better than interlock functions? No. It depends.  In this exe lock is held for long time so critical section was ideal.  Assume each thread would have got access to the resource after spinning once (or) twice  then definitely interlock functions would have been an ideal choice because we would have avoided transition of each thread from user mode to kernel mode which is expensive. There is also a API called InitializeCriticalSectionAndSpinCount. What is the difference between InitializeCriticalSection and InitializeCriticalSectionAndSpinCount? InitializeCriticalSectionAndSpinCount Spins to acquire resource  n mumber of time and only if all attempts fail then the thread transits to kernel mode.

Thread deadlock: Similar to SQL Server locks what happens when two threads wait to acquire critical sections owned on resource owned by other? If there is no timeout threads will attempt to wait forever and will never get scheduled. In SQL Server we have deadlock monitor to detect this condition and rollback one of the transaction but windows doesn’t offer any such facility.

Orphan or unreleased critical section: When a thread takes critical section it is expected to release it, Assume a flaw in code or exception caused a thread to abort after taking a critical section and before releasing it, Critical section taken by the terminate thread is never destroyed and all the other threads will wait indefinitely on it. 

If you liked this post do like us on Facebook at https://www.facebook.com/mssqlwiki and join our Facebook group MSSQLWIKI

Thank you,

Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

Disclaimer
The views expressed on this website/blog are mine alone and do not reflect the views of my company. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

Posted in SQL Server Engine, SQLServer SOS | Tagged: , , , , , , , , | 2 Comments »

SQL Server Operating system (SOS) – Series 1

Posted by Karthick P.K on January 10, 2013

Before we start studying SQLOS we will recollect some of the basic OS concepts. 

What is process?

Process is instance of service or application which is running. Each process will have address space that contains all the Executable, Dlls, data, thread stacks etc.  Operating system maintains a kernel objects for each process to manage the process. One or more threads which runs under the context of the process and execute the code in address space. Each thread can execute the code and maintains its own set of CPU registers and stack. When a process is created primary thread of the process is created and starts executing the main() or Other similar function.  Primary thread can create additional threads using create thread function and lpStartAddress (Thread entry point ) defines the function that is to be executed by the thread which is created.

 

 

Ex: Ureadfile will be the thread entry point for the new thread which is created using below code.

CreateThread(0,0,(LPTHREAD_START_ROUTINE  )Ureadfile,(LPVOID)&PSUreadfile[i],  0,  NULL);

 

 

What is thread?

Threads executes the code in process.  There can be one  (Single threaded ) or more( Multi-threaded) threads for every process. In multi-threaded applications each thread has to synchronize their activities among other threads. Ex: Allowing one thread to modify a Global while other is reading it can cause race conditions.

SQL Server uses synchronization techniques like Spinlocks, Latches, Events Etc.

   

Threads states

Threads can be in below core states (There are other states which we will discuss on need)

 

Waiting

Wait state represents thread is waiting for some resource. A thread in this state is not eligible to be scheduled from OS.

SQL Server threads can be in wait state in multiple places. A thread requesting for lock has to wait till it is available and goes for sleep unless signaled when the lock is available.

A thread can call WaitForSingleObject  and wait without competing  CPU resource

 

Ex:

LMHandle = CreateMemoryResourceNotification(LowMemoryResourceNotification);

WaitForSingleObject( LMHandle,INFINITE);

In above example Thread will wait till there is Lowmemoryresourcenotification from windows.

Running

Thread is running in CPU.

 

Ready

Thread is ready to run and waiting for its CPU slice. In SQL Server thread which are ready to run on scheduler will stay in runnable list of scheduler till they get chance to run on scheduler.

Quantum
All the threads which is executed in operating system will get a time slice to run in CPU called as quantum.Thread is yielded from scheduler after its quantum is completed.

 

Scheduling

 

Preemptive scheduling:

Operating system can interrupt the thread execution any time. OS can halt the thread execution and schedule another thread to run at any time.

 

Non-Preemptive scheduling:

Operating system cannot interrupt the thread execution any time. The worker owns the scheduler until it yields to another worker on the same CPU. If the thread which runs on CPU(Scheduler) don’t yield in time it monopolizes the CPU until it finishes. 

Windows 3.x and DOS were using Non-Preemptive scheduling. In Non-Preemptive scheduling context switching is generally reduced because the operating system does not interrupt code execution and It is easier to implement a multi-threaded application in Non-Preemptive mode because synchronization may be less of an issue.  A bad application can easily ‘hang’ the entire system if thread from application does not yield from CPU allowing other applications threads to execute.

 

If you liked this post do like us on Facebook at https://www.facebook.com/mssqlwiki and join our Facebook group MSSQLWIKI

Thank you,

Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

Disclaimer
The views expressed on this website/blog are mine alone and do not reflect the views of my company. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

Posted in SQL Server Engine, SQLServer SOS | Tagged: , , , , , , | 2 Comments »