MSSQLWIKI

Karthick P.K on SQL Server

Posts Tagged ‘sql server performance’

SQL Server Parameter sniffing

Posted by Karthick P.K on October 8, 2012

When a stored procedure, prepared queries and queries submitted via sp_executesql  is compiled for the first time, the values of the parameters supplied with the execution call are used for cardinality estimation, to optimize the statements within that stored procedure and create the query plan. This is known as parameter sniffing because the optimizer sniffs the current parameter value during compilation.

If these values are typical and the data distribution is even in the underlying tables, all the calls to the stored procedure will benefit from this query plan since the plan is reused. However, parameter sniffing can cause problems if the "sniffed" parameter value is not typical of the values which are actually used during a typical execution or the data in underlying tables are very skewed, because plan generated for “sniffed” parameter value may not be optimal for current parameter passed and since the plan is reused there can be performance degradation.

Consider the following scenario we have a table with two columns (country and some column ). This table has 10001 rows.  10000 rows has USA in country column and 1 row has brazil in country column.

 

This table has NONCLUSTERED INDEX called NC on country column.

 

 

create table data(country char(10),somecolumn char(10))

go

insert into data values (‘BRAZIL’,‘somedata’)

go

insert into data values (‘USA’,‘somedata’)

go 10000

 

CREATE NONCLUSTERED INDEX [NC] ON [dbo].[data]

(

[country] ASC

)WITH (STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

GO

 

create proc sniffing @p1 char(10)

as

begin

select country,somecolumn from data where country=@p1

end

Go

 

 

 

–Let us execute stored procedure sniffing with the with parameter brazil.

 

exec sniffing ‘BRAZIL’

go

 

clip_image002

 

 

 

 

Optimizer picked up Index-seek in Non-clustered index and Row-ID lookup on table.

 

What happens when we execute the same procedure with parameter ‘USA’. Since the plan is already created and cached for ‘BRAZIL’ it is reused and plan which is generated for BRAZIL is Not an optimal plan for parameter USA.

exec sniffing ‘USA’ 

go

 

 

 

clip_image004

 

 

How to identify if the optimizer is using plan which compiled for sniffed parameter values  and not the current parameters value.

 

Let us enable statistics xml on

 

 

set statistics xml on

 

exec sniffing ‘USA’ 

go

 

Look at the XML plan for the ParameterCompiledValue and ParameterRuntimeValue.

 

Below is extract from XML plan and this output proves that the plan is compiled for parameter BRAZIL (ParameterCompiledValue) and it is used for parameter USA (ParameterRuntimeValue)

{

<ColumnReferenceColumn="@p1"ParameterCompiledValue="‘BRAZIL    ‘"ParameterRuntimeValue="‘USA       ‘" />

}

 

 

 

We will also see a huge difference in estimated and actual rows count if the parameter sniffing is impacting the plan

(Remember out dated stats can also cause optimizer to estimate incorrect rows so difference in estimate and actual rows doesn’t mean it is because of parameter sniffing ). 

 

What would have been the optimal plan if the parameter ‘USA’?

 

 

Let us execute the same procedure with recompile option

 

 

exec sniffing ‘USA’ with recompile

go

 

 

clip_image006

 

 

 

 

How to fix Parameter sniffing?

 

1. USE RECOMPILE: when you create the stored procedure. so the parameter is compiled every time it is called. This method can be used if the compile time is very less compared to execution time of bad plan

   Ex: create proc sniffing @p1 char(10) with recompile

 

2.  OPTION (RECOMPILE): for the statement which impacted by the parameter sniffing.  If the procedure has multiple statements recompile will impact only the particular statement.

 

3.  OPTIMIZE FOR HINT: Instructs the query optimizer to use a particular value for a local variable when the query is compiled and optimized (or)  OPTIMIZE FOR  UNKNOWN WHICH Instructs the query optimizer to use statistical data

 

instead of the initial values for all local variables when the query is compiled and optimized.  This value is used only during query optimization and actual values are used during execution.

{

alter proc sniffing @p1 char(10)  as
begin
select country,somecolumn from data where country=@p1
option (optimize for (@p1 = ‘USA’))

–option (optimize for (@p1 unknown))
end
}

 

4.   Assign the incoming parameter values to local variables and use the local variables in the query.  If you are in SQL Server2000 in which we don’t have OPTIMIZE FOR hint.

      Ken Henderson has blogged about it in http://blogs.msdn.com/b/khen1234/archive/2005/06/02/424228.aspx

 

5.   Trace Flag 4136 which is introduced in SQL Server 2008 R2 Cumulative Update 2, SQL Server 2008 SP1 Cumulative Update 7 and SQL Server 2005 SP3 Cumulative Update 9 introduce trace flag 4136 that can be used to disable  the "parameter sniffing" process  more details on http://support.microsoft.com/kb/980653

 

 

 

If you liked this post, do like us on Facebook at https://www.facebook.com/mssqlwiki and join our Facebook group https://www.facebook.com/mssqlwiki#!/groups/454762937884205/

Thank you,

Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

Posted in Optimizer, Performance, SQL General, SQL Server Engine | Tagged: , , , , , , , , , , , , , , , , , | 6 Comments »

I/O requests taking longer than 15 seconds to complete on file

Posted by Karthick P.K on August 27, 2012

Do you see warnings like one below in your SQL Server error log?

SQL Server has encountered  x occurrence(s) of I/O requests taking longer than 15 seconds to complete on file .

The OS file handle is 0x000006A4. The offset of the latest long I/O is: 0x00000

(or)

BobMgr::GetBuf: Sort Big Output Buffer write not complete after n seconds.

This indicates SQL Server I/O Bottlenecks. SQL Server performance highly relies on the Disk performance.  SQL Server I/O Bottleneck can be identified through

1. PAGEIOLATCH_xx or WRITELOG wait types in Sys.Sysprocesses and other DMV’s

2. I/O taking longer than 15 seconds in SQL Server Error log.

{

SQL Server has encountered X occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [ ] in database [IOTEST (7). The OS file handle is 0x000006A4. The offset of the latest long I/O is:
0x000001

}

3. By looking at I/O latch wait statistics in sys.dm_os_wait_stats

{

Select  wait_type,         waiting_tasks_count,         wait_time_ms  from    sys.dm_os_wait_stats where    wait_type like ‘PAGEIOLATCH%’
order by wait_type

}

4. By looking at pending I/O requests and isolating the disks,File and database in which we have I/O Bottleneck.

{

select     database_id,     file_id,     io_stall,     io_pending_ms_ticks,     scheduler_address from    sys.dm_io_virtual_file_stats(NULL, NULL)t1,         sys.dm_io_pending_io_requests as t2
where    t1.file_handle = t2.io_handle

}

Following are common reasons for I/O Bottleneck in SQL Server.

1. SQL Server is spawning more I/O requests than what I/O disk subsystem could handle.

2 . There could be an Issue with I/O subsystem (or) driver/firmware issue (or) Misconfiguration in I/O Subsystem (or) Compression and  so the Disks are performing very slow and hence SQL Server is affected.

3. Some other process on the system is saturating the disks with I/O requests. Common application includes AV Scan,System Backup Etc. So I/O requests posted by SQL Server becomes slow.

 

How to  troubleshoot?

1.  Exclude SQL Server files from antivirus scan.

2. Do not place SQL Server FILES on compressed drives.

3. Distribute SQL Server data files and transaction log files across drives.

4. If the “I/O request taking longer” warning is for tempdb , Enable trace flag 1118 and increase the tempdb data files refer:http://support.microsoft.com/kb/2154845

5. If none of the above resolves the issue collect the below perfmon counters.

 

Perfmon counters can help us in understanding “If disk is slow” or  “SQL Server is spawning more I/O then what disk could handle” or “Some other process is saturating disk with I/O”

Note:It is important to get  throughput of the disk subsystem in MB/SEC before we look at disk counters. Normally it will be more than 150 MB for SAN disk and greater 50 MB for Single disk .When you look at the perfmon counter look at Max value.

 

Avg. Disk sec/Transfer –> Time taken to perform the I/O operation

Ideal value for Disk sec/Transfer is 0.005-0.010 sec. If you consistently notice this counter is beyond 0.015 then there is a serious I/O bottleneck.

1. Look for Disk Bytes /sec when Avg. Disk sec/Transfer  is greater than 0.015. If it is below 200 MB for SAN disk and Below 50 MB for Single disk then the problem is with I/O subsystem Engage hardware vendor.

2. If the Disk Bytes /sec  is greater than  200 MB for SAN disk or greater than 50 MB for Single disk when the  Avg. Disk sec/Transfer  is greater than 0.015. Look at the Process:IO Data Bytes/Sec for the same time and identify which process is spawning I/O. If the identified process is not SQL Server involve the team which supports that process. If the  the identified process is SQL Server tune SQL Server queries which are I/O intensive by creating dropping indexes etc.

 

Disk Bytes /sec  –> Total read and write to disk per second in bytes.

Collect the values for each logical disks in which SQL Server files are placed and look at the Max value for this counter ideally it has to be greater than the throughput of the disk subsystem. If you don’t have the throughput for the disk then this value to be greater than 200MB for SAN or greater than 50 MB for single disk.

If it is below the expected value you can consider that your disks are not performing well. Involve the hardware vendor. 

Important: Value for this counter will be low when there is no I/O happening on the drives. So you have to look at the this counter during the time you see I/O warnings or When Disk sec/Transfer >0.010 for the same drive. 

 

Process:IO Data Bytes/Sec –> Total read and write to disk per second in bytes by each process.

Collect this counter for all the processes running on the server. This counter will help us understand if any other process is saturating the disk with excessive I/O.

Example: Let us consider a disk with max throughput of 250MB per second. If antivirus is spawning 200MB of I/O per second and if SQL Server data files are placed in same drive and SQL Server is spawning 150MB obviously there will be I/O waits.

 

Buffer Manager: Page Read/sec + Page Writes/sec –>Total read and write to disk per second in bytes by SQL Server process.

Note: If you are analyzing the .BLG file collected and not live perfmon focus on Maximum value for each counter don’t look at average.  

 

If (Avg. Disk sec/Transfer> ==0.015 ) and ( (Disk Bytes /sec < 150MB (For San)) or (Disk Bytes /sec < 50MB (For Local) or (Disk Bytes /sec < Speed of disk as per Vendor ))

{

There is Issue with I/O subsystem (or) driver/firmware issue (or) Misconfiguration in I/O Subsystem.

}

If (Disk sec/Transfer > ==0.015 Consistently) and ( (Disk Bytes /sec >= 150 (For San)) or (Disk Bytes /sec >= 50MB (For Local) or (Disk Bytes /sec >= Speed of disk as per Vendor ))

{

Identify the process which is posting excessive I/O request using Process:IO Data Bytes/Sec.

If ( Identified process == SQLServer.exe )

{

Identify and tune the queries which is Spawning excessive I/O.

(Reads+Writes column in profiler, Dashboard reports or sys.dm_exec_query_stats and sys.dm_exec_sql_text

can be used to identify the query). Use DTA to tune the query

}

If ( Identified process != SQLServer.exe )

{

Engage the owner of application which is spawning excessive I/O

}

}

Many thanks to Joseph Pilov from whom I learned many techniques like the one above.

 

If you liked this post, do like us on FaceBook at https://www.facebook.com/mssqlwiki and join our FaceBook group https://www.facebook.com/mssqlwiki#!/groups/454762937884205/

Thank you,

Karthick P.K  |My Facebook Page |My Site| Blog space| Twitter

Posted in Performance, SQL Server Engine, SQL Server I/O | Tagged: , , , , , , | 22 Comments »

A significant part of SQL Server process memory has been paged out

Posted by Karthick P.K on June 27, 2012

When you get “A significant part of SQL Server process memory has been paged out. This may result in performance degradation.This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 2007640, committed (KB): 4594040, memory utilization: 43%.” message in SQL Server error log

Pay attention to Working set (KB), committed (KB) and memory utilization:% (Percentage of SQL Server memory in RAM) in above warning message. Above warning message is logged in SQL Server error log when working set reaches 50% or  below of the overall committed memory by SQL Server memory manager.

 

What is working set: Memory allocated by the process which is currently in RAM.

 

Committed: Total memory that is allocated by process (allocated bytes can be in RAM or Page file)

 

Working Set trimming:  Windows is moving the allocated bytes of the process from physical RAM to page file because of memory pressure. Memory pressure is most commonly caused by applications or windows components that are requesting more memory causing OS to start trimming working set of other processes to satisfy these new requests.

 

Before we step in to troubleshooting working set trimming warnings, here are few basics about how SQL Server memory management is designed to dynamically adjust the committed memory based on the amount of available memory on the system.

SQL Server uses CreateMemoryResourceNotification to create a memory resource notification object  and SQL Server Resource monitor threads calls QueryMemoryResourceNotification every time it runs to identify if there is any notification. If a low memory notification comes from Windows, SQL Server scales down its memory usage and when Windows sends the high memory notification, SQL Server Server can grow its memory usage target. Low memory notification is signaled by windows when the available physical memory drops approximately below 32 MB per 4 GB, to a maximum of 64 MB. The default level that signals a high-memory-resource notification event is three times the default low-memory value. As soon as the SQL Server resource monitor threads finds low-memory-resource notification it scales down SQL Server memory usage.

 

Why do I see “A significant part of sql server process memory has been paged out. This may result in performance degradation.” By SQL Server In spite of having above mechanism to detect the system level memory pressure and scale SQL Server memory?

 

There are couple of situations where SQL Server Process working set might be paged out by Windows despite these memory resource notification mechanism.

1.If windows is not sending the correct notifications to all listening processes at the right moment and thresholds

2.If SQL Server is not responding fast enough to the low memory resource notification from Windows

3.When low physical memory notification is received by SQL Server it will scale down its memory usage by trimming internal caches. This effect is applied for 5 seconds and then paused for 1 minute. This is to avoid any hungry/faulty application to consume all the memory making SQL Server to  scale its usage continuously. If low memory conditions still exist after 1 minute pause, the effect is applied again. So if there is physical memory pressure even after SQL Server scaling its usage for 5 seconds windows will still page out SQL Server’s working set.

4.Conditions in Windows where working sets of all processes are trimmed

5.Windows might decide to trim a certain percentage of working set of various or specific processes

 

 

We can also increase the LowMemoryThreshold value so the OS will notify applications such as SQL on low memory conditions much earlier and SQL Server can respond to memory pressure much early before the system is starving for memory.

 

How to set the LowMemoryThreshold value (in MB)?

 

In Regedit -> go to

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\SessionManager\MemoryManagement

Right click on the right pane,

Select New -> select click DWORD Value -> enter LowMemoryThreshold

Double Click LowMemoryThreshold -> value (choose decimal) -> 512

System Reboot is required to take effect.

 

Default values as per MSDN:

“The default level of available memory that signals a low-memory-resource notification event is approximately 32 MB per 4 GB, to a maximum of 64 MB. The default level that signals a high-memory-resource notification event is three times the default low-memory value.”

 

 

We can use the below query to extract information about the condition of OS memory and SQL memory using a query like the following. Looking at this query, you will be able to easily determine the various indicators that would have triggered the Windows to page various processes including SQL Server. Use the following query to obtain the memory notification-related information from the XML data of the ring buffer

SELECT CONVERT (varchar(30), GETDATE(), 121) as [RunTime],
dateadd (ms, (rbf.[timestamp] - tme.ms_ticks), GETDATE()) as [Notification_Time],
cast(record as xml).value('(//Record/ResourceMonitor/Notification)[1]', 'varchar(30)') AS [Notification_type],
cast(record as xml).value('(//Record/MemoryRecord/MemoryUtilization)[1]', 'bigint') AS [MemoryUtilization %],
cast(record as xml).value('(//Record/MemoryNode/@id)[1]', 'bigint') AS [Node Id],
cast(record as xml).value('(//Record/ResourceMonitor/IndicatorsProcess)[1]', 'int') AS [Process_Indicator],
cast(record as xml).value('(//Record/ResourceMonitor/IndicatorsSystem)[1]', 'int') AS [System_Indicator],
cast(record as xml).value('(//Record/ResourceMonitor/Effect/@type)[1]', 'varchar(30)') AS [type],
cast(record as xml).value('(//Record/ResourceMonitor/Effect/@state)[1]', 'varchar(30)') AS [state],
cast(record as xml).value('(//Record/ResourceMonitor/Effect/@reversed)[1]', 'int') AS [reserved],
cast(record as xml).value('(//Record/ResourceMonitor/Effect)[1]', 'bigint') AS [Effect],

cast(record as xml).value('(//Record/ResourceMonitor/Effect[2]/@type)[1]', 'varchar(30)') AS [type],
cast(record as xml).value('(//Record/ResourceMonitor/Effect[2]/@state)[1]', 'varchar(30)') AS [state],
cast(record as xml).value('(//Record/ResourceMonitor/Effect[2]/@reversed)[1]', 'int') AS [reserved],
cast(record as xml).value('(//Record/ResourceMonitor/Effect)[2]', 'bigint') AS [Effect],

cast(record as xml).value('(//Record/ResourceMonitor/Effect[3]/@type)[1]', 'varchar(30)') AS [type],
cast(record as xml).value('(//Record/ResourceMonitor/Effect[3]/@state)[1]', 'varchar(30)') AS [state],
cast(record as xml).value('(//Record/ResourceMonitor/Effect[3]/@reversed)[1]', 'int') AS [reserved],
cast(record as xml).value('(//Record/ResourceMonitor/Effect)[3]', 'bigint') AS [Effect],

cast(record as xml).value('(//Record/MemoryNode/ReservedMemory)[1]', 'bigint') AS [SQL_ReservedMemory_KB],
cast(record as xml).value('(//Record/MemoryNode/CommittedMemory)[1]', 'bigint') AS [SQL_CommittedMemory_KB],
cast(record as xml).value('(//Record/MemoryNode/AWEMemory)[1]', 'bigint') AS [SQL_AWEMemory],
cast(record as xml).value('(//Record/MemoryNode/SinglePagesMemory)[1]', 'bigint') AS [SinglePagesMemory],
cast(record as xml).value('(//Record/MemoryNode/MultiplePagesMemory)[1]', 'bigint') AS [MultiplePagesMemory],
cast(record as xml).value('(//Record/MemoryRecord/TotalPhysicalMemory)[1]', 'bigint') AS [TotalPhysicalMemory_KB],
cast(record as xml).value('(//Record/MemoryRecord/AvailablePhysicalMemory)[1]', 'bigint') AS [AvailablePhysicalMemory_KB],
cast(record as xml).value('(//Record/MemoryRecord/TotalPageFile)[1]', 'bigint') AS [TotalPageFile_KB],
cast(record as xml).value('(//Record/MemoryRecord/AvailablePageFile)[1]', 'bigint') AS [AvailablePageFile_KB],
cast(record as xml).value('(//Record/MemoryRecord/TotalVirtualAddressSpace)[1]', 'bigint') AS [TotalVirtualAddressSpace_KB],
cast(record as xml).value('(//Record/MemoryRecord/AvailableVirtualAddressSpace)[1]', 'bigint') AS [AvailableVirtualAddressSpace_KB],
cast(record as xml).value('(//Record/@id)[1]', 'bigint') AS [Record Id],
cast(record as xml).value('(//Record/@type)[1]', 'varchar(30)') AS [Type],
cast(record as xml).value('(//Record/@time)[1]', 'bigint') AS [Record Time],
tme.ms_ticks as [Current Time]
FROM sys.dm_os_ring_buffers rbf
cross join sys.dm_os_sys_info tme
where rbf.ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR' --and cast(record as xml).value('(//Record/ResourceMonitor/Notification)[1]', 'varchar(30)') = 'RESOURCE_MEMPHYSICAL_LOW'
ORDER BY rbf.timestamp ASC

We can use below query to to check the health of SQL Server including SQL Server working set information in past

SELECT  CONVERT (varchar(30), GETDATE(), 121) as runtime, DATEADD (ms, a.[Record Time] - sys.ms_ticks, GETDATE()) AS Notification_time,    a.* , sys.ms_ticks AS [Current Time]
FROM   (SELECT x.value('(//Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') AS [ProcessUtilization],
x.value('(//Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') AS [SystemIdle %],
x.value('(//Record/SchedulerMonitorEvent/SystemHealth/UserModeTime) [1]', 'bigint') AS [UserModeTime],
x.value('(//Record/SchedulerMonitorEvent/SystemHealth/KernelModeTime) [1]', 'bigint') AS [KernelModeTime],
x.value('(//Record/SchedulerMonitorEvent/SystemHealth/PageFaults) [1]', 'bigint') AS [PageFaults],
x.value('(//Record/SchedulerMonitorEvent/SystemHealth/WorkingSetDelta) [1]', 'bigint')/1024 AS [WorkingSetDelta],
x.value('(//Record/SchedulerMonitorEvent/SystemHealth/MemoryUtilization) [1]', 'bigint') AS [MemoryUtilization (%workingset)],
x.value('(//Record/@time)[1]', 'bigint') AS [Record Time]  FROM (SELECT CAST (record as xml) FROM sys.dm_os_ring_buffers
WHERE ring_buffer_type = 'RING_BUFFER_SCHEDULER_MONITOR') AS R(x)) a  CROSS JOIN sys.dm_os_sys_info sys ORDER BY DATEADD (ms, a.[Record Time] - sys.ms_ticks, GETDATE())
  • Process Utilization shows the percentage of overall used system CPU time that was consumed by sqlservr.exe. Process Utilization is calculated as sql_process_cpu_time/total_system_cpu_time for the current time interval.
  • SystemIdle is the percentage of time that the system’s CPU’s have been idle.
  • Page Faults value is the number of hard + soft page faults that have occurred since the last snapshot.
  • Working Set is the change in the working set size  in KB, since the last snapshot.
  • Memory Utilization is working set/committed memory which is another way of saying that the number is the percentage of the process’ committed memory that is in RAM.  The farther below 100% that this number falls, the larger the percentage of SQL memory that has been trimmed and moved to the page file.

Common Side Effects of Working set Trimming

1.  When OS starts trimming the working set of SQL Server we would see drastic performance drop, increased I/O,non-yielding Resource Monitor / scheduler dumps etc..

2.  IS-alive check failures resulting in SQL Server resource failure.

3. Resource monitor thread can start  Garbage collector  if SQLCLR is enabled on this instance of SQL. When Garbage collector  is kicked off during memory pressure all other threads in the process are suspended. So if Garbage collector is taking a long time reosurce monitor thread appears stuck and hence the non-yielding errors and dumps are generated. (Refer http://support.microsoft.com/kb/2504603)

How to troubleshoot?

1.  Capture perfmon counters (Process: Private bytes and Working set ) to determine which applications / windows component are requesting memory and causing OS to start trimming the working set of processes including SQL Server.

2.  Use This exe which will print the memory information of all the processes and system wide memory information (Global memory status) when the operating system signals  low memory notification.

 

3.  Cap the SQL Server MAX Server Memory after considering the memory required by other applications, Operating system, Drivers , SQL Server Non- Bpool allocations etc. Make sure you have adequate available physical memory even when the system is under heavy load.

 

4.  We can consider using the Lock pages in memory privilege. Remember it protects only the BPool from paging and Non-Bpool allocations can still be paged out.

If you liked this post, do like us on Facebook at https://www.facebook.com/mssqlwiki and join our Facebook group  https://www.facebook.com/mssqlwiki#!/groups/454762937884205/

 

https://mssqlwiki.com/sqlwiki/sql-performance/basics-of-sql-server-memory-architecture/

https://mssqlwiki.com/sqlwiki/sql-performance/windows-2008-and-windows-2008-r2-known-issues-related-to-working-set-memory/

https://mssqlwiki.com/2012/05/18/sql-server-performance-degraded-in-32-bit-sql-server-after-i-adding-additional-ram/

https://mssqlwiki.com/sqlwiki/sql-performance/troubleshooting-sql-server-memory/

https://mssqlwiki.com/sqlwiki/sql-performance/io-bottlenecks/

Thank you,

Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

Posted in Memory, Performance, SQL Server Engine, SQL Server memory | Tagged: , , , , , , , , , , , , , , , | 29 Comments »

SQL Server performance degraded in 32-Bit SQL Server after adding additional RAM.

Posted by Karthick P.K on May 18, 2012

 

Do you know that adding additional RAM can affect the performance of SQL Server Sometimes?

I am not going to write how Optimizer can some times choose suboptimal plans when we have large amount of memory on the system but We will see how the memory which can be used by other memory clerks (aks: stolen memory) can shrink when we have large physical memory and AWE enabled.

 

If you notice  performance of 32-Bit SQL Server degraded after you added additional RAM or if you see SQL Server memory errors like ones below after adding RAM then it could be because of Large BUF structures which reduced the size of Bpool.

 

Errors:

SQL Server 2005/2008

 

Buffer Pool errors:

    BPool::Map: no remappable address found.

 

Either BPool or MemToLeave errors:

    Error: 17803 “Insufficient memory available..”

    Buffer Distribution:  Stolen=7901 Free=0 Procedures=1 Inram=201842 Dirty=0 Kept=572…

 

Extract from SQL Server memory design

{

SQL Server "User address space" is broken into two regions: MemToLeave and Buffer Pool

 

Size of MemToLeave (MTL) and Buffer Pool (BPool) is determined by SQL Server during start up as below.

 

MTL (Memory to Leave)= (Stack size * max worker threads) + Additional space to load Dll’s.

 

Stack size =512 KB per thread for 32 Bit SQL Server (904K under WOW)

 

I.e. = (256 *512 KB) + 256MB =384MB

 

Additional space to load Dll’s= 256 MB from SQLServer2000. This space is used to store COM objects, Extended stored procedure, Linked server in SQL Server process

 

Note: Additional space to load Dll’s can be modified using -g startup parameter.

 

on any machine with less than 4 processors the Maximum worker Thread’s is always 256 by default (unless we change the value using SP_configure)

 

SQL Server Buffer Pool is minimum of “Physical RAM “ or “user mode memory(2GB or 3GB) – MTL-  BUF structures”

 

BPool = Minimum (Physical memory, User address space – MTL) – BUF structures

 

}

 

 

When AWE is enabled in 32-Bit SQL Server M_pbuf (part of BUF structures) which is mentioned earlier is calculated and allocated for entire physical memory on the system . Regardless of “MAX Server Memory”   This is to adjust Max server memory without restarting SQL Server.

 

SQL Server requires 8MB to create M_pbuf for every 1GB of RAM available on the server.

 

Machine with 64 GB RAM can consume 64 (RAM) *8MB (M_pbuf for each GB) =512 MB just for the BUF array alone.

 

So the amount of BPOOL available for SQL Server is adversely affected.

 

Going back to the previous formula for BPOOL. Size of Bpool for 32-Bit SQL Server with AWE enabled and 64 GB of RAM would be.

 

BPool = Minimum (Physical memory, User address space – MTL) – BUF structures

 

BPool= Minimum (64GB, (2GB-384MB)) – BUF structures (512+ MB)

 

Bpool would approximately become 1GB.  Since size BPOOL become very small we might end up with memory errors.

 

Note:  In 32-Bit SQL Server Only data pages an index pages can be placed in AWE memory. So the memory available for other SQL Server memory objects is still limited to BPOOL and MTL.  

 

How to resolve this issue?

Remove few GB of RAM from server J if you can convince your management that removing RAM will improve performance.

(Or)

There is a startup trace flag TF 836 which you can use to indicate that BUF’s need to be allocated only for the configured max server memory setting. Enable this Trace Flag (836) and Reduce the “MAX Server Memory” of SQL Server.

(Or)

Enable /3GB. This will increase the Size of SQL Server BPOOL by 1GB providing relief to SQL Server BPOOL pressure.

Note: When the physical RAM in the system exceeds 16 GB and the /3GB switch is used, the operating system will ignore the additional RAM until the /3GB switch is removed.

 

 

 

Related posts

Troubleshooting steps for all SQL Server Memory errors

Trouble shooting working set trim “A significant part of SQL Server process memory has been paged out”

SQL Server lock pages in memory should I use it?

SQL Server memory leak

What is new in SQL Server 2012 Memory

How to set max server memory and min server memory

 

If you liked this post do like us on Facebook at https://www.facebook.com/mssqlwiki and join our Facebook group MSSQLWIKI

Thank you,

Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

Disclaimer
The views expressed on this website/blog are mine alone and do not reflect the views of my company. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

Posted in Memory, Performance, SQL General, SQL Server Engine | Tagged: , , , , , , , | 6 Comments »