MSSQLWIKI

Karthick P.K on SQL Server

Archive for the ‘Configuration’ Category

The connection to the primary replica is not active. The command cannot be processed

Posted by Karthick P.K on June 20, 2013

When you configure SQL Server always on available group from management studio it may fail with below error while joining secondary replica to the availability group.

 

Error 1

 

{

Joining database on secondary replica resulted in an error.  (Microsoft.SqlServer.Management.HadrTasks)

——————————

ADDITIONAL INFORMATION:

Failed to join the database ‘AG’ to the availability group ‘AG1’ on the availability replica ‘NODE2’. (Microsoft.SqlServer.Smo)

An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)

——————————

The connection to the primary replica is not active.  The command cannot be processed. (Microsoft SQL Server, Error: 35250)

}

 

Error 2

 

{

TITLE: Microsoft SQL Server Management Studio

——————————

Failed to join the instance ‘NODE2’ to the availability group ‘AG1’. (Microsoft.SqlServer.Management.SDK.TaskForms)

For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&ProdVer=11.0.2100.60+((SQL11_RTM).120210-1917+)&EvtSrc=Microsoft.SqlServer.Management.Smo.ExceptionTemplates.FailedOperationExceptionText&LinkId=20476

——————————

ADDITIONAL INFORMATION:

Failed to join local availability replica to availability group ‘AG1’.  The operation encountered SQL Server error 41106 and has been rolled back.  Check the SQL Server error log for more details.  When the cause of the error has been resolved, retry the ALTER AVAILABILITY GROUP JOIN command. (Microsoft SQL Server, Error: 41158)

For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&ProdVer=11.00.2100&EvtSrc=MSSQLServer&EvtID=41158&LinkId=20476

}

 

 

You may get below error when you configure AG availability group using  alter database command mentioned below or synchronization might fail with 35250 error mentioned below.

 

ALTER DATABASE [AG] SET HADR AVAILABILITY GROUP = [Group name];

 

Error 1

 

Msg 35250, Level 16, State 7, Line 1

The connection to the primary replica is not active.  The command cannot be processed.

 

 

To resolve  above errors

 

1. Ensure always on endpoint ([Hadr_endpoint]) are not blocked by firewall (Default port 5022).

 

2. Make sure startup account of primary server is added to all secondary server’s and Startup accounts of all secondary servers are added to primary servers.(Startup account of each replica to be added to other replica’s)

 

3. If log on account of SQL Server is “Nt service\” or local system account then ensure system account (Domainname\systemname$) of each replica is added to other replicas.

{

CREATE LOGIN [MSSQLWIKI\node2$] FROM WINDOWS

}

 

4. Grant connect on always on endpoints created on each replicas for startup account of other replica servers (Grant connect on endpoints even if startup account of other replicas are added as sysadmins).

{

GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [MSSQLWIKI\node1$]

}

 

5.  Make sure SQL Server name (select @@servername) matches with hostname.

6. Make sure cluster service startup account is part of SQL Server logins (More details in This link).

 

 

Thank you,

Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

Disclaimer:

The views expressed on this website/blog are mine alone and do not reflect the views of my company or anyone else. All postings on this blog are provided “AS IS” with no warranties, and confers no rights

 

Posted in Always On, Configuration, Connectivity, Security, SQL General | Tagged: , , , , , , | 28 Comments »

SQL Server fails to start with error "Failed allocate pages: FAIL_PAGE_ALLOCATION 1" During startup

Posted by Karthick P.K on January 6, 2013

SQL Server fails to start and If you look at the SQL Server Error log you will find "Failed allocate pages: FAIL_PAGE_ALLOCATION" and SQL Server generating exception dump. Similar to the SQL Server error log below.

 

Note: This blog is applicable when you out get of memory error during startup (or) with event ID: 2019 in system event log. For general troubleshooting of SQL Server out of memory errors follow steps in Troubleshooting SQLServer Memory

{

2013-01-02 12:31:20.91 Server      Microsoft SQL Server 2008 R2 (SP1) – 10.50.2500.0 (Intel X86)

                Jun 17 2011 00:57:23

                Copyright (c) Microsoft Corporation

                Enterprise Edition on Windows NT 5.2 <X86> (Build 3790: Service Pack 2)

2013-01-02 12:31:20.91 Server      (c) Microsoft Corporation.

2013-01-02 12:31:20.91 Server      All rights reserved.

2013-01-02 12:31:20.91 Server      Server process ID is 1583.

2013-01-02 12:31:20.91 Server      Authentication mode is MIXED.

2013-01-02 12:31:20.91 Server      Logging SQL Server messages in file ‘C:\Microsoft SQL Server\MSSQL10_50.MSSQLWIKIServer\MSSQL\Log\ERRORLOG’.

2013-01-02 12:31:20.91 Server      This instance of SQL Server last reported using a process ID of 9240 at 1/3/2013 7:31:20 PM (local) 1/4/2013 12:31:20 AM (UTC). This is an informational message only; no user action is required.

2013-01-02 12:31:20.91 Server      Registry startup parameters:

                 -d C:\Microsoft SQL Server\MSSQL10_50.MSSQLWIKIServer\MSSQL\DATA\master.mdf

                -e C:\Microsoft SQL Server\MSSQL10_50.MSSQLWIKIServer\MSSQL\Log\ERRORLOG

                -l C:\Microsoft SQL Server\MSSQL10_50.MSSQLWIKIServer\MSSQL\DATA\mastlog.ldf

2013-01-02 12:31:20.92 Server      SQL Server is starting at normal priority base (=7). This is an informational message only. No user action is required.

2013-01-02 12:31:20.92 Server      Detected 24 CPUs. This is an informational message; no user action is required.

2013-01-02 12:31:20.94 Server      Address Windowing Extensions is enabled. This is an informational message only; no user action is required.

2013-01-02 12:31:27.33 Server       Failed allocate pages: FAIL_PAGE_ALLOCATION 1

2013-01-02 12:31:27.33 Server     

Memory Manager                                   KB

—————————————- ———-

VM Reserved                                 1534584

VM Committed                                  51576

AWE Allocated                                     0

Reserved Memory                                1024

Reserved Memory In Use                            0

2013-01-02 12:31:27.33 Server      Error: 17311, Severity: 16, State: 1. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.

2013-01-02 12:31:27.33 Server      Using ‘dbghelp.dll’ version ‘4.0.5’

2013-01-02 12:31:27.34 Server      **Dump thread – spid = 0, EC = 0x00000000

2013-01-02 12:31:27.34 Server      ***Stack Dump being sent to C:\Microsoft SQL Server\MSSQL10_50.MSSQLWIKIServer\MSSQL\LOG\SQLDump0008.txt

2013-01-02 12:31:27.34 Server      * *******************************************************************************

2013-01-02 12:31:27.34 Server      *

2013-01-02 12:31:27.34 Server      * BEGIN STACK DUMP:

2013-01-02 12:31:27.34 Server      *   01/03/13 19:31:27 spid 4344

2013-01-02 12:31:27.34 Server      *

2013-01-02 12:31:27.34 Server      * ex_handle_except encountered exception C0000005 – Server terminating

 

}

 

 

Why would SQL Server fail with out of memory error (FAIL_PAGE_ALLOCATION)during the startup? Only possible reason that I could think of is Paged or NonPaged pool is empty.

How to prove if my Paged / NonPaged pool is empty?  Look at the system event log for the Event ID: 2019

 

You will find error in system event log similar to one you see below.

{

Event Type:        Error

Event Source:    Srv

Event Category:                None

Event ID:              2019

Date:                     2013-01-02

Time:                     12:31:00 PM

User:                     N/A

Computer:          MSSQLWIKIServer

Description:

The server was unable to allocate from the system nonpaged pool because the pool was empty.

}

 

Above error indicates nonpaged pool is empty, When Nonpaged pool is empty every application would fail. How to identify who is consuming Nonpaged pool?

 

Use poolmon.exe from windows support tools. (Steps are documented in This KB).

If you r OS is windows 2003 or above you can simple run the exe from command prompt and identify who is consuming (Leaking J) space in Paged / NonPaged pool.

Below is sample output of poolmon.exe which I collected from my test system

 

clip_image002[4]

 

Memory consumption by each tag is printed in above output. After finding the tag which is leaking the memory (Highest bytes)identify the Driver which is using the tag by using  find command or strings utility from sysinternals (search for TAG in drivers folder %Systemroot%\System32\Drivers). Once you identify the driver, check if there are any known issue with the driver or you may have to contact the vendor of the driver to identify why the driver is consuming large amount of pooled /Non-pooled memory.

 

If you liked this post, do like us on Facebook at https://www.facebook.com/mssqlwiki , join our Facebook group MSSQLWIKI and post your SQL Server questions to get answered by experts.

 

 

Thank you,

Karthick P.K |

 My Facebook Page |My Site| Blog space| Twitter

 

Disclaimer

The views expressed on this website/blog are mine alone and do not reflect the views of my company. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Posted in Configuration, Memory, SQL Server Engine, SQL Server memory, Startup failures | Tagged: , , , , | 26 Comments »

Troubleshooting SQL Server high CPU usage

Posted by Karthick P.K on October 4, 2012

Troubleshooting SQL Server high CPU usage

 

First thing to determine when there is High CPU on systems is, if SQL server is consuming the CPU resource or other applications/service.

 

Use query in  THIS LINK to get CPU usage history (or) Task manager (or) Perfmon counter to determine that. In Perfmon, Process %Process time can also be used. Remember this counter is not based on 100%.  It is based on number of processor.  If you see 200 for sqlservr.exe and the system has 8 CPU, CPU consumed by sqlservr.exe is 200 out of 800 (only 25%).)

 

If the CPU spike is caused by other application involve application team.

 

Next step is to determine if the CPU consumed is kernel time or user time.

 

We can use Process %Privileged  time and %user Time counters in perfmon. Task manager will show kernel times which will also help us understand

 

Kernel CPU:  In general, if kernel CPU remains below 10%, it’s normal.  But if you see sustained kernel CPU at 30% or above, you should start looking at system drivers , Antivirus etc.  some known issues which can increase Kernel CPU time are

1.       Few Anti-virus software’s can cause high kernel time.  Temporarily disable anti-virus software to rule this out

 

2.       We have seen high resolution timer in SQL 2008 or SQL 2005 SP3 caused high kernel time in Virtual Machines because of outdated BIOS .  Temporarily disabling high resolution timer by turning on trace flag 8038 (configure as startup parameter) to prove this. Check for BIOS update and do not use 8038 in long term.

 

High user CPU: Some of the most common causes for High CPU in SQL Server  are

1.       Query execution causing CPU spike (Most commonly caused by optimizer picking bad plan).

 

2.       High compiles and recompiles. (schema, Stats change, Use of Temp table, Recompile hint).

 

3.       System threads spiking CPU (Ghost cleanup, Lazy writer, Resource monitor).

 

4.       Running many traces.

 

 

1. Query execution causing CPU spike:

 

Query execution  takes long times and spikes CPU commonly because of in-correct cardinality estimates caused by outdated statistics, Lack of Index, Server configuration, Distributed queries, etc.

 

When the server is experiencing this problem run the query in below link to list all the queries which are executing in the server order by CPU time desc along with plan.

{

Get SQL Text and Query Plan for statements which are executing now

}

 It could be one query which is driving the majority CPU time or Multiple queries each driving the CPU. Look at the CPU time of the above query output.

 

If it is single query/Store procedure which is driving the majority of CPU.

 

1.        Update the stats of tables and indexes used by the query (If the stats are up to date Estimated rows and estimated execution will  be approximately

same in execution plan .If there is huge difference stats are out dated and requires update) .

 

2.       Identify if the query has used bad plan because of parameter sniffing (If the ParameterCompiledValue and ParameterRuntimeValue is different in XML plan). Refer THIS LINK to know more about Parameter Sniffing

 

3.        If updating the stats and fixing the parameter sniffing doesn’t resolve the issue it is more likely optimizer is not able to create efficient plan because of lack of indexes and correct statistics. Run the query which is driving the CPU in database tuning advisor and apply the recommendations. (You will find missing index detail in xml plan but DTA is more efficient). You can follow the steps in Tune queries using SQL Server Database tuning advisor .

 

4.       If the query which is spiking the CPU is linked server query try changing the security of linked server to ensure linked server user has ddl_admin or dba/sysadmin on the remote server. More details regarding the issue in THIS LINK.

 

5.       Ensure optimizer is not aborting early and creating bad plan. For details refer THIS LINK.

 

6.       Ensure the query which is spiking the CPU doesn’t have plan guides (xml plan will have PlanGuideDB attribute.  Also sys.plan_guides will have entries) and query hints(index= or (option XXX join) or inner (Join Hint) join).

 

7. Ensure that SET options are not changed.

 

If it is Multiple queries/Store procedure are driving the CPU together.

 

1.       Update the stats of all the tables and indexes in the database. Using the query in below link Rebuild index and update statistics for all the tables in database

 

2.       If updating stats doesn’t help and rebuilding the indexes doesn’t bring down the CPU we have to tune the queries 1 by 1.

 

3.       Ensure Large amount of RAM is not causing optimizer to choose inefficient plan http://support.microsoft.com/kb/2413549

 

4.       Ensure that we do not run many traces at same time (commonly from monitoring tools). Use query in below link to list all the active traces.

{

Find all the profiler traces running on SQL Server

}

 

2. If the system thread is consuming most of the CPU.

 

1.       If none of the SQL queries are consuming majority of CPU,  we can identify if the back ground threads is consuming the majority of CPU by looking at  sysprocesses output for background threads. select * from sys.sysprocesses where spid<51.

 

2.       Check if you are hitting any of the known issues.

{

Resource Monitor may consume high CPU: http://support.microsoft.com/kb/968722

The Ghost Cleanup task uses 100% of the CPU on an idle system in SQL Server 2008 or in SQL Server 2005: http://support.microsoft.com/?id=978430

}

 

3. High compiles and recompiles: I will blog about high compiles and recompiles shortly

                               

4. Other factors which can impact SQL Server query performance

1.       Maximum degree of parallelism. Ensure MAX DOP is set properly (you can follow the steps in How to set Max degree of parallelism (MAXDOP)

 

2.       Priority boost. (Do not enable priority boot)

 

3.       Do not enable Fiber mode.

 

4.       Tweaking affinity mask (Spikes few CPU).

 

5.       TokenAndPermUserStore. http://support.microsoft.com/kb/927396

 

6.       CPU power plan degrade the server performance http://support.microsoft.com/kb/2207548

 

7.       SQL Server that’s uses  .Net Framework can cause high CPU Refer THIS LINK

 

 

If you liked this post, do like us on Facebook at https://www.facebook.com/mssqlwiki and join our Facebook group

Related blogs: 

Tuning SQL Server query

Thank you,

Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

Disclaimer

The views expressed on this website/blog are mine alone and do not reflect the views of my company. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

Posted in Configuration, Optimizer, Performance, SQL Server Engine | Tagged: , , , , , , , , , , , , , , , , , , , , | 28 Comments »

Database Mail errors in SQL Server (Troubleshooting steps)

Posted by Karthick P.K on August 25, 2012

Troubleshooting Database Mail issues in SQL Server

 

 

Use the Database Mail Configuration Wizard, change the Logging Level to Verbose and send a test mail to investigate the point of failure.

 

Right click database mail –View database mail log to see error or we can SELECT * FROM msdb.dbo.sysmail_event_log ;

 

Check the sent_Status column in the sysmail_allitems table. The four values are sent, unsent, retrying and failed.

If the status is sent and the recipients  hasn’t received the email yet, that the Database Mail external program successfully delivered the e-mail message to the SMTP server but it failed to deliver the message to the final recipient. At this point, the SMTP needs to be troubleshooted (perhaps engaged your Exchange or Mail server team)

 

If the status is unsent or retrying, it means that the Database Mail has not yet processed the e-mail message or is in the process of retrying after a failed attempt. This could be due to network conditions, volume of messages, SMTP server issues, etc. If the problem persists, use another profile or another mail host database.

 

If the status is failed, it means that the Database Mail was unable to deliver the message to the SMTP server. Check the sysmail_log table and the destination address. Also be sure that there are no Network or SMTP issues.

 

Send a test email outside SQL Server using below script or Other mail clients and check if the  recipients   are receiving mails. If they do not receive problem is outside SQL Server. Engage Exchange or other mail server teams to identify why we are not able to send emails from below script or Office outlook or Other mail clients.

Set objMessage = CreateObject("CDO.Message")
 objMessage.Subject = "Hello"
 objMessage.From = """SENDER NAME""<e-mail ID>"
 objMessage.To = "To address@mssqlwiki.com"
 objMessage.HTMLBody = "<h1><font face=arial>Hello,<br>How are you?."
 objMessage.Configuration.Fields.Item _
 ("http://schemas.microsoft.com/cdo/configuration/sendusing") = 2
 objMessage.Configuration.Fields.Item _
 ("http://schemas.microsoft.com/cdo/configuration/smtpserver") = "smtphost.dns.Mailserver.com"
 objMessage.Configuration.Fields.Item _
 ("http://schemas.microsoft.com/cdo/configuration/smtpserverport") = 25
objMessage.Configuration.Fields.Item _
("http://schemas.microsoft.com/cdo/configuration/smtpauthenticate") = 2
objMessage.Configuration.Fields.Item _
("http://schemas.microsoft.com/cdo/configuration/smtpusessl") = False
objMessage.Configuration.Fields.Item _
("http://schemas.microsoft.com/cdo/configuration/smtpconnectiontimeout") = 60
objMessage.Configuration.Fields.Update
objMessage.Send

If the mail has successfully reached  to recipients from above script problem is with in SQL Server mail configuration. 
Verify the following

 

1.Verify if  Service Broker is enabled (select is_broker_enabled from sys.databases where name=‘MSDB’ (0 – disabled, 1- enabled).

To enable service broker on your database run the following query: ALTER DATABASE MSDB SET ENABLE_BROKER

Note: You will be required to have exclusive access to the database while running this statement.  If you do not you will get the following error message: 
Msg 5061, Level 16, State 1, Line 1. ALTER DATABASE failed because a lock could not be placed on database MSDB. Try again later.

                               Msg 5069, Level 16, State 1, Line 1

ALTER DATABASE statement failed.

You will have to stop SQL Server agent to enable broker on MSDB

2.Check if Database mail stored procedures are  enabled (Surface Area Configuration >> “Surface Area Configuration for Features” >> Under MSSQLSERVER, expand Database Engine, and then click Database Mail. >> Ensure that Enable Database Mail stored procedures is selected, and then click Apply).

 

3.Check if the user is part of DatabaseMailUserRole.

 

4.Check what parameters and values are used in configuration by running

   exec msdb..sysmail_help_configure_sp

   A list of default values are given in BOL, topic: “sysmail_help_configure_sp (Transact-SQL)”. To modify a parameter or value you can use the following stored procedure

   exec msdb..sysmail_configure_sp ‘parameter_name’, ‘parameter_value’

 Check if ReadFromConfigurationFile is enabled if yes check if the DatabaseMail90.exe.config file (The default path is < drive >\Program Files\Microsoft SQL   Server\MSSQL.1\MSSQL\Binn) and has proper parameters.

 

5.Verify that the Database Mail executable is located in the correct directory – e.g. C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Binn

 

6.Verify that the service account for SQL Server has permission to run the executable, DatabaseMail90.exe, which requires network access to the SMTP servers specified in Database Mail accounts. Therefore, the service account for SQL Server must have permission to access the network, and the SMTP servers must allow connections from the computer that runs SQL Server.

 

If you liked this post, do like us on Facebook at https://www.facebook.com/mssqlwiki and join our Facebook group https://www.facebook.com/mssqlwiki#!/groups/454762937884205/

 

Thank you,

Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

Posted in Configuration, Database mail, SQL General | Tagged: , | 5 Comments »

How to analyze Non-Yielding scheduler or Non-yielding IOCP Listener dumps ……

Posted by Karthick P.K on August 17, 2012

Note: If you are interested only in finding a quick resolution for  Non-Yielding scheduler or Non-yielding IOCP Listener dumps or  Non-yielding  resource monitor Jump to THIS LINK. Continue reading this article if you like to learn How to analyze Non-Yielding scheduler dumps and Non-yielding IOCP listener dumps

How to analyze Non-Yielding scheduler dumps and Non-yielding IOCP listener dumps?

This blog is targeted towards analyzing Non-Yielding scheduler dumps and not targeted on explaining how Non-Yield Detection works please read http://technet.microsoft.com/en-us/library/cc917684.aspx to understand how the Non-Yield Detection works but let us recollect few key points before we get in to analysis.

 

1.   SQL Server has its own logical schedulers to schedule the SQL Server workers.

2.   The scheduler is called the User Mode Scheduler (UMS) in SQL Server 2000 and the SQL Server Operating System (SOS) Scheduler in SQL Server 2005

3.   Logical scheduler makes the worker non-preemptive to the database engine. The worker owns the scheduler until it yields to another worker on the same scheduler.

 

What if the threads which owns the scheduler executes for long time without yielding (or) forever and does not yield to give quantum for the other threads waiting in the scheduler?

Answer: Other threads would not get CPU cycles and starve the SQL Server performance.

 

What if the thread is not able to finish its work with in quantum –(4 Milliseconds) for example large for loop?

SQL Server worker thread’s Quantum target is 4ms which means the thread(worker) is expected to yield back to SQL Server scheduler when it exceeds 4ms and rescheduled when the other threads which are already waiting in SOS Scheduler (runnable list) finishes its execution or quantum.

 

What if the thread did not yield after 4 Milliseconds?

SQL Server has its scheduler monitor to track this. SchedulerMonitor algorithm is to check non-Yield condition every 5 seconds during which the basic check (Check if the thread is executing for >4Ms) is done . When the basic check evaluates to true, tracking of the worker begins and if the thread doesn’t yield beyond 10seconds (Nonyield threshold) after the tracking begins then threshold check becomes true. So there is approximately 15 seconds between the time of the last yield on the scheduler and the time that the threshold check becomes true and tracking continues.

 

A dump is taken when an specific nonyield situation has reached 60 seconds in total duration. Once a 17883 mini-dump is captured, no further 17883 mini-dumps are captured until trace flag -T1262 is enabled or the SQL Server process is restarted. However, 17883 error message reporting continues, regardless of the mini-dump capture.  Also when –T1262 is enabled mini-dump is captured when the Non-Yield threshold check becomes true (15 seconds)

and at subsequent 60-second intervals for the same nonyield occurrence. A new nonyielding occurrence causes dump captures to occur again.

When the SQL Server decides to take the minidump on nonyield occurrence it copies the CONTEXT of the nonyielding thread to a global structure and then initiates the dump because Sometimes it is possible that by the time SQLDumper gets the dump, the non-yielding thread has already yielded. So to get the exact snapshot of the thread we need to trust on CONTEXT saved in global structure  also we can compare the current stack of the thread with the one which is copied  and check if the thread is progressing.

Non-yielding IOCP Listener

An identical algorithm is used to detect non-yielding I/O completion routines, counting completed I/O completion routines instead of number of yields. Scheduler Monitor takes a dump when it notices the IOCP has not moved for 10 seconds. Analyzing Non-yielding IOCP Listener is also same as analyzing non-yielding scheduler dump

 

 

Let us step in to analysis of  non-yielding scheduler dump which I got in SQL Server 2012

 

Sample 1

When a non-yielding scheduler dump is generated following error message is logged in SQL Error log and SQLDump000n.mdmp is generated in log folder.

 

{

    * *******************************************************************************

    *

    * BEGIN STACK DUMP:

    *   04/16/12 10:09:58 spid 6256

    *

    * Non-yielding Scheduler

    *

     * *******************************************************************************

Process 0:0:0 (0x1cb0) Worker 0x0000003054F62160 appears to be non-yielding on Scheduler 0. Thread creation time: 12979065797278. Approx Thread CPU Used: kernel 0 ms, user 0 ms. Process Utilization 0%. System Idle 97%. Interval: 70110 ms.

}

 

To analyze the dump download and Install Windows Debugger from This  link

Step 1:

Open Windbg .  Choose File menu –> select Open crash dump –>Select the Dump file (SQLDump000#.mdmp)

 

Microsoft (R) Windows Debugger Version 6.11.0001.404 X86

Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [C:\Users\karthick \Desktop\Karthick\SQLDump0009.mdmp]

User Mini Dump File: Only registers, stack and portions of memory are available

 

Comment: ‘Stack Trace’

Comment: ‘Non-yielding Scheduler’  èType of the dump

Symbol search path is: *** Invalid ***

 

 

Executable search path is:

Windows 7 Version 7601 (Service Pack 1) MP (24 procs) Free x64

Product: Server, suite: Enterprise TerminalServer SingleUserTS  à Windows version and system information

Machine Name:

Debug session time: Mon Apr 16 09:09:59.000 2012 (GMT-7)

System Uptime: 9 days 15:57:03.155

Process Uptime: 0 days 0:06:48.000

……………………………………………………….

……………………………..

Step 2:

on command window type
.sympath srv*c:\Websymbols*http://msdl.microsoft.com/download/symbols;

Step 3:

Type .reload /f and hit enter. This will force debugger to immediately load all the symbols.

Step 4:

Verify if symbols are loaded for  SQL Server by using the debugger command lmvm

 

0:146> lmvm sqlservr                                               

start             end                 module name

00000000`ffad0000 00000000`ffb0e000   sqlservr T (pdb symbols)          c:\websymbols\sqlservr.pdb\21553ADC31784A4D933974A386EE2E052\sqlservr.pdb

    Loaded symbol image file: sqlservr.exe

    Image path: C:\Program Files\Microsoft SQL Server\MSSQL11.S1\MSSQL\Binn\sqlservr.exe

    Image name: sqlservr.exe

    Timestamp:        Fri Apr 06 08:19:38 2012 (4F7F098A)

    CheckSum:         00036498

    ImageSize:        0003E000

    File version:     2011.110.2316.0                                                         

    Product version:  11.0.2316.0 èSQL Server Version

    File flags:       0 (Mask 3F)

    File OS:          40000 NT Base

    File type:        1.0 App

    File date:        00000000.00000000

    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

 

Step 5:

Use !findstack command to find scheduler monitor thread (sqlservr!SQL_SOSNonYieldSchedulerCallback )

 

 

0:146> !findstack sqlservr!SQL_SOSNonYieldSchedulerCallback

Thread 006, 1 frame(s) match  è Thread ID of scheduler monitor.

        * 07 00000000336be420 000007fee36e0955 sqlservr!SQL_SOSNonYieldSchedulerCallback+0x47f

 

Step 6:

Switch to scheduler monitor thread using ~[threadID]s command

 

0:146> ~[006]s 

ntdll!NtWaitForSingleObject+0xa:

00000000`76d3135a c3              ret

 

 

Step 7:

Use kC or kP command to look at the stack on scheduler monitor thread.

 

0:006> kP                        Child-SP          RetAddr           Call Site

00000000`3369c218 000007fe`fcd210ac ntdll!NtWaitForSingleObject+0xa

00000000`3369c220 00000000`ffaeecce KERNELBASE!WaitForSingleObjectEx+0x79

00000000`3369c2c0 00000000`ffaef1a4 sqlservr!CDmpDump::DumpInternal+0x20e

00000000`3369c360 000007fe`dbe50794 sqlservr!CDmpDump::Dump+0x24

00000000`3369c3a0 000007fe`dbe511e6 sqllang!SQLDumperLibraryInvoke+0x2e4

00000000`3369c640 000007fe`dbe16ddb sqllang!CImageHelper::DoMiniDump+0x426

00000000`3369c830 00000000`ffae307f sqllang!stackTrace+0xbdb

00000000`3369e270 000007fe`e36e0955 sqlservr!SQL_SOSNonYieldSchedulerCallback+0x47f

00000000`336be430 000007fe`e36866da sqldk!SOS_Scheduler::ExecuteNonYieldSchedulerCallbacks+0x375

00000000`336bebf0 000007fe`e364b53f sqldk!SchedulerMonitor::CheckScheduler+0x307

00000000`336bed60 000007fe`e364aa8f sqldk!SchedulerMonitor::CheckSchedulers+0x211

00000000`336bf1f0 000007fe`e371c779 sqldk!SchedulerMonitor::Run+0xfb

00000000`336bf320 000007fe`e3642f10 sqldk!SchedulerMonitor::EntryPoint+0x9

00000000`336bf350 000007fe`e3642d04 sqldk!SOS_Task::Param::Execute+0x21e

00000000`336bf950 000007fe`e36429e6 sqldk!SOS_Scheduler::RunTask+0xa8

00000000`336bf9c0 000007fe`e365a29f sqldk!SOS_Scheduler::ProcessTasks+0x299

00000000`336bfa40 000007fe`e365a3b0 sqldk!SchedulerManager::WorkerEntryPoint+0x261

00000000`336bfae0 000007fe`e3659fcf sqldk!SystemThread::RunWorker+0x8f

00000000`336bfb10 000007fe`e365aaf8 sqldk!SystemThreadDispatcher::ProcessWorker+0x3c8

00000000`336bfbc0 00000000`76ad652d sqldk!SchedulerManager::ThreadEntryPoint+0x236

 

Step 8:

Switch to the thread which is reported as Non-Yielding in SQL Server error log using ~~[ThreadID]s command.

 

Recollect the below error in SQL error log.  Process 0:0:0 (0x1cb0) Worker 0x0000003054F62160 appears to be non-yielding on Scheduler 0.

 

0:006> ~~[0x1cb0]s 

ntdll!NtWaitForSingleObject+0xa:

00000000`76d3135a c3              ret

 

Step 9:

Look at the current stack of Non-yielding thread. using kc command

 

0:146> kc 10

Call Site

ntdll!NtWaitForSingleObject

KERNELBASE!WaitForSingleObjectEx

sqldk!SOS_Scheduler::SwitchContext

sqldk!SOS_Scheduler::SuspendNonPreemptive

sqldk!WorkDispatcher::DequeueTask

sqldk!SOS_Scheduler::ProcessTasks

sqldk!SchedulerManager::WorkerEntryPoint

sqldk!SystemThread::RunWorker

sqldk!SystemThreadDispatcher::ProcessWorker

sqldk!SchedulerManager::ThreadEntryPoint

kernel32!BaseThreadInitThunk

ntdll!RtlUserThreadStart

 

 

 

 

Recollect the information which we discussed earlier in this blog. When the SQL Server decides to take the minidump on nonyield occurrence it copies the CONTEXT of the nonyielding thread to a global structure and then initiates the dump because Sometimes it is possible that by the time SQLDumper gets the dump, the non-yielding thread has already yielded. So to get the exact snapshot of the thread we need to trust on CONTEXT saved in global structure  also we can compare the current stack of the thread with the one which is copied  and check if the thread is progressing.

 

 

Look at the above stack it cannot be Non-Yield thread because we see SwitchNonPreemptive  and SwitchContex  in the thread. 

{

SwitchPreemptive or SuspendNonPreemptive forces another worker to become owner of the scheduler. It does this by making the head of the runnable list the new owner and removing the current worker from logical scheduler control. The worker transitions ownership and is removed from SQL scheduler control until the external activity is complete. When the external activity is complete, the worker returns to the end of the runnable list by calling SwitchNonPreemptive.

}

 

Step 10:

Search for the copied stack structure using X commad

 

0:146> X sqlmin!*copiedStack*    

000007fe`df11bfe0 sqlmin!g_copiedStackInfo = <no type information>               

 

It is sqlmin!g_copiedStackInfo in this dump because this is dump is from SQL Server2012. In earlier versions of SQL Server it is sqlservr!g_copiedStackInfo

 

Step 11:

We know copied CONTEXT is stored in g_CopiedStackInfo how to find the valid offset in this structure ? If the CONTEXT is valid Rip,Rbp and RSP registers has to be valid if the dump is from X64 system and Eip,Ebp and Esp has to be valid if it is X86 system.

 

 

Let us do dd on sqlmin!g_copiedStackInfo  (remember it is sqlservr!g_copiedStackInfo in SQL2008/2005/2000)

 

0:146> dd sqlmin!g_copiedStackInfo     

000007fe`df11bfe0  00000001 00000000 3369e2e0 00000000

000007fe`df11bff0  0000a998 00000000 00000000 00000000

000007fe`df11c000  00000000 00000000 00000000 00000000

000007fe`df11c010  00000000 00000000 00000000 00000000

000007fe`df11c020  00000000 00000000 00000000 00000000

000007fe`df11c030  0010000b 00001f80 00000033 00000000

000007fe`df11c040  002b0000 00000246 00000000 00000000

000007fe`df11c050  00000000 00000000 00000000 00000000

 

Step 12:

Let us dump each address with context and verify if Rip,Rbp and RSP registers are valid. This dump is from 64-bit SQL Server so we are using Rip,Rbp and RSP registers. If the dump is from x86 system use Eip,Ebp and Esp .

 

0:146> dt 000007fe`df11bfe0 CONTEXT Rip Rsp Rbp     èType cast  000007fe`df11bfe0  with CONTEXT  .  RsP,Rbp and Rip is invalid so this is not valid offset.

ole32!CONTEXT

   +0x098 Rsp : 2

   +0x0a0 Rbp : 0x5a4d

   +0x0f8 Rip : 0

0:146> dt 000007fe`df11bff0 CONTEXT Rip Rsp Rbp      è Type cast 000007fe`df11bff0  with CONTEXT  . RIP is invalid

ole32!CONTEXT

   +0x098 Rsp : 0x72120000

   +0x0a0 Rbp : 0x3369e3cc

   +0x0f8 Rip : 0xf2

0:146> dt 000007fe`df11c000 CONTEXT Rip Rsp Rbp   è Type cast 000007fe`df11c000 with CONTEXT  . RIP ,RSP  and Rbp is valid. So this should be a valid context.Let us switch to this context and verify

ole32!CONTEXT

   +0x098 Rsp : 0x3369e2e0

   +0x0a0 Rbp : 0x3369e498

   +0x0f8 Rip : 0x76d3139a

 

Now we know 000007fe`df11c000 is valid context.  So 000007fe`df11c000 -sqlmin!g_copiedStackInfo =0x20 hence  we can use .cxr sqlmin!g_copiedStackInfo+0X20 directly to switch to the context of copied stack.

 

Step 13:

 

Switch the context of copied stack using .cxr 000007fe`df11c000  or .cxr sqlmin!g_copiedStackInfo+0X20

 

0:146> .cxr 000007fe`df11c000

rax=0000000000000002 rbx=000000003369e3cc rcx=0000000000005a4d

rdx=0000000072120000 rsi=000000000000026c rdi=0000000000000000

rip=0000000076d3139a rsp=000000003369e2e0 rbp=000000003369e498

r8=00000000000000b0  r9=0000000084a85310 r10=0000000000000000

r11=0000000000000000 r12=0000000000000000 r13=0000000000000004

r14=00000000000000f2 r15=0000000000000001

iopl=0         nv up ei pl zr na po nc

cs=0033  ss=002b  ds=0000  es=0000  fs=0000  gs=0000             efl=00000246

ntdll!NtWriteFile+0xa:

00000000`76d3139a c3              ret

 

Step 14:

 

Dump the stack of copied context using kP or kc (kc displays clean stack trace. each display line includes only the module name and the function name)

 

 

0:146> Kc

  *** Stack trace for last set context – .thread/.cxr resets it

Call Site

ntdll!NtWriteFile

KERNELBASE!WriteFile

kernel32!WriteFileImplementation

sqllang!CErrorReportingManager::WriteToErrLog

sqllang!CErrorReportingManager::SendErrorToErrLog

sqllang!CErrorReportingManager::CwchFormatAndPrint

sqllang!ReportLoginFailure

sqllang!FRedoLogin

sqllang!login

sqllang!process_login_finish

sqllang!process_commands

sqldk!SOS_Task::Param::Execute

sqldk!SOS_Scheduler::RunTask

sqldk!SOS_Scheduler::ProcessTasks

sqldk!SchedulerManager::WorkerEntryPoint

sqldk!SystemThread::RunWorker

sqldk!SystemThreadDispatcher::ProcessWorker

sqldk!SchedulerManager::ThreadEntryPoint

kernel32!BaseThreadInitThunk

ntdll!RtlUserThreadStart

Now compare the current stack and the copied stack to see if the thread has progressed after No-Yield condition. Stack look completely different So the Non-Yield thread has progressed and completed .It is doing new work now. Also to understand why the thread was Non-Yielding look at the copied stack and not the current unless both the stacks are same. 

 

 

Current thread stack which we dumped using the thread ID in SQL Errorlog.

0:146> kc                                                                                    

Copied thread stack which SQL Server copied to global structure before generating the dump.

0:146> Kc 10

ntdll!NtWaitForSingleObject

KERNELBASE!WriteFile

KERNELBASE!WaitForSingleObjectEx

kernel32!WriteFileImplementation

sqldk!SOS_Scheduler::SwitchContext

sqllang!CErrorReportingManager::WriteToErrLog

sqldk!SOS_Scheduler::SuspendNonPreemptive

sqllang!CErrorReportingManager::SendErrorToErrLog

sqldk!WorkDispatcher::DequeueTask

sqllang!CErrorReportingManager::CwchFormatAndPrint

sqldk!SOS_Scheduler::ProcessTasks

sqllang!ReportLoginFailure

sqldk!SchedulerManager::WorkerEntryPoint

sqllang!FRedoLogin

sqldk!SystemThread::RunWorker

sqllang!login

sqldk!SystemThreadDispatcher::ProcessWorker

sqllang!process_login_finish

sqldk!SchedulerManager::ThreadEntryPoint

sqllang!process_commands

kernel32!BaseThreadInitThunk

sqldk!SOS_Task::Param::Execute

ntdll!RtlUserThreadStart

sqldk!SOS_Scheduler::RunTask

 

sqldk!SOS_Scheduler::ProcessTasks

 

sqldk!SchedulerManager::WorkerEntryPoint

 

sqldk!SystemThread::RunWorker

 

sqldk!SystemThreadDispatcher::ProcessWorker

 

sqldk!SchedulerManager::ThreadEntryPoint

 

Now let us read the copied stack and understand what would have caused a Non-Yield condition (read from bottom to top)

 

ntdll!NtWriteFile  -> WriteFile function is at top of the stack and did not complete in expected time.          
KERNELBASE!WriteFile       
kernel32!WriteFileImplementation 
sqllang!CErrorReportingManager::WriteToErrLog  ->Write the error to errorlog
sqllang!CErrorReportingManager::SendErrorToErrLog  ->Send the error to SQL Server errorlog
sqllang!CErrorReportingManager::CwchFormatAndPrint  ->format the error
sqllang!ReportLoginFailure  ->Login failed
sqllang!FRedoLogin        
sqllang!login ->Login task is processed

 

 

From the above stack we are able to understand SQL Server is writing login failed information to SQL Error log (Synchronously) and the writefile function has taken long time and did not return.So there is Non-Yield scheduler dump.

 

When will writefile operation take long time?

When there is Disk bottleneck. So the obvious solution for this issue is to  fix the performance of the disk.

Similarly there could be numerous other reasons for Non-Yield condition so look at the stack of your Non-Yield scheduler dump using the method above and make out what could have caused the Non-Yield condition.

Also refer THIS LINK to check if your stack matches with any of the known issues in SQL Server.

To Be continued…………………………

Related posts:

  • How to Analyze "Deadlocked Schedulers" Dumps?
  • Non-yielding IOCP Listener, Non-yielding Scheduler and non-yielding resource monitor known issues and fixes
  • SQL Server generated Access Violation dumps while accessing oracle linked servers.
  • SQL Server Latch & Debugging latch time out

    If you liked this post, do like us on Face Book at https://www.facebook.com/mssqlwiki and join our FaceBook group https://www.facebook.com/mssqlwiki#!/groups/454762937884205/

    Thank you,

    Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

  • Posted in Configuration, Debugging, Performance, SQL General, SQL Server Engine | Tagged: , , , , , , , , , , , , , , , , , , | 28 Comments »

    (SQLServer) Initializing the FallBack certificate failed with error code: 1, state: 1, error number: -2146893802.

    Posted by Karthick P.K on April 19, 2012

    SQL Server might fail to start with below error

    Server Error: 17190, Severity: 16, State: 1.

    Server Initializing the FallBack certificate failed with error code: 1, state: 1, error number: -2146893802.

    Server Unable to initialize SSL encryption because a valid certificate could not be found, and it is not possible to create a self-signed certificate

    Error: 15466, Severity: 16, State: 1.

    spid7s An error occurred during decryption.

    Cause

    CryptAcquireContext function is used by SQL Server to acquire a handle to key containers, create key containers and destroy key containers.

    By default CryptAcuireContext function create key in “Roaming\Microsoft\Crypto\..” under path mentioned in below registry

    HKEY_USERS\S-1-X-XXX\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders\AppData

    If the AppData Key is missing or if the user don’t have permission in path mentioned in above registry or if user profile is corrupted we might end up with above error.

    To narrow down the issue outside SQL-Server run THIS executable which will Open or Create key container if it doesn’t exist. If the exe fails look at error code returned by exe and troubleshoot further.

    To check if the problem is because of corrupted profile modify the path mentioned in HKEY_USERS\S-1-X-XXX\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders\AppData folder to a

    different path and check if the exe is able to create the key container.

     

    Source  code for Exe is below

    #include <windows.h> 
    #include <string> 
    #include <winbase.h> 
    #include <iostream> 
    using namespace std;
    #include <Wincrypt.h >
     
                                          
    void main()
    {
    LPCSTR rgwchKeyContName = "Test123456";  
    HCRYPTPROV m_hCryptoProviderFB;
    BOOL ret;
    BOOL ret2;
    
    ret=CryptAcquireContext(&m_hCryptoProviderFB, rgwchKeyContName, MS_ENHANCED_PROV, PROV_RSA_FULL, CRYPT_SILENT);
        
    if (!ret && GetLastError() == NTE_BAD_KEYSET)
    
    {
        
        printf("\nUnable to open Keyset.CryptAcquireContext failed with error: 0x%X . \nWe will try creating key",GetLastError());
    
        ret2=CryptAcquireContext(&m_hCryptoProviderFB, rgwchKeyContName, MS_ENHANCED_PROV, PROV_RSA_FULL, CRYPT_NEWKEYSET | CRYPT_SILENT);
            if (!ret2)
            {
            printf("\nCryptAcquireContext failed creating key.Error: 0x%X",GetLastError());
            }
            else
            {
            printf("\nKey created");
            }
        exit;
    }
    
    
    else if (!ret && GetLastError() == NTE_BAD_KEYSET)
    {
    printf("CryptAcquireContext failed with error: 0x%X",GetLastError());
    }
    
    else
    {
    
        printf("CryptAcquireContext opened key. Return value is 0x%X.",ret);
    }
    
        if (CryptReleaseContext(m_hCryptoProviderFB,0))
        {
        printf("\nHandle is released.\n");
        }
        else
        {
        printf("\nHandle could not be released.\n");
        }
    
    }

     

     

    Thanks

    Karthick P.K

    Posted in Configuration, Security, Startup failures | Tagged: , , , , , , , | 19 Comments »

    SQL-Server resource fails to come online IS Alive check fails

    Posted by Karthick P.K on January 31, 2012

    SQL-Server resource fails to come online with below Error:

    [sqsrvres] checkODBCConnectError: sqlstate = 08001; native error = 35; message = [Microsoft][SQL Server Native Client 11.0]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible.

    Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.

     

    Resolution:

    Look at the version of (c:\windows\system32\sqsrvres.dll) and install the same version of SQL Server native client.

    Cause:

    When Higher version of SQL-Server is installed on a cluster in which lower version of SQL Server is already installed, the lower version SQL Server Resource DLL (c:\windows\system32\sqsrvres.dll) is upgraded to higher version and Higher resource DLL will be loaded by the resource monitor process to monitor Lower version as well.

    For example: The Denali SQL Server Resource uses SNAC 11.0 to connect to the SQL instance and because SNAC 11.0 can be used to connect to Shiloh, Yukon and Katmai as well this side by side configuration will work. However if Denali is uninstalled, the Denali SQL Server resource DLL is not downgraded to Katmai, Yukon or Shiloh version and hence care should be taken to not uninstall SNAC 11.0 otherwise Yukon or Shiloh instance cannot be brought online.

    Similarly When we install Yukon and Shiloh together, Yukon SQL Server Resource uses SNAC to connect to the SQL instance and because SNAC can be used to connect to Shiloh as well this side by side configuration will work. However if Yukon is uninstalled, the Yukon SQL Server resource DLL is not  downgraded to Shiloh version and hence care should be taken to not uninstall SNAC otherwise Shiloh instance cannot be brought online.

     

    If you liked this post do like us on Facebook at https://www.facebook.com/mssqlwiki and join our Facebook group MSSQLWIKI

    Thank you,

    Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

    Disclaimer
    The views expressed on this website/blog are mine alone and do not reflect the views of my company. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

    Posted in Configuration, Connectivity, SQL General, SQL Server Cluster | Tagged: , , , , | 8 Comments »

    Linked server connection fails with “An error occurred during decryption”

    Posted by Karthick P.K on January 9, 2012

    We might get Error: 15466, Severity: 16, State: 2  An error occurred during decryption while installing Projects servers (or) Sending mails using database mail (or) Linked server connections might fail with Msg 15593, Level 16, State 1, Line 1

    Linked server connection fails with below error

    {

    Failed to retrieve data for this request. (Microsoft.SqlServer.Management.Sdk.Sfc)
    ——————————
    ADDITIONAL INFORMATION:
    An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)
    ——————————
    An error occurred during decryption. (Microsoft SQL Server, Error: 15466)

    Msg 15593, Level 16, State 1, Line 1

    An error occurred while decrypting the password for linked login ‘distributor_admin’ that was encrypted by the old master key. The error was ignored because the FORCE option was specified.

    }

    Database mail might fail with below error

    {

    Set mail server login password failed for MailServer ‘Domain’.  (Microsoft.SqlServer.Smo)

    An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)

    An error occurred during decryption. (Microsoft SQL Server, Error: 15466)

    }

    Or

    You notice below errors in SQL Server errorlogs

    spid10s Error: 15581, Severity: 16, State: 3.
    Please create a master key in the database or open the master key in the session before performing this operation.

    Cause

    SQL Server service account was changed from services control manager (or) service master key was not backed up and restored when migrating SQL Server to another computer domain.

    {

    http://msdn.microsoft.com/en-us/library/ms187788.aspx

    To change the SQL Server service account, use SQL Server Configuration Manager. To manage a change of the service account, SQL Server stores a redundant copy of the service master key protected by the machine account that has the necessary permissions granted to the SQL Server service group. If the computer is rebuilt, the same domain user that was previously used by the service account can recover the service master key. This does not work with local accounts or the Local System, Local Service, or Network Service accounts. When you are moving SQL Server to another computer, migrate the service master key by using backup and restore.

    The REGENERATE phrase regenerates the service master key. When the service master key is regenerated, SQL Server decrypts all the keys that have been encrypted with it, and then encrypts them with the new service master key. This is a resource-intensive operation. You should schedule this operation during a period of low demand, unless the key has been compromised. If any one of the decryptions fail, the whole statement fails.

    The FORCE option causes the key regeneration process to continue even if the process cannot retrieve the current master key, or cannot decrypt all the private keys that are encrypted with it. Use FORCE only if regeneration fails and you cannot restore the service master key by using the RESTORE SERVICE MASTER KEY statement.

    }

     

    Resolution

    Regenerate the service master key using ALTER SERVICE MASTER KEY REGENERATE

    If you receive the following error message when running ALTER SERVICE MASTER KEY REGENERATE.

    {

    The current master key cannot be decrypted. If this is a database master key, you should attempt to open it in the session before performing this operation. The FORCE option can be used to ignore this error and continue the operation but the data encrypted by the old master key will be lost.

    }

    We are left with only option to force regenerating service master key using ALTER SERVICE MASTER KEY FORCE REGENERATE “.

    Note:The service master key is the root of the SQL Server encryption hierarchy. The service master key directly or indirectly protects all other keys and secrets in the tree. If a dependent key cannot be decrypted during a forced regeneration, the data the key secures will be lost.

     

    If you liked this post, do like us on Facebook at https://www.facebook.com/mssqlwiki and join our Facebook group https://www.facebook.com/mssqlwiki#!/groups/454762937884205/

    Thank you,

    Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

    Posted in Configuration, Connectivity, Security, SQL Server Tools | Tagged: , , , , , , , , , , | 9 Comments »

    System stored procedures like sp_addsrvrolemember or sp_addserver may fail because of McAfee Host Intrusion Prevention

    Posted by Karthick P.K on June 26, 2011

    We might get Incorrect syntax near while applying the snapshot or stored procedure like sp_addsrvrolemember or sp_addserver might fails when we have Host Intrusion Prevention antivirus.

     

    Last week two DBA’s came to me with two different errors after breaking their head for hours….

    Error 1:

    When i Run “EXEC sp_addsrvrolemember  ‘VC’, ‘sysadmin’”        I get

    [Microsoft][ODBC SQL Server Driver][DBNETLIB]ConnectionCheckForData (CheckforData()).
    Server: Msg 11, Level 16, State 1, Line 0
    General network error. Check your network documentation.
    Connection Broken

    Error 2:

    I get Access Violation when i Install SQL Server 2008 and here is error in errolog

    *   Exception Address = 7814500A Module(MSVCR80+0001500A)

      *   Exception Code    = c0000005 EXCEPTION_ACCESS_VIOLATION

      *   Access Violation occurred writing address 43D3FFFC

      *   Input Buffer 428 bytes –

        declare @ServerName nvarchar(255) if not exists (select * fro

    *  m sysservers) begin select @ServerName = Convert(nvarchar(255), SERVERPR

    *  OPERTY(N’ServerName’)) execute sys.sp_addserver @ServerName, local end

      *             declare @ServerName nvarchar(255) if not exists (select * fro

      *  m sysservers) begin select @ServerName = Convert(nvarchar(255), SERVERPR

      *  OPERTY(N’ServerName’)) execute sys.sp_addserver @ServerName, local

     

     

    Error is raised while executing sp_addsrvrolemember  or sp_addserver  or while applying the initial snapshot for database replication. I collected memory dump from both the systems and interestingly there was same 3rd party Dll’s in SQL Server address space of both systems. Its  McAfee Host Intrusion Prevention. Disabled this and things started working.

    https://kc.mcafee.com/corporate/index?page=content&id=KB65845

     

    Thank you,

    Karthick P.K |My Facebook Page |My Site| Blog space| Twitter

    Disclaimer:

    The views expressed on this website/blog are mine alone and do not reflect the views of my company. All postings on this blog are provided “AS IS” with no warranties, and confers no rights.

    Posted in Configuration, Security, SQL Server Engine | Tagged: , , , | 4 Comments »

    Runtime error: ActiveX component can’t create object: ‘SQLDMO.SQLServer’

    Posted by Karthick P.K on July 28, 2010

     

    When you use SQLDMO ???

    Install SQL Server2005 Backward compatibility

    http://download.microsoft.com/download/3/1/6/316FADB2-E703-4351-8E9C-E0B36D9D697E/SQLServer2005_BC.msi

    Also remember if you have installed 32-Bit SQL Server on 64-Bit you have to use Cscript.exe from SysWOW64

    Posted in Configuration, SQL Server Setup | 1 Comment »

    Configuring SSL for SQL Server using Microsoft Certificate Authority Server

    Posted by Karthick P.K on June 12, 2010

    Configuring SSL for SQL Server using Microsoft Certificate Authority Server 

    Refer attached document  for detailed steps

    1. Install IIS Server from ADD/Remove Windows Components (if it is not installed already)

    2. Install Certificate Server ADD/Remove Windows Components (if it is not installed already)

    3. OPEN Certsrv browser console by either of below mentioned ways,

    A.  IIS Manager and browse to Machine Name — Web sites — CertSrv
    B.  IE open
    http://localhost/certsrv
    C.  From IE open
    http://<machinename&gt; /certsrv
    e.g., http://pjhome1/certsrv

    4. To Install CA (Root) Certificate

    A. Click on ‘Download a CA Certificate, Certificate Chain, or CRL’

    B. Click on Install this CA certificate chain

    C. Click YES

    D. CA chain (Root Certificate) installed successfully

    5. Create a SERVER Side Authentication Certificate

    A. Go to Certsrv site and click on ‘Request a certificate’

    B. Click on ‘Advanced Certificate request’

    C.Cick on ‘Create and submit a request to this CA’

    D. Enter the certificate information

    • 1. Type the FQDN (Fully Qualified Domain Name) for the name
    • 2. Select ‘Server Authentication Certificate’ for Type of Certificate Needed.
    • 3. Check the ‘Mark Keys as exportable’ option
    • 4. Click on Submit

    E. Click on YES to complete

    F. We need to make a note of the ‘Request Id’ from the below screen.

    6. Issue the certificate.

    A. In MMC add ‘Certificates’&’Certificate Authority’ using ‘Add/Remove Snap-in’ options.

    B. Click on ‘Pending requests’ in ‘Certificate Authority’.
    (We would see certificate with Request ID which we generated in STEP 5.i.e., 7 here)

    C. Right click on the certificate –> All Tasks –> ISSUE

    D. Now we should see the certificate under ‘Issued Certificates’

    7. Install the certificate
    A. Click on ‘View the status of a pending certificate request’

    B. Click on the certificate.

    C. Click on ‘Install this certificate’

    D. Click on YES

    E. We will see the successfully installed screen.

    8. Assign the certificate to the SQL Server instance.

    A. Open SQL Server Configuration Manager
    B. Right click on ‘Protocol on <instance name>’
    (for the instance which we need, here it is STANDARD)

    C. In the certificate tab and select the certificate we created earlier.

    D. Click on Apply and restart the SQL Server instance to get this change applied.

    9.After the successful deployment of the certificate (Server side) we should see the below message in our SQL Error Log file during the server startup.
    The certificate was successfully loaded for encryption.

     

    Regards

    Karthick P.K

    Posted in Configuration, Connectivity, Security | Tagged: , , , , | 6 Comments »

    DBCC CheckDB fails with error "The database could not be checked as a database snapshot could not be created and the database or table could not be locked

    Posted by Karthick P.K on March 6, 2010

     

    DBCC CheckDB may with fail with error

    Error

    Msg 5030, Level 16, State 12, Line 1 The database could not be exclusively locked to perform the operation.
    Msg 7926, Level 16, State 1, Line 1
    Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details.

     

    Possible Causes Snapshot Creation Failure Reason

    1. Database is having read only file group.

    Check if the database is having read only file groups.

    2. No Parse file support by the file system.

    A. Parse file is not supported in FAT32 check the file system of the datafiles. If you use FAT32   use DBCC CheckDB with Tablock Option

    B. To get the volume information of file system in which we have the data files SQL Server use  GetVolumeInformation API.

    This API would fail if SQL Server startup account do not have full permission on Volume in which the data file is located.
    Grant full permission for the startup account of SQL Server on the root volume of all the data files. To verify if the startup account of SQL Server has permission on volume in which data file is created use This EXE. It uses GetVolumeInformation to list the information about volume.

    3. No alternate stream support.

     

    Regards

    Karthick P.K

    Posted in Configuration, DBCC | Tagged: , , | 3 Comments »

    How to add an IP Address when we Add new NIC to node where SQLServer2005 instance is running.

    Posted by Karthick P.K on April 14, 2009

    The only possible way to add an IP to SQLServer2005 after adding new NIC to the
    node where SQLServer2005 is installed (or) After adding second ip address to
    existing NIC is to either manually edit the registry or run Service pack/Hotfix setup again.
    The only possible way to add an IP to SQLServer2005 after adding new NIC to the
    node where SQLServer2005 is installed (or) After adding second ip address to
    existing NIC is

    Option1

    Manually edit the registry

    1. Add a new registry key under
    HKLM\software\microsoft\microsoftSQLServer\mssql.1\mssqlserver\supersocketnetlib\tcp
    for IP2

    simplest way to make it is export the Key

    HKLM\software\microsoft\microsoftSQLServer\mssql.1\mssqlserver\supersocketnetlib\tcp
    ,

    Open the exported file using the notepad and Copy the Following string ,make
    required changes(Modify the IPAddress,IPn.. Ie: n stands for number of IP and MSSQL.N Ie nstands for instance ID )

    [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL
    Server\MSSQL.X\MSSQLServer\SuperSocketNetLib\Tcp\IPn]
    “Enabled”=dword:00000000
    “Active”=dword:00000001
    “TcpPort”=””
    “TcpDynamicPorts”=””
    “DisplayName”=”Specific IP Address”
    “IpAddress”=”65.52.17.19”

    2. Import the registry again.

    3..Now the new ip is listed in SQLServer configuration manager for editing.
    Option2

    Install the service pack again.

    1. After the installatin of Service packs/HotFIx SQLServer recognize the new NIC but
    Assings wrong IPaddresses to SQLserver.it makes SQLServer Listen on wrong ip’s and
    so the SQLServer might fail to start with error

    “Error: 17120, Severity: 16, State: 1.
    2007-08-25 00:27:10.83 Server SQL Server could not spawn FRunCM thread.

    Check the SQL Server error log and the Windows event logs for information about
    possible related problems.”

    2. We can find duplicate IP in SSCM(SQLServer Configuration manager). we can edit
    the IP addresses in SQLserver configuration manager to fix right IP and start the SQLServer.

    Regards

    Karthick P.K

    Posted in Configuration, SQL Server Tools | Tagged: | 4 Comments »

    How to Browse (or) view objects and there code in mssqlsystemresource Database

    Posted by Karthick P.K on January 15, 2009

    How to Browse (or) view mssqlsystemresource Database.

    1. Stop the SQL Server service
    Copy the mssqlsystemresource.MDF and mssqlsystemresource.LDF to a new path (This
    two files will be in same path where master databse is located).
    Start the SQL Server Service
    Ues the following command to attach the data and log file as a new user database.

    EXEC sp_attach_db
    ‘mssqlsystemresource _Copy’, ‘<Path where you copied>\resource_copy.mdf’,
    ‘<Path where you copied>\resource_copy.ldf’
    Now you browse through all the system objects and there code.

    OR

    1. Start SQL Server in single user mode.

    2. Open SSMS and connect using DAC (ADMIN) Connection and change DB context to “mssqlsystemresource”.

    use mssqlsystemresource

    3. Query Resource DB objects.

    Regards

    Karthick P.K

    Posted in Configuration, SQL General, SQL Server Engine | Tagged: , , | 3 Comments »

    Behavior of SQL Server default instance on a NON-Default port

    Posted by Karthick P.K on January 15, 2009

    Unable to connect to SQLServer listening on Non-Default Port?

    If you connect to SQLServer using the “hostname” it connects to the instance which is listening on 1433.

    If you make named instance listen on 1433 and connect using “hostname” it connects to named instance which is listening on 1433.

    if you connect to SQLServer using hostname or ip-address (without “\instance name”) it connects to instance which is listening on 1433, it can be named instance as well.(We have a belief that it connects to default instance because default port of default instance is 1433).

    When you are connecting to SQLServer using hostname (without “\instance name”) client will not communicate with SQL Browser.
    SQLClient would communicate with SQLBrowser only if there is “\instancename”  followed by host name.

    Dynamic port detection is only available for named instances of SQL Server 2000.
    The behavior of SQL Server 2000 for a default instance is exactly the same as in earlier versions of SQL Server. The network libraries assume either 1433 or the
    global default port established with the Client Configuration Utility.

    If a default instance is listening on a port other than the standard 1433 port, you can provide an alias or change the global default port. You can also connect to the
    instance of SQL Server by using its server name, its FQDN, or its IP address followed by a comma and the port number.

    The mdac rules :

    If you connect to a Named Instance, then you need to either specify the Server\Instance or Server,Port.
    If you change the default port number for the default instance, then you need to specify Port number as well.

    Regards

    Karthick P.K

    Posted in Configuration, Connectivity | Tagged: , , , | 16 Comments »

    Unable to start SQLServer agent resource on cluster after upgrading to 9.00.3186 or Higher

    Posted by Karthick P.K on January 14, 2009

    SQLServer agent resource fails to come online on cluster after upgrading to build 9.00.3186 or Higher

     

    Error

    2008-03-19 23:06:23 – ? [100] Microsoft SQLServerAgent version 9.00.3186.00 (x86
    unicode retail build) : Process ID 2428
    2008-03-19 23:06:23 – ? [101] SQL Server SNETNAME version 9.00.3186 (0 connection
    limit)
    2008-03-19 23:06:23 – ? [102] SQL Server ODBC driver version 9.00.3042
    2008-03-19 23:06:23 – ? [103] NetLib being used by driver is DBNETLIB.DLL; Local
    host server is
    2008-03-19 23:06:23 – ? [310] 4 processor(s) and 2560 MB RAM detected
    2008-03-19 23:06:23 – ? [339] Local computer is SNETNAME running Windows NT 5.2
    (3790) Service Pack 2
    2008-03-19 23:06:23 – ? [432] There are 11 subsystems in the subsystems cache
    2008-03-19 23:06:38 – ! [364] The Messenger service has not been started – NetSend
    notifications will not be sent
    2008-03-19 23:06:38 – ? [129] SQLSERVERAGENT starting under Windows NT service
    control
    2008-03-19 23:06:38 – + [260] Unable to start mail session (reason: No mail profile
    defined)
    2008-03-19 23:06:38 – + [396] An idle CPU condition has not been defined – OnIdle
    job schedules will have no effect
    2008-03-19 23:06:38 – + [408] SQL Server MSSQLSERVER is clustered – AutoRestart has
    been disabled
    2008-03-19 23:06:39 – ! [298] SQLServer Error: 22022, CryptUnprotectData() returned
    error -2146893813, ‘Key not valid for use in specified state.’ [SQLSTATE 42000]
    2008-03-19 23:06:39 – ! [442] ConnConnectAndSetCryptoForXpstar failed (0).
    2008-03-19 23:06:40 – ? [098] SQLServerAgent terminated (normally)
    Error2

    2008-03-18 12:18:30 – ? [100] Microsoft SQLServerAgent version 9.00.3200.00
    ((Unknown) unicode retail build) : Process ID 6512
    2008-03-18 12:18:30 – ? [101] SQL Server PISTONDIST version 9.00.3200 (0 connection
    limit)
    2008-03-18 12:18:30 – ? [102] SQL Server ODBC driver version 9.00.3042
    2008-03-18 12:18:30 – ? [103] NetLib being used by driver is DBNETLIB.DLL; Local
    host server is np:pistondist
    2008-03-18 12:18:30 – ? [310] 16 processor(s) and 32765 MB RAM detected
    2008-03-18 12:18:30 – ? [339] Local computer is PISTONDIST running Windows NT 5.2
    (3790) Service Pack 2
    2008-03-18 12:18:31 – ? [432] There are 11 subsystems in the subsystems cache
    2008-03-18 12:18:31 – ! [364] The Messenger service has not been started – NetSend
    notifications will not be sent
    2008-03-18 12:18:31 – ? [129] SQLSERVERAGENT starting under Windows NT service
    control
    2008-03-18 12:18:32 – + [396] An idle CPU condition has not been defined – OnIdle
    job schedules will have no effect
    2008-03-18 12:18:32 – + [408] SQL Server MSSQLSERVER is clustered – AutoRestart has
    been disabled
    2008-03-18 12:18:32 – + [162] Internal request (from SetJobNextRunDate [reason:
    schedule will not run again]) to deactivate schedule 66
    2008-03-18 12:18:32 – ! [298] SQLServer Error: 22022, CryptUnprotectData() returned
    error -2146892987, ‘The requested operation cannot be completed. The computer must
    be trusted for delegation and the current user account must be configured to allow
    delegation.’ [SQLSTATE 42000]
    2008-03-18 12:18:32 – ! [442] ConnConnectAndSetCryptoForXpstar failed (0).
    2008-03-18 12:18:33 – ? [098] SQLServerAgent terminated (normally)

     

    Resolution

    Modify the the following Key.
    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.X\SQLServerAgent
    Modify the value data of the serverhost key to np:Virtualservername

    Ie:HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL
    Server\MSSQL.X\SQLServerAgent
    ServerHost
    Value: np:Virtualservername.

    This will force the SQLServer agent to connect with SQLserver using Named Pipes so
    delegation is not used.

    —————————————————————————————————————–

    We have a HOTFIX available for this issue and it is included in the cumulative update pack9 for SQLServer service pack2. http://support.microsoft.com/?id=956378
    Note: Before applying the Hotfix. you have to follow the  steps mentioned in Resolution else hotfix would fail. Revert the steps after applying the fix.

     

    Regards

    Karthick PK

    Posted in Configuration, Connectivity, SQL Cluster Setup, SQL Server Cluster | Tagged: , , , , , | 2 Comments »

    How to enable Constraint delegation for SQLServer2005

    Posted by Karthick P.K on January 14, 2009

    Constraint delegation for SQLServer2005

    Follow the below steps to set the constraint delegation for SQLServer2005

    1. In active directory users and computers for the startup account of SQLServer
    a.Select “Trust this user for delegation to specified services
    only”
    b.Add host service for each node of the cluster
    2. In active directory users and computers for each node under the delegation
    tab
    a. Select “Trust this computer for specified services only”
    b. Add CIFS and Protected storage for each node of the cluster and each DC which
    SQLserver may use to authenticate.

    Take the SQLServer Group offline in clutser administrator and bring it online.

    Please refer the attached document for additional information with Image and example   ConstraintDelegationforSQLServer2005

    Regards

    Karthick PK

    Posted in Configuration, SQL Server Cluster | Tagged: , , | 10 Comments »