Home > Windows Server Tips > Windows Server Monitoring and Management > Why do Windows servers hang?
Windows Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

WINDOWS SERVER MONITORING AND MANAGEMENT

Why do Windows servers hang?


Bruce Mackenzie-Low, Contributor
04.04.2008
Rating: -4.37- (out of 5)


Expert advice on Windows-based systems and hardware
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Part 1 | Part 2 | Part 3

Bruce Mackenzie-Low
Troubleshooting a hung or nonresponsive Windows server can be a challenging endeavor. Simply hitting the reset button is no longer a tolerated option as more companies use these servers for business-critical operations. This three-part series will explore the reasons why a Windows server may hang and provide a cookbook approach to diagnosing the underlying issues with the Windows Kernel Debugger (Windbg).

Background

When Microsoft released the early versions of its server operating system (Windows NT 3.5x and NT4), there was no easy way to troubleshoot a hung server. Other mainstream operating systems, such as Digital Equipment Corp.'s VAX/VMS, offered ways to manually intervene by forcing a crash dump whereby the server's state could be captured at the time of the hang. This dump could then be analyzed to determine why the server hung. The only option for early Windows platforms, however, was to reset the box.

As Windows servers became more predominant in the business world, hitting the reset button became unacceptable.

As Windows servers became more predominant in the business world, hitting the reset button became unacceptable. As a result, in Windows 2000 Server and later versions, it became possible to force a crash dump to assist with determining why the server hung. Microsoft introduced this feature in Knowledge Base article 244139. It allows a keystroke combination (right CTRL+SCROLL LOCK twice) to generate a crash dump on PS/2-type keyboards. Microsoft extended this feature in Windows Server 2003 with a hotfix to the Kbdhid.sys driver to accommodate USB-type keyboards.

Several other options now exist to force a crash dump. Microsoft provides the Windows Special Administrative Console (SAC) Crashdump command as part of Windows Emergency Management Services (EMS), which allows for "headless" servers with no local graphical console. Vendor-specific options also exist to force a crash dump including the HP Integrity server's Management Processor TC (transfer of control) command, an NMI (non-maskable interrupt) button on some Integrity models, or the Integrated Lights Out (iLO) virtual NMI button. We'll take a closer look at each of these options later in the series.

Why a server hangs

There are a variety of reasons why a server may hang, including both hardware and software issues. The most common hardware reason for a server hang is spurious interrupts by a failing device. For example, a network interface controller (NIC) may have a bad component or be attached to a bad cable causing false interrupts to occur. These interrupts occur at an elevated interrupt request level (IRQL) dominating the attention of the processor(s), leaving lower priority requests (user level) unanswered. As a result, the server appears to be hung.

Another example of a hardware-induced hang involves storage requests going unanswered. For example, consider a case where a disk drive fails, causing outstanding I/O requests to be queued up. Eventually, these pending requests trigger a cascading effect of user and system threads to hang, leading to a system-wide outage.

More often, however, server hangs are a result of software issues. These issues come in several flavors, including:

  • System resource depletion (e.g., out of memory pool) -- The most common type of software hang, this typically is the result of a memory leak by a driver or kernel mode thread. Resource depletion can also result from exceeding architectural limits of paged and nonpaged memory pools (typically experienced on an x86 32-bit operating system).
  • Deadlock conditions -- A deadlock occurs when contention exists for common resources between two or more threads. For example, a deadlock exists when one thread owns an exclusive lock on a resource that another thread wants, and that thread exclusively owns a resource that the initial thread wants.
  • Spinlock conditions -- Spinlock hangs are similar to deadlocks, but involve contention for a spinlock that is used to synchronize access to data structures in a multi-processor environment. Other permutations of these conditions include a driver holding a lock while performing other activities for an extended period of time. Actual examples of deadlock and spinlock hangs will be provided later.
  • High-priority, compute-bound threads -- A software hang can also occur if high-priority, compute-bound thread(s) are dominating the processors. Since the Windows operating system permits varying levels of thread priority, one or more threads may execute at a higher priority than typical user threads. The result is that applications and users at normal priority are starved for CPU time, causing a perceived software hang.
  • The big picture

    So, as you can see, there are numerous reasons why a server may hang. To give you a better idea of what happens when you force a crash to generate a memory dump, and subsequently analyze the crash to determine what caused the hang, see Figure 1 below.

    Starting on the left-hand side, you can see the server crashes or hangs. In the event of a crash, the server would generate a memory dump if the dumpfile and pagefile are properly configured (see Microsoft Knowledge Base articles 254649, 197379 and 889654).

    In the event of a hang, manual intervention would be required to force a crash dump as previously described. In either case, the content of memory is written to the pagefile.sys before the server is rebooted. During the reboot, the pagefile.sys is written to the memory.dmp file. Finally, once the server has rebooted, you can use the Windows Kernel Debugger (Windbg) to analyze the memory dump using a symbol server (as documented in KB article 311503) to translate memory references to meaningful functions and variables.

    Figure 1: Overview of memory dump process and analysis

    Now that you have a better idea of why server hangs occur, the next article in this series will look at the preparation process for troubleshooting a hung Windows server.


    TROUBLESHOOTING A HUNG WINDOWS SERVER

    Part 1: Why do servers hang?
    Part 2: Preparing to troubleshoot
    Part 3: Resolving the issue

    Bruce Mackenzie-Low, MCSE/MCSA, is a systems software engineer with HP providing third-level worldwide support on Microsoft Windows-based products including Clusters and Crash Dump Analysis. With more than 20 years of computing experience at Digital, Compaq and HP, Bruce is a well known resource for resolving highly complex problems involving clusters, SANs, networking and internals.

    Rate this Tip
    To rate tips, you must be a member of SearchWindowsServer.com.
    Register now to start rating these tips. Log in if you are already a member.




    Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


    RELATED CONTENT
    Windows Server Troubleshooting
    High-tech solutions for monitoring computer heat
    Troubleshoot Windows server clusters with ClusDiag
    Analyze server history with new tool in Windows 2008
    Determining the cause of Windows server hang
    Preparing to troubleshoot a hung Windows server
    How to use Microsoft's IIS Diagnostics Toolkit
    Using Safe Mode to resolve Windows Server 2003 startup issues
    Why doesn't the successful server in a multi-server system automatically assume the PDC role when the others go down?
    Default server not sticking
    What causes ASP.NET server performance to degrade as sessions age?

    Windows Server Monitoring and Administration
    A quick guide to Server Manager for Windows Server 2008
    Moving dynamic disks to a new Windows server
    A first look at Storage Explorer for Windows Server 2008
    Tips for Windows domain controller optimization
    Take control of server clusters with Microsoft's ClusDiag tool
    Analyze server history with new tool in Windows 2008
    Determining the cause of Windows server hang
    Preparing to troubleshoot a hung Windows server
    Microsoft tool simplifies Windows server cluster configuration
    Exploring the Windows Server 2003 Resource Kit: Clusfileport.dll

    Windows Server Monitoring and Management
    A quick guide to Server Manager for Windows Server 2008
    How does Microsoft Hyper-V rate?
    Network Access Protection in Windows Server 2008: Should you care?
    Just what does Microsoft Hyper-V have to offer?
    Considerations in building GeoClusters for Windows Server 2008
    Can Microsoft really make an impact with Hyper-V?
    Easing security concerns with Server Core for Windows 2008
    Understanding quorum in Windows Server 2008 clustering
    What's there to hate about Windows Server 2008?
    Windows PowerShell: A backdoor to malware?

    RELATED RESOURCES
    2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
    Search Bitpipe.com for the latest white papers and business webcasts
    Whatis.com, the online computer dictionary

    DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



    Server Room Design - Planning, Cooling, Maintenance
    HomeTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
    About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
    SEARCH 
    TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

    TechTarget Corporate Web Site  |  Media Kits  |  Site Map




    All Rights Reserved, Copyright 2004 - 2009, TechTarget | Read our Privacy Policy
      TechTarget - The IT Media ROI Experts