Archive for the ‘Windows 2008’ Category

Failover Cluster Network Design with Hyper-V–How many NICs are required?

July 25, 2011

Failover Clustering reliability and stability is also “strongly” dependent on the underlying networking design and *drivers* but that’s a another story…. Let’s focus here in the design part.

Since there is no more the hard requirement like in Windows 2003 based clusters (MSCS) for an “HB” (heartbeat) network there is some “unsureness” around the network design for Failover Clustering based on Windows 2008 especially when virtualization workloads are involvedSmileHeartbeat traffic.

Cluster intra-communication (heartbeat traffic) will now go over each cluster network per default except you disable for cluster usage like in case of ISCSI:

image

NOTE: It is a well known best practice to disable cluster communication for ISCSI networks = dedicated for ISCSI traffic only!

The “golden” rule here is, for a “general” Failover Cluster, Microsoft does “recommend” to have at minimum 2 redundant network “paths” between the cluster nodes . But often you want to use more than the minimum “recommended” as you want to have additional redundancy (and/or performance) in your network connectivity (a.e. NIC Teaming) or you will use features like Hyper-V (CSV, LM) which will bring his own network requirements.

Depending on the used workloads on top of Failover Clustering, the number of required physical NICs can grow fast. In example in Hyper-V Failover Clustering with using Live Migration and ISCSI for VM guests the recommended number is roughly at minimum 4 physical NICs, of course more are required, when using NIC teaming technologies for redundancy and or performance objective.

Here are a few example scenarios and the number of the “minimum recommended” required physical NIC ports per cluster node:

Scenario 1:

Failover Cluster with 2 Nodes and Hyper-V (1 x  Virtual Switch dedicated) in use without LM/CSV

image

=> min. 3 physical NICs are recommended => 2 Cluster Networks are automatically discovered and added to Cluster

Scenario 2:

Failover Cluster with 2 Nodes and Hyper-V (1 x Virtual Switch dedicated) in use with LM/CSV

image

=> min. 4 physical NICs are recommended => 3 Cluster Networks are automatically discovered and added to Cluster

Scenario 3:

Failover Cluster with 2 Nodes and Hyper-V (1 x Virtual Switch dedicated) in use with LM/CSV and ISCSI at host

image

=> min. 5 physical NICs are recommended (see note below for ISCSI) => 4 Cluster Networks are automatically discovered and added to Cluster

Scenario 4:

Failover Cluster with 2 Nodes and Hyper-V (2 x Virtual Switch dedicated) in use with LM/CSV and ISCSI at host and guest

image

=> min. 6 physical NICs are recommended (see note for ISCSI) => 4 Cluster Networks are automatically discovered and added to Cluster

Scenario 5:

Failover Cluster with 2 Nodes and Hyper-V (3 x Virtual Switch dedicated) in use with LM/CSV and ISCSI at host and guest

image

=> min. 7 physical NICs are recommended (see note for ISCSI) => 4 Cluster Networks are automatically discovered and added to Cluster

NOTE: In case of ISCSI it is recommended to have at minimum 2 physical network paths for redundancy (availability) purposes. NIC TEAMING IS NOT SUPPORTED HERE, MPIO or MCS must be used for reliability and availability purposes. As a best practice you should disable “cluster communication” through the ISCSI interfaces!

Of course, now when you use techniques like NIC teaming for networks like “Management, Hyper-V switches, CSV..” the number of required physical NICs will automatically grows.

Generally, the cluster service – “NETFT” network fault tolerant – will automatically discover each network based on their subnet and add it to the cluster as a cluster network. ISCSI networks should be generally disabled for Cluster usage (cluster communication).

Further official guidance around network design in Failover Clustering environments can be found here:

Network in a Failover Cluster
http://technet.microsoft.com/en-us/library/cc773427(WS.10).aspx

Network adapter teaming and server clustering
http://support.microsoft.com/kb/254101

Hyper-V: Live Migration Network Configuration Guide
http://technet.microsoft.com/en-us/library/ff428137(WS.10).aspx

Requirements for Using Cluster Shared Volumes in a Failover Cluster in Windows Server 2008 R2
http://technet.microsoft.com/en-us/library/ff182358(WS.10).aspx

Designating a Preferred Network for Cluster Shared Volumes Communication
http://technet.microsoft.com/en-us/library/ff182335(WS.10).aspx

Appendix A: Failover Cluster Requirements
http://technet.microsoft.com/en-us/library/dd197454(WS.10).aspx

Cluster Network Connectivity Events
http://technet.microsoft.com/en-us/library/dd337811(WS.10).aspx

Understanding Networking with Hyper-V
http://www.microsoft.com/download/en/details.aspx?amp;displaylang=en&displaylang=en&id=9843

Achieving High Availability for Hyper-V
http://technet.microsoft.com/en-us/magazine/2008.10.higha.aspx

Windows Server 2008 Failover Clusters: Networking (Part 1-4)
http://blogs.technet.com/askcore/archive/2010/02/12/windows-server-2008-failover-clusters-networking-part-1.aspx
http://blogs.technet.com/askcore/archive/2010/02/22/windows-server-2008-failover-clusters-networking-part-2.aspx
http://blogs.technet.com/askcore/archive/2010/02/25/windows-server-2008-failover-clusters-networking-part-3.aspx
http://blogs.technet.com/askcore/archive/2010/04/15/windows-server-2008-failover-clusters-networking-part-4.aspx 

Description of what to consider when you deploy Windows Server 2008 failover cluster nodes on different, routed subnets
http://support.microsoft.com/kb/947048

Stay tuned…. Winking smile

Regards

Ramazan

Advertisements

Running Domain Controller on top of Hyper-V and Failover Cluster?

July 15, 2011

Generally this questions is currently a well discussed topic in my customer scenarios therefore I would like to cover the important points when talking about virtualized DCs and especially when Failover Clustering is involved.

Hyper-V:

Mainly the virtualization of DC roles are generally supported if you had understood the caveats. Generally in production environments you should NOT use “snapshot/save state” features for DCs especially in multi-DC deployment but also in single-DC. Reason even for Single-DC environments is that domain members does update their computer password frequently and which doesn’t match anymore when you apply an previous snapshot (please see KB175468 around machine password). Of course, there are some workarounds but from my perspective none of them apply in production environments.

If you read the below articles and you are aware what exactly to overlook, “Yes you can” use this feature in lab scenarios, like you must snapshot all domain members at the same time or reset computer password after applying an earlier DC snapshot. But GENERALLY YOU SHOULD (NEVER) NOT USE SNAPSHOT/SAVE STATE FUNCTION IN PRODUCTION for DC role(s)!

So when running a domain controller within a Hyper-V virtual machine do NOT use:

1. Save states OR,
2. Virtual machine snapshots

In Hyper-V deployments there are some general “considerations” which need to overlooked when deploying virtualized domain controllers, here are some great articles which covers this in detail and gives also some guidelines:

Running Domain Controllers in Hyper-V
http://technet.microsoft.com/en-us/library/virtual_active_directory_domain_controller_virtualization_hyperv(WS.10).aspx

Things to consider when you host Active Directory domain controllers in virtual hosting environments
http://support.microsoft.com/kb/888794/en-us

Considerations when hosting Active Directory domain controller in virtual hosting environments
http://support.microsoft.com/kb/888794/en-us

The Domain Controller Dilemma
http://blogs.msdn.com/b/virtual_pc_guy/archive/2008/11/24/the-domain-controller-dilemma.aspx

Problems with virtual machines and domain membership
http://blogs.msdn.com/b/virtual_pc_guy/archive/2006/03/28/561508.aspx

Hyper-V and Domain Controllers – Demo Tips and Tricks
http://blogs.msdn.com/b/virtual_pc_guy/archive/2009/11/20/hyper-v-and-domain-controllers-demo-tips-and-tricks.aspx

Effects of machine account replication on a domain
http://support.microsoft.com/kb/175468

Running Domain Controllers within Virtual Server 2005
http://www.microsoft.com/downloads/details.aspx?FamilyId=64DB845D-F7A3-4209-8ED2-E261A117FC6B&displaylang=en

Failover Cluster:

Especially in Failover Cluster environments it is a “best practice” and recommended to have at least 1 physical/virtual DC available which is outside of the cluster environment as cluster service does require DC communication before starting cluster service (VCO/CNO).

Checkout the following blog post from my MVP colleague – Lai Yoong Seng MVP Virtual Machine – which discusses arising issues, when putting your DCs on top of Failover Cluster:

http://www.ms4u.info/2011/05/why-you-should-not-running-domain.html

We call this “Henne und Ei Problem” in German where translation has the same sense “Chicken and Egg IssueSmile

Windows 2003 MSCS:

Determining Domain Controller Access for Server Clusters (Windows 2003)
http://technet.microsoft.com/en-us/library/cc779512(WS.10).aspx

Active Directory, DNS and Domain Controllers (Windows 2003)
http://technet.microsoft.com/en-us/library/cc775654(WS.10).aspx

Cluster Networking Requirements (Windows 2003)
http://technet.microsoft.com/es-es/library/cc783193(WS.10).aspx

Stay tuned…. Winking smile

Regards

Ramazan

How to generate and correctly interpret Failover Cluster Log

July 2, 2011

Very often it is required to generate cluster log if the eventlog information’s are not enough to nail down your root cause and for getting a better understanding what is happening under the hood of your failover cluster.

Hereby are some useful articles and techniques for how to generate cluster logs and of course how to correctly read and interpret them:

Anatomy of a Cluster Log Entry
http://technet.microsoft.com/en-us/library/cc962179.aspx

Techniques for Tracking the Source of a Problem
http://technet.microsoft.com/en-us/library/cc962185.aspx

Interpreting the Cluster Log
http://technet.microsoft.com/en-us/library/cc961673.aspx

Cluster Log Basics
http://technet.microsoft.com/en-us/library/cc962184.aspx

Understanding the Cluster Debug Log in 2008
http://blogs.technet.com/b/askcore/archive/2010/04/13/understanding-the-cluster-debug-log-in-2008.aspx

Windows Server 2008 and R2 Cluster Log Appears to Be Missing Gaps of Data
http://blogs.technet.com/b/thbrown/archive/2010/07/31/windows-server-2008-and-r2-cluster-log-and-missing-gaps-of-data.aspx

View Events and Logs for a Failover Cluster
http://technet.microsoft.com/en-us/library/cc772342.aspx

TechNet Webcast: Failover Cluster Validation and Troubleshooting with Windows Server 2008 (Level 300)
https://msevents.microsoft.com/CUI/EventDetail.aspx?culture=en-US&EventID=1032364832&CountryCode=US

Failover Clustering: Pro Troubleshooting in Windows Server 2008 (PPT)
http://media.ch9.ms/teched/na/2011/ppt/WSV309.pptx
http://ecn.channel9.msdn.com/o9/te/NorthAmerica/2010/pptx/WSV314.pptx

How to create the cluster.log in Windows Server 2008 Failover Clustering
http://blogs.msdn.com/b/clustering/archive/2008/09/24/8962934.aspx

DBA 101: Collecting and Interpreting Failover Cluster Logs
http://blogs.msdn.com/b/joesack/archive/2010/02/09/dba-101-collecting-and-interpreting-failover-cluster-logs.aspx

Introduction to Cluster Diagnostics and Verification Tool for Exchange Administrators
http://technet.microsoft.com/en-us/library/aa996161(EXCHG.65).aspx

How to turn on cluster logging in Microsoft Cluster Server (W2K3)
http://support.microsoft.com/kb/168801

The meaning of state codes in the Cluster log
http://support.microsoft.com/kb/286052

Failover Cluster Troubleshooting
http://msdn.microsoft.com/en-us/library/ms189117.aspx

Troubleshooting Cluster Logs 101 – Why did the resources failover to the other node?
http://blogs.technet.com/b/askcore/archive/2008/02/06/troubleshooting-cluster-logs-101-why-did-the-resources-failover-to-the-other-node.aspx

Hope this helps to understand more what is going on under the hood of your failover cluster, especially in root cause analyses (RCA), test and/or proof-of-concept scenarios it is really helpful to be able to read and interpret the cluster logs.

Stay tuned Winking smile

Best Regards

Ramazan

Deploying Microsoft Windows Server 2008 Networking with Cisco

June 6, 2011

Cisco has published an really useful whitepaper around networking deployments with their Cisco gear and does provide great technical details around the new enhanced capabilities of the new networking features from 2008/Vista stack.

Techniques like Teredo, ISATAP, NAT, SMBv2, IPv6…..will be discussed in detail especially the multisite cluster DNS is also covered here:

image

image

image

image

image

image

image

image

Deploying Microsoft Windows Server 2008 and Vista on a Cisco Network
http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/App_Networking/extmsftw2k8vistacisco.pdf

Stay tuned…. Winking smile

Regards

Ramazan

New Hyper-V Performance Blog

June 5, 2011

Tony Voellm has left Microsoft is his previous role as Hyper-V Performance Lead. He had given great insights into Hyper-V performance monitoring and had spread really great and deep technical knowledge on Hyper-V performance objectives on “All topics fundamental” blog:

http://blogs.msdn.com/b/tvoellm/

BUT we can (hopefully) still get his really useful articles on his new blog:

http://perfguy.blogspot.com/

Especially his 1st blog around the question which hypervisor does provide better performance – Hyper-V or VMWare – makes me sure that we will read more in the future from Tony Winking smile

http://perfguy.blogspot.com/2011/03/which-is-faster-windows-hyper-v-or.html

Stay tuned…

Regards

Ramazan

Every 20 min an Event 1069 and 1558?

May 15, 2011

DISCLAIMER: Event 1069 is an generic event therefore you should check and decide by your own if this blog post is an possible solution for your scenario and environment. This can be verified in the cluster logs when you identify "‘Failed to create cluster directory on witness”! which is generated by Quorum Agent.

Problem:

In a customer scenario we had identified every 20 min the events 1069 and 1558. They had pointed to quorum disk issues which we could not confirm in the 1st step as the disk was online and could be failed over to other nodes without any issues.

Event ID 1069 — Clustered Service or Application Availability
http://technet.microsoft.com/en-us/library/cc756225(WS.10).aspx

Event ID 1558 — Cluster Witness Functionality
http://technet.microsoft.com/en-us/library/dd353960(WS.10).aspx

After digging deeper and doing an cluster log analyses I had found an very interesting pointer there:

ERR mscs::QuorumAgent::PostOnline: ERROR_PATH_NOT_FOUND(3)’ because of ‘Failed to create cluster directory on witness, path \\?\Volume{5c65f7b0-15e4-11e0-b316-002655db949a}\Cluster

This tells me, that the cluster services has issues when trying to access the quorum disk when “he” want to write his cluster hive (=configuration) to the quorum disk.

As the cluster hive is “redundantly” available on each in the cluster and can also be manually created anytime when changing the cluster quorum configuration in your cluster, I used this scenario for troubleshooting my issue here.

Solution:

1. Changing the Quorum Modell temporary to “Node Majority” so that I can remove the Quorum Disk “Q:” from the cluster (Note: When changing quorum model, be aware of the available “votes” (keep majority) in your cluster)

image

image

2. Remove Quorum Disk from Cluster , Re-Format with NTFS and back to Cluster

3. Restore Quorum model – in my case “Node und Disk Majority” and point to new-formatted Quorum disk Q:

image

Result:: Die Cluster Hive is newly created on Quorum Disk Q:\

image

After the cluster hive is successfully created from cluster service, all entries in cluster logs and also events 1069 and 1558 are gone Winking smile

Information:

General Information’s around Cluster Logs can be found here:

How to create the cluster.log in Windows Server 2008 Failover Clustering
http://blogs.msdn.com/b/clustering/archive/2008/09/24/8962934.aspx

Troubleshooting Cluster Logs 101 – Why did the resources failover to the other node?
http://blogs.technet.com/b/askcore/archive/2008/02/06/troubleshooting-cluster-logs-101-why-did-the-resources-failover-to-the-other-node.aspx

Introduction to Cluster Diagnostics and Verification Tool for Exchange Administrators
http://technet.microsoft.com/en-us/library/aa996161(EXCHG.65).aspx

….wish you good luck with “troubleshooting” clip_image010

Best Regards

Ramazan

Alle 20 Minuten ein Event ID 1069 und 1558?

May 15, 2011

HINWEIS: Da der Event 1069 sehr “generisch” ist und auf unterschiedliche Fehlerquellen zeigen kann, ist dies nur eine mögliche Beispiellösung und soll als Erfahrungsbericht dienen. Zum Verfizieren, ob dieser Artikel bei Eurem Problem helfen kann, bitte die Cluster Logs überprüfen nach “’Failed to create cluster directory on witness”!

Problem:

Bei einem Kunden hatten wir etwa alle 20 min die Events 1069 und 1558, obwohl es mit dem Quroum Laufwerk – Cluster Ressource online – keine sichtbaren Probleme gab und wir es auch ohne Probleme auf andere Knoten verschieben konnten.

Event ID 1069 — Clustered Service or Application Availability
http://technet.microsoft.com/en-us/library/cc756225(WS.10).aspx

Event ID 1558 — Cluster Witness Functionality
http://technet.microsoft.com/en-us/library/dd353960(WS.10).aspx

Nach weiterer System und Cluster Log Analyze habe ich folgenden interessanten Hinweis entdeckt:

ERR mscs::QuorumAgent::PostOnline: ERROR_PATH_NOT_FOUND(3)’ because of ‘Failed to create cluster directory on witness, path \\?\Volume{5c65f7b0-15e4-11e0-b316-002655db949a}\Cluster

Dies deutet auf einen Fehler hin, das der Cluster Service nicht auf die Quorum Disk zugreifen (schreiben/lesen) konnte und daher wir auch die Events 1069/1558 erhalten haben.

Da die Quorum Struktur (Cluster Hive=Cluster Konfiguration) “redundant” im Cluster verfügbar ist und diese auch durch die Quorum Konfiguration manuell neu generiert werden kann, war die Lösung in meimem Falle, ganz einfach:

Lösung:

1. Quorum Modell temporär geändert damit Qurom Disk “Q:” aus Cluster entfernt werden konnte

image

image

2. Quorum Disk aus Cluster entfernt, Neu-formatiert und wieder Cluster hinzugefügt

3. Quorum Konfiguration wiederhergestellt – in meinem Falle Node und Disk Majority

image

Ergebnis: Die Cluster DB wurde neu auf der Quorum erstellt

image

Nachdem die Cluster “Hive” auf der Quorum Disk erfolgreich erstellt worden ist, waren die Einträge im Cluster Log verschwunden und auch die Eventlogs wieder sauber.

Informationen:

Generelle Informationen zu Cluster Logs sind zu finden unter:

How to create the cluster.log in Windows Server 2008 Failover Clustering
http://blogs.msdn.com/b/clustering/archive/2008/09/24/8962934.aspx

Troubleshooting Cluster Logs 101 – Why did the resources failover to the other node?
http://blogs.technet.com/b/askcore/archive/2008/02/06/troubleshooting-cluster-logs-101-why-did-the-resources-failover-to-the-other-node.aspx

Introduction to Cluster Diagnostics and Verification Tool for Exchange Administrators
http://technet.microsoft.com/en-us/library/aa996161(EXCHG.65).aspx

Viele Spass beim “Troubleshooten” Winking smile

Grüße

Ramazan

Powershell–How to write your own custom functions in Powershell?

May 15, 2011

How to write your own functions within Powershell?

As most of you knows, Powershell is a really powerful toy for administrators which are dealing with a lot of systems. Most of the time, you do same operations and need to write your own scripts. I usually use here the "function" method to build my own Powershell functions which I call later in my script anytime.

Function Example: Enumerate all drives/partitions on a server
function GetDrives
    Get-WmiObject Win32_DiskPartition | select-Object Name, VolumeName, DiskIndex, Index 
}
   

Function Example: Start all VMs
function StartVM ([string]$VMName) {
    write-host (‘INFOTEXTBOX: Starting all VM now: ‘+$VMName)
    Get-VM -Suspended $VMName | Start-VM -Force
    Get-VM -Stopped $VMName | Start-VM -Force
    while (!(Get-VM $VMName -Running)) {sleep 1}
}

Function Example: Cleanup or Remove all VM
function RemoveVM ([string]$VMName) {
    write-host (‘INFO: Remove VMs now:’+$VMName)
    Get-VM $VMName| Remove-VM -force | Out-Null
    while (Get-VM $VMName) {sleep 3}
}

Function Example: Remove ISO drive from VM
function RemoveISO ([string]$VMName) {
    $DVD=Get-VMDisk $VMName | Where-Object {$_.DriveName -like ‘DVD Drive’} | Where-Object {$_.DiskImage -like ‘*.iso’}
    if ($DVD) {
        ForEach ($actDVD in $DVD) {
            write-host (‘INFO: Remove Mounted ISO from VM ‘+$VMName)
            Remove-VMDrive $VMName -ControllerID $actDVD.ControllerID -LUN $actDVD.DriveLUN -DiskOnly | Out-Null
        }
    }
}

These are examples for Hyper-V and Cluster Environments where we do utilize command like  “Remove-VMDrive” or “Get-VM” from the Powershell Hyper-V library module from Codeplex or built in Failover Cluster module.

PowerShell: calling a function with parameters
http://weblogs.asp.net/soever/archive/2006/11/29/powershell-calling-a-function-with-parameters.aspx

PowerShell Tutorial 10: Functions and Filters
http://www.powershellpro.com/powershell-tutorial-introduction/powershell-functions-filters/

Stay tuned for further PS examples Winking smile

Regards

Ramazan

VMMUpdate – How to (automatically) check for available updates for SCVMM, Hyper-V and Failover Cluster?

April 11, 2011

Jonathan Jordan (Support Escalation Engineer) has released an really helpful tool to check the patch status for your SCVMM, Hyper-V and Failover Cluster environments. 

The solution here is called VMMUpdate!

VMMUpdate creates a report of required updates for technologies used by the SCVMM server and all Hosts.

– Windows
– Hyper-V
– Failover Cluster

As well as components SCVMM and other technologies leverage…

– WinRM
– BITS
– WMI
– VDS
– VSS

After extract you can start the script with “VMMUpdatev2.4.cmd” and when scan is completed you receive an textfile with all relevant information’s:

image

image

These logfiles are saved in C:\Windows\SCVMM\vmmupdate\logs

more details around VMMUpdate can be found at Jonathan’s Blog here:

http://blogs.technet.com/b/jonjor/archive/2010/10/14/vmmupdate.aspx

Download VMMUpdate is here

Kudos to Jonathan!

Stay tuned…. 😉

Regards

Ramazan