Previously in Part 1 we looked at creating the groups used to assign policies too
Let’s now take a look at the current policies set in the environment, You can see from the image below I have selected one of my hosts in the virtualiseme cluster and in the bottom right you can see the highlighted area saying “default policy”
Next we want to have a quick look at the current virtual machine capacity forecast to see what the environment looks like under the default policy so we can see the difference once we create out new production policy. The image below shows the virtualiseme cluster on the default policy.
You will notice that it is currently saying the cluster has 0 days left and is over capacity by 8 VMs.
To get there select the cluster on the left -> go to the planning tab and select Views and look for the virtual machine capacity view.
Creating Capacity Profile
Profiles are what vCOPS uses to apply alert configuration and capacity information to objects in your environment. We will be only looking at the capacity side of the profile, but there is also an alert and badge side which lets you choose when a badge should change colour, yellow or green, allows you to disabled alert types. You can do this fine tuning later once capacity is sorted.
1) Ok let’s get creating a capacity policy, first thing we need to go is click on the configuration link on the top left of the vCOPS vSphere UI dashboard, As shown in the image below.
2. You will now be presented with the policy editor dialog box similar to the image below, In this box you will see all your policies you have created, If you have not created any you will only see the default policy. So now we click on the little green plus icon to kick off a new policy.
3. We now have the new policy dialog window, on the left are all the different areas of the policy and on the right are where we make the changes. Now enter a descriptive name and a Good description explaining what this policy is for. The cone from drop down is basically your starting point for the policy. So for example you have already created a production policy and now new hosts have been purchased and a new environment configured and all you need to do is make a small change to the existing production policy, you would select the old production policy to clone from so there is very little you need to change on the new policy.
In the below image I have set the name saying its production for the DL380 Servers, and the good description describing the policy.
4. Next up is the associations, this is where we will use the group we created before. You can select as many groups as you like to be part of a policy. You can even have a single policy for your whole environment if you wanted. Like the image below check the box next to the croup we created and hit the add button which will move it to the right column.
5. Once we have associated our group with this new policy we will move onto section 3, you can re visit the badges section later, these settings will be set at what the cloned policy so in this case default.
As you can see from the image below there is a number of configuration settings we can make. Let’s have a look at each one.
The biggest killer of using this setting in most environments is guest based AV and backups, these services will cause big and quite lengthy “peaks” which we don’t want to include in capacity. If you are running host based backups and AV then definitely check this option, it will then include VM peaks into capacity management numbers.
6) Click next to move onto 3B Usable Capacity. In this section we get to choose what we want to be subtracted from our physical or (actual) capacity to make what is called usable capacity. In the image below you will notice under Usable Capacity Rules we have only selected use HA settings, this means the number of hosts you have configured for failover capacity in vCenter for this cluster will be removed from the actual capacity. The buffer settings are just that a buffer, so if you wanted a 10% buffer on CPU and memory then you enter that in. This is so you can give you and your organisation plenty of time of procurement of new equipment or a just in case reserve.
For Capacity Calculation Rules it is recommend selecting last known capacity this is because it uses historical data for calculations, I have seen issues when a hosts has gone offline before the capacity calculation take place which by default is at 1am ever morning, and this can have some serious impacts to the capacity management information displayed, then you spend most of the next day trying to explain it to the boss as to why it doesn’t look right.
7) Click next or you can select the next one down 3c Usage Calculation on the left hand side. Now you should see Usage Calculation settings page similar to the image below. As you can see there is 2 parts to this page:
As you can see in the below image we have set a 6:1 vCPU to pCPU ratio meaning for every physical core the host has we can allocate 6 virtuals CPUs. In the memory we have put in 50% this means if your hosts has 128GB of RAM you can allocate 192GB of RAM to virtual machines. For precaution environments this can vary but for development and test environments you can stretch this out to 100% or greater.
The Disk Space only matters if you are using thin provisioning, you can set this to what best suites the type of storage you are running.
8) Click next and now we are onto 4a to configure the thresholds for powered off and idle machines. This sections is very straight forward, in most cases the default settings which are pictured below meet the definition of what idle and powered off are.
If a virtual machine is powered off for 90% or more of the time it is considered powered off and if a virtual machine spends 90% or more of its time averaging less than 100MHz of CPU usage, less than 20KBps of disk IO and less than 1KBps of network IO then its idle. Of course you can change this to what you feel would better meet what is considered and idle or a powered off machines.
9) Click next and we move onto Oversized and Undersized thresholds. This is where we configure the settings which will impact the right sizing recommendations that vCOPS makes you. If you remember back to chapter 2 in the oversided VMs section when talking about waste the default settings are mentioned, these are a virtual machine is considered oversize if its CPU demand is below 30% for 1% of the time, these means that over 30 days if it spends 7.2 hours below 30% demand it is oversized
720 hours(30 days) x 0.01 = 7.2 Hours
As you can see from the below image we have changed this and made it 50% this means if it spends 360 hours demanding below 30% then it is oversized.
720 hours(30 days) x 0.50 = 360 Hours
This is more realistic number. The undersized machines is exactly the same but at the other end of the scale must spend at least 50% of the time demanding more than 80%.
10) Click next again and we are on the last part to do with capacity management which is 4c Underuse and stress. These settings are very similar to the to the oversized and undersized settings we just made but this time we are talking about underused and stress. These settings apply to “containers”. What is a container you might be asking, well there is several fixed containers in vCOPS, these are
• World
• vCenters
• Datacenter
• Clusters
• Hosts
• Datastore
The group we just made for this policy is also a container, basically anything that holds other objects. The way to look at this is this is oversized and undersized for containers. The default settings are 1% of the time spent under or over x, It is recommended to change this as we have done on the below image.
Datastore waste is the only one that is different from the previous settings we made and is very straight forward, you choose how many days until a snapshot is considered waste. Let’s hope you don’t have a snapshot more than 180 days which is the default, this could be the case if you are using linked clones which requires a virtual machine with a snapshot to work.
11) Now its time to hit finish, we have gone through all the capacity management side of the settings. And its time to see what difference this has made to the environments capacity recommendations. When we hit finish we should now see the new policy as shown in the image below.
Lets now have a look and see if the profile and the group has now worked. If I navigate to the cluster or a host in the cluster that we configured I should see what the active profile is. See the image below and see that the new policy and group have worked perfectly.
Now remember before we started creating the policy we had an image showing us what the remaining capacity was, and it was saying 0 days remaining and 8 virtual machine over capacity, Let’s look at the below image and see what making the new policy has done for the capacity.
We are still at zero days remaining but only just over by .57 of a virtual machine. I think this is right on the money I know this environment and that number feels right I know if any more virtual machines were provisioned performance issues would start to appear. Well done you have now created your first capacity management policy and now it is
Next up is part 3 – what if?
[…] ← Previous Next → […]
Thank you for this well written article, much appreciated .
Not problem, Thank you for the great feed back