Thursday, May 14, 2015

What So New in vSphere 6?

With the announcement and also from the datasheet, it seems to be pretty lots of functionalities been added.  However there are some critical ones that are more appealing and wanting to see approvement or resolution to those who are already using since vSphere 4 and prior till today which are not make known to many.

There were many discussion over storage UNMAP via thin provisioning and many called it a "myth".  This was also discussed heavily in our Facebook VMUG - ASEAN group.  This was due to many changes since VMFS3 to till VMFS5.  Cody wrote a long history of what are the changes for those who have missed out here.

A KB was also release and this create some discussion VMFS3 with different block size would benefit thin provision so to speak before vSphere 5.0 Update 1.  Sadly after which, all UNMAP was not possible via GUI or automatically other than via command line or script.

I try to ask internally as well and luckily Cormac with his findings has listed all the answers on questions here.  Sadly we still cannot support Linux due to legacy SCSI version.  At least we are on the right track now to see at least Windows are supported.

VMware Data Protection (VDP) first introduced in vSphere 5.1 replacing VMware Data Recovery.   VDP is running a vApp version of EMC Avamar and first introduced with the normal edition and Advanced edition.  The Advanced edition (VDPA) has to be purchased and comes with three agents (SQL, Sharepoint, Exchange) and storage of deduped data up to 8TB instead of 2TB per appliance as on the normal edition.

With VDPA, customers were also able to purchase the per OS Instance license to backup their physical server as shown here.

With vSphere 6, VDPA is now known as VDP and provided free and no longer a purchase option.  So the next question that arise was can user used VDP in vSphere 6 to backup physical server via the agent?  The answer is Yes.  Is there a cost to this?  VDP is now free so the simple answer is yes it is free!  How good is that!

VADP (vSphere API's for Data Protection) now support Linux maintaining file system consistent when taking snapshot and backup.  Also the support of Windows 2012 deduped volumes.  Performance using VADP is optimized and faster.

vMotion Support for MSCS
Support for vMotion for MSCS with Cluster across Box (CAB) and Cluster with physical server is supported (VM as standby node).  This is supported from WIndows 2008 R2 and above.  Storage vMotion / XvMotion is not supported.  CAB will need to make sure DRS Anti-Affinity constraints are enforced.

Locked Down
There are two different mode of Locked Down mode.
  • Normal Locked Down
  • Strict Locked Down
This is been explained here.  A KB on this is also provided.

Exception Users is also introduced.  Only users with administrative privileges added into Exception Users list will allow be able to access the DCUI in Normal Locked Down mode.  Other options is to add user into DCUI.Access in advanced option to have access to DCUI.

In Strict Locked Down, DCUI is disable, only when SSH or ESXi Shell is enabled, will users with administrative privileges in Exception Users able to access the ESXi server.  If not, a reinstall is required.

NIOC version 2 and 3 coexist in vSphere 6.0 and what is the different is be recorded here.  The performance improvement white paper is also been produced.

Increased support from vSphere 5.5. for 10GbE cards from 8 to 16.

SR-IOV limitation includes, vMotion, DRS, Suspend/Resume, Snapshot, Memory Overcommit, Network I/O Control and Fault Tolerance.  In vSphere 6.0, a total of 64 VFs per one 10GbE comparing to a total of 8 VF per one 10GbE in vSphere 5.5 that is 16 times increased!

vSphere Replication
Many might not be aware or not make aware the changes that has been done on vSphere Replication (vR).  There are actually enhancements been done on it but not publicly made known.  One of the major enhancement is compression.  This helps in reducing the amount of data to be replicated across and effectively save you on bandwidth.  Also mentioned here is the introduction of dedicated Network used for NFC instead of sharing with Management Network in the past.  Also the inclusion of Linux OS quiesce.  Also removing to the need of Full Sync whenever a Storage vMotion is triggered.  A White Paper just on vR is also provided here.

I have previously written an article on the new improvement on vNUMA here.  With this improvement, memory locality can be increased across NUMA nodes.

I will include more information here on things that are not really made known here as I get hold of it.  Hope this give you the beauty of this release.

Monday, May 11, 2015

vNUMA Improvement in vSphere 6

NUMA is always a very interesting topic when in design and operation in virtualization space.  We need to understand it so we can size a proper VM more effectively and efficiently for application to perform at its optimum.

To understand what is NUMA and how it works, a very good article to read will be from here.  Mathias has explained this in a very simple terms with good pictures that I do not have to reinvent.  How I wish I have this article back then.

Starting from ESX 3.5, NUMA was made aware to ESX servers.  Allowing for memory locality via a NUMA node concept.  This helps address memory locality performance.

In vSphere 4.1, wide-VM was introduce this was due to VM been allocating more vCPUs than the physical cores per CPU (larger than a NUMA node).  Check out Frank's post.

In vSphere 5.0, vNUMA was introduced to improve the performance of the CPU scheduling having VM to be exposed to the physical NUMA architecture.  Understanding how this works help to understand why in best practice we try not to placed different make of ESXi servers in the same cluster.  You can read more of it here.

With all these improvement on NUMA helps address memory locality issues.  How memory allocation works when using Memory Hot-Add since Memory Hot-add was not vNUMA aware.

With the release of vSphere 6, there are also improvement in NUMA in terms of memory.  One of which is Memory hot-add is now vNUMA aware.  However many wasn't aware how Memory was previously allocated.

Here I will illustrate with some diagram to help in understanding.

Let's start with what happen in prior with vSphere 6 when a VM is hot-added with memory.

Let's start with a VM with 3 GB of virtual memory configured.

When a additional 3 GB of memory is hot added to VM, memory will be allocated by placing to the first NUMA node follow by the next once memory is insufficient one after another in sequence.

In vSphere 6.0, Hot-Add memory is now more NUMA friendly.

Memory allocation is now balance evenly across all the NUMA nodes instead of all in one basket on the first NUMA node.  This helps in trying to access memory mostly from the lowest NUMA node and thus increase the chance of a local memory access.

We would wish that this could be smarter but of course we cannot predict where memory would be accessed from which NUMA node when a processes is running.

Hope this helps give you a better picture when doing sizing and enabling hot-add function.

Sunday, May 3, 2015

Applications for Storage or Storage for Applications?

With many new start ups from storage arrays, converged, hyper-converged to software defined storage (SDS), many users starts to have lots of choice to make.

Recently encountering many questions on which should they choose and which is better.  However there is no straight answer as there are just too many choices to choose from just like in a supermarket.  In the end, some may choose one that advertise the best and create the best reminder in your mind.  To be truth, you will not buy and replaced the rest, but rather some have a hybrid environment for some reasons which we will go through later.

With several asks and questions, I like to give some guideline when deciding.  Here I will do my best to start with no bias towards any technology and this is my personal opinion and may not be the same with others.

1.  Ease of management: A big word often misused by marketing I would say.  Assess it and ask yourself do you have a team to manage different components and if you have a lean team to manage it.  How it is define for ease of use?  Walk through the daily things you commonly need to perform on a traditional setup comparing to this new technology you are evaluating.

2.  What are the applications you are running it on, can this components support the performance:  When performance comes into play, many only look at the storage throughput and IOPS, we need to also look at the daily operations tasks.  How fast can it spin up a workload in a server landscape and VDI landscape (if you are using as well)?  Test everything not just look at a demonstration on one scenario but all.  Rate everything a score and decide which you can do with or without.  There won't be one that will fit all the bill.  Pay for what you need and now and not extras and future.

3.  How are you intend to protect this application?
You know they can meet your requirement in day 0 operations however do you need to protect this application?  If you are doing a backup, can it support any backup API.  This can be from Microsoft or VMware?  Weight the cost between the two.  Would you need storage snapshot, if so, would your workload need to be application/data consistent?  Can the storage as part of this new devices you are looking at able to do it?  If it can, is it build in or via a script or via an agent?  How easy?

4.  How are you going to do disaster recovery?
The cheapest way might be leverage some host based replication technology that will work with any of the device chosen.  However what if you need to perform some kind of storage replication?  Will your workload be application/data consistent?  Can the storage as part of this new devices you are looking at able to do it?  If it can, is it build in or via a script or via an agent?  What are the application it support if you are going to place them running on these new equipments?

5.  Is it easy to do maintenance doing physical components upgrade, firmware upgrade, software upgrade?
This is important as you will definitely do this as it comes along.  We can't expect to have something which give you an ease of day 0 operations yet create lots of work for a maintenance.

6. Does it comes with a per-requisites?
The fine prints that always exist in this world of things.  Ask other than the equipment you are choosing, does it come with a requirement you need have or can it work with your existing infrastructure components.  Leveraging existing investment.

7.  Proof of Concept: Before you perform a pilot or proof of concept, are you placing real data or dummy data.  You need to decide whether this data can be removed easily from the equipment later and whose responsible to do that?  If it's yours, know how you are suppose to do it before you start any activity.  You definitely do not want decision to be made because your data is on it after the test instead of it meet your requirements.

8.  Can it offload storage activities e.g. Full copy, snapshot activities to storage or this will leverage on your hosts' CPU cycle?  Understand this help to identify the specification requirement for your nodes or servers you are using and not to find out contention later.

9. Can the new device leverage on your current investment?  E.g. Reuse existing SAN, IP storage, etc.  Can it use both its build in storage for converged and Hyper converged with existing storage.  For new storage array, can it work with your existing equipments e.g. Servers HBA, Network cards, etc.

From all these above considerations, there might be more however these are just some questions ought to be thought through.  Definitely not one equipment can fulfill everything, this also means, either you might have mixture for different workloads which might need your traditional setup.

VMworld 2019 US Two Days Summary

If you have been following what VMware has been up to by acquiring several companies and mainly related to Cloud Native Applications solutio...