Data Automation
Posted on Mon, Mar 14, 2011
Author: Michael Traves, Practice Lead for
Data Management
With consolidation and virtualization of infrastructure, many benefits can be achieved; however they should not require increased management or complex processes to maintain through the life of the asset and solution. One key to this is automation – and notably, the ability for tasks normally requiring administrative intervention or complex policy/script writing to be automated and initiated on-demand or scheduled without user analysis and execution.
Today’s storage arrays come with a wide assortment of software management capabilities that allow for intelligent, automated provisioning which, in the past, needed to be manually performed, such as RAID Parity, Striping, LUN creation, device discovery, etc. However, the most notable capabilities are wide striping, thin provisioning, and dynamic data placement/caching. I’ll give you a quick overview to illustrate their greatest benefits:
Wide Striping places data within a LUN (Volume, Filesystem) across as many disks as possible. Sometimes this is limited to a pool of disks, other times it is across the entire set of disks. The objective is to create as much parallel IO as possible, across as many devices as makes sense, to drive IO rates up across all co-located data.
Generally, wide striping is enabled hand-in-hand with Thin Provisioning, another feature commonly available at the array, pool, or aggregate level. Thin Provisioning allows storage to be allocated across a set of disks as needed, not pre-allocated ahead of time, greatly improving storage efficiency and enabling administrators to provision for the life of the asset, not initial requirements.
Thin Provisioning implies a certain amount of block level virtualization, which is why Wide Striping is discussed at the same time – however other functions that work at the block level can also be implemented since the logic to virtualize a block’s location has already been implemented. These can include encryption, compression, and data deduplication, features that generally improve security and storage efficiency. Notably, block level deduplication of shared blocks can be of great benefit in environments when those blocks end up being served from cache frequently – it means less cache is required to improve application performance for a large number of workloads, without creating IO spikes. This is very noticeable in such applications as Virtual Desktops, particularly boot and anti-virus storms which tend to occur with regularity.
Dynamic Data Placement is a capability that enables you to put data on the right tier at the right time – at the block (or sub-LUN, in some cases) level. Dynamic Data Placement lets the intelligence in the storage array figure out what tier the data should be on, based on real-time statistics gathering, and on a scheduled or dynamic basis will move data to the appropriate tier.
Associated with this type of data movement is Dynamic Data Caching. Rather than actually move a block’s location between tiers, it is instead cached in FLASH or SSD. The benefit to this approach is that an array’s read cache becomes much larger, and works well for larger working sets (such as virtualization), without actually having to relocate all data onto expensive SSD. As data ages in the cache, it will fall out and be replaced by more active data automatically.
Having gotten this far, you must be asking yourself - does my storage array have any of these features? Well, maybe, but more likely no, unless you recently purchased it. Whether the features are licensed, or in use, is another matter altogether. All too often features get marketed but never sold or properly implemented; sometimes due to budgeting, but more often they just don't work as advertised and get turned off prior to deployment into production.
If you find yourself in this situation, or are looking at options to replace, upgrade, augment, or properly deploy what you already have, get in touch through the form on this page. Sometimes it's just a matter of better using what you already own - or replacing it when its lifecycle ends, with something capable of delivering the value you're looking for in a consolidated, virtualized storage infrastructure to support your application and business services.