Tag Archives: Data

Data Center Architecture: Converged, HCI, and Hyperscale


A comparison of three approaches to enterprise infrastructure.

If you are planning an infrastructure refresh or designing a greenfield data center from scratch, the hype around converged infrastructure, hyperconverged infrastructure (HCI) and hyperscale might have you scratching your head. In this blog, I’ll compare and contrast the three approaches and consider scenarios where one infrastructure architecture would be a better fit than the others.

Converged infrastructure

Converged infrastructure (CI) incorporates compute, storage and networking in a pre-packaged, turnkey solution. The primary driver behind convergence was server virtualization: expanding the flexibility of server virtualization to storage and network components. With CI, administrators could use automation and management tools to control the core components of the data center. This allowed for a single admin to provision, de-provision and make any compute, storage or networking changes on the fly.

Converged infrastructure platforms use the same silo-centric infrastructure components of traditional data centers. They’re simply pre-architected and pre-configured by the manufacturers. The glue that unifies the components is specialized management software. One of the earliest and most popular CI examples is Virtual Computing Environment (VCE). This was a joint venture by Cisco Systems, EMC, and VMware that developed and sold various sized converged infrastructure solutions known as Vblock. Today, Vblock systems are sold by the combined Dell-EMC entity, Dell Technologies.

CI solutions are a great choice for infrastructure pros who want an all-in-one solution that’s easy to buy and pre-packaged direct from the factory. CI is also easier from a support standpoint. If you maintain support contracts on your CI system, the manufacture will assist in troubleshooting end-to-end. That said, many vendors are shifting their focus towards hyperconverged infrastructures.

Hyperconverged infrastructure

HCI builds on CI. In addition to combining the three core components of a data center together, hyperconverged infrastructure leverages software to integrate compute, network and storage into a single unit as opposed to using separate components. This architecture design offers performance advantages and eliminates a great deal of physical cabling compared to silo- and CI-based data centers.  

Hyperconverged solutions also provide far more capability in terms of unified management and orchestration. The mobility of applications and data is greatly improved, as is the setup and management of functions like backups, snapshots, and restores. These operational efficiencies make HCI architectures more attractive from a cost-benefit analysis when compared to traditional converged infrastructure solutions.

In the end, a hyperconverged solution is all about simplicity and speed. A great use case for HCI would be a new virtual desktop infrastructure (VDI) deployment. Using the orchestration and automation tools available, you have the ideal platform to easily roll out hundreds or thousands of virtual desktops.

Hyperscale

The key attribute of hyperscale computing is the de-coupling of compute, network and storage software from the hardware. That’s right, while HCI combined everything into a single chassis, hyperscale decouples the components.

This approach, as practiced by hyperscale companies like Facebook and Google, provides more flexibility than hyperconverged solutions, which tend to grow in a linear fashion. For example, if you need more storage on your HCI system, you typically must add a node blade that includes both compute and built-in storage. Some hyperconverged solutions are better than others in this regard, but most fall prey to linear scaling problems if your workloads don’t scale in step.

Another benefit of hyperscale architectures is that you can manage both virtual and bare metal servers on a single system. This is ideal for databases that tend to operate in a non-virtualized manner. Hyperscale is most useful in situations where you need to scale-out one resource independently from the others. A good example is IoT because it requires a lot of data storage, but not much compute. A hyperscale architecture also helps in situations where it’s beneficial to continue operating bare metal compute resources, yet manage storage resources in elastic pools.



Source link

Data Center Transformation at ConocoPhillips


IT leaders at ConocoPhillips were already working on a major data center consolidation initiative before oil prices plummeted. The company couldn’t keep adding storage and servers; it just wasn’t sustainable, especially for a company that was looking to get serious about the cloud. The industry downturn added urgency to their efforts.

That meant taking some dramatic action in order to cut IT operating costs and save jobs, according to Scott Duplantis, global IT director of server, storage and data center operations at ConocoPhillips. The transformation, which focused on two data centers in the US, included fast-tracking adoption of newer technology like all-flash arrays with full-time data reduction, and refreshing compute platforms with a control-plane software that manages virtual CPU and memory allocations.

All the hard work combined with a fearless approach to data center modernization paid off: The company reduced its data center footprint by more than 50%, slashed its SAN floor space consumption by 80%, cut its power and cooling costs by $450,000 a year, improved reliability, and saved jobs along the way, all in about 30 months.

“We have fewer objects under management, which means not having to add staff as we continue to grow,” Duplantis said. “Our staff can do a better job of managing the infrastructure they have, and it frees them up to pursue cloud initiatives.”

ConocoPhillips’ data center transformation initiative earned first place in the InformationWeek IT Excellence Awards infrastructure category.

Reducing the storage footprint

For its storage-area network, network-attached storage, and backup and recovery, ConocoPhillips traditionally relied on established storage vendors. The SAN alone had 62 racks of storage between the two data centers.

ConocoPhillips decided that flash storage was the way to go, and conducted a bakeoff between vendors that had the features it wanted: ease of management, data deduplication and compression, replication, and snapshotting. The company wound up choosing a relatively new vendor to supply all-flash storage for its SAN, and buying AFAs from one of its incumbent vendors for its NAS. The company also focused on buying larger controllers, which when combined with the flash, provided better performance and reduced the number of objects the staff has to manage.

The work reduced raw SAN storage from 5.6 to 1.8 petabytes. Altogether, the consolidation cuts down on object maintenance and support contracts tied to storage hardware.

Improved power and cooling efficiency from the flash storage adoption has ConocoPhillips revaluating how its data centers are cooled. “We have to do some reengineering in our data centers to accommodate for almost half of the power footprint they had, and a significant drop in heat because these all-flash arrays don’t generate much heat at all,” he said.

The company also is relearning how to track and trend storage capacity needs; with full-time data reduction, measuring capacities has become a bit tricky.

While some argue that flash has a limited lifecycle, ConocoPhillips has experienced improved SAN storage reliability, Duplantis said. 

Server consolidation

On the compute side, ConocoPhillips deployed faster, more powerful servers, along with the control-plane technology that automates the management of CPU and memory workloads. Virtual server densities shot up dramatically, from 20:1 to 50:1.

The control-plane technology, from a startup, provides a level of optimization that goes beyond human scale, according to Duplantis. Combined with the flash storage, it’s helped cut performance issues to near zero.

“You really can’t just stick with the mainstream players,” he advised. “In the industry today, a lot of the true innovation is coming out of the startup space.”

Lessons learned

While the data center modernization project went smoothly for the most part, without disrupting end users, there were some hiccups with the initial flash deployment. Duplantis said the company was pleased with the support they received from the vendor, which was especially important given that the vendor was newer.

Internally, the data center transformation did require a culture shift for the IT team. IT administrators become attached to the equipment they manage, so they need to see a lot of proof that the new technology is reliable and easy to manage, Duplantis said.

“Today, we understand mistakes are made and technology can fail,” he said. “Once they saw they could take a chance and wouldn’t be in trouble if it didn’t work perfectly, they could breathe easy.”

The fact that jobs were saved amid the economy downturn with all the cost-cutting measures turned employees into champions for the new technology, he said. “They see they’re part of the process that helped save jobs, save costs, and increase reliability,” he said.

Looking ahead

ConocoPhillips plans to continue to right-size its storage and virtual server environments; the process is now just part of the corporate DNA. On the virtual side, the team examines the number of hosts every month and decides to either keep them on premises or put them in a queue for the cloud, Duplantis said.

The team also is working to build up its cloud capability to ensure it’s ready when the economy picks up and the company increases its drilling activity. “We want to be nimble and agile when the business needs it,” he said.



Source link

Enterprise Data Storage Shopping Tips


Enterprise data storage used to be an easy field. Keeping up meant just buying more drives from your RAID vendor. With all the new hardware and software today, this strategy no longer works. In fact, the radical changes in storage products impact not only storage buys, but ripple through to server choices and networking design.

This is actually a good news scenario. In data storage, we spent much of three decades with gradual drive capacity increases as the only real excitement. The result was a stagnation of choice, which made storage predictable and boring.

Today, the cloud and solid-state storage have revolutionized thinking and are driving much of the change happening today in the industry. The cloud brings low-cost storage-on-demand and simplified administration, while SSDs make server farms much faster and drastically reduce the number of servers required for a given job.

Storage software is changing rapidly, too. Ceph is the prime mover in open-source storage code, delivering a powerful object store with universal storage capability, providing all three mainstream storage modes (block-IO, NAS and SAN) in a single storage pool. Separately, there are storage management solutions for creating a single storage address space from NVDIMMs to the cloud, compression packages that typically shrink raw capacity needs by 5X, virtualization packages that turn server storage into a shared clustered pool, and tools to solve the “hybrid cloud dilemma” of where to place data for efficient and agile operations.

A single theme runs through all of this: Storage is getting cheaper and it’s time to reset our expectations. The traditional model of a one-stop shop at your neighborhood RAID vendor is giving way to a more savvy COTS buying model, where interchangeability of  component elements is so good that integration risk is negligible. We are still not all the way home on the software side in this, but hardware is now like Legos, with the parts always fitting together. The rapid uptake of all-flash arrays has demonstrated just how easy COTS-based solutions come together.

The future of storage is “more, better, cheaper!” SSDs will reach capacities of 100 TB in late 2018, blowing away any hard-drive alternatives. Primary storage is transitioning to all-solid-state as we speak and “enterprise” hard drives are becoming obsolete. The tremendous performance of SSDs has also replaced the RAID array with the compact storage appliance. We aren’t stopping here, though. NVDIMM is bridging the gap between storage and main memory, while NVMe-over-Fabric solutions ensure that hyperconverged infrastructure will be a dominant approach in future data centers.

With all these changes, what storage technologies should you consider buying to meet your company’s needs? Here are some shopping tips.

(Image: Evannovostro/Shutterstock)



Source link

How To Shrink Your Data Storage Footprint


I remember a few years ago parsing through all the files on a NAS box. I was amazed at all the duplicate files, but a bit more investigation revealed that we had a mix of near duplicates in with the genuine replicas. All had the same name, so it was hard to tell the valid files from the trash. I asked around and the responses I got were mostly along the lines of, “Why are we keeping that? No one uses it!”

This begs the question: Do we throw any data away any more? Laws and regulations like the Sarbanes-Oxley Act (SOX) and HIPAA stipulate that certain data should be kept safe and encrypted. The result is that data subject to the law tends to be kept carefully forever, but then, so does most of the rest of our data, too.

Storing all this data isn’t cheap. Even on Google Nearline or Amazon Glacier, there is cost associated with all of the data, its backups and replicas. In-house, we go through the ritual of moving cold data off primary storage to bulk disk drives, and then on into the cloud, in almost a mindless manner.

Excuses include “Storage is really cheap in the cloud,” and “It’s better to keep everything, just in case” to “Cleaning up data is expensive” or too complicated. Organizations often evoke big data as another reason for their data stockpiling, since there may be nuggets of gold in all that data sludge. The reality though is that most cold, old data is just that: old and essentially useless.

As I found with the NAS server, analyzing a big pile of old files is not easy. Data owners are often no longer with the company. Even if they were, remembering what an old file is all about is often impossible. The relationship between versions of files is hard to recreate, especially for desktop data from departmental users. In fact, it’s mainly a glorious waste of time. Old data is just a safety blanket!

So how can companies go about reducing their data storage footprint? Continue on to learn about some data management best practices and tools that can help.

(Image: kentoh/Shutterstock)



Source link

Docker Data Security Complications


Docker containers provide a real sea change in the way applications are written, distributed and deployed. The aim of containers is to be flexible and allow applications to be spun up on-demand, whenever and wherever they are needed. Of course wherever we use our applications, we need data.

There are two schools of thought on how data should be mapped into containers. The first says we keep the data only in the container; the second says we have persistent data outside of the container that extends past the lifetime of any individual container instance. In either scenario, the issue of security poses big problems for data and container management.

Managing data access

As discussed in my previous blog, there are a number of techniques for assigning storage to a Docker container. Temporary storage capacity, local to the host running the container can be assigned at container run time. Storage volumes assigned are stored within the host in a specific subdirectory mapped to the application. Volumes can be created at the time the container is instanced, or in advance using the “docker volume” command.

Alternatively, local storage can be mapped as a mount point into the container. In this instance, the “docker run” command specifies a local directory as the mounted point within the container. The third option is to use a storage plugin that directly associates external storage with the container.

Open access

In each of the described methods, the Docker framework provides no inherent security model for data. For example, any host directory can be mounted to a container, including sensitive system folders like /etc. It’s possible for a container to then modify those files, as permissions are granted using standard, simple Unix permission settings. An alternative and possibly better practice is to consider using non-root containers, which involves running containers under a different Linux user ID (UID). This is relatively easy to do, however does mean building a methodology to secure each container with either a group ID (GID) or UID as permissions checking is done on UID/GID numbers.

Here we run into another problem: Using non-root containers with local volumes doesn’t work, unless the UID used to run the container has permissions to the /var/lib/docker/volumes directory. Without this, data can’t be accessed or created. Opening up this directory would be a security risk; however, there’s no inherent method to set individual permissions on a per-volume basis without a lot of manual work.

If we look at how external storage has been mounted to a container, many solutions simply present a block device (a LUN) to the host running the container and format a file system onto it. This is then presented into the container as a mount point. At this point, the security on directories and files can be set by within container itself, reducing some of the issues we’ve discussed. However, if this LUN/volume is reused elsewhere, there are no security controls about how it is mounted or used on other containers, as there is no security model built directly into the container/volume mapping relationship. Everything depends on trusting the commands run on the host.

This is where we have yet another issue: a lack of multi-tenancy. When we run containers, each container instance may run for a separate application. As in traditional storage deployments, storage assigned to containers should have a degree of separation to ensure data can’t be inadvertently or maliciously accessed cross-application. There’s no easy way to currently do this at the host level, other than to trust the orchestration tool running the container and mapping it to data.

Finding a solution

Obviously some of the issues presented here are Linux/Unix specific. For example, the abstraction of the mount namespace provides different entry points for our data, however there’s no abstraction of permissions – I can’t map user 1,000 to user 1,001 without physically updating the ACL (access control list) data associated with each file and directory. Making large-scale ACL changes could potentially impact performance. For local volumes, Docker could easily set the permissions of the directory on the host that represents a new volume to match the UID of the container being started.

External volumes provide a good opportunity to move away from the permissions structure on the host running containers. However, this means that a mechanism is required to map data on a volume to a known trusted application running in a specific container instance. Remember that containers have no inherent “identification” and can be started and stopped at will. This makes it hard to determine whether any individual container is the owner of a data volume.

Today the main solution is to rely on the orchestration platform that manages the running of the containers themselves. We put the trust into these systems to map volumes and containers accurately. In many respects, this isn’t unlike traditional SAN storage or the way virtual disks are mapped to virtual machines. However, the difference for containers is the level of portability they represent and the need to have a security mechanism that extends to the public cloud.

There’s still some work to be done here. For Docker, its acquisition of storage startup Infinit may spur ideas about how persistent data is secured. This should hopefully mean the development of an interface that all vendors can work towards — storage “batteries included” but optional.

Learn more about containers at Interop ITX, May 15-19 in Las Vegas. Container sessions include “Managing Containers in Production: What You Need To Think About,” and “The Case For Containers: What, When, and Why?” Register now!



Source link