Tag Archives: Data

6 Reasons SSDs Will Take Over the Data Center


The first samples of flash-based SSDs surfaced 12 years ago, but only now does the technology appear poised to supplant hard drives in the data center, at least for primary storage. Why has it taken so long? After all, flash drives are as much as 1,000x faster than hard-disk drives for random I/O.

Partly, it has been a misunderstanding that overlooks systems, and focuses instead on storage elements and CPUs. This led the industry to focus on cost per terabyte, while the real focus should have been the total cost of a solution with or without flash. Simply put, most systems are I/O bound and the use of flash inevitably means needing fewer systems for the same workload. This typically offsets the cost difference.

The turning point in the storage industry came with all-flash arrays: simple drop-in devices that instantly and dramatically boosted SAN performance. This has evolved into a model of two-tier storage with SSDs as the primary tier and a slower, but cheaper, secondary tier of HDDs

Applying the new flash model to servers provides much higher server performance, just as price points for SSDs are dropping below enterprise hard drive prices. With favorable economics and much better performance, SSDs are now the preferred choice for primary tier storage.

We are now seeing the rise of Non-Volatile Memory Express (NVMe), which aims to replace SAS and SATA as the primary storage interface. NVMe is a very fast, low-overhead protocol that can handle millions of IOPS, far more than its predecessors. In the last year, NVMe pricing has come close to SAS drive prices, making the solution even more attractive. This year, we’ll see most server motherboards supporting NVMe ports, likely as SATA-Express, which also supports SATA drives.

NVMe is internal to servers, but a new NVMe over Fabrics (NVMe-oF) approach extends the NVMe protocol from a server out to arrays of NVMe drives and to all-flash and other storage appliances, complementing, among other things, the new hyper-converged infrastructure (HCI) model for cluster design.

The story isn’t all about performance, though. Vendors have promised to produce SSDs with 32 and 64TB capacity this year. That’s far larger than the biggest HDD, which is currently just 16TB and stuck at a dead-end at least until HAMR is worked out.

The brutal reality, however, is that solid-state opens up form-factor options that hard disk drives can’t achieve. Large HDDs will need to be 3.5 in form-factor. We already have 32TB SSDs in a 2.5 inch size and new form-factors, such as M2.0 and the “ruler“(an elongated M2.0), which will allow for a lot of capacity in a small appliance. Intel and Samsung are talking petabyte- sized storage in 1U boxes.

The secondary storage market is slow and cheap, making for a stronger barrier to entry against SSDs. The rise of 3D NAND and new Quad-Level Cell (QLC) flash devices will close the price gap to a great extent, while the huge capacity per drive will offset the remaining price gap by reducing the number of appliances.

Solid-state drives have a secret weapon in the battle for the secondary tier. Deduplication and compression become feasible because of the extra bandwidth in the whole storage structure, effectively multiplying capacity by factors of 5X to 10X. This lowers the cost of QLC-flash solutions below HDDs in price-per-available terabyte.

In the end, perhaps in just three or four years flash and SSDs will take over the data center and kill hard drives off for all but the most conservative and stubborn users. On the next pages, I drill down into how SSDs will dominate data center storage.

(Image: Timofeev Vladimir/Shutterstock)



Source link

SNIA Releases Data Protection Guidance for Storage Pros


Data storage professionals may not be accustomed to dealing with data security and privacy issues like due diligence, but with the European Union’s General Data Protection Regulation about to take effect, many will need to learn some new concepts.

That’s what makes a new white paper from the Storage Networking Industry Association especially timely, Eric Hibbard, chair of SNIA’s Security Technical Work Group, told me in an interview. SNIA, a nonprofit focused on developing storage standards and best practices, put together a document that provides guidance on data protection, specifically as it relates to storage.

“The storage industry has for many years has been insulated from having to worry about traditional security and to a less degree, the privacy issues,” Hibbard said. “With GDPR, the definition of a data breach moved from unauthorized access to include things like unauthorized data destruction or corruption. Why is that important to storage professionals? If you make an update to a storage system that causes corruption of data, and if that’s only copy of that data, it could constitute a data breach under GDPR. That’s the kind of thing we want to make sure the storage industry and consumers are aware of.”

The GDPR, which sets mandatory requirements for businesses, becomes enforceable May 25. It applies to any business storing data of EU citizens.

The white paper builds on the ISO/IEC 27040 storage security standard, which doesn’t directly address data protection, by providing specific guidance on topics such as data classification, retention and preservation, data authenticity and integrity, monitoring and auditing, and data disposition/sanitization.

For example, the issue of data preservation, retention, and archiving is barely touched on in the standard, so the paper expands on that and explains what the potential security issues are from a storage perspective, said Hibbard, who holds several certifications, including CISSP-ISSAP, and serves roles in other industry groups such as the Cloud Security Alliance.

The paper explains the importance of due diligence and due care – concepts that storage mangers aren’t used to dealing with, Hibbard said.

“In many instances, the regulations associated with data protection of personal data or PII (privacy) do not include details on the specific security controls that must be used,” SNIA wrote in its paper. “Instead, organizations are required to implement appropriate technical and organizational measures that meet their obligations to mitigate risks based on the context of their operations. Put another way, organizations must exercise sufficient due care and due diligence to avoid running afoul of the regulations.”

Failure to take steps to understand and address data exposure risks can demonstrate lack of due care and due diligence, the paper warns, adding: “Storage systems and ecosystems are such integral parts of ICT infrastructure that these concepts frequently apply, but this situation may not be understood by storage managers and administrators who are responsible and accountable.”

One of the components of due diligence is data disposition and sanitization. “When you’re done with data, how do you make sure it actually goes away so that it doesn’t become a source of a data breach?” Hibbard said.

The SNIA paper spends some time defining data protection, noting that the term means different things depending on whether someone works in storage, privacy, or information security. SNIA defines data protection as “assurance that data is not corrupted, is accessible for authorized purposes only, and is in compliance with applicable requirements.”

The association’s Storage Security: Data Protection white paper is one of many it produces, which are freely available. Others papers cover topics such as cloud storage, Ethernet storage, hyperscaler storage, and software-defined storage.



Source link

Facebook Debuts Data Center Fabric Aggregator


At the Open Compute Project Summit in San Jose on Tuesday, Facebook engineers showcased their latest disaggregated networking design, taking the wraps off new data center hardware. Microsoft, meanwhile, announced an effort to disaggregate solid-state drives to make them more flexible for the cloud.

The Fabric Aggregator, built on Facebook’s Wedge 100 gigabit top-of-rack switch and Open Switching System (FBOSS) software, is designed as a distributed network system to accommodate the social media giant’s rapid growth. The company is planning to build its twelfth data center and is expanding one in Nebraska from two buildings to six.

“We had tremendous growth of east-west traffic,” Sree Sankar, technical product manager at Facebook said, referring to the traffic flowing between buildings in a Facebook data center region. “We needed a change in the aggregation tier. We were already using the largest chassis switch.”

The company needed a system that would provide power efficiency and have a flexible design, she said. Engineers used Wedge 100 and FBOSS as building blocks and developed a cabling assembly unit to emulate the backplane. The design provides operational efficiency, 60% better power efficiency, and higher port density. Sankar said Facebook was able to deploy it quickly in its data center regions in the past nine months. Engineers can easily scale Fabric Aggregator up or down according to data center demands.

“It redefines network capacity in our data centers,” she said.

Facebook engineers wrote a detailed description of Fabric Aggregator in a blog post. They submitted the specifications for all the backplane options to the OCP, continuing their sharing tradition. Facebook’s networking contributions to OCP include its Wedge switch and Edge Fabric traffic control system. The company has been a major proponent of network disaggregation, saying traditional proprietary network gear doesn’t provide the flexibility and agility they need.

Seven years ago, Facebook spearheaded the creation of the Open Compute Project with a focus on open data center components such as racks and servers. The OCP now counts more than 4,000 engineers involved in its various projects and more than 370 specification and design packages, OCP CEO Rocky Bullock said in kicking off this week’s OCP Summit, which drew some 3,000 attendees.  

Microsoft unveils Project Denali

While Facebook built on its disaggregated networking approach, Microsoft announced Project Denali, an effort to create new standards for flash storage to optimize it for the cloud through disaggregation.

Kushagra Vaid, general manager of Azure Infrastructure at Microsoft, said cloud providers are top consumers of flash storage, which amounts to billions of dollars in annual spending. SSDs, however, with their “monolithic architecture” aren’t designed to be cloud friendly, he said.  

Any SSD innovation requires that the entire device be tested, and new functionality isn’t provided in a consistent manner, he said. “At cloud scale, we want to drive every bit of efficiency,” Vaid said. Microsoft engineers wanted to figure out a way to provide the same kind of flexibility and agility with SSDs as disaggregation brought to networking.

“Why can’t we do the same thing with SSDs?” he said.

Project Denali “standardizes the SSD firmware interfaces by disaggregating the functionality for software-defined data layout and media management,” Vaid wrote in a blog post.

“Project Denali is a standardization and evolution of Open Channel that defines the roles of SSD vs. that of the host in a standard interface. Media management, error correction, mapping of bad blocks and other functionality specific to the flash generation stays on the device while the host receives random writes, transmits streams of sequential writes, maintains the address map, and performs garbage collection. Denali allows for support of FPGAs or microcontrollers on the host side,” he wrote.

Vaid said this disaggregation provides a lot of benefits. “The point of creating a standard is to give choice and provide flexibility… You can start to think at a bigger scale because of this disaggregation, and have each layer focus on what it does best.”

Microsoft is working with several partners including CNEX Labs and Intel on Project Denali, which it plans to contribute to the OCP.

Hear more from Facebook and the Open Compute Project when they present live at the Network Transformation Summit at Interop ITX, April 30 and May 1 in Las Vegas. Register now!



Source link

Data Protection in the Public Cloud: 6 Steps


While cloud security remains a top concern in the enterprise, public clouds are likely to be more secure than your private computing setup. This might seem counter-intuitive, but cloud service providers have a leverage of scale that allows them to spend much more on security tools than any large enterprise, while the cost of that security is diluted across millions of users to fractions of a cent.

That doesn’t mean enterprises can hand over all responsibility for data security to their cloud provider. There are still many basic security steps companies need to take, starting with authentication. While this applies to all users, it’s particularly critical for sysadmins. A password compromise on their mobiles could be the equivalent of handing over the corporate master keys. For the admin, multi-factor authentication practices are critical for secure operations. Adding biometrics using smartphones is the latest wave in the second or third part of that authentication; there are a lot of creative strategies!

Beyond guarding access to cloud data, what about securing the data itself? We’ve heard of major data exposures occurring when a set of instances are deleted, but the corresponding data isn’t. After a while, these files get loose and can lead to some interesting reading. This is pure carelessness on the part of the data owner.

There are two answers to this issue. For larger cloud setups, I recommend a cloud data manager that tracks all data and spots orphan files. That should stop the wandering buckets, but what about the case when a hacker gets in, by whatever means, and can reach useful, current data? The answer, simply, is good encryption.

Encryption is a bit more involved than using PKZIP on a directory. AES-256 encryption or better is essential. Key management is crucial; having one admin with the key is a disaster waiting to happen, while writing down on a sticky note is going to the opposite extreme. One option offered by cloud providers is drive-based encryption, but this fails on two counts. First, drive-based encryption usually has only a few keys to select from and, guess what, hackers can readily access a list on the internet. Second, the data has to be decrypted by the network storage device to which the drive is attached. It’s then re-encrypted (or not) as it’s sent to the requesting server. There are lots of security holes in that process.

End-to-end encryption is far better, where encryption is done with a key kept in the server. This stops downstream security vulnerabilities from being an issue while also adding protection from packet sniffing.

Data sprawl is easy to create with clouds, but opens up another security risk, especially if a great deal of cloud management is decentralized to departmental computing or even users. Cloud data management tools address this much better than written policies. It’s also worthwhile considering adding global deduplication to the storage management mix. This reduces the exposure footprint considerably.

Finally, the whole question of how to backup data is in flux today. Traditional backup and disaster recovery has moved from in-house tape and disk methods to the cloud as the preferred storage medium. The question now is whether a formal backup process is the proper strategy, as opposed to snapshot or continuous backup systems. The snapshot approach is growing, due to the value of small recovery windows and limited data loss exposure, but there may be risks from not having separate backup copies, perhaps stored in different clouds.

On the next pages, I take a closer look at ways companies can protect their data when using the public cloud.

(Image: phloxii/Shutterstock)



Source link

6 Ways to Transform Legacy Data Storage Infrastructure


So you have a bunch of EMC RAID arrays and a couple of Dell iSCSI SAN boxes, topped with a NetApp filer or two. What do you say to the CEO who reads my articles and knows enough to ask about solid-state drives, all-flash appliances, hyperconverged infrastructure, and all the other new innovations in storage? “Er, er, we should start over” doesn’t go over too well! Thankfully, there are some clever — and generally inexpensive — ways to answer the question, keep your job, and even get a pat on the back.

SSD and flash are game-changers, so they need to be incorporated into your storage infrastructure. SSDs are better than enterprise-class hard drives from a cost perspective because they will speed up your workload and reduce the number of storage appliances and servers needed. It’s even better if your servers support NVMe, since the interface is becoming ubiquitous and will replace both SAS and (a bit later) SATA, simply because it’s much faster and lower overhead.

As far as RAID arrays, we have to face up to the harsh reality that RAID controllers can only keep up with a few SSDs. The answer is either an all-flash array and keeping the RAID arrays for cool or cold secondary storage usage, or a move to a new architecture based on either hyperconverged appliances or compact storage boxes tailored for SSDs.

All-flash arrays become a fast storage tier, today usually Tier 1 storage in a system. They are designed to bolt onto an existing SAN and require minimal change in configuration files to function. Typically, all-flash boxes have smaller capacities than the RAID arrays, since they have enough I/O cycles to do near-real-time compression coupled with the ability to down-tier (compress) data to the old RAID arrays.

With an all-flash array, which isn’t outrageously expensive, you can boast to the CEO about 10-fold boosts in I/O speed, much lower latency , and as a bonus a combination of flash and secondary storage that usually has 5X effective capacity due to compression. Just tell the CEO how many RAID arrays and drives you didn’t buy. That’s worth a hero badge!

The idea of a flash front-end works for desktops, too. Use a small flash drive for the OS (C-drive) and store colder data on those 3.5” HDDs. Your desktop will boot really quickly, especially with Windows 10 and program loads will be a snap.

Within servers, the challenge is to make the CPU, rather than the rest of the system, the bottleneck. Adding SSDs as primary drives makes sense, with HDDs in older arrays doing duty as bulk secondary storage, just as with all-flash solutions, This idea has fleshed out into the hyperconverged infrastructure (HCI) concept where the drives in each node are shared with other servers in lieu of dedicated storage boxes. While HCI is a major philosophical change, the effort to get there isn’t that huge.

For the savvy storage admin, RAID arrays and iSCSI storage can both be turned into powerful object storage systems. Both support a JBOD (just a bunch of drives) mode, and if the JBODs are attached across a set of server nodes running “free” Ceph or Scality Ring software, the result is a decent object-storage solution, especially if compression and global deduplication are supported.

Likely by now, you are using public clouds for backup. Consider “perpetual “storage using a snapshot tool or continuous backup software to reduce your RPO and RTO. Use multi-zone operations in the public cloud to converge DR onto the perpetual storage setup, as part of a cloud-based DR process. Going to the cloud for backup should save a lot of capital expense money.

On the software front, the world of IT is migrating to a services-centric software-defined storage (SDS), which allows scaling and chaining of data services via a virtualized microservice concept. Even older SANs and server drives can be pulled into the methodology, with software making all legacy boxes in a data center operate as a single pool of storage. This simplifies storage management and makes data center storage more flexible.

Encryption ought to be added to any networked storage or backup. If this prevents even one hacker from reading your files in the next five years, you’ll look good! If you are running into a space crunch and the budget is tight, separate out your cold data, apply one of the “Zip” programs and choose the encrypted file option. This saves a lot of space and gives you encryption!

Let’s take a closer look at what you can do to transform your existing storage infrastructure and extend its life.

(Image: Production Perig/Shutterstock)



Source link