Tag Archives: Data

Combining Data Center Innovations to Reduce Ecological Footprints | IT Infrastructure Advice, Discussion, Community


The big tech companies are vying for positive coverage of their environmental initiatives. Microsoft just promoted its achievements in renewable energy, which will comprise 60 percent of the company’s electricity usage by the end of the year. Facebook made headlines for a forthcoming 100 percent renewable-powered facility in Los Lunas, New Mexico, while both Apple and Google claim 100 percent carbon neutrality.

These green milestones are important, but renewables represent only one environmental solution for the data center industry. Energy-intensive technologies, such as AI and blockchain, complicate the quest for clean, low-impact electricity generation. Additionally, the sector remains a large consumer of the planet’s other resources, including water and raw materials. Unfortunately, the search for energy efficiency can negatively affect other conservation efforts.

Current State of Play on the Search for Energy Efficiency

A case in point is adiabatic cooling, which evaporates water to ease the burden on HVAC systems. At a time when 2.7 billion people suffer from water scarcity, this approach can lead to intense resource competition, such as in Maharashtra, India, where drinking water had to be imported as thirsty colocation facilities proliferated.

Bolder strategies will be necessary to deliver the compute power, storage capacity, and network connectivity the world demands with fewer inputs of fossil fuels, water, rare earth metals, and other resources. Long range, there is hope for quantum computing, which has the potential to slash energy usage by more than 20 orders of magnitude over conventional technologies. This could cut Google’s annual burn rate, for instance, from gigawatt-hours to the nanowatt-hour range, reducing the need to produce more solar panels, wind turbines, and hydropower stations along the way.

Commercial launches – such as IBM’s Q System One – notwithstanding, the quantum moonshot still lies at least a decade away by most accounts, and the intervening barriers are significant. Quantum calculations remain vulnerable to complex errors, new programming approaches are required, and the nearest-term use cases tend toward high-end modeling, not replacing the standard web server or laptop.

Green Technology Solutions Closer to Earth

Fortunately, there are other technologies nearer at hand and more accessible for the average data center, colocation provider, or even regional office. For example, AI-based technologies are training as zombie killers, using machine learning to improve server allocation and power off the 25% of physical servers and 30% of virtual servers currently running but doing nothing. If underutilized IT assets are repurposed, this can not only help realize energy savings, it can delay new equipment purchases as well.

 

Then there is liquid cooling, well known from the industry’s mainframe origins. Although many companies won’t be able to redesign facilities a la Facebook’s designs, hardware manufacturers are delivering off-the-shelf liquid-cooled products. Use of rear-door heat exchangers and direct-to-chip cooling can help lower PUE from 1.5 or more down toward 1.1, and immersion cooling can deliver power savings of up to 50 percent. These technologies also enable greater density, which means doing more with less space—a good thing, as land, too, is a natural resource.

Consolidation trends will shift more of the environmental burden to the few outfits with pockets deep enough to do the seemingly impossible: sink data centers in the ocean for natural cooling, launch them into space, and “accelerate” workloads with the earliest, sure to be exorbitantly expensive, quantum computers ready for mission critical applications.

What’s Next for the “Green” Data Center

None of today’s available technologies, from AI-driven DCIM systems to advanced load balancers, is a panacea. With blockchain’s intense processing demands and consumers’ insatiable appetite for technology, among other pressures, the IT industry faces numerous forces working against its efforts to shrink resource consumption and carbon emissions.

While we await a breakthrough with the exponential impact of quantum computing, we will have to combine various solutions to drive incremental progress. In some cases, that will mean a return of cold storage to move rarely accessed information off powered storage arrays in favor of tape backups and similar “old school” methods. In others, it will mean allowing energy efficiency and component recyclability to tip the balance during hardware acquisition decisions. And in still others, newer edge computing applications may integrate small, modular pods that work on solar-wind hybrid energy systems.

Hopefully, the craving these dominant tech players display for positive environmental headlines, paired with a profit motive rewarding tiny efficiency gains achieved at hyperscale, will continue to propel advances in green solutions that can one day be implemented industry-wide.



Source link

5 Things You Need to Know About Data Lakes | IT Infrastructure Advice, Discussion, Community


Still waters run deep, the old proverb tells us. The same can be said for data lakes, storage repositories that hold vast amounts of raw data in native format until required by an application, such as predictive analytics.

Like still water, data lakes can be dark and mysterious. This has led to several misconceptions about the technology, some of which can prove damaging or even fatal to new data lake projects.

Before diving in, here are five key things you need to know about data lakes.

1. Data lakes and data warehouses are not the same thing

A data warehouse contains data that has been loaded from source systems based on predefined criteria. “A data lake, on the other hand, houses raw data that has not been manipulated in any way prior to entering the) lake and enables multiple teams within an organization to analyze the data,” noted Sue Clark, senior CTO and architect at Sungard Availability Services.

Although separate entities, data lakes and data warehouses can be packaged into a hybrid model. “This combined approach enables companies to stream incoming data into a data lake, but then move select subsets into relational structures,” said Ashish Verma, a managing director at Deloitte Consulting. “When data ages past a certain point or falls into disuse, dynamic tiering functionality can automatically move it back to the data lake for cheaper storage in the long term.”

2. Don’t treat a data lake like a digital dump

Although a data lake can store structured, unstructured, and semi-structured data in raw form, it should never be regarded as a data dumping ground. “Since data is not processed or analyzed before entering the lake, it’s important that the data lake is maintained and updated on a routine basis, and that all users know the sources of the data in the lake to ensure it’s analyzed appropriately,” Clark explained.

From a data scientist point of view, the most important components when creating a data lake is the process of adding data while ensuring the accompanying catalogs are updated, current, and accessible, observed Brandon Haynie, chief data scientist at Babel Street, a data discovery and analysis platform provider. Otherwise, potentially useful datasets may be set adrift and lost. “The catalog will provide the analyst with an inventory of the sources available, the data’s purpose, it’s origin, and it’s owner,” he said. “Knowing what the lake contains is critical to generating the value to support decision-making and allows data to be used effectively instead of generating more questions surrounding its quality or purpose.”

3. A data lake requires constant management

It’s important to define management approaches in advance to ensure data quality, accessibility, and necessary data transformations. “If a data lake isn’t properly managed from conception, it will turn into a ‘data swamp,’ or a lake with low-quality, poorly cataloged data that can’t be easily accessed,” Verma said.

It’s important for IT leaders to know that data governance is critical for ensuring data is consistent, accurate, contextualized, accessible, and protected, noted Jitesh S. Ghai, vice president and general manager of data quality, security, and governance, at software development company Informatica. “With a crystal-clear data lake, organizations are able to capitalize on their vast data to deliver innovative products and services, better serve customers, and create unprecedented business value in the digital era,” he explained.

4. Don’t become a data hoarder

Many organizations feel they must store everything in order to create an endless supply of valuable data. “Unless someone decides to keep reprocessing all of the data continuously, it is sufficient to create a ‘digestible’ version of the data,” observed Dheeraj Ramella, chief technologist at VoltDB, a firm that offers an in-memory database to support applications requiring real-time decisions on streaming data. “This way, you can refine the model with any new training data.” Once the training has been completed, and the information that’s meaningful to the enterprise is in, one should be able to purge the data outside of the compliance and regulation timeframes.

5. A data lake is not a “prophet-in-a-box”

The truth is that gaining meaningful insights or creating accurate forecasts still requires a significant amount of analytical work and problem-solving using a tool that’s capable of accessing and working the stored data, Haynie advised. “The data lake is just a step in the overall problem-solving process.”

Takeaway

Staying competitive in today’s data-driven world requires a modern analytics platform that can turn information into insight, and both data lakes and data warehouses have an essential role to play, Verma said. “By developing a clear understanding of where they each make sense, IT leaders can help their organizations invest wisely and maximize the value of their information assets.”

 

 



Source link

Unigine 2.8 Brings Better Vegetation, Improved Asynchronous Data Streaming


LINUX GAMING --

While there are no major games currently shipping that make use of the Unigine 2 engine, the company appears to be seeing great success in the industrial simulation space as they keep making great strides in features for their cross-platform engine. Unigine 2.8 was released this week as the newest feature release.

Unigine 2.8 brings improved asynchronous data streaming, cached shadows optimization work, various shadow improvements, better FPS stabilization functionality, improved vegetation, better subsurface scattering, interleaved lights rendering, and multiple other rendering improvements.

For developers, Unigine 2.8 also brings better performance profiling support, a refactored editor, and other development improvements.

More details on Unigine 2.8 via the Unigine DevLog.


How to Get Your Data Center Ready for 100G | IT Infrastructure Advice, Discussion, Community


Today, the focus for many data centers is accommodating 100 Gbps speeds: 28 percent of enterprise data centers have already begun their migration.

Here are three considerations to guide upgrade projects that take into consideration the current and future states of your data center.

1) Understand your options for 100G links

Understanding the options for Layer 0 (physical infrastructure) and what each can do will help you determine which best matches your needs and fits your budget.

Here are several options:

For example, if you’re at 10G right now and you have a fiber plant of OM3 with runs up to 65 meters, and you’re trying to move to 100G, you have two options (SWDM4 and BiDi) for staying with your legacy infrastructure.

On the other hand, if you’re at 10G and trying to get to 100G and you have a fiber plant of OM4 with many runs longer than 100 meters, you’ll need to upgrade these runs to single-mode fiber.

But there’s more.

For the longer runs, you have an option of using duplex or parallel SMF runs – which to choose? For “medium” length runs (greater than 100m but less than 500 meters), the extra cost of installing parallel vs. duplex SMF is moderate, while the savings in being able to use PSM4 optics instead of CWDM4 can be large (as much as 7x).

Bottom line: do your own cost analysis. And don’t forget to consider the future: parallel SMF has a less expensive upgrade path to 400G. Added bonus: the individual fiber pairs in parallel fibers can be separated in a patch panel for higher-density duplex fiber connections.

2) Consider your future needs before choosing new fiber

Again, it’s best to upgrade your data center with the future in mind. If you’re laying new fiber, be sure to consider which configuration will offer the most effective “future proofing.”

As you can see from the image above, for long runs you may be better off using parallel SMF. However, there’s a point at which the cost of extra fiber may outweigh the benefits to cheaper optics, so be sure to do the calculations for your data center.

And remember: planning for future needs is a business decision as much as a technical one, so you’ll want to consider questions like these:

How soon will you need to upgrade to 400G, based on elapsed time between when you upgraded from 1G to 10G and 10G to 100G?

Is upgrading to 100G capability right now the best move, given the planned direction of your business?

3) Consider the evolution of data center technology

Technology solutions get cheaper the longer they’re on the market. So if your data center can wait two to three years to make upgrades, then waiting may be the most cost-effective option.

For example, there’s a smaller form factor coming out soon. In the next two to three years, 100G will be moving to the SFP-DD form factor, which is higher density than QSFP-28, meaning you can get more ports in, which is good for tight server closets and those paying by the square foot for co-location.

SFP-DD ports are also backwards-compatible with SFP+ modules and cables, allowing users to upgrade at their own pace. So even if you’re not ready for all 100G ports, you can upgrade the switch but still use your existing 10G SFP+ devices until you need to upgrade them to 100G.

Proceed with caution

Upgrading a data center means managing a lot of moving pieces, so there’s plenty of room for things to go wrong. Consider this example: a data center manager noticed that his brand-new 25G copper links (server to switch) were performing poorly – dropping packets and losing the link.

Remote diagnostics showed no problems, so he decided to physically inspect the new installation. Since the last inspection, he saw the installers had used plastic cable ties to attach all the cables to the racks. This was fine for old, twisted pair cables, but the new 25G twinax copper cables are highly engineered and have strict specs on bend radius and crush pressure.

The tightly cinched cable ties bent the cables and put pressure on the jacketing, which actually changed the cables’ properties and caused intermittent errors. All the cables had to be thrown away and replaced – obviously, not a very cost-effective endeavor.

So, if you’re weighing your options, think through performance, cost, loss budgets, distance, and other features to consider as you upgrade your data center to 100G.



Source link

Why Cloud-based DCIM is not Just for Data Centers | IT Infrastructure Advice, Discussion, Community


Just as technology and its use are evolving at a tremendous pace, the physical infrastructure which supports IT equipment is also being transformed to support these advances. There are some significant trends driving new approaches to the way technology is being deployed, but there are also important ramifications for the way that the basics – power, cooling, space – have to be provisioned and, more importantly, managed.

Firstly, a massive shift towards hybrid infrastructure is underway, says Gartner. The analyst predicts that by 2020, cloud, hosting, and traditional infrastructure services will be on a par in terms of spending. This follows on from earlier research which indicates an increase in the use of hybrid infrastructure services. As companies have placed an increasing proportion of IT load into outsourced data center services and cloud, both the importance and proliferation of distributed IT environments have been heightened.

Secondly, the IoT – or more specifically the Industrial IoT – has quietly been on the rise for a couple of decades. While industrial manufacturing and processing have utilized data for some time in order to maintain their ability to remain competitive and ensure profitability, companies must continually strive to optimize efficiency and productivity. The answer is being sought through more intelligent and more automated decision-making – most of it data-driven – with the data almost exclusively gathered and processed outside traditional data center facilities.

Thirdly, rapidly developing applications such as gaming and content streaming, as well as emerging uses like autonomous vehicles require physical resources which are sensitive to both latency and bandwidth limitations. Closing the physical distance between data sources, processing and use, is the pragmatic solution, but it also means that centralized data centers are not the answer. Most of the traction for these sorts of services is where large numbers of people reside – exactly where contested power, space and connectivity add unacceptable cost for large facility operations.

The rise of distributed IT facilities and edge data centers

In each of these examples – and there are more – IT equipment has to be run efficiently and reliably. Today there’s little argument with the fact that the best way to enable this from an infrastructure point of view is within a data center. Furthermore, the complexity of environments and the business criticality of many applications means that data center-style management practices need to be implemented in order to ensure that uptime requirements are met. And yet, data centers per se only partially provide the answer, because distributed IT environments are becoming an increasingly vital part of the mix.

The key challenges that need to be resolved where multiple edge and IT facilities are being operated in multiple or diverse locations include visibility, availability, security, and automation – functions which DCIM has a major role in fulfilling for mainstream data centers. You could also add human resource to the list, because most data center operations, including service and maintenance, are delivered by small and focused professional teams. When you add the complication of distributed localities, you have a recipe for having the wrong people in the wrong place, at the wrong time.

Cloud-based DCIM answers the need for managing Edge Computing infrastructure

DCIM deployment in any network can be both complex and potentially high cost (whether delivered using on-premise or as-a-service models). By contrast, cloud-based DCIM, or DMaaS (Data Center Management-as-a-Service), overcomes this initial inertia to offer a practical solution for the challenges being posed. Solutions such as Schneider Electric EcoStruxure IT enable physical infrastructure in distributed environments to be managed remotely for efficiency and availability using no more than a smartphone.

Access Edge Computing White Paper

DMaaS combines simplified installation and a subscription-based approach coupled with a secure connection to cloud analytics to deliver smart and actionable insights for the optimization of any server room, wiring closet or IT facility. This means that wherever data is being processed, stored or transmitted, physical infrastructure can be managed proactively for assured uptime and Certainty in a Connected World.

Read this blog post to find out more about the appeal of cloud-based data center monitoring, or download our free white paper, “Why Cloud Computing is Requiring us to Rethink Resiliency at the Edge.

 



Source link