Tag Archives: Data

Why Cloud-based DCIM is not Just for Data Centers | IT Infrastructure Advice, Discussion, Community

Just as technology and its use are evolving at a tremendous pace, the physical infrastructure which supports IT equipment is also being transformed to support these advances. There are some significant trends driving new approaches to the way technology is being deployed, but there are also important ramifications for the way that the basics – power, cooling, space – have to be provisioned and, more importantly, managed.

Firstly, a massive shift towards hybrid infrastructure is underway, says Gartner. The analyst predicts that by 2020, cloud, hosting, and traditional infrastructure services will be on a par in terms of spending. This follows on from earlier research which indicates an increase in the use of hybrid infrastructure services. As companies have placed an increasing proportion of IT load into outsourced data center services and cloud, both the importance and proliferation of distributed IT environments have been heightened.

Secondly, the IoT – or more specifically the Industrial IoT – has quietly been on the rise for a couple of decades. While industrial manufacturing and processing have utilized data for some time in order to maintain their ability to remain competitive and ensure profitability, companies must continually strive to optimize efficiency and productivity. The answer is being sought through more intelligent and more automated decision-making – most of it data-driven – with the data almost exclusively gathered and processed outside traditional data center facilities.

Thirdly, rapidly developing applications such as gaming and content streaming, as well as emerging uses like autonomous vehicles require physical resources which are sensitive to both latency and bandwidth limitations. Closing the physical distance between data sources, processing and use, is the pragmatic solution, but it also means that centralized data centers are not the answer. Most of the traction for these sorts of services is where large numbers of people reside – exactly where contested power, space and connectivity add unacceptable cost for large facility operations.

The rise of distributed IT facilities and edge data centers

In each of these examples – and there are more – IT equipment has to be run efficiently and reliably. Today there’s little argument with the fact that the best way to enable this from an infrastructure point of view is within a data center. Furthermore, the complexity of environments and the business criticality of many applications means that data center-style management practices need to be implemented in order to ensure that uptime requirements are met. And yet, data centers per se only partially provide the answer, because distributed IT environments are becoming an increasingly vital part of the mix.

The key challenges that need to be resolved where multiple edge and IT facilities are being operated in multiple or diverse locations include visibility, availability, security, and automation – functions which DCIM has a major role in fulfilling for mainstream data centers. You could also add human resource to the list, because most data center operations, including service and maintenance, are delivered by small and focused professional teams. When you add the complication of distributed localities, you have a recipe for having the wrong people in the wrong place, at the wrong time.

Cloud-based DCIM answers the need for managing Edge Computing infrastructure

DCIM deployment in any network can be both complex and potentially high cost (whether delivered using on-premise or as-a-service models). By contrast, cloud-based DCIM, or DMaaS (Data Center Management-as-a-Service), overcomes this initial inertia to offer a practical solution for the challenges being posed. Solutions such as Schneider Electric EcoStruxure IT enable physical infrastructure in distributed environments to be managed remotely for efficiency and availability using no more than a smartphone.

Access Edge Computing White Paper

DMaaS combines simplified installation and a subscription-based approach coupled with a secure connection to cloud analytics to deliver smart and actionable insights for the optimization of any server room, wiring closet or IT facility. This means that wherever data is being processed, stored or transmitted, physical infrastructure can be managed proactively for assured uptime and Certainty in a Connected World.

Read this blog post to find out more about the appeal of cloud-based data center monitoring, or download our free white paper, “Why Cloud Computing is Requiring us to Rethink Resiliency at the Edge.


Source link

What You Need to Know About Data Lakes, Marts, Vaults, and Warehouses | IT Infrastructure Advice, Discussion, Community

It seems like everywhere you turn, someone is talking about big data this, or data analytics that. Supporting this move to data-driven businesses is a whole range of different data infrastructures, but it can be difficult to wrap your head around where your data lakes and data warehouses meet, and why you might even need a data vault.

At their hearts, each of these concepts boils down to finding ways to ingest and manage your data in an effective way for today’s level of insight-driven decision-making. So, what are the options, how do they relate, and what are they used for?

Data lakes: Data lakes are huge collections of data, ranging from raw data that has not been organized or processed, through to varying levels of curated data sets. One of their benefits from an analytics purpose is that the varying types of consumers can access appropriate data for their needs. This makes it perfect for some of the newer use cases such as Data Science, AI and machine learning, which are viewed by many companies as the future of analytics work. It is a great way to store masses of raw data on scalable storage solutions without attempting traditional ETL or ELT (extract, transform, load), which can be expensive at this volume. However, for more traditional analytics, this type of data environment can be unwieldy and confusing – which is why organizations turn to other solutions to manage essential data in more structured environments. 

In terms of positioning within a data infrastructure, data lakes are, if you like, up-stream of other data infrastructure, and can be used as a staging area for a more structured approach such as a data warehouse, as well as providing for data exploration and data science.

Data warehouses: A data warehouse, or an enterprise data warehouse as it is sometimes known, is a more curated repository of data. It is invaluable for providing business users with access to the right information in a usable format – and can include both current and historical information. As data enters the data warehouse environment, it is cleansed, transformed, categorized and tagged – making it easier to manage, use and monitor from a compliance perspective, which is where automation comes in.

 The volume and velocity of data experienced by businesses today mean that manually ingesting this data, processing it, and making sure it’s stored and accessible in a way that meets compliance requirements within a data warehouse is unfeasible in the modern world. However, with businesses constantly looking to data as the source of both reports and forecasts, a data warehouse is invaluable. It’s important that data lakes do not subsume the role of a more structured data infrastructure just because of the perceived effort of ingestion. Automation can help speed the ingestion and processing to fast-track time to value with data-driven decision-making in a data warehouse.

Data marts: data mart is a specific subset of a data warehouse, often used for curated data on one specific subject area, which needs to be easily accessible in a short amount of time. Due to its specificity, it is often quicker and cheaper to build than a full data warehouse. However, a data mart is unable to curate and manage data from across the business to inform business decisions.

Data vaults: Data vault modeling is an approach to data warehousing which looks to address some of the challenges posed by transforming data as part of the data warehousing process. One of the great advantages of a data vault is that it makes no assessment as to what data is “valuable” and what isn’t, whereas once data is processed and cleansed into a warehouse environment, this decision has typically been made. Data vaults have the flexibility to manage this, and to address changing sources of data, leading the data vault approach to be credited with providing a “single version of the facts” rather than a “single version of the truth.”

For enterprises with large, growing and disparate datasets, a data vault approach to data warehousing can help tame the beast of big data into a manageable, business-centric solution, but can take time to set up. Data vault automation is a critical component to ensuring organizations can deliver and maintain data vaults that adhere to the stringent requirements of the Data Vault 2.0 methodology and will be able to do so in a practical, cost-effective and timely manner. 

Each of these different data approaches has its own role to play in ingesting, managing, and delivering data across an organization. A broad brush understanding of how they fit together can be invaluable for IT managers and business leaders when trying to understand what is and isn’t possible as big data becomes as much a business prerogative as a technology one. Finding ways to speed up the establishment and management of these practices using technologies such as automation is essential for helping organizations reduce the time to value and succeed in the data-driven business landscape.


Source link

Digital Transformation Hinges on Tearing Down Data Silos | IT Infrastructure Advice, Discussion, Community

Here are two quick questions for which IT managers should have a ready answer: Where’s your critical data and how quickly can you access it? Do you have the right infrastructure to manage it?

For a variety of reasons, many IT teams cannot answer these questions easily because they do not have command of their data landscape.

The primary problem is uncontrolled duplication. Today, the average company maintains more than nine copies of a single piece of information. “Data” in fact is no longer a singular entity – it’s many similar, but often distinct copies of the same information. A typical company might have seven, eight or more copies of their data, each of which is intended to support a singular purpose: business resiliency, disaster recovery, DevOps, incident response, legal workflow, archive, ediscovery, and on and on.

Most companies have all these copies of data scattered across numerous data warehouses and data lakes. Getting to the right data at the right time can be a complicated, Herculean feat. When something goes wrong and a particular piece of information is needed, “How do I get the right data to solve this challenge?” becomes the top question, and today’s complexity means there’s no easy answer. This is something more businesses are confronting. Over the past two years, more than 75 percent of businesses have been unable to surface the right data, according to a 2017 survey by Forrester. It is a two-headed problem, caused by both the number of data copies and the increasingly intricate data infrastructure IT must navigate.

Such complexity crept into our data management over time, born of siloed operational channels, shifting requirements, changing operational needs, and architectural evolution.

A typical story might go something like this: the legal staff needed data but they don’t know the “backup people” (and probably wouldn’t have talked to them anyway), and then legal probably couldn’t trust that the right data for discovery could be obtained in a legally permissible way. Thus, it became inevitable that the legal team demanded and now retains its own backup copy – and so it goes across the organization.

Data copies are siloed not just by department or use case, but also by platform. Backup solutions span technologies, including traditional on-premises backup for physical servers and databases, hybrid solutions that run on-premises but push data into designated archives (cloud or local), hosted backup (from a managed services provider or MSP), backup-as-a-service, and, in some cases, backups from edge computing such as IoT systems. The Forrester survey found that 79 percent of organizations have at least three backup solutions, and more than 26 percent have five or more solutions.

Today, this landscape is further complicated by increasing dependence on SaaS solutions such as Salesforce, G Suite and Office 365, resulting in additional silos and making it more difficult to reliably protect data.

Digital transformation hinges on tearing down data silos and creating a more complete data lake from which new intelligence, more effective decision-making, and informed automation emerge. In this new world, protecting data has become a data management requirement that demands a holistic and unified approach. And it’s important to note that keeping multiple copies of data for different purposes ultimately creates more problems than it solves.

This fragmented and siloed approach to data management isn’t sane or manageable, particularly given today’s rate of data growth. In fact, more than half (56 percent) of IT professionals cite “exponential data growth” as a challenge.

Regulatory and compliance requirements (such as GDPR and others) fuel increased data volumes, with large enterprises stating that these requirements are the No. 1 driver of data growth. Compounding the challenge, regulations like GDPR require organizations to have a better handle on their data than ever.

This problem was not created overnight, and it won’t be solved quickly, but the shape of the solution is becoming clear. We must use the cloud to unify our view of all of the information spread across data silos in this landscape. Beyond the well-known business benefits of the cloud (cost, simplicity, instant scalability and agility), the cloud makes it easier to augment the enterprise’s architecture to support different use cases. Using the cloud in this way makes it feasible to use a single dataset to support each unique requirement. Because the cloud is inherently collaborative, it enables a unified approach that allows different departments to address distinct use cases using the same underlying data.

As we continue to see, the need for disaster recovery planning is only increasing in the wake of natural disasters, and cloud-based operations and backup copies offer an immediate solution for data recovery needs. They eliminate the need for offsite locations and enable business continuity without the need for specialized, dedicated resources or large investments.

But data management is more than backups. DevOps teams can take advantage of continuous cloud backups to satisfy their hunger for the latest copy of data to support A/B testing and rapid development. Backup data can be analyzed for machine learning and AI applications. For example, a medical device company can learn from its archive of surgical data to improve the precision times of its devices.

Incremental progress is the way. This refocus on data – in the singular sense of the term but modernized to reflect the need for a proper data lake – won’t be instituted enterprise-wide in a single push. Use cases such as disaster recovery, forensics and DevOps are already accustomed to working directly with data management and therefore provide natural starting points for leading the transition to data sanity.




Source link

New World of Edge Data Center Management] |

Big changes are happening with data center management as emphasis shifts from core to edge operations. The core is no less important, but the move to the edge opens new challenges as the environment becomes more complex. IT management roles, and the supporting tools and infrastructure, must change in line with the transition to new edge data centers.

A new world of data center management is being driven by rapid growth in the implementation of edge computing environments and “non-traditional” IT, with analysts forecasting that 80% of enterprise applications will become cloud based by 2025. Underscoring these drivers is a hunger for data with actionable insights and an increased focus on customer experience. Whether internal users or external clients, the services received will be hosted and accessed from multiple locations. From wherever and however it is delivered is of no concern to the user, only quality of service is important.

Minimize Complexity in Edge Data Center Management for Better Business Outcomes

For IT teams, the shift is away from equipment management to application provision and service delivery – wherever and whenever the user wants it. The challenge for IT professionals is to deliver a seamless user experience.

This new focus is accelerated by both internal and external forces. Internally, business has traditionally had little interest in IT operations. Today, there is even less concern about what IT is – the business is really only interested in what it does and how much it costs. IT teams are being told to focus on running applications on behalf of the business, not operating the data center on behalf of the IT department. At the same time, the business expects the management of the infrastructure assets to be automated and efficiently provisioned from the centralized hub to the edge.

Externally, rapid and monumental changes in multi-cloud delivery, edge computing, and AI are creating new challenges and opportunities for IT management. For example, new applications such as AI will ingest data in the cloud, on-premise, and at the edge at volumes previously unseen. AI is about driving business value and cannot be constrained by equipment failures or sub-optimal performance. To ensure the infrastructure is available and performing as required will demand new levels of management and monitoring visibility.

Take Advantage of the Evolving Edge Ecosystem to Meet Business Demands

IT was once relatively simple: keep up with the latest tech industry advances which make it into product sets, and invest wisely in those with the clearest roadmaps. Today, with a focus on business outcomes and less resources, you need automation, AI, and technology to help manage the edge and the data center.

To respond to business’ demands for fast and accurate information, IT as a service has focused on application delivery, not infrastructure management. The choices available on how to deliver a particular application have never been greater and increasingly involve cloud hosting and edge solutions.

The Future Lies with Visibility at the Edge

In a widely read blog post, Gartner’s Dave Cappuccio provided a vision for the future of the data center, declaring that by 2025 the enterprise data center as we know it today will be dead.

Gartner’s obituary for the data center is timely and may prove to be correct. Cappuccio recognizes that it is not yet time to issue the last rites, and he is wise enough not to greatly exaggerate reports of its immediate demise. We can be certain that there is no exaggeration in reports of the need to transition to the next stage. This starts with gaining visibility of all infrastructure operations from the cloud to the edge. A successful edge data center management strategy should include a cloud-based management platform that offers visibility across the entire IT infrastructure. Coupled by a data lake empowered by the smarts of power and cooling specialists, IT teams have the time to focus on more strategic activities that drive business success.


Source link

Linux 4.19.8 Released With BLK-MQ Fix To The Recent Data Corruption Bug


Hopefully you can set aside some time this weekend to upgrade to Linux 4.19.8 as there’s the BLK-MQ fix in place for the recent “EXT4 corruption issue” that was plaguing many users of Linux 4.19.

Greg Kroah-Hartman just released a number of stable kernel point releases. Linux 4.19.8 has just some minor additions like supporting the ELAN0621 touchpad, quirking all PDP Xbox One gamepads for better support, and some minor fixes… Linux 4.19.8 wouldn’t be worthy of a shout-out had it not been for Jens Axboe’s BLK-MQ patches part of this release.

Earlier this week the Linux 4.19+ data corruption issue was resolved and turned out not to be an EXT4 problem but rather an issue with the multi-queue block I/O queuing mechanism that could cause some data corruption when running without an I/O scheduler. Once that was figured out, the Linux 4.20 kernel quickly picked up the fixes and now it’s been back-ported to the 4.19.8 release. So particularly if using BLK-MQ with “none” as your I/O scheduler selection, make sure you upgrade to this latest release for data safety.

Greg also released Linux 4.14.87 and 4.9.144 as the latest for these LTS kernels albeit with no high profile changes.