Monday, December 12, 2011

Efficient Data Center Transformation Requires Consolidation and Standardization Across Critical IT Tasks

Transcript of a sponsored podcast discussion in conjunction with an HP video series on the best practices for developing a common roadmap for DCT.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: HP.

For more information on The HUB, HP's video series on data center transformation, go to www.hp.com/go/thehub.

Dana Gardner: Hi, this is Dana Gardner, Principal Analyst at Interarbor Solutions, and you’re listening to BriefingsDirect.Today, we present a sponsored podcast discussion on quick and proven ways to attain significantly improved IT operations and efficiency.

We'll hear from a panel of HP experts on some of their most effective methods for fostering consolidation and standardization across critical IT tasks and management. This is the second in a series of podcast on data center transformation (DCT) best practices and is presented in conjunction with a complementary video series. [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]

Here today we will specifically explore building quick data center project wins, leveraging project tracking and scorecards, as well as developing a common roadmap for both facilities and IT infrastructure. You don’t need to go very far in IT to find people who are diligently working to do more with less, even as they're working to transform and modernize their environments.

One way to keep the interest high and those operating and investment budgets in place is to show fast results and then use that to prime the pump for even more improvement and even more funding with perhaps even growing budgets.

With us now to explain how these solutions can drive successful data center transformation is our panel, Duncan Campbell, Vice President of Marketing for HP Converged Infrastructure and small to medium-sized businesses (SMBs); Randy Lawton, Practice Principal for Americas West Data Center Transformation & Cloud Infrastructure Consulting at HP, and Larry Hinman, Critical Facilities Consulting Director and Worldwide Practice Leader for HP Critical Facility Services and HP Technology Services. Welcome to you all.

Let's go first to Duncan Campbell on communicating an ongoing stream of positive results, why that’s important and necessary to set the stage for an ongoing virtuous adoption cycle for data center transformation and converged infrastructure projects.

Duncan Campbell: You bet, Dana. We've seen that when a customer is successful in breaking down a large project into a set of quick wins, there are some very positive outcomes from that.

Breeds confidence

N
umber one, it breeds confidence, and this is a confidence that is actually felt within the organization, within the IT team, and into the business as well. So it builds confidence both inside and outside the organization.

The other key benefit is that when you can manifest these quick wins in terms of some specific return on investment (ROI) business outcome, that also translates very nicely as well and gets a lot of key attention, which I think has some downstream benefits that actually help out the team in multiple ways.

Gardner: I suppose it's not only getting these quick wins, but effectively communicating them well. People really need to know about them.

Campbell: Right. So this is one of the things that some of the real leaders in IT realize. It's not just about attracting the best talent and executing well, but it's about marketing the team’s results as well.

One of the benefits in that is that you can actually break down these projects just in terms of some specific type of wins. That might be around standardization, and you can see a lot of wins there. You can quickly consolidate to blades. You can look at virtualization types of quick wins, as well as some automation quick wins.

We would advocate that customers think about this in terms of almost a step-by-step approach, knocking that down, getting those quick wins, and then marketing this in some very tangible ways that resonate very strongly.



We would advocate that customers think about this in terms of almost a step-by-step approach, knocking that down, getting those quick wins, and then marketing this in some very tangible ways that resonate very strongly.

Gardner: When you start to develop a cycle of recognition, incentives, and buy-in, I suppose we could also start to see some sort of a virtuous adoption cycle, whereby that sets you up for more interest, an easier time evangelizing, and so on.

Campbell: That’s exactly right. A virtuous cycle is well put. That allows really the team to get the additional green light to go to the next step in terms of their blueprint that they are trying to execute on. It gets a green light also in terms of additional dollars and, in some cases, additional headcount to add to their team as well.

What this does is, and I like this term the virtuous cycle, not only allow you to attract key talent, but it really allows you to retain folks. That means you're getting the best team possible to duplicate that, to get those additional wins, and it really does indeed become a virtuous cycle.

Gardner: I suppose one last positive benefit here might be that, as enterprises adopt more of what we call social networking and social media, the ability for the rank and file, those users involved with these products and services, can start to be your best word-of-mouth marketing internally.

TCO savings

Campbell: That’s right. A good example is where we have been able to see a significant total cost of ownership (TCO) type of savings with one of our customers, McKesson, that in fact was taking one of these consolidated approaches with all their development tools. They saw a considerable savings, both in terms of dollars, over $12.9 million, as well as a percentage of TCO savings that was upwards of 50 percent.

When you see tangible exciting numbers like that, that does grab people’s attention and, you bet, it becomes part of the whole social-media fabric and people want to go to a winner. Success breeds success here.

Gardner: Thank you. Next, we're going to go to Randy Lawton and hear some more about why tracking scorecards and managing expectations through proven data and metrics also contributes to a successful ongoing DCT activity.

Randy, why is it so important to know your baseline tracks and then measure them each and every step along the way?

Randy Lawton: Thank you, Dana. Many of the transformation programs we engage in with our customers are substantially complex and span many facets of the IT organization. They often involve other vendors and service providers in the customer organization.

So there’s a tremendous amount of detail to pull together and organize in these complex engagements and initiatives. We find that there’s really no way to do that, unless you have a good way of capturing the data that’s necessary for a baseline.

It’s important to note that we manage these programs through a series of phases in our methodology. The first phase is strategy and analysis. During that phase, we typically run a discovery on all IT assets that would include the data center, servers, storage, the network environment, and the applications that run on those environments.

During the course of the last few years, our services unit has made investments in a number of tools that help with the capture and management of the data, the scorecarding, and the analytics.



From that, we bridge into the second phase, which is architect and validate, where we begin to solution out and develop the strategies for a future-state design that includes the standardization and consolidation approaches, and on that begin to assemble the business case. In a detailed design, we build out those specifications and begin to create the data that determines what the future-state transformation is.

Then, through the implementation phase, we have detailed scorecards that are required to be tracked to show progress of the application teams and infrastructure teams that contribute to the program in order to guarantee success and provide visibility to all the stakeholders as part of the program, before we turn everything over to operations.

During the course of the last few years, our services unit has made investments in a number of tools that help with the capture and management of the data, the scorecarding, and the analytics through each of the phases of these programs. We believe that helps offer a competitive advantage for us and helps enable more rapid achievement of the programs from our customer perspective.

Gardner: As we heard from Duncan about why it’s important to demonstrate wins, I sense that organizations are really data driven now more than ever. It seems important to have actual metrics in place and be able to prove your work each step of the way.

Complex engagements

Lawton: That’s very true. In these complex engagements, it’s normally some time before there are quick-win type of achievements that are really notable.

For example, in the HP IT transformation program we undertook over several years back through 2008, we were building six new data centers so that we could consolidate 185 worldwide. So it was some period of time from the beginning of the program until the point where we moved the first application into production.

All along the way we were scorecarding the progress on the build-out of the data centers. Then, it was the build-out of the compute infrastructure within the data centers. And then it was a matter of being able to show the scorecarding against the applications, as we could get them into the next generation data centers.

If we didn't have the ability to show and demonstrate the progress along the way, I think our stakeholders would have lost patience or would not have felt that the momentum of the program was going on the kind of track that was required. With some of these tools and approaches and the scorecarding, we were able to demonstrate the progress and keep very visible to management the movements and momentum of the program.During the course of the last few years, our services unit has made investments in a number of tools that help with the capture and management of the data, the scorecarding, and the analytics.

If we didn't have the ability to show and demonstrate the progress along the way, I think our stakeholders would have lost patience or would not have felt that the momentum of the program was going on the kind of track that was required.



Gardner: Randy, I know that many organizations are diligent about the scorecarding across all sorts of different business activities and metrics. Have you noticed in some of these engagements that these readouts and feedback in the IT and data center transformation activities are somehow joined with other business metrics? Is there an executive scorecard level that these feed into to give more of a holistic overview? Is this something that works in tandem with other scorecarding activities in a typical corporation?

Lawton: It absolutely is, Dana. Often in these kind of programs there are business activities and projects that are going on within the business units. There are application projects that work into the program and then there are the infrastructure components that all have to be fit together at some level.

What we typically see is that the business will be reporting its set of metrics, each of the application areas will be reporting their metrics, and it’s typically from the infrastructure perspective where we pull together all of the application and infrastructure activities and sometimes the business metrics as well.

We've seen multiple examples with our customers where they are either all consolidated into executive scorecards that come out of the reporting from the infrastructure portion of the program that rolls it all together, or that the business may be running separate metrics and then application teams and infrastructure are running the IT level metrics that all get rolled together into some consolidated reporting on some level.

Gardner: And that, of course, ensures that IT isn’t the odd man out, when it comes to being on time and in alignment with these other priorities. That sounds like a very nice addition to the way things may have been done five or 10 years ago.

Lawton: Absolutely.

Gardner: Any examples, Randy, either with organizations you could name, or use cases where you could describe, where the use of this ongoing baselining, tracking, measuring, and delivering metrics facilitates some benefits? Any stories that you can share?

Cloning applications

Lawton: A very notable example is one of our telecom customers we worked with during the last year and finished a program earlier this year. The company was purchasing the assets of another organization and needed to be able to clone the applications and infrastructure that supported business processes from the acquired company.

Within the mix of delivery for stakeholders in the program, there were nine different companies represented. There were some outsourced vendors from the application support side in the acquiree’s company, outsourcers in the application side for the acquiring company, and outsourcers in the data centers that operated data center infrastructure and operations for the target data centers we were moving into.

What was really critical in pulling all this together was to be able to map out, at a very detailed level, the tasks that needed to be executed, and in what time frame, across all of these teams.

The final cutover migration required over 2,500 tasks across these 9 different companies that all needed to be executed in less than 96 hours in order to meet the downtime window of requirements that were required of the acquiring company’s executive management.

It was the detailed scorecarding and operating war rooms to keep those scorecards up to date in real-time that allowed us to be able to accomplish that. There’s just no possible way we would have been able to do that ahead of time.

For more information on The HUB, HP's video series on data center transformation, go to www.hp.com/go/thehub.

I think that HP was very helpful in working with the customer and bringing that perspective into the program very early on, because there had been a failed attempt to operate this program prior to that, and with our assistance and with developing these tools and capabilities, we were able to successfully achieve the objectives of that program.

Gardner: One thing that jumped out at me there was your use of the words real time. How important is it to capture this data and adjust it and update it in real-time, where there’s not a lot of latency? How has that become so important?

Lawton: In this particular program, because there were so many activities taking place in parallel by representatives from all over the world across these nine different companies, the real-time capture and update of all of the data and information that went into the scorecarding was absolutely essential.

In some of the other programs we've operated, there was not such a compressed time frame that required real-time metrics, but we, at minimum, often required daily updates to the metrics. So each program, the strategies that drive that program, and some of the time constraints will drive what the need is for the real-time update.

We often can provide the capabilities for the real-time updates to come from all stakeholders in the program, so that the tools can capture the data, as long as the stakeholders are providing the updates on a real-time basis.

Gardner: So as is often the case, good information in, good results back.

Lawton: Absolutely.

Organizing infrastructure

Gardner: Let’s move now to our third panelist today. We're going to hear about why organizing facilities and infrastructure planning in conjunction in relationship to one another is so important.

Now to Larry Hinman. Larry, let’s go historical for a second. Has there usually been a completely separate direction for facilities planning in IT infrastructure? Why was that the case, and why is it so important to end that practice?

Larry Hinman: Hi, Dana. If you look over time and over the last several years, everybody has data centers and everybody has IT. The things that we've seen over the last 10 or 15 years are things like the Internet and criticality of IT and high density and all this stuff that people are talking about these days. If you look at the ways companies organized themselves several years ago, IT was a separate organization, facilities was a separate organization, and that actually still exists today.

One of the things that we're still seeing today is that, even though there is this push to try to get IT groups and facilities organizations to talk and work each other, this gap that exists between truly how to glue all of this together.

If you look at the way people do this traditionally -- and when I say people, I'm talking about IT organizations and facilities organization -- they typically will model IT and data centers, even if they are attempting to try and glue them together, they try to look at power requirements.

One of the things that we spotted a few years ago was that when companies do this, the risk of over provisioning or under provisioning is very high. We tried to figure out a way to back this up a few notches.

What we figured out was that you have to stop and back up a few notches to really start to get all this glued together.



How can we remedy this problem and how can we bring some structure to this and bring some, what I would call, sanity to the whole equation, to be able to have something predictable over time? What we figured out was that you have to stop and back up a few notches to really start to get all this glued together.

So we took this whole complex framework and data center program and broke it into four key areas. It looks simplistic in the way we've done this, and we have done this over many, many years of analysis and trying to figure out exactly what direction we should take. We've actually spun this off in many directions a few times, trying to continually make it better, but we always keep coming back to these four key profiles.

Business and risk is the first profile. IT architecture, which is really the application suite, is the second profile. IT infrastructure is the third. Data center facilities is the fourth.

One of the things that you will start to hear from us, if you haven’t heard it already via the data center transformation story that you guys were just recently talking about, is this nomenclature of IT plus facilities equals the data center.

Getting synchronized

L
ook at that, look at these four profiles, and look at what we call a top-down approach, where I start to get everybody synchronized on what risk profiles are and tolerances for risk are from an IT perspective and how to run the business, gluing that together with an IT infrastructure strategy, and then gluing all that into a data center facility strategy.

What we found over time is that we were able to take this complex program of trying to have something predictable, scalable, all of the groovy stuff that people talk about these days, and have something that I could really manage. If you're called into the boss’s office, as I and others have been over the many years in my career, to ask what’s the data center going to look like over the next five years, at least I would have some hope of trying to answer that question.

That is kind of the secret sauce here, and the way we have developed our framework was breaking this complex program into these four key areas. I'm certainly not trying to say this is an easy thing to do. In a lot of companies, it’s culture changes. It’s a threat to the way the very organization is organized from an IT and a facilities perspective. The risk and recovery teams and the management teams all have to start working together collaboratively and collectively to be able to start to glue this together.

Gardner: You mentioned earlier the issues around energy and the ongoing importance around the cost structure for that. I suppose it's not just fitting these together, but making them fit for purpose. That is to say, IT and facilities on an ongoing basis.

You get it pointing the right direction, collect the data, complete the modeling, put it in the toolset, and now you have something very dynamic that you can manage over time.



It’s not really something that you do and sit still, as would have been the case several years ago, or in the past generation of computing. This is something that's dynamic. So how do you allow a fit-for-purpose goal with data-center facilities to be something that you can maintain over time, even as your requirements change?

Hinman: You just hit a very important point. One of the the big lessons learned for us over the years has been this ability to not only provide this kind of modeling and predictability over time for clients and for customers. We had to get out of this mode of doing this once and putting it on a shelf, deploying a future state data center framework, keep client pointing in the right direction.

The data is, as you said, gets archived, and they pick it up every few years and do it again and again and again, finding out that a lot of times there's an "aha" moment during those periods, the gaps between doing it again and again.

One thing that we have learned is to not only have this deliberate framework and break it into these four simplistic areas, where we can manage all of this, but to redevelop and re-hone our tools and our focus a little bit, so that we could use this as a dynamic ongoing process to get the client pointing the right direction. Build a data center framework that truly is right size, integrated, aligned, and all that stuff. But then, to have something that was very dynamic that they could manage over time.

That's what we've done. We've taken all of our modeling tools and integrated them to common databases, where now we can start to glue together even the operational piece, of data center infrastructure management (DCIM), or architecture and infrastructure management, facilities management, etc., so now the client can have this real-time, long-term, what we call a 10-year view of the overall operation.

So now, you do this. You get it pointing the right direction, collect the data, complete the modeling, put it in the toolset, and now you have something very dynamic that you can manage over time. That's what we've done, and that's where we have been heading with all of our tools and processes over the last two to three years.

EcoPOD concept

Gardner: I also remember with great interest the news from HP Discover in Las Vegas last summer about your EcoPOD and the whole POD concept toward facilities and infrastructure. Does that also play a part in this and perhaps make it easier when your modularity is ratcheted up to almost a mini data center level, rather than at the server or rack level?

Hinman: With the various what we call facility sourcing options, which PODs are certainly one of those these days, we've also been very careful to make sure that our framework is completely unbiased when it comes to a specific sourcing option.

What that means is, over the last 10 plus years, most people were really targeted at building new green-field data centers. It was all about space, then it became all about power, then about cooling, but we were still in this brick and mortar age, but modularity and scalability has been driving everything.

With PODs coming on the scene with some of the other design technologies, like multi-tiered or flexible data center, what we've been able to do is make sure that our framework is targeted at almost a generic framework where we can complete all the growth modeling and analysis, regardless of what the client is going to do from a facilities perspective.

It lays the groundwork for the customer to get their arms around all of this and tie together IT and facilities with risk and business, and then start to map out an appropriate facility sourcing option.

We find these days that POD is actually a very nice fit with all of our clients, because it provides high density server farms, it provides things that they can implement very quickly, and gets the power usage effectiveness (PUE) and power and operational cost down.



We find these days that POD is actually a very nice fit with all of our clients, because it provides high density server farms, it provides things that they can implement very quickly, and gets the power usage effectiveness (PUE) and power and operational cost down. We're starting to see that take a stronghold in a lot of customers.

Gardner: As we begin to wrap up, I should think that these trends are going to be even more important, these methods even more productive, when we start to factor in movement toward private cloud. There's the need to support more of a mobile tier set of devices, and the fact that we're looking for of course even more savings on those long-term energy and operating costs.

Back to you, Randy Lawton. Any thoughts about how scorecards and tracking will be even more important in the future, as we move, as we expect we will, to a more cloud-, mobile-, and eco-friendly world?

Lawton: Yes, Dana. In a lot of ways, there is added complexity these days with more customers operating in a hybrid delivery model, where there may be multiple suppliers in addition to their internal IT organizations.

Greater complexity

Just like the example case I gave earlier, where you spread some of these activities not only across multiple teams and stakeholders, but also into separate companies and suppliers who are working under various contract mechanism, the complexity is even greater. If that complexity is not pulled into a simplified model that is beta driven, that is supported by plans and contracts, then there are big gaps in the programs.

The scorecarding and data gathering methods and approaches that we take on our programs are going to be even more critical as we go forward in these more complex environments.

Operating the cloud environments simplifies things from a customer perspective, but it does add some additional complexities in the infrastructure and operations of the organization as well. All of those complexities add up to, meaning that even more attention needs to be brought to the details of the program and where those responsibilities lie within stakeholders.

Gardner: Larry Hinman, we're seeing this drive toward cloud. We're also seeing consolidation and standardization around data center infrastructure. So perhaps more large data centers to support more types of applications to even more endpoints, users, and geographic locations or business units. Getting that facilities and IT equation just right becomes even more important as we have fewer, yet more massive and critical, data centers involved.

Hinman: Dana, that's exactly correct. If you look at this, you have to look at the data center facilities piece, not only from a framework or model or topology perspective, but all the way down to the specific environment.

You have to look at the data center facilities piece, not only from a framework or model or topology perspective, but all the way down to the specific environment.



It could be that based on a specific client’s business requirements and IT strategy that it will require possibly a couple of large-scale core data centers and multiple remote sites and/or it could just be a bunch of smaller types of facilities.

It really depends on how the business is being run and supported by IT and the application suite, what the tolerances for risk are, whether it’s high availability, synchronous, all the groovy stuff, and then coming up with a framework that matches all those requirements that it’s integrating.

We tell clients constantly that you have to have your act together with respect to your profile, and start to align all of this, before you can even think about cloud and all the wonderful technologies that are coming down the pike. You have to be able to have something that you can at least manage to control cost and control this whole framework and manage to a future-state business requirement, before you can even start to really deploy some of these other things.

So it all glues together. It's extremely important that customers understand that this really is a process they have to do.

Gardner: Very good. You've been listening to a sponsored BriefingsDirect podcast discussion on how quick and proven ways to attain productivity can significantly improve IT operations and efficiency.

This is the second in an ongoing series of podcasts on data center transformation best practices and is presented in conjunction with a complementary video series.

I'd like to thank our guests, Duncan Campbell, Vice President of Marketing for HP Converged Infrastructure and SMB; Randy Lawton, Practice Principal in the Americas West Data Center Transformation & Cloud Infrastructure Consulting at HP, and Larry Hinman, Critical Facilities Consulting Director and Worldwide Practice Leader for HP Critical Facility Services and HP Technology Services. So thanks to you all.

This is Dana Gardner, Principal Analyst at Interarbor Solutions. Also, thanks to our audience for listening, and come back next time.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: HP.

For more information on The HUB, HP's video series on data center transformation, go to www.hp.com/go/thehub.

Transcript of a sponsored podcast discussion in conjunction with an HP video series on the best practices for developing a common roadmap for DCT. Copyright Interarbor Solutions, LLC, 2005-2011. All rights reserved.

You may also be interested in:

Wednesday, November 30, 2011

Big Data Meets Complex Event Processing: AccelOps Delivers a Better Architecture to Attack the Data Center Monitoring and Analytics Problem

Transcript of a BriefingsDirect podcast on how enterprises can benefit from capturing and analyzing systems data to improve IT management in real-time.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: AccelOps.

Connect with AccelOps: Linkedin, Twitter, Facebook, RSS.

Dana Gardner: Hi, this is Dana Gardner, Principal Analyst at Interarbor Solutions, and you're listening to BriefingsDirect.

Today, we present a sponsored podcast discussion on how new data and analysis approaches are significantly improving IT operations monitoring, as well as providing stronger security. We'll examine how advances in big data analytics and complex events processing (CEP) can come together to provide deep and real-time, pattern-based insight into large-scale IT operations.

AccelOps has developed the technology to correlate events with relevant data across IT systems, so that operators can gain much better insights faster, and then learn as they go to better predict future problems before they emerge. [Disclosure: AccelOps is a sponsor of BriefingsDirect podcasts.]

With us now to explain how these new solutions can drive better IT monitoring and remediation response -- and keep those critical systems performing at their best -- is our guest, Mahesh Kumar, Vice President of Marketing at AccelOps. Welcome to BriefingsDirect, Mahesh.

Mahesh Kumar: Dana, glad to be here.

Gardner: It's always been difficult to gain and maintain comprehensive and accurate analysis of large-scale IT operations, but it seems, Mahesh, that this is getting more difficult. I think there have been some shifts in computing in general in these environments that makes getting a comprehensive view of what’s going on perhaps more difficult than ever. Is that fair in your estimation?

Kumar: Absolutely, Dana. There are several trends that are fundamentally questioning existing and traditional ways of monitoring a data center.

Gardner: Of course we're seeing lots of virtualization. People are getting into higher levels of density, and so forth. How does that impact the issue about monitoring and knowing what’s going on with your systems? How is virtualization a complexity factor?

Kumar: If you look at trends, there are on average about 10 virtual machines (VMs) to a physical server. Predictions are that this is going to increase to about 50 to 1, maybe higher, with advances in hardware and virtualization technologies. So that’s one trend, the increase in density of VMs is a complicating factor for capacity planning, capacity management, performance management, and security.

Corresponding to this is just the sheer number of VMs being added in the enterprise. Analysts estimate that just in the last few years, we have added as many VMs as there were physical machines. In a very short period of time, you have in effect seen a doubling of the size of the IT management problem. So there are a huge number of VMs to manage and that introduces complexity and a lot of data that is created.

Moreover, your workloads are constantly changing. vMotion and DRS are causing changes to happen in hours, minutes, or even seconds, whereas in the past, it would take a week or two for a new server to be introduced, or a server to be moved from one segment of the network to the other.

So change is happening much more quickly and rapidly than ever before. At the very least, you need monitoring and management that can keep pace with today’s rate of change.

Cloud computing

Cloud computing is another big trend. All analyst research and customer feedback suggests that we're moving to a hybrid model, where you have some workloads on a public cloud, some in a private cloud, and some running in a traditional data center. For this, monitoring has to work in a distributed environment, across multiple controlling parties.

Last but certainly not the least, in a hybrid environment, there is absolutely no clear perimeter that you need to defend from a security perspective. Security has to be pervasive.

Given these new realities, it's no longer possible to separate performance monitoring aspects from security monitoring aspects, because of the distributed nature of the problem. You can’t have two different sets of eyes looking at multiple points of presence, from different angles and then try to piece that together.

Those are some of the trends that are causing a fundamental rethink in how IT monitoring and management systems have to be architected.

Gardner: And even as we're seeing complexity ramp-up in these data centers, many organizations are bringing these data centers together and consolidating them. At the same time, we're seeing more spread of IT into remote locations and offices. And we're seeing more use of mobile and distributed activities for data and applications. So we're not only talking about complexity, but we're talking about scale here.

Every office with voice over IP (VoIP) phones needs some servers and network equipment in their office, and those servers and network equipment have to be secured and their up-time guaranteed.



Kumar: And very geographically distributed scale. To give you an example, every office with voice over IP (VoIP) phones needs some servers and network equipment in their office, and those servers and network equipment have to be secured and their up-time guaranteed.

So what was typically thought of as a remote office now has a mini data center, or at least some elements of a data center, in it. You need your monitoring and management systems to have the reach and can easily and flexibly bring those under management and ensure their availability and security.

Gardner: What are some of the ways that you can think about this differently? I know it’s sort of at a vision level, but typically in the past, people thought about a system and then the management of that system. Now, we have to think about clouds and fabrics. We're just using a different vocabulary to describe IT. I suppose we need to have a different vocabulary to describe how we manage and monitor it as well.

Kumar: The basic problem you need to address is one of analysis. Why is that? As we discussed earlier, the scale of systems is really high. The pace of change is very high. The sheer number of configurations that need to be managed is very large. So there's data explosion here.

Since you have a plethora of information coming at you, the challenge is no longer collection of that information. It's how you analyze that information in a holistic manner and provide consumable and actionable data to your business, so that you're able to actually then prevent problems in the future or respond to any issues in real-time or in near real-time.

You need to nail the real-time analytics problem and this has to be the centerpiece of any monitoring or management platform going forward.

Fire hose of data

Gardner: In the past, this fire hose of data was often brought into a repository, perhaps indexed and analyzed, and then over time reports and analysis would be derived from it. That’s the way that all data was managed.

But we really can't take the time to do that, especially when we have to think about real-time management. Is there a fundamental change in how we approach the data that’s coming from IT systems in order to get a better monitoring and analysis capability?

Kumar: The data has to be analyzed in real-time. By real-time I mean in streaming mode before the data hits the disk. You need to be able to analyze it and make decisions. That's actually a very efficient way of analyzing information. Because you avoid a lot of data sync issues and duplicate data, you can react immediately in real time to remediate systems or provide very early warnings in terms of what is going wrong.

The challenges in doing this streaming-mode analysis are scale and speed. The traditional approaches with pure relational databases alone are not equipped to analyze data in this manner. You need new thinking and new approaches to tackle this analysis problem.

Gardner: Also for issues of security, you don't want to find out about security weaknesses by going back and analyzing a bunch of data in a repository. You want to be able to look and find correlations about what's going on, where attacks might be originating, and how that might be affecting different aspects of your infrastructure.

Attackers may hijack an account or gain access to a server, and then over time, stealthily, be able to collect or capture the information that they are after.



People are trying different types of attacks. So this needs to be in real-time as well. It strikes me that if you want to solve security as well as monitoring, that that is also something that has to be in real-time and not something that you go back to every week or month.

Kumar: You might be familiar with advanced persistent threats (APTs). These are attacks where the attacker tries their best to be invisible. These are not the brute-force attacks that we have witnessed in the past. Attackers may hijack an account or gain access to a server, and then over time, stealthily, be able to collect or capture the information that they are after.

These kinds of threats cannot be effectively handled only by looking at data historically, because these are activities that are happening in real-time, and there are very, very weak signals that need to be interpreted, and there is a time element of what else is happening at that time. What seems like disparate sets of activity have to be brought together to be able to provide a level of defense or a defense mechanism against these APTs. This too calls for streaming-mode analysis.

If you notice, for example, someone accessing a server, a database administrator accessing a server for which they have an admin account, it gives you a certain amount of feedback around that activity. But if on the other hand, you learn that a user is accessing a database server for which they don’t have the right level of privileges, it may be a red flag.

You need to be able to connect this red flag that you identify in one instance with the same user trying to do other activity in different kinds of systems. And you need to do that over long periods of time in order to defend yourself against APTs.

Advances in IT

Gardner: So we have the modern data center, we have issues of complexity and virtualization, we have scale, we have data as a deluge, and we need to do something fast in real-time and consistently to learn and relearn and derive correlations.

It turns out that there are some advances in IT over the past several years that have been applied to solve other problems that can be brought to bear here.

This is one of the things that really jumped out at me when I did my initial briefing with AccelOps. You've looked at what's being done with big data and in-memory architectures, and you've also looked at some of the great work that’s been done in services-oriented architecture (SOA) and CEP, and you've put these together in an interesting way.

Let's talk about what the architecture needs to be in order to start doing for IT what we have been doing with retail data or looking at complex events in a financial environment to derive inference into what's going on in the real world. What is the right architecture, now that we need to move to for this higher level of operations and monitoring?

Kumar: Excellent point, Dana. Clearly, based on what we've discussed, there is a big-data angle to this. And, I want to clarify here that big data is not just about volume.

A single configuration setting can have a security implication, a performance implication, an availability implication, and even a capacity implication in some cases.



Doug Laney, a META and a Gartner analyst, probably put it best when he highlighted that big data is about volume, the velocity or the speed with which the data comes in and out, and the variety or the number of different data types and sources that are being indexed and managed. I would add to this a fourth V, which is verdicts, or decisions, that are made. How many decisions are actually impacted or potentially impacted by a slight change in data?

For example, in an IT management paradigm, a single configuration setting can have a security implication, a performance implication, an availability implication, and even a capacity implication in some cases. Just a small change in data has multiple decision points that are affected by it. From our angle, all these different types of criteria affect the big data problem.

When you look at all these different aspects of IT management and how it impacts what essentially presents itself as a big data challenge or a big data problem, that’s an important angle that all IT management and monitoring products need to incorporate in their thinking and in their architectures, because the problem is only going to get worse.

Gardner: Understanding that big data is the issue, and we know what's been done with managing big data in this most comprehensive definition, how can we apply that realistically and practically to IT systems?

It seems to me that you are going to have to do more with the data, cleansing it, discovering it, and making it manageable. Tell me how we can apply the concepts of big data that people have been using in retail and these other applications, and now point that at the IT operations issues and make it applicable and productive.

Couple of approaches

Kumar: I mentioned the analytics ability as central to monitoring systems – big-data analytics to be specific. There are a couple of approaches. Some companies are doing some really interesting work around big-data analysis for IT operations.

They primarily focus on gathering the data, heavily indexing it, and making it available for search, thereby derive analytical results. It allows you to do forensic analysis that you were not easily able to with traditional monitoring systems.

The challenge with that approach is that it swings the pendulum all the way to the other end. Previously we had a very rigid, well-defined relational data-models or data structures, and the index and search approach is much more of a free form. So the pure index-and-search type of an approach is sort of the other end of the spectrum.

What you really need is something that incorporates the best of both worlds and puts that together, and I can explain to you how that can be accomplished with a more modern architecture. To start with, we can't do away with this whole concept of a model or a relationship diagram or entity relationship map. It's really critical for us to maintain that.

I’ll give you an example. When you say that a server is part of a network segment, and a server is connected to a switch in a particular way, it conveys certain meaning. And because of that meaning, you can now automatically apply policies, rules, patterns, and automatically exploit the meaning that you capture purely from that relationship. You can automate a lot of things just by knowing that.

If you stick to a pure index-and-search approach, you basically zero out a lot of this meaning and you lose information in the process.



If you stick to a pure index-and-search approach, you basically zero out a lot of this meaning and you lose information in the process. Then it's the operators who have to handcraft these queries to have to then reestablish this meaning that’s already out there. That can get very, very expensive pretty quickly.

Even at a fairly small scale, you'll find more and more people having to do things, and a pure index and search approach really scales with people, not as much with technology and automation. Index and search certainly adds a positive dimension to traditional IT monitoring tools -- but that alone is not the answer for the future.

Our approach to this big-data analytics problem is to take a hybrid approach. You need a flexible and extensible model that you start with as a foundation, that allows you to then apply meaning on top of that model to all the extended data that you capture and that can be kept in flat files and searched and indexed. You need that hybrid approach in order to get a handle on this problem.

Gardner: I suppose you also have to have your own architecture that can scale. So you're going to concepts like virtual appliances and scaling on-demand vis-à-vis clustering, and taking advantage of in-memory and streaming capabilities to manage this. Tell me why you need to think about the architecture that supports this big data capability in order for it to actually work in practical terms?

Kumar: You start with a fully virtualized architecture, because it allows you not only to scale easily. From a reach standpoint, with a virtualized architecture, you're able to reach into these multiple disparate environments and capture and analyze and bring that information in. So virtualized architecture is absolutely essentially for you to start with.

Auto correlate

Maybe more important is the ability for you to auto-correlate and analyze data, and that analysis has to be distributed analysis. Because whenever you have a big data problem, especially in something like IT management, you're not really sure of the scale of data that you need to analyze and you can never plan for it.

Let me put it another way. There is no server big enough to be able to analyze all of that. You'll always fall short of compute capacity because analysis requirements keep growing. Fundamentally, the architecture has to be one where the analysis is done in a distributed manner. It's easy to add compute capacity by scaling horizontally. Your architecture fits how computing models are evolving over the long run. So there are a lot of synergies to be exploited here by having a distributed analytics framework.

Think of it as applying a MapReduce type of algorithm to IT management problems, so that you can do distributed analysis, and the analysis is highly granular or specific. In IT management problems, it's always about the specificity with which you analyze and detect a problem that makes all the difference between whether that product or the solution is useful for a customer or not.

Gardner: In order for us to meet our requirements around scale and speed, we really have to think about the support underneath these capabilities in a new way. It seems like, in a sense, architecture is destiny when it comes to the support and monitoring for these large volumes in this velocity of data.

Let's look at the other part of this. We talked about the big data, but in order for the solution to work, we're looking at CEP capabilities in an engine that can take that data and then work with it and analyze it for these events and these programmable events and looking for certain patterns.

A major advantage of distributed analytics is that you're freed from the scale-versus-richness trade-off, from the limits on the type of events you can process.



Now that we understand the architecture and why it's important, tell me why this engine brings you to a higher level and differentiates you in the field around the monitoring.

Kumar: A major advantage of distributed analytics is that you're freed from the scale-versus-richness trade-off, from the limits on the type of events you can process. If I wanted to do more complex events and process more complex events, it's a lot easier to add compute capacity by just simply adding VMs and scaling horizontally. That’s a big aspect of automating deep forensic analysis into the data that you're receiving.

I want to add a little bit more about the richness of CEP. It's not just around capturing data and massaging it or looking at it from different angles and events. When we say CEP, we mean it is advanced to the point where it starts to capture how people would actually rationalize and analyze a problem.

For example, the ability for people in a simple visual snapshot to connect three different data points or three events together and say that they're interrelated and they point to a specific problem.

The only way you can automate your monitoring systems end-to-end and get more of the human element out of it is when your CEP system is able to capture those nuances that people in the NOC and SOC would normally use to rationalize when they look at events. You not only look at a stream of events, you ask further questions and then determine the remedy.

No hard limits

To do this, you should have a rich data set to analyze, i.e. there shouldn’t be any hard limits placed on what data can participate in the analysis and you should have the flexibility to easily add new data sources or types of data. So it's very important for the architecture to be able to not only event on data that are is stored in in traditional models or well-defined relational models, but also event against data that’s typically serialized and indexed in flat file databases.

This hybrid approach basically breaks the logjam in terms of creating these systems that are very smart and that could substitute for people in terms of how they think and how they react to events that are manifested in the NOC. You are not bound to data in an inflexible vendor defined model. You can also bring in the more free-form data into the analytics domain and do deep and specific analysis with it.

Cloud and virtualization are also making this possible. Although they’ve introduced more complexity due to change frequency, distributed workloads etc., they’ve also introduced some structure into IT environments. An example here is the use of converged infrastructure (Cisco UCS, HP Blade Matrix) to build private-cloud environments. At least at the infrastructure level it introduces some order and predictability.

Gardner: All right, Mahesh, we've talked about the problem in the market, we have talked about high-level look at the solution and why you need to do things differently, and why having the right architecture to support that is important, but let's get into the results.

If you do this properly, if you leverage and exploit these newer methods in IT -- like big data, analytics, CEP, virtual appliances and clustered instances of workload and support, and when you apply all those to this problem about the fire hose of data coming out of IT systems, a comprehensive look at IT in this fashion -- what do you get? What's the payoff if you do this properly?

Their needs are really around managing security, performance and configurations. These are three interconnected metrics in a virtualized cloud environment.



Kumar: I want to answer this question from a customer standpoint. It is no surprise that our customers don’t come to us saying we have a big data problem, help us solve a big data problem, or we have a complex event problem.

Their needs are really around managing security, performance and configurations. These are three interconnected metrics in a virtualized cloud environment. You can't separate one from the other. And customers say they are so interconnected that they want these managed on a common platform. So they're really coming at it from a business-level or outcome-focused perspective.

What AccelOps does under the covers, is apply techniques such as big-data analysis, complex driven processing, etc., to then solve those problems for the customer. That is the key payoff -- that customer’s key concerns that I just mentioned are addressed in a unified and scalable manner.

An important factor for customer productivity and adoption is the product user-interface. It is not of much use if a product leverages these advanced techniques but makes the user interface complicated -- you end up with the same result as before. So we’ve designed a UI that’s very easy to use, requires one or two clicks to get the information you need; a UI-driven ability to compose rich events and event patterns. Our customers find this very valuable, as they do not need super-specialized skills to work with our product.

Gardner: What's important to think about when we mention your customers is not just applying this value to an enterprise environment. Increasingly the cloud, with the virtualization, the importance of managing performance to very high standards, these are also impacting the cloud providers, managed service providers (MSPs), and software-as-a-service (SaaS) providers.

Up and running

T
his sounds like an architecture, an approach and a solution that's going to really benefit them, because their bread and butter is about keeping all of the systems up and running and making sure that all their service level agreements (SLAs) and contracts are being managed and adhered to.

Just to be clear, we're talking about an approach for a fairly large cross-section of the modern computing world -- enterprises and many different stripes of what we consider as service providers.

Kumar: Service providers are a very significant market segment for us and they are some of our largest customers. The reason they like the architecture that we have, very clearly, is that it's scalable. They know that the architecture scales as their business scales.

They also know that they get both the performance management and the security management aspects in a single platform. They're actually able to differentiate their customer offerings compared to other MSPs that may not have both, because security becomes really critical.

For anyone wanting to outsource to an MSP, the first question or one of the first questions that they are going to ask, in addition to the SLAs, are how are you going to ensure security? So to have both of those options is absolutely critical.

Subscription based licensing, which we offer in addition to perpetual licensing, also fits well with the CSP/MSP business model.



The third piece really is the fact that our architecture is multi-tenant from day one. We're able to bring customers on board with a one-touch mechanism, where they can bring the customer online, apply the right types of policies, whether it's SLA policies or security polices, automatically in our product and completely segment the data from one customer to the other.

All of that capability was built into our products from day one. So we didn’t have to retrofit any of that. That’s something our cloud-service providers and managed service provider customers find very appealing in terms of adopting AccelOps products.

Subscription based licensing, which we offer in addition to perpetual licensing, also fits well with the CSP/MSP business model.

Gardner: All right. Let's introduce your products in a little bit more detail. We understand you have created a platform, an architecture, for doing these issues or solving these issues for these very intense types of environments, for these large customers, enterprises, and service providers. Tell us a little bit about your portfolio.

Key metrics

Kumar: What we've built is a platform that monitors data center performance, security, and configurations. The three key interconnected metrics in virtualized cloud environments. Most of our customers really want that combined and integrated platform. Some of them might choose to start with addressing security, but they soon bring in the performance management aspects into it also. And vice versa.

And we take a holistic cross-domain perspective -- we span server, storage, network, virtualization and applications.

What we've really built is a common consistent platform that addresses these problems of performance, security, and configurations, in a holistic manner and that’s the main thing that our customers buy from us today.

Gardner: It sounds as if we're doing business intelligence for IT. We really are getting to the point where we can have precise dashboards, and we are not just making inferences and guesses. We're not just doing Boolean searches on old or even faulty data.

We're really looking at the true data, the true picture in real-time, and therefore starting to do the analysis that I think can start driving productivity to even newer heights than we have been accustomed to. So is that the vision, business intelligence (BI) for IT?

As you add the number of VMs or devices, you simply cannot scale the management cost, in a linear fashion. You want to have continuously reducing management cost for every new VM added or new device introduced.



Kumar: I guess you could say that. To break it down, from an IT management and monitoring standpoint, it is on an ongoing basis to continuously reducing the per capita management costs. As you add the number of VMs or devices, you simply cannot scale the management cost, in a linear fashion. You want to have continuously reducing management cost for every new VM added or new device introduced.

The way you do that is obviously through automation and through a self-learning process, whereby as you continue to learn more and more about the behavior of your applications and infrastructure, you're able to start to easily codify more and more of those patterns and rules in the system, thereby taking sort of the human element out of it bit by bit.

What we have as a product and a platform is the ability for you to increase the return on investment (ROI) on the platform as you continue to use that platform day-to-day. You add more information and enrich the platform with more rules, more patterns, and complex events that you can detect and potentially take automated actions on in the future.

So we create a virtuous cycle, with our product returning higher and higher return on your investment with time. Whereas, in traditional products, scale and longevity have the opposite effect.

So that’s really our vision. How do you reduce the per capita management cost as the scale of the enterprises start to increase, and how do you increase more automation as one of the elements of reducing the management cost within IT?

Gardner: You have given people a path to start in on this, sort of a crawl-walk-run approach. Tell me how that works. I believe you have a trial download, an opportunity for people to try this out for free.

Free trial download

Kumar: Most of our customers start off with the free trial download. It’s a very simple process. Visit www.accelops.com/download and download a virtual appliance trial that you can install in your data center within your firewall very quickly and easily.

Getting started with the AccelOps product is pretty simple. You fire up the product and enter the credentials needed to access the devices to be monitored. We do most of it agentlessly, and so you just enter the credentials, the range that you want to discover and monitor, and that’s it. You get started that way and you hit Go.

The product then uses this information to determine what’s in the environment. It automatically establishes relationships between them, automatically applies the rules and policies that come out of the box with the product, and some basic thresholds that are already in the product that you can actually start measuring the results. Within a few hours of getting started, you'll have measurable results and trends and graphs and charts to look at and gain benefits from it.

That’s a very simple process, and I encourage all our listeners and readers to download our free trial software and try AccelOps.

Gardner: I also have to imagine that your comments a few moments ago about not being able to continue on the same trajectory when it comes to management is only going to accelerate the need to automate and find the intelligent rather than the hard or laborious way to solve this when we go to things like cloud and increased mobility of workers and distributed computing.

It’s about automation and distributed analytics and about getting very specific with the information that you have, so that you can make absolutely more predictable, 99.9 percent correct of decisions and do that in an automated manner.



So the trends are really in your favor. It seems that as we move toward cloud and mobile that at some point or another organizations will hit the wall and look for the automation alternative.

Kumar: It’s about automation and distributed analytics and about getting very specific with the information that you have, so that you can make absolutely more predictable, 99.9 percent correct of decisions and do that in an automated manner. The only way you can do that is if you have a platform that’s rich enough and scalable and that allows you to then reach that ultimate goal of automating most of the management of these diverse and disparate environments.

That’s something that's sorely lacking in products today. As you said, it's all brute-force today. What we have built is a very elegant, easy-to-use way of managing your IT problems, whether it’s from a security standpoint, performance management standpoint, or configuration standpoint, in a single integrated platform. That's extremely appealing for our customers, both enterprise and cloud-service providers.

I also want to take this opportunity to encourage those of your listening or reading this podcast to come meet our team at the 2011 Gartner Data Center Conference, Dec. 5-9, at Booth 49 and learn more. AccelOps is a silver sponsor of the conference.

Gardner: I am afraid we will have to leave it there. You've been listening to a sponsored BriefingsDirect podcast. We've been talking about how new data and analysis approaches from AccelOps are attaining significantly improved IT operations monitoring as well as stronger security.

I'd like to thank our guest, Mahesh Kumar, Vice President of Marketing at AccelOps. Thank so much, Mahesh.

Kumar: Thank you, Dana.

Gardner: This is Dana Gardner, Principal Analyst at Interarbor Solutions. Thanks again for listening and come back next time.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: AccelOps.

Connect with AccelOps: Linkedin, Twitter, Facebook, RSS.

Transcript of a BriefingsDirect podcast on how enterprises can benefit from capturing and analyzing systems data to improve IT management in real-time. Copyright Interarbor Solutions, LLC, 2005-2011. All rights reserved.

You may also be interested in:

Tuesday, November 29, 2011

HP Discover Case Study: Vodafone Ireland IT Group Sees Huge ROI By Emphasizing Business Service Delivery

Transcript of a BriefingsDirect podcast in conjunction with HP Discover 2011 in Vienna on how a major telecom provider has improved service to customers by shifting from a technology emphasis to business service delivery.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: HP.

Dana Gardner: Hello, and welcome to a special BriefingsDirect podcast series coming to you in conjunction with the HP Discover 2011 Conference in Vienna.

We’re here in the week of Nov. 28, to explore some major case studies from some of Europe’s leading enterprises. We'll see how a series of innovative solutions and an IT transformation approach to better support business goals is benefiting these companies, their internal users, and their global customers.

I'm Dana Gardner, Principal Analyst at Interarbor Solutions and I'll be your host throughout this series of HP-sponsored Discover live discussions.

Out next customer case study interview highlights how a shift from a technology emphasis to a business services delivery emphasis has created significant improvements for a large telecommunications provider, Vodafone.

To learn more, we’re here with Shane Gaffney, Head of IT operations for Vodafone Ireland in Dublin. Welcome to the show, Shane.

Shane Gaffney: Thank you, Dana.

Gardner: Tell me what was the challenge that you faced when you decided to switch from a focus on technology purely to one more of a business user mentality? Why did you think you needed to do that?

Gaffney: Back in summer of 2010, when we looked at the business perception of the quality of service received from IT, the confidence was lower than we’d like in terms of predictable and optimal service quality being provided.

There was a lack of transparency. Business owners didn’t fully understand what quality was being received and they didn’t have simple meaningful language that they were receiving from IT operations in terms of understanding service quality: good, bad, or indifferent.

Within IT operations, as a function, we also had our own challenges. We were struggling to control our services. We were under the usual pressure that many of our counterparts face in terms of having to do more with less, and downward pressure on cost and headcount. We were growing a dynamic complex IT estate, plus customers are naturally becoming ever more discerning in terms of their expectations of IT.

Transactional and reactive


As a large multinational, we specified a fragmented organization, some services being provided centrally vs. locally, some offshore vs. onshore. With that, we had a number of challenges and the type of relationship we had with our customers was transactional and reactive in nature.

So with that backdrop, we knew we needed to take some radical steps to really drive our business forward.

Gardner: And before we learn more about that, Shane, tell me a little bit about Vodafone Ireland? Tell us the extent of your services and your reach there?

Gaffney: Vodafone is Ireland’s leading telecommunications operator. We have in excess of 2.4 million subscribers, about 1,300 employees in a mixture of on-premise and cloud operations. I mentioned the complex and dynamic IT estate that we manage. To put a bit of color around that, we’ve got 230 applications, about 2,500 infrastructure nodes that we manage either directly or indirectly -- with substantial growth in traffic, particularly the exponential growth in the telecom data market.

Gardner: When you decided to change your emphasis to try to provide more of that business confidence, the services orientation, clearly just using technology to do that probably wasn't going to be sufficient. There are issues around people, process, culture, and so forth. How do you, at a philosophical level, bridge the continuum among and between technology and these other softer issues like culture?

Gaffney: That's a really important point. The first thing we did, Dana, was engage quite heavily with all of our business colleagues to define a service model. In essence what we were looking at there was having our business unit owners define what services were important to them at multiple levels down to the service transactions, and defining the attributes of each of those services that make them successful or not.

We essentially looked to align our people, revamp our processes, and look at our end-to-end tool strategy, all based around that service model.



Once we had a very clear picture of what that looked like across all business functions, we used that as our starting point to be able to measure success through the customer eyes.

That's the focus and continues to be the core driver behind everything else we do in IT operations. We essentially looked to align our people, revamp our processes, and look at our end-to-end tool strategy, all based around that service model.

The service model has enforced a genuine service orientation and customer centricity that’s driven through all activities and behaviors, including the culture within the IT ops group in how we service customers. It’s really incorporating those commercial and business drivers at the heart of how we work.

Gardner: Shane, I've heard from other companies that another important aspect of moving to the shift on services delivery is to gain more awareness of the IT products in total. It involves, I suppose, gaining insight and then analysis, not at the point-by-point basis in the products themselves, but at that higher abstraction of how the users themselves view these services? Has that been important for you as well?

Helicopter view

Gaffney: We’ve taken the service view at a number of levels. Essentially, the service model is defined at a helicopter view, which is really what’s important to our respective customers. And we’ve drilled down into a number of customer or service-oriented views of their services, as well as mapping in, distilling, and simplifying the underlying complexities and event volumes within our IT estate.

Gardner: In order to get that helicopter view and abstract things in terms of a process level, what have you done in terms of gaining insight? What has become important for you to be able to do that?

Gaffney: There are a number of things we’ve considered there. Without having a consolidated or rationalized suite of tools, we found previously that it's very difficult to get control of our services through the various tiers. By introducing the HP Application Performance Management tools portfolio, there are a number of modules therein that have allowed us to achieve the various goals that we’ve set to achieve the desired control.

Gardner: Before we go into any detail on products and approaches, let’s pause and step back. What does this get for you -- if you do it right? What is it that you've been able to attain by shifting your emphasis to the business services level and employing some new approaches in culture? What did you get? What’s the payoff?

Gaffney: First of all for IT, we build confidence within the team in terms of having a better handle on the quality of service that we’re offering. Having that commercial awareness really does drive the team forward. It means that we’re able to engage with our customers in a much more meaningful way to create genuine value-add, and move away from routine transactional activity, to helping our customers to innovate and drive business forward.

Without having a consolidated or rationalized suite of tools, we found previously that it's very difficult to get control of our services through the various tiers.



We’ve certainly enjoyed those type of benefits through our transformation journey by automating a lot of the more core routine and repeatable activity, facilitating focus on our relationship with our customers in terms of understanding their needs and helping them to evolve the business.

Gardner: Have you done any surveys or presented key performance indicators (KPIs) against some of this? Is it still early? How might we look to some more concrete results, if you're able to provide that?

Gaffney: In terms of how we measure success, Dana, we try to take a 360 view of our service quality. So we have a comprehensive suite of KPIs at the technology layer. We also do likewise in terms of our service management and establishing KPIs and service level agreements (SLAs) at the service layer. We've then taken a look at what quality looks like in terms of customer experience and perception, seeking to correlate metrics between these perspectives.

As an example, we routinely and rigorously measure our customer net promoter score, which essentially assesses whether the customers, based on their experience, would recommend our products and services to others.

To give a flavor of the type of KPI improvements at an operational level that we’ve seen improve over the last year, we measure "customer loss hours," which is effectively due to any diminished performance in, or availability of, our services. We measure the impact to the end customer in terms of the adverse impact they would suffer.

Reduction in lost hours

We’ve seen a 66 percent reduction in customer lost hours year on year from last summer to this. We’ve also seen a 75 percent reaction in mean time to repair or average service restoration time.

Another statistic I'd call out briefly is that at the start of this process, we were identifying root cause for incidents that were occurring in about 40-50 percent of cases on average. We’re now tracking consistently between 90-100 percent in those cases and have thereby been able to better understand, through our capabilities and tools, what’s going on in the department and what’s causing issues. We consequently have a much better chance of avoiding repetition in those issues impacting customers.

At a customer satisfaction level, we’ve seen similar improvements that correlate with the improved operational KPIs. From all angles, we’ve thankfully enjoyed very substantial improvements. If we look at this from a financial point of view, we’ve realized a return on investment (ROI) of 300 percent in year one and, looking solely at the cost to fix and the cost of failure in terms of not offering optimal service quality, we’ve been able to realize cost savings in the region of €1.2 million OPEX through this journey.

Gardner: Let me just dig into that ROI. That’s pretty amazing, 300 percent ROI in one year. And what was that investment in? Was that in products, services, consulting, how did you measure it?

At a customer satisfaction level, we’ve seen similar improvements that correlate with the improved operational KPIs.



Gaffney: Yes, the ROI is in terms of the expenditure that would have related primarily to our investment in the HP product portfolio over the last year as well as a smaller number of ancillary solutions.

The payback in terms of the benefits realized from financial perspective that relate to the cost savings associated with having fewer issues and in the event where we have issues, the ability to detect those faster and spend less labor investigating and resorting issues, because the tools, in effect, are doing a lot of that legwork and much of the intelligence is built in to that product portfolio.

Gardner: I suppose this would be a good time to step back and take a look at what you actually do have in place. What specifically does that portfolio consist of for you there at Vodafone Ireland?

Gaffney: We have a number of modules in HP's APM portfolio that I'll talk about briefly. In terms of looking to get a much broader and richer understanding of our end-user experience which we lacked previously, we’ve deployed HP’s Business Process Monitors (BPMs) to effectively emulate the end-user experience from various locations nationwide. That provides us with a consistent measure and baseline of how users experience our services.

We’ve deployed HP Real User Monitoring (RUM), which gives us a comprehensive micro and macro view of the actual customer experience to complement those synthetic transactions that mimic user behavior. Those two views combined provide a rich cocktail for understanding at a service level what our customers are experiencing.

Events correlation

We then looked at events correlation. We were one of the first commercial customers to adopt HP’s BSM version 9.1 deployment, which gives us a single pane of glass into our full service portfolio and the related IT infrastructure.

Looking a little bit more closely at BSM, we've used HP’s Discovery and Dependency Mapping Advanced (DDMa) to build out our service model, i.e. effectively mapping our configuration items throughout the estate, back up to that top-down service view. DDMa effectively acts as an inventory tool that granularly links the estate to service. We’ve aligned the DDMa deployment with our service model which, as I mentioned earlier, is integral to our transformation journey.

Beyond that, we’ve looked at HP’s Operations Manager i (OMI) capability, which we use to correlate our application performance and our system events with our business services. This allows our operators to reduce a lot of the noisy events by distilling those high-volume events into unique actionable events. This allows operators to focus instead on services that may be impacted or need attention and, of course, our customers and our business.

We’ve gone farther and looked at ArcSight Logger, software which we’ve deployed to a single location that collects logged files throughout our estate. This allows us to quickly and easily search across all logged files for abnormalities that might be related to a particular issue.

By integrating ArcSight Logger with OMI -- and I believe we’re one of the first HP customers to do this -- we’ve enriched operator views with security information as well as the hardware, OS, and application layer events. That gives us a composite view of what’s happening with our services through multiple lenses, holistically across our technology landscape and products and services portfolio.

A year ago, we were to a degree reactive in terms of how we provided service. At this point, we’re proactive in how we manage services.



Additionally, we’ve used HP’s Operations Orchestration to automate many of our routine procedures and, picking up on the ROI, this has allowed us to free up operators’ time to focus on value-add and effectively to do more with less. That's been quite a powerful module for us, and we’ve further work to exploit that capability.

The last point to call out in terms of the HP portfolio is we’re one of the early trialists of HP’s Service Health Analyzer. A year ago, we were to a degree reactive in terms of how we provided service. At this point, we’re proactive in how we manage services.

Service Health Analyzer will allow us to move to the next level of our evolution, moving toward predictive service quality. I prefer to call the Service Health Analyzer our “crystal ball,” because that’s essentially what we’re looking at. It’s taking trends that are occurring with the services of transaction, and predicting what's likely to happen next and what may be in jeopardy of breaking down the line, so you can take early intervention and remedial action before there’s any material impact on customers.

We’re quite excited about seeing where we can go there. One of the sub-modules of Service Health Analyzer is Service Health Reporter, and that’s a tool that we expect to act as our primary capacity planning capability across a full IT estate going forward.

Throughout our implementation, partnership was a key ingredient to success. Vodafone had the business vision and appetite to evolve. HP provided the thought leadership and guidance. And, Perform IT, HP's partner, brought hands-on implementation and tuning expertise into the mix.

Gardner: That’s very impressive. You’re certainly juggling a lot of balls and keeping them in the air. One of the things that I've seen in the market when it comes to gaining this sort of pane of glass view into operations is they’re starting to share that as sort of a dashboard, or a graphical representation as a scorecard perhaps we could refer to it, with more of the business leadership.

Have you been able to take some of the analysis, and insights and then not just use that in the context of the IT operations, but provide it back to business, so it would help them manage their strategy and operational decision making?

Full transparency

Gaffney: Absolutely. One of our core principles throughout this journey has been to offer full transparency to our customers in terms of the services they receive and enjoy from us. On one hand, we provide the BSM console to all of our customers to allow them to have a view of exactly what the IT teams see, but with a service orientation.

We’re actually going a step further and we’re building out a cloud-based service portal that takes a rich feed in from the full BSM portfolio, including the modules that I've called out earlier. It also takes feeds in from a remedy system, in order to get the view of core processes such as incident management, problem management, change management.

Bringing all of that information together gives customers a comprehensive view of the services they receive from IT operations. That's our aim -- to provide customers with everything they need at their fingertips.

It's essentially providing simple and meaningful information with customized views and dynamic drill-down capabilities, so customers can look at a very high level of how the services are performing, or really drill into the detail, should they so desire. The portal, we believe, is likely to act as a powerful business enabler. Ultimately, we believe there's opportunity to commercialize or productize this capability down the line.

The portal, we believe, is likely to act as a powerful business enabler. Ultimately, we believe there's opportunity to commercialize or productize this capability down the line.



Gardner: We’re about out of time, but Shane, now that you've gone through quite a bit of this, and as an early adopter, I wonder if you could share some 20-20 hindsight for those users around the world who are examining some of the products and services available, thinking about culture, re-emphasizing the business process issues, rather than just pure technology issues. What would you tell them as advice when they get started? Any recommendations now that you've been through this yourself?

Gaffney: For customers embarking on this type of transformation initiative, first off, I would suggest: engage with your customers. Speak with your customers to deeply understand their services, and let them define what success looks like.

Look to promote quick wins and win-wins. Look at what works for the IT community and what works for the customer. Both are equally important. Buy-in is required, and people across those functions all need to understand what success looks like, and believe in it.

I would recommend taking a holistic approach from a couple of angles. Don’t just look at your people, technology, or processes, but look at those collectively, because they need to work in harmony to hit the service quality sweet spot. Holistically, it's important to prepare your strategy, but look top down from the customer view down into your IT estate and vice versa, mapping all configuration items back into those top level services.

Rationalize and automate

Rationalize and automate wherever possible. We had a suite of over two dozen tools, acting as a cumbersome patchwork solution for operators. We’ve vastly rationalized those tools into a much more manageable single console that the teams use now.

We’ve automated all resource-intensive and transactional activities wherever possible, which again frees up time to allow engineers to focus on the business relationship.

I’d also recommend the people incrementally build on success. We started out with modest budget, but by targeting early wins through that investment, and by building subsequent business cases, particularly with the service model, we were easily able to get the buy-in from stakeholders, because the story was compelling, based on the commercial advantages and the broader business benefits that were accrued from the earlier investment.

Lastly, for IT teams I would strongly suggest that you look to establish a dedicated surveillance capability, whether that’s round the clock or whatever is appropriate for your business model. Moving from a traditional support model to this type of service-oriented view, the key to success is having people managing the eyes and ears across your services at all times. It really does pay back in spades.

We’ve automated all resource-intensive and transactional activities wherever possible, which again frees up time to allow people focus on the business relationship.



Gardner: Excellent. A big thank you to you Shane Gaffney, Head of IT Operations at Vodafone Ireland. This has been a great story, and thank you for sharing it on how a shift from technology emphasis to a business services delivery emphasis has created some significant improvements and has set the stage for yet greater business productivity from IT.

I also want to thank our audience for joining us for this special BriefingsDirect podcast coming to you in conjunction with the HP Discover 2011 Conference in Vienna. I hope you have a great show. I appreciate your time, Shane.

Gaffney: Thank you, Dana.

Gardner: This is Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this series of HP-sponsored Discover live discussions. Thanks again for listening and come back next time.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: HP.

Transcript of a BriefingsDirect podcast in conjunction with HP Discover 2011 in Vienna on how a major telecom provider has improved service to customers by shifting from a technology emphasis to business service delivery. Copyright Interarbor Solutions, LLC, 2005-2011. All rights reserved.

You may also be interested in: