Wednesday, August 03, 2011

Case Study: MSP InTechnology Improves Network Services Via Automation and Consolidation of Management Systems

Transcript of a BriefingsDirect podcast discussion on how InTechnology uses network management automation to improve delivery and service performance for network and communications services.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Sponsor: HP.

Dana Gardner: Hi, this is Dana Gardner, Principal Analyst at Interarbor Solutions, and you're listening to BriefingsDirect.

Today, we present a sponsored podcast discussion on a UK-based managed service provider’s journey to provide better information and services for its network, voice, VoIP, data, and storage customers. Their benefits have come from an alignment of many service management products into an automated lifecycle approach to overall network operations.

We'll hear how InTechnology has implemented a coordinated, end-to-end solution using HP solutions that actually determine the health of its networks by aligning their tools to ITIL methods. And, by using their system-of-record approach with a configuration management database, InTechnology is better serving its customers with lean resources by leveraging systems over manual processes. [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]

We're here with an operations manager from InTechnology to learn about their choices and outcomes when it comes to better operations and better service for their hundreds of enterprise customers.

Please join me now in welcoming Ed Jackson, Operational System Support Manager at InTechnology. Welcome, Ed.

Ed Jackson: Thanks. Hi.

Gardner: Your organization is a managed service provider (MSP) for both large enterprises and small to medium-sized companies, and you've been facing an awful lot of growth over the past several years. But you have also been dealing with heterogeneity in terms of many different products in place for network operations. It sounds like you've tried to tackle two major things at once: growth and complexity. How has that worked out?

Jackson: In terms of our network growth, we've basically been growing exponentially year over year. In the past four years, we've grown our network about 75 percent. In terms of our product set, we've basically tripled that in size, which obviously leads to major complexity on both our network and how we manage the product lifecycle.

Previously, we didn’t have anything that could scale as well as the systems that we have in place now. We couldn’t hope to manage 8,000 or 9,000 network devices, plus being able to deliver a product lifecycle, from provisioning to decommission, which is what we have now.

Gardner: So our audience better understands the hurdles and challenges you've faced, you're providing voice, both VoIP and traditional telephone, and telephony services. You have data, managed Microsoft Exchange, managed servers, and virtual hosting. You're providing storage, backup and restore, and of course a variety of network services. So this is a really full set of different services and a whole lot of infrastructure to support that.

Jackson: Yeah. It's pretty massive in terms of the technologies involved. A lot of them are cutting-edge. We have many partners. And you are right, our suite of cloud services is very diverse and comprises what we believe is the UK’s most complete and "joined-up"set of pay-monthly voice and data services.

Their own pace

In practice what we aim to do is help our customers engage with the cloud at a pace that works for them. First, we provide connectivity to our nationwide network ring – our cloud. Once their estate is connected they can then cherry pick services from our broad pay-as-you-go (PAYG) menu.

For example, they might be considering replacing their traditional "tin" PBXs with hosted IP telephony. We can do that and demonstrate massive savings. Next we might overlay our hosted unified communications (UC) suite providing benefits such as "screen sharing," "video calling," and "click-to-dial." Again, we can demonstrate huge savings on planes, trains and automobiles.

Next we might overlay our exciting new hosted call recording package -- Unity Call Recording (UC) -- which is perfect if they are in a regulated industry and have a legal requirement to record calls. It’s got some really neat features including the ability to tag and bookmark calls to help easy searching and playback.

While we're doing this, we might also explore the data path. For example our new FlexiStor service provides what we think is the UK’s most straightforward PAYG service designed to manage data by its business "value" and not just as one big homogenous lump of data.

It treats data as critical, important or legacy and applies an appropriate storage process to each ... saving up to 40 percent against traditional data management methods. There’s much more of course, but that gives you a flavor, I hope.

Due to the HP product set that we have, we've been able to utilize all the integrations and have a fully managed, end-to-end lifecycle of the service.



Imagine trying to manage this disparate set of systems. It would be pretty impossible. But due to the HP product set that we have, we've been able to utilize all the integrations and have a fully managed, end-to-end lifecycle of the service, the devices, and the product sets that we have as a company.

Gardner: I have to imagine too that customer service and support is a huge part of what you do, day in and day out. You also have had to manage the help desk and provide automated alerts, fixes, and notifications, so that the manual help desk, which is of course quite costly, doesn’t overwhelm you. Can you address what you've attempted to do and what you have managed to do when it comes to automated support?

Jackson: In terms of our service and support, we've basically grown the network massively, but we haven’t increased any headcount for managing the network. Our 24/7 guys are the same as they were four or five years ago in terms of headcount.

We get on average around 5,000 incidents a month automatically generated from our systems and network devices. Of these incidents, only about 560 are linked to customer facing Interactions using our Service Desk Module in the Service Manager application.

Approximately 80 percent of our total incidents are generated automatically. They are either proactively raised, based on things like CPU and memory of network devices or virtual devices or even physical servers in our data centers, or reactively raised based on for example device or interface downs.

Massive burden

When you've got like 80 percent of all incidents raised automatically, it takes a massive burden off the 24/7 teams and the customer support guys, who are not spending the majority of their time creating incidents but actually working to resolve them.

Gardner: Let's back it up. Five years ago, when you didn't have any integrated systems and you were dealing with lots of data, perhaps spurious data, what did you think? I know that you're an ITIL shop and so you had to bring in that service management mindset, but what did you do in order to bring these products together or even add more products, but without them being also unwieldy in terms of management?

Jackson: It was spurred by really bad data that we had in the systems. We couldn't effectively go forward. We couldn't scale anymore. So, we got the guys at HP to come in and design us a solution based on products that we already had, but with full integration, and add in additional products such as HP Asset Manager and device Discovery and Dependency Mapping Inventory (DDMI).

With the systems that we already had in place, we utilized mainly HP Service Desk. So we decided to take the bold leap to go to Service Manager, which then gave us the ability to integrate it fully into the Operations Manager product and our Network Node Manager product.

Since we had the initial integrations, we've added extra integrations like Universal Configuration Management Database (UCMDB), which gives us a massive overview on how the network is progressing and how it's developing. Coupled with this, we've got Release Control, and we've just upgraded to the latest version of Service Manager 9.2.

For any auditor that comes in, we have a documented set of reports that we can give them. That will hopefully help us get this compliance and maintain it.



So it has given us a huge benefit in terms of process control, how ITIL is related. More importantly, one of the main things that we are going for at the moment is payment card industry (PCI) and ISO 27001 compliance.

For any auditor that comes in, we have a documented set of reports that we can give them. That will hopefully help us get this compliance and maintain it. One of the things as an MSP is that we can be compliant for the customer. The customer can have the infrastructure outsourced to us with the compliance policy in that. We can take the headache of compliance away from our customers.

Gardner: Having that full view and the ability to manage also discreetly is not only good business, but it sounds like it's an essential ingredient for the way in which you go to market?

Jackson: More and more these days, we have a lot of solicitors and law firms on our books, and we're getting "are you compliant" as a request before they place business with us. We're finding all across the industry that compliance is a must before any contract is won. So to keep one step ahead of the game, this is something that we're going to have to achieve and maintain, and the HP product set that we have is key in that.

Gardner: I suppose too that a data flow application like Connect-It 4.1 provides an opportunity to not only pull together disparate products and give that holistic view, but also provides that validation for any audits or compliance issues?

Recently upgraded

Jackson: We recently upgraded Connect-It from 4.1 to 9.3, and with that, we upgraded Asset Manager System to 9.3. Connect-It is the glue that holds everything together. It's a fantastic application that you can throw pretty much any data at, from a CSV file, to another database, to web services, to emails, and it will formulate it for you. You can do some complex integrations in that. It will give you the data that you want on the other side and it cleanses and parses, so that you can pass it on to other systems.

From our DDMI system, right through to our Service Manager, then into our Network Node Manager, we now have a full set of solutions that are held together by Connect-It.

We can discover the device on the network. We can then propagate it into Service Manager. We can add lots of financial details to it from other financial systems outside of the HP product set, but which are easy to integrate. We can therefore provision the circuit and provision the device and add to monitoring automatically, without any human intervention, just by the fact that the device gets shipped to the site.

It gets loaded up with the configuration, and then it's good to go. It's automatically managed right through to the decommissioning stage, or the upgrade stage, where it's replaced by another device. HP systems give us that capability.

Gardner: So these capabilities really do allow you to take on a whole new level of business and service. It sounds like the maintenance of the network, the integrity, and then the automation really helps you go to market in a whole new way than you could have just several years ago.

I don’t know of many other MSPs that have such an automated set of technology tools to help them manage the service that they provide to their customers.



Jackson: Definitely. One of the key benefits is it gives us a unique calling card for our potential customers. I don’t know of many other MSPs that have such an automated set of technology tools to help them manage the service that they provide to their customers.

Five years ago, this wasn't possible. We had disparate systems and duplicate data held in multiple areas So it wasn’t possible to have the integration and the level of support that we give our customers now for the new systems and services that we provide.

Gardner: Of course, HP has been engineering more integration into its product and you have been aggressive in adopting some of the newer versions, which is an important element of that, but I have to imagine that there is also a systems integrations function here or professional services. Have you employed any professional services or relied on HP for that?

Jackson: When we originally decided to take the step to upgrade from Service Desk to Service Manager and to get the network discovery product set in, we used HP’s Professional Services to effectively design the solution and help us implement it.

Within six months, we had Service Desk upgraded to Service Manager. We had an asset manager system that was fully integrated with our financials, our stock control. And we also had a Network Discovery toolset that was inventorying our estate. So we had a fully end-to-end solution.

Automatic incidents

I
nto that, we have helped to develop the Network Operations Management Solution into being able to generate automatic incidents. HP PS services provided a pivotal role in providing us with the kind of solutions that we have now.

Since then, we took that further, because we have very good in-house knowledgeable guys that really understand the HP systems and services. So we've taken it bit of a step further, and most of the stuff that we do now in terms of upgrades and things are done in-house.

Gardner: It's a very compelling story. I wonder if we have more than just the show-and-tell here. Do we have any metrics of success? Have you been able to point to faster time to resolution, maintaining service-level agreements (SLAs), or something along those lines, that we could help people appreciate what this does, not only functionally in terms of bringing new services to your customers, but also in terms of how you operate and some important metrics that affect your bottom line?

Jackson: Mean time to restore has come down significantly, by way over 15 percent. As I said, there has been zero increase in headcount over our systems and services. We started off with a few thousand network devices and only three or four different products, in data, storage, networks and voice. Now we've got 16 different kinds of product sets, with about 8,000, 9,000 network devices.

In terms of cost saving, and increased productivity, this has been huge. Our 24/7 teams and customer support teams are more proactive in using knowledge bases and Level 1 triage. Resolution of incidents has gone up by 25 percent by customer support teams and level 1 engineers; this enables the level 3 engineers to concentrate on more complex issues.


In terms of SLAs, we manage the availability of network devices. It gives us a lot more flexibility in how we give these availability metrics to the customers.



If you take a Priority 3, Priority 4 incident70 percent of those are now fixed by Level 1 engineers, which was unheard of five or six years ago. Also, we now have a very good knowledge base in the Service Manager tool that we can use for our Level 1 engineers.

In terms of SLAs, we manage the availability of network devices. It gives us a lot more flexibility in how we give these availability metrics to the customers. Because we're business driven by other third party suppliers, we can maintain and get service credits from them. We've also got a fully documented incident lifecycle. We can tell when the downtime has been on these services, and give our suppliers a bit of an ear bashing about it, because we have this information to hand them. We didn’t have that five or six years ago.

Gardner: So, by having event correlation and data to back up your assertions there's much less finger pointing. You know exactly who had dropped the ball.

Jackson: Exactly. With event correlation, we reduced our operations browsers down to just meaningful incidents, we filtered our events from over 100,000 a month to less than 20,000 many of these are duplicated and are correlated together. Most events are associated with knowledge base articles in Service Manager and contain instructions to escalate or how to resolve the event, increasingly by a level 1 engineer.

We can also run automatic actions from these events, and we can send the information to the relevant parties, and also raise an incident and send it directly to the correct assignment groups or teams that are involved in looking after that.

Internal SLA

For Priority 1 incidents, which by an internal SLA we have 15 minutes to communicate to the customer, we can do that now within two minutes, because the group that’s been assigned the incident are on the ball straight away and they can contact the customer and let them know of the potential or actual problem.

Contacting customers within agreed SLAs and how we can drive our suppliers to provide better service is fantastic because of the information that is available in the systems now. It gives us a lot more heads up on what’s happening around the network.

Gardner: And now that you have had this in place, this integrated lifecycle, end-to-end approach, you've got your UCMDB, is there now, in hindsight, an opportunity to do some analytics, perhaps even refine what you requirements are, and therefore cut your total cost at some level?

Jackson: We're building a lot of information, taken from our financial systems and placing it into our UCMDB and CMDB databases to give us the breakdown of cost per device, cost per month, because now this information is available.

We have a couple of data centers. One of our biggest costs is power usage. Now, we can break down by use of collecting the power information, using NNMi -- how much our power is costing per rack by terms of how many amps have been used over a set period of time, say a week or a month. where previously we had no way of determining how our power usage was being spent or how much was actually costing us per rack or per unit.

From this performance information, we can also give our customers extra value reports and statistics that we can charge as a value added managed solution for them.



It's given us a massive information boost, and we can really utilize the information, especially in UCMDB, and because it’s so flexible, we can tailor it to do pretty much whatever we want. From this performance information, we can also give our customers extra value reports and statistics that we can charge as a value added managed solution for them.

Gardner: For the benefit of our listeners, now that you've gone through this process, are there any lessons learned, anything you could relay in terms of, "If I had to do this again, I might do blank?" What would you offer to those who would now be testing the waters and embarking on such a journey?

Jackson: One of the main things is to have a clear goal in mind before you start. Plan everything, get it all written down, and have the processes looked at before you start implementing this, because it’s fairly hard to re-engineer if you decided that one of the actual solutions or one of the processes that you have implemented isn’t going to work. Because of the integration of all the systems, you might tend to find that reverse engineering them is a difficult task.

As a company, we decided to go for a clean start and basically said we'd filter all the data, take the data that we actually really required, and start off from scratch. We found that doing it that way, we didn’t get any bad data in there. All the data that we have now is pretty much been cleansed and enriched by the information that we can get from our automated systems, but also by utilizing the extra data that people have put in.

Gardner: Thanks so much. You've been listening now to a sponsored podcast discussion on a UK-based managed service provider, InTechnology, and their journey to provide better information and services for their voice, data, and storage customers. They've employed an automated lifecycle approach and it has benefited them in a number of levels.

Thanks to Ed Jackson, the Operational System Support Manager at InTechnology. Ed, we really appreciated your input.

Jackson: Okay. No problem.

Gardner: And this is Dana Gardner, Principal Analyst at Interarbor Solutions. Thanks to our audience, and come back next time.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Sponsor: HP.

Transcript of a BriefingsDirect podcast discussion on how InTechnology uses network management automation to improve delivery and service performance for network and communications services. Copyright Interarbor Solutions, LLC, 2005-2011. All rights reserved.

You may also be interested in: