Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Learn more. Sponsor: HP.
Dana Gardner: Hi, this is Dana Gardner, Principal Analyst at Interarbor Solutions, and you're listening to BriefingsDirect. Today, we present a sponsored podcast discussion on raising the bar for network performance management.
The IT news headlines are full of incidents of major cloud instances brought down for days, and unfortunately often weeks, with some of the largest of these due to network issues in association with virtualization and storage sprawl. The price in the cloud era for such disruptions is very high and very public.
A big part of the solution to preventing such outages comes from comprehensive, automated, and increasingly integrated network management capabilities. The tasks before network managers have never been more daunting. There are far more devices, hybrid networks, hybrid compute resources, higher levels of virtualization, and there is a need to maintain security and compliance requirements throughout.
What’s more, the pressure to keep cost down and to seek lower cost alternatives for converged infrastructure remains a constant companion to business and IT architects, and therefore an ongoing network challenge.
Into this environment, HP has recently delivered a wide-ranging update to its Network Management Center suite Version 9.1. The emphasis is on a comprehensive lifecycle approach to network management with deep data gathering, automated root cause analytics, and intelligent and proactive response features that enable consistently high performance and network reliability.
I'm here with an HP Network Management Center expert to dig into the new offerings and to better understand why previous fragmented approaches to network performance and stability just won’t hold up for most enterprises. Please join me now in welcoming Ashish Kuthiala, Director of Product Marketing for HP Software’s Network Management Center. Welcome, Ashish.
Ashish Kuthiala: Hi, Dana. Glad to be on this call.
Gardner: This overarching network importance seems to be growing more and more. We're hearing about issues with virtualization and issues with multiple devices. The load on the network is increasing, and the complexity is increasing. Maybe you could help me understand what it is about the new environment that is taxing the older ways of accomplishing a network management function?
Kuthiala: Let’s start with a simple example of a business outage, whether that is your shopping cart, through which you do your business transactions, going down or you come into the office on a Monday morning and your email is really slow or not syncing up to the main server.
When you have a business outage, the blame is put on the IT organization. Then, within the IT organization, you're not sure whether it’s an application issue or it’s a backend database issue. Is it the server that’s not responding or is it the network? When you're not sure, you know the answer first and foremost is that it must be the network. The network is the backbone of any IT organization today.
It’s always the first thing to be blamed and the most complex things to diagnose and solve. When you're looking at the network today, it has become very complex and is increasingly becoming more complex. With new domains coming in, such as voice over IP (VOIP), webcasts, and video traffic, multiprotocol label switching (MPLS) services, unified communications, and cloud computing and virtualization, it just becomes a nightmare to manage your network for your business.
Then, you look at the volume of network devices coming online. Now, everyone wants to be in the instant-on enterprise mode. Everyone has to be connected. Everything has to be connected. Everyone expects immediate gratification and instant results. You have to respond to this opportunity continuously, and "any time, anywhere, any way" is the new tagline for anybody who is working.
Let’s look at the job of the director of network ops in a particular IT organization. Not only does he have to configure, manage, and standardize a network, he has to provision, he has to deliver, and he has to report on it. He has to do it very proactively and he has to do it very strategically at the lowest cost possible.
IT budgets are shrinking or remaining flat, whereas the demands on IT are really going up. It’s estimated that a customer can lose about $70,000 a minute during network outage, as I'm sure you’ve seen in the recent news. It's a big business inhibitor if the network goes down. It is what provides the experience to the end user for all the IT services that they experience.
Gardner: What about the old ways? Why isn’t the previous mode of network management able to keep up?
Kuthiala: Today, if you were to look into a customer’s IT department managing a network environment, you would often see a war-room like approach to managing networks. They have multiple tools, legacy approaches, and a lot of band-aids. The inability in tying together what used to be separate domains has become unacceptable.
The inability to cope up with the scale and complexity, the different teams hunched over their different monitors, is what I call the "swiveling chair syndrome." If there is a network outage, you have these 8 or 10 different operators looking at different aspects of the network. They are just swiveling in their chairs, talking to each other and looking for data that should really be on one screen for them to manage. The lack of scalability of such tools just adds to the problem.
Gardner: So they are fragmented and reactive. They're not proactive. Is that right?
Kuthiala: They're very reactive. If your shopping cart goes down doing the Christmas shopping season, and a customer tells you about it, that is just unacceptable. By that time, you've lost money, you have damaged your brand, and you have a number of IT people being woken up in their homes at night to resolve this problem -- and you don’t know when this will get resolved.
Gardner: Why is it that an automated approach can work? Now, you have a suite of products. You recognize that you need different tools for different parts of the equation of the problem. I guess it’s abstracting that up to a console or a single view that is a powerful approach here. Is that what’s going on?
Kuthiala: To manage your network today, you really need to understand how your network is constructed from the bottom up, how it ties together, how it changes over time, and how it self-organizes. You need to build that kind of intelligence into your root-cause analysis.
The design of the tools has to be built ground up, based on these decisions. That’s how you need to construct the tools. That’s how they need to be integrated. For an operator, all these need to build upon each other.
It has to be in the right context. It cannot be siloed. It is a nightmare to manage. The desired nirvana for a network team is to reduce the numerous point tools to manage various aspects of network management. It has to be proactive, not reactive.
You have compliance management diagnostics and change issues that you need to take human error out of, and you need to automate that. You want to reduce the manual effort, the errors and increase control over your environment. You want to reduce the mean time to repair network outages, and maintain cost optimization as your network grows.
Today for customers, “performance is the new fault." So just because a network device is up and running, and you can ping it, doesn't mean it is providing the quality of service it should to the end user. It’s really the performance that the network is being measured against.
Gardner: So, it’s not so much a red light/green light effect. It’s really what is the level of performance, what are the tradeoffs, how can I remain secured and reliable, and then how can I manage my cost? That’s a fairly a big equation.
Kuthiala: Correct. It’s a pretty big equation. It’s all about efficiency, how you reduce your errors, and increase your speed through automation.
Gardner: So, HP has looked at this problem. You've been in the management business for quite some time with a long legacy with Open View, but you've been building, buying, and partnering for a wider and more comprehensive approach.
Tell me a little bit about the philosophy. I guess there are three aspects to management and a way in which you can broaden your capabilities, but at the same time give a singular view of what’s going on.
Kuthiala: So, just to recap, customers are looking for a solution that's efficient, automated, and secure for them. When they manage a network, they should be able to do things like fault, performance, change, configuration, compliance, trending and reporting, and this ties into their business services.
So, HP looked at this problem. As you know, we've had a long history of about 20 years with the HP OpenView product in network management. As we acquired other companies such as Opsware, they bought in additional tools with them. We looked at the tools and the evolving landscape of the network management domain and about five years ago, embarked on a re-architecture plan for these products from the ground up.
The approach wasn’t to make these products just work together by putting in connectors, but we wanted them to be integrated from bottom up, from the data level itself, where the data would build upon each other.
Now, as we look at the Network Management Center (NMC), it is a complete portfolio of solutions and tools that lets you do network management in an integrated and automated way.
This really builds upon the HP Network Node Manager i (NNMi), the related special plug-ins that handle complex services such as multicast traffic, VOIP, etc., as well as the network automation piece of it which really helps customers automate and manage their change, compliance, and configuration of network devices that they need to do on an ongoing basis.
Gardner: Ashish, as I recall, you had a pretty large update with this whole Network Node Manager family and a whole set of smart plug-ins. This was about a year ago, Level 9.0. Maybe we should revisit that, before we think about understanding more about 9.1.
Kuthiala: The five-year journey of re-architecting our NMC portfolio completes with the 9.1 release that we are talking about today.
So, 9.0 introduced a number of features including better user interfaces, the ability to scale to large environments, and tying our products together into better functioning solutions. With 9.1, we are building on that.
We've strengthened the ability of our customers to manage cloud services. The most critical capability that a customer must have is to manage the network the same way that they have managed traditional networks, and it doesn’t matter if they have to go across the cloud or are looking at private or public clouds.
Gaining visibility into the network elements, whether they are local, off-premise or the health and quality of the cloud services that's being delivered, is the most important step. Can I reach my device? Is it healthy? Is it performing to the expected levels of business needs?
And, of course, configuration compliance management of these devices across the cloud is very important, and corrective actions and rollbacks are very important. Our tools are able to do that across different environments.
The 9.1 release is also focused on the managed service provider’s (MSP's) market needs. There is a big trend of IT outsourcing to MSPs, and one of the things that customers want to outsource is network management services. So this is a big, growing market, and our MSPs need platforms to manage their customers' network environments in a way that that maximizes their profit.
They need to scale and grow with their customer in expanding network environments, reduce their hardware spend and their training costs, as well as grow their revenues and create new lines of business, as their own customers move to new and complex services.
For example, a customer might go from traditional phones to IP telephones, and at that point, the MSP has to manage that aspect of their customer’s environment as well, and they don’t want at this point to buy a new tool.
The size of the customer's network might increase, and you don’t want to buy another server, another set of tools and deploy another set of operators to manage that.
We have introduced multi-tenancy capability and security groups that allow our customers to separate their data and views into secure partitions. This helps them manage multiple customers, departments or sites per single software instance, driving down their cost and giving them a flexible architecture.
We’ve also done a lot of work on the performance-based, time-based thresholds for better alerting. What this means is that the performance data is in the context of the network topology providing a unique point of your fault monitoring. It helps them with proactive notification of performance degradation, fix it proactively and guarantee service delivery levels.
We've also increased the number of months that the data is retained. It's up to 13 months now which allows you to do forecasting and trending capabilities. This is a sufficient data retention period for compliance requirements for real-time and historical data, and allows a very efficient analysis.
Our user interface (UI) has been enhanced based on the feedback we’ve gotten from customers. The common look and feel UI across all the products and our solution set ensures lower training cost -- train once, leverage across all these tools.
The UIs show relevant contextual information on the nodes and incidents they're managing, giving them a lot of operational efficiency. The breadcrumb history and the easy navigation with right-click menus also allows the operators to get to the root cause more quickly, making them much more efficient and improving the time to resolution.
The analysis pane shows you a number of system component help enables you to get key information including availability and performance graph really quickly.
Gardner: In some of these high-profile outages that we've had recently, it seems that they were doing updates and that caused the cascading or spiraling effect and ultimately brought the network down. For these MSPs their credibility is on the line, a lot of the money could be lost, and their service level agreements (SLAs) can't be met, and so forth.
What is it about your suite and your comprehensive approach that could help ameliorate something like that? Are you doing updates, constantly and in a dynamic, constantly changing environment? Tell me how this could be prevented in the future?
Kuthiala: A network constantly needs updates, whether its configuration updates or being in compliance with a number of different policies -- Sarbanes-Oxley (SOX) or the Health Insurance Portability and Accountability Act (HIPAA), and government regulations.
Typically, customers have a set of people who use multiple tools or manually log into a number of these devices and do these configuration changes manually. This is very dangerous. One, there is human error involved. Second, when something goes wrong, you don't know what has gone wrong, and you are scrambling to fix it. Think about doing this across 50,000, 60,000, 70,000 devices in your network.
Our network automation capabilities allow customers to automatically make these changes through our tools. As they implement these changes, it's takes minutes and hours, versus days, to keep these devices configured to the latest and greatest configurations and in compliance.
Think about when you are on the 59,000th device that you are updating and you realize there is an error. This was not the right thing to do, and you need to roll back. If you're doing this manually, you're spending many hours fixing the error while your business is suffering during that time. Our automation capabilities help customers; with a few clicks of buttons they are able to automate all of this.
Today, customers might be looking at a number of incidents -- 10,000, to 15,000 incidents. For example, if somebody yanks a LAN cord out and puts it back in, what really has happened is the interface has gone down and come back up. And now that is flagged as an incident or an event that the operator has to pay attention to.
With our root cause analysis engine, and the ability to map the topology dynamically in a spiral discovery fashion, the network topology is always up-to-date. The root cause analysis engine helps figure out whether this is an incident that needs to be paid attention to or not, auto-resolving some of that.
The incidents that boil up to the operators are meaningful, and therefore are reduced in number to those that are actionable. We have had customers whose incidents have been reduced from 10,000-12,000 down to 400, and only about 100 of those have to be acted upon and escalated to the next level of management.
Automation really takes a lot of the work out of your hands and enables you to fix errors very proactively, and if there is a mistake, fix it right away with a few clicks.
Gardner: Configuration management is something we’ve heard about over the years and often it has been applied to the servers and the application workloads. Are we talking about the same type of configuration management or do you need to do it in an entirely different way on the network?
And then second, your configuration and your management center capabilities are part of the larger business service management suite or set of products and services at HP, is there a commonality between configuration management of the network and configuration management at some of the other major aspects of a converged infrastructure?
Kuthiala: I'm talking very specifically about the configuration of network devices. The software that your network device comes with is the key differentiator in how they act, and the intelligence that they provide. So this has to be not only managed really well, but there are patches and upgrades, just as you have software patches and upgrades on your servers. These have to be managed. Sometimes, there are government regulations or company regulations that you want to propagate across these devices.
But tying to the business service management set of tools or the suite stems from the fact that, when you look at it from a business service availability aspect, it’s not just about the network. There are servers, there are applications, and they are all tied together. For example, if application business service is not working, do you know if it’s the server? Do you know if it’s the application? Do you know if it is the network?
Our Business Service Management offering ties in these aspects through our runtime service model. This ties your network, to your application, to your server and is able to give your business a look into how your business service is going to be affected by the failure of any one of these infrastructure elements.
Gardner: Okay. I have seen you referred to as "application-aware network management." Maybe you could help me better understand. What do you mean by that?
Kuthiala: If you go back to the basic premise, the network is there to transit the traffic for applications themselves. It's essential to understand what type of traffic is flowing on your network. This gives you the ability to optimize your network performance and network resiliency.
The true measure of how an application is running is what a user cares about. He doesn’t really care about how the network is running. Your network has to be very application-aware so that you can tune it to the desired performance and resiliency that you need.
Gardner: Now, we've been talking about network performance management in the context of sort of firefighting and preventing outages, but as I mentioned earlier, cost is such a still an important element here.
In using your approach to network management, is there some efficiency or total cost of ownership (TCO) benefits, when you have better insight into the network? When you can have these root cause analysis data points available, when you have that comprehensive view, can you then perhaps start tweaking and refining the way in which your network operates in such a way that sure you're going to keep availability and performance? Can you also find ways of developing efficiencies and therefore cut total cost?
Kuthiala: A customer that I met last year was on a prior version of our toolset and also had a number of other vendors' tools to manage his network.
We talked about the new NNMi platform, and customer’s response was, "You know, I have seven or eight people dedicated to managing my network. I have a toolset that works and I'm happy. And, I have a number of other IT projects that I need to attend to. I do understand the value of going to the new platform, but I will do that next year."
As we talked, I was able to articulate the value of how they could reduce the number of operators invested in managing the network, the number of resources, the number of different contracts they had, the server footprint, the cooling costs, etc. The customer agreed that it made a lot of sense to upgrade.
The customer came back to me in about three weeks and said, "The upgrade was easy, we got it up and running. I now have only two people managing my network. I've been able to free six people to put them on other critical IT projects. There has been a lot of savings for me and the ability to redeploy my resources has been tremendous." So, I think a lot of customers of ours are actually realizing tremendous value from taking this new approach.
The other case that I would like to share with you is about HP Enterprise Services. They were looking to deploy 10,000 new remote workers, where people would be able to work from their remote offices or homes. And, per worker that they would deploy, they would have to invest a couple of man hours on their end with somebody on the phone sitting and getting people to configure their new equipment to work with the corporate environment in a seamless fashion.
By using automation tools, they were able to save about two hours per deployment per worker, as they rolled this program out and they deployed about 10,000 workers in a matter of few weeks versus months. They have had multiple successes with automation across their entire system and deployed it across 350-plus clients to reduce their costs, increase their efficiencies and reduce errors.
Gardner: And these economic issues are very important to everyone, but I suppose they are even especially important to those MSPs, where their margins are lower and their costs, when they cut them, can go directly to the bottom line.
Kuthiala: Absolutely. It enables them to maximize their profits. For example, the new multi-tenancy capabilities enable them to manage multiple customers from a single software instance.
It helps them drive down their ongoing hardware, software, and headcount costs that they can redeploy somewhere else. The scalability of our products is immense. We're able to manage 25,000 devices or up to two million interfaces from a single server instance.
They can partition their customers in their own secure environments and use security groups. So, they can meet their customer SLAs but drive down their costs by going to a single instance of the software.
Gardner: Now Network Management Center is a fairly significant set of different products, but most people already have something in place. So this is not a matter of starting greenfield. This is a matter of coexistence, migration, and transformation. How do you get started? What’s the typical scenario for working with a Network Management Center set by bringing it into an environment where you’ve already got installed management?
Kuthiala: Most customers today have in place something to monitor their networks, but a lot of customers have not automated their configuration, compliance, and diagnostic capabilities that we talked about.
So, let me start with that. We've seen a trend in our customer base where they buy smaller node packs to manage a small number of devices with our automation capabilities. Once they have put that in place, they start to see other efficiency use cases that they can achieve using our network automation capabilities.
We observe that these customers come back and buy more licenses for managing a greater number of network devices. So, that’s almost like a greenfield opportunity here.
But, when we look at the most customers looking at managing their networks and doing performance and monitoring, for example, if they have an instance of our software, it’s an in-place upgrade. We offer a dual entitlement and run a parallel program
that allows customers is to seamlessly set up another parallel environment and bring the network up there, start to manage it, and seamlessly shift.
We’ve had an instance of a customer in the EMEA region, where they were testing our latest software and running it in parallel to see how it was functionally different and what effect of productivity it would have on their operators. A couple of weeks went by and their senior management started getting escalations for network problems.
Now, when senior management turned to the network operations team and asked, "We have all these incidents showing up. What is going on? Is something wrong?"
Almost sheepishly, the network operator team had to acknowledge that they were testing the new platform and had completely forgotten about the old tool which they needed to shut down because the new platform ignored the incidents that were not meaningful. They had “accidentally” migrated to the new platform to managing the network much more efficiently.
A lot of our customers use this approach to migrate to the new platform, and of course, our approach is modular. Start with the core product and add the special plug-ins to manage your IP telephony MPLS or multicast capabilities.
Gardner: Okay, for those folks, thinking about evaluating these entry points and looking at the wider benefits of an automated managed approach to configuration on the networks, do you have any landing pages, vanity pages, whitepapers? Where can people go for more detail and more information?
Kuthiala: We have an hp.com page, which is www.hp.com/go/nmc for downloading trial software, reading whitepapers, customer case studies, product capabilities and features. That’s a good starting point.
We also blog about customer experiences and the stories they share with us as well.
To see the HP Automated Network Management (ANM) Solution in action, you can watch a short overview and the ANM 9.10 Video Demo. This recording will explain the NMC components that make up the ANM solution and walk you through a use case to demonstrate the automated capabilities of HP Automated Network Management 9.10.
Gardner: You’ve been listening to a sponsored podcast discussion on raising the bar for network performance management and learning more details about HP’s new Network Management Center 9.1 release. I’d like to thank our guest. We’ve been here with Ashish Kuthiala. He is the Director of Product Marketing for HP Software’s Network Management Center. Thank you, Ashish.
Kuthiala: Thank you, Dana.
Gardner: This is Dana Gardner, Principal Analyst at Interarbor Solutions. Thanks for listening, and come back next time.
Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Learn more. Sponsor: HP.
Transcript of a sponsored podcast on the increasing demands placed on IT network managers and the tools available to help them. Copyright Interarbor Solutions, LLC, 2005-2011. All rights reserved.
You may also be interested in:
- HP with FlexNetwork: You're going to have to update your network, so you might as well do it right
- HP's Robin Purohit unpacks Business Service Management 9 as way to address complexity in hybrid data centers
- Tag-Team of HP Workshops Provides Essential Path to IT Maturity Assessment and a Data Center Transformation
- HPs Kevin Bury on How Cloud and SaaS Will Help Pave the Way to Increased Efficiency in IT Budgets for 2011 and Beyond