Tuesday, June 15, 2010

HP Data Protector, a Case Study on Scale and Completeness for Total Enterprise Data Backup and Recovery

Transcript of a BriefingsDirect podcast from the HP Software Universe Conference in Washington, DC on backing up a growing volume of enterprise data using HP Data Protector.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Sponsor: HP.

Dana Gardner: Hello, and welcome to a special BriefingsDirect podcast series coming to you from the HP Software Universe 2010 Conference in Washington, DC. We're here the week of June 14, 2010 to explore some major enterprise software and solutions trends and innovations making news across HP's ecosystem of customers, partners, and developers.

I'm Dana Gardner, Principal Analyst at Interarbor Solutions, and I'll be your host throughout this series of HP-sponsored Software Universe Live Discussions.

Our topic for this conversation focuses on the challenges and progress in conducting massive and comprehensive backups of enterprise live data, applications, and systems. We'll take a look at how HP Data Protector is managing and safeguarding petabytes of storage per week across HP's next-generation data centers.

The case-study sheds light on how enterprises can consolidate their storage and backup efforts to improve response and recovery times ,while also reducing total costs.

To learn more about high-performance enterprise scale storage and reliable backup, please join me in welcoming Lowell Dale, a technical architect in HP's IT organization. Welcome to BriefingsDirect, Lowell.

Lowell Dale: Thank you, Dana.

Gardner: Lowell, tell me a little bit about the challenges that we're now facing. It seems that we have ever more storage and requirements around compliance and regulations, as well as the need to cut cost. Maybe you could just paint a picture for me of the environment that your storage and backup efforts are involved with.

Dale: One of the things that everyone is dealing with these days is pretty common and that's the growth of data. Although we have a lot of technologies out there that are evolving -- virtualization and the globalization effect with running business and commerce across the globe -- what we're dealing with on the backup and recovery side is an aggregate amount of data that's just growing year after year.

Some of the things that we're running into are the effects of consolidation. For example, we end up trying to backup databases that are getting larger and larger. Some of the applications and servers that consolidate will end up being more of a challenge for some of the services such as backup and recovery. It's pretty common across the industry.

In our environment, we're running about 93,000-95,000 backups per week with an aggregate data volume of about 4 petabytes of backup data and 53,000 run-time hours. That's about 17,000 servers worth of backup across 14 petabytes of storage.

Gardner: Tell me a bit about applications. Is this a comprehensive portfolio? Do you do triage and take some apps and not others? How do you manage what to do with them and when?

Slew of applications

Dale: It's pretty much every application that HP's business is run upon. It doesn’t matter if it's enterprise warehousing or data warehousing or if it's internal things like payroll or web-facing front-ends like hp.com. It's the whole slew of applications that we have to manage.

Gardner: Tell me what the majority of these applications consist of.

Dale: Some of the larger data warehouses we have are built upon SAP and Oracle. You've got SQL databases and Microsoft Exchange. There are all kinds of web front-ends, whether it’s with Microsoft, IIS, or any type of Apache. There are things like SharePoint Portal Services, of course, that have database back-ends that we back up as well. Those are just a few that come to mind.

Gardner: What are the major storage technologies that you are focusing on that you are directing at this fairly massive and distributed problem?

Dale: The storage technologies are managed across two different teams. We have a storage-focused team that manages the storage technologies. They're currently using HP Surestore XP Disk Array and EVA as well. We have our Fibre Channel networks in front of those. In the team that I work on, we're responsible for the backup and recovery of the data on that storage infrastructure.

We're using the Virtual Library Systems that HP manufactures as well as the Enterprise System Libraries (ESL). Those are two predominant storage technologies for getting data to the data protection pool.

Gardner: One of the other trends, I suppose, nowadays is that backup and recovery cycles are happening more frequently. Do you have a policy or a certain frequency that you are focused on, and is that changing?

As the volume and transactional growth goes up, you’ll see the transactional log volume and the archive log volume backups increase, because there's only so much disk space that they can house those logs in.



Dale: That's an interesting question, because often times, you'll see some induced behavior. For example, we back up archive logs for databases, and often, we'll see a large increase in those. As the volume and transactional growth goes up, you’ll see the transactional log volume and the archive log volume backups increase, because there's only so much disk space that they can house those logs in.

You can say the same thing about any transactional type of application, whether it's messaging, which is Exchange with the database, with transactional logs, SQL, or Oracle.

So, we see an increase in backup frequency around logs to not only mitigate disk space constraints but to also mitigate our RTO, or RPO I should say, and how much data they can afford to lose if something should occur like logical corruption or something akin to that.

Gardner: Let's take a step back and focus on the historical lead-up to this current situation. It's clear that HP has had a lot of mergers and acquisitions over the past 10 years or so. That must have involved a lot of different systems and a lot of distribution of redundancy. How did you start working through that to get to a more comprehensive approach that you are now using?

Dale: Well, if I understand your question, you're talking about the effect of us taking on additional IT in consolidating, or are you talking about from product standpoint as well?

Gardner: No, mostly on your internal efforts. I know there's been a lot of product activities as well, but let's focus on how you manage your own systems first.

Simplify and reduce

Dale: One of the things that we have to do at the scope or the size that we get to manage is that we have to simplify and reduce the amount of infrastructure. It’s really the amount of choices and configurations that are going on in our environment. Obviously, you won't find the complete set or suite of HP products in the portfolio that we are managing internally. We have to minimize how many different products we have.

One of the first things we had to do was simplify, so that we could scale to the size and scope that we have to manage. You have to find and simplify configuration and architecture as much as possible, so that you can continue to grow out scale.

Gardner: Lowell, what were some of the major challenges that you faced with those older backup systems? Tell me a bit more about this consolidation journey?

Dale: That's a good question as well. Some of the new technologies that we're evolving, such as virtual tape libraries, was one of the things that we had to figure out. What was the use case scenario for virtual tape? It's not easy to switch from old technology to something new and go 100 percent at it. So we had to take a step-wise approach on how we adopted virtual tape library and what we used it for.

We first started with a minimal amount of use cases and little by little, we started learning what that was really good for. We’ve evolved the use case even more, so that in our next generation design that will move forward. That’s just one example.

We're still using physical tape for certain scenarios where we need the data mobility to move applications or enable the migration of applications and/or data between disparate geographies.



Gardner: And that virtual tape is to replace physical tape. Is that right?

Dale: Yes, really to supplement physical tape. We're still using physical tape for certain scenarios where we need the data mobility to move applications or enable the migration of applications and/or data between disparate geographies. We'll facilitate that in some cases.

Gardner: You mentioned a little earlier on the whole issue of virtualization. You're servicing quite a bit more of that across the board, not just with applications, but storage and networks even.

Tell me a bit more about the issues of virtualization and how that provided a challenge to you, as you moved to these more consolidated and comprehensive storage and backup approaches?

Dale: One of the things with virtualization is that we saw something that we did with storage and utility storage. We made it such that it was much cheaper than before and easy to bring up. It had the "If you build it, they will come" effect. So, one of the things that we may end up seeing is an increase in the number of operating systems (OSs) or virtual machines (VMs) that we see out there. That's the opposite of the consolidation effect, where you have, say, 10 one-terabyte databases consolidated into one to reduce the overhead.

Scheduling overhead

With VMs increasing and the use case for virtualization increasing, one of the challenges is trying to work with scheduling overhead tasks. It could be anywhere from a backup to indexing to virus scanning and whatnot, and trying to find out what the limitations and the bottlenecks are across the entire ecosystem to find out when to run certain overhead and not impact production.

That’s one of the things that’s evolving. We are not there yet, but obviously we have to figure out how to get the data to the data protection pool. With virtualization, it just makes it a little bit more interesting.

Gardner: Lowell, given that your target is moving -- as you say, you're a fast growing company and the data is exploding -- how do you roll out something that is comprehensive and consolidating, but at the same time your target is moving object in terms of scale and growth?

Dale: I talked previously about how we have to standardize and simplify the architecture and the configuration, so that when it comes time to build that out, we can do it in mass.

For example, quite a few years ago, it used to take us quite a while to bring up a backup infrastructure that would facilitate that service need. Nowadays, we can bring up a fairly large scope environment, like an entire data center, within a matter of months if not weeks. This is how long it would take us. The process from there moves towards how we facilitate setting up backup policies and schedules, and even that’s evolving.

For example, if the backup or resource should fail, we have the ability with automation to go out and have it pick up where it left off.



Right now, we're looking at ideas and ways to automate that, so that' when a server plugs in, basically it’ll configure itself. We're not there yet, but we are looking at that. Some of the things that we’ve improved upon are how we build out quickly and then turn around and set up the configurations, as that business demand is then turned around and converted into backup demand, storage demand, and network demand. We’ve improved quite a bit on that front.

Gardner: And what version of Data Protector are you using now, and what are some of the more interesting or impactful features that are part of this latest release?

Dale: Data Protector 6.11 is the current release that we are running and deploying in our next generation. Some of the features with that release that are very helpful to us have to do with checkpoint recoveries.

For example, if the backup or resource should fail, we have the ability with automation to go out and have it pick up where it left off. This has helped us in multifold ways. If you have a bunch of data that you need to get backed up, you don’t want to start over, because it’s going to impact the next minute or the next hour of demand.

Not only that, but it’s also helped us be able to keep our backup success rates up and our tickets down. Instead of bringing a ticket to light for somebody to go look at it, it will attempt a few times for a checkpoint recovery. After so many attempts, then we’ll bring light to the issue so that someone would have to look at.

Gardner: With this emphasis on automation over the manual, tell us about the impact that’s had on your labor issues, and if you’ve been able to take people off of these manual processes and move them into some, perhaps more productive efforts.

Raising service level

Dale: What it’s enabled us to do is really bring our service level up. Not only that, but we're able to focus on other things that we weren’t able to focus on before. So one of the things is there’s a successful backup.

Being able to bring that backup success rate up is key. Some of the things that we’ve done with architecture and the product -- just the different ways for doing process -- has helped with that backup success rate.

The other thing that it's helped us do is that we’ve got a team now, which we didn’t have before, that’s just focused on analytics, looking at events before they become incidents.

I’ll use an analogy of a car that’s about to break-down, and the check-engine light comes on. We're able to go and look at that prior to the car's breaking down. So, we're getting a little bit further ahead. We're going further upstream to detect issues, before they actually impact our backup success rate or SLAs. Those are just a couple of examples there.

We have a certain amount of rate of resource that we do per month. Some of those are to mitigate data loss from logical corruption or accidental deletion



Gardner: How many people does it take to run these petabytes of recovery and backup through your next-generation data center. Just give us a sense of the manpower.

Dale: On backup and recovery in the media management side, we’ve got about 25 people total spread between engineering and operational activities. Basically, their focus is on the backup and recovery of the media management side.

Gardner: Let’s look at some examples. Can you describe a time when you’ve needed to do very quick or even precise recovery, and how did this overall architectural approach and consolidation efforts help you on that?

Dale: We’ve had several cases where we had to recover data and go back to the data protection pool. That happens monthly in fact. We have a certain amount of rate of resource that we do per month. Some of those are to mitigate data loss from logical corruption or accidental deletion.

But, we also find the service being used to do database refreshes. So, we’ll have these large databases that they need to make a copy of from production. They end up getting copied over to development or test.

This current technology we are using, the current configuration, with the virtual tape libraries and the archive blogs has really enabled us to get the data backed up quickly and restored quickly. That’s been exemplified several times with either database copying or database recoveries, when those few type of events do occur.

Gardner: I should think these are some very big deals, when you can deliver the recovered data back to your constituents, to your users. That probably makes their day.

Dale: Oh yes, it does save the bacon at the end of the day.

Gardner: Perhaps you could outline, in your thinking, the top handful of important challenges that Data Protector addresses for you at HP IT. What are the really important paybacks that you're getting?

Object copy

Dale: I’ve mentioned checkpoint recovery. There are also some the things that we’ve been able to use with object copy that’s allowed us to balance capacity between our virtual tape libraries and our physical tape libraries. In our first generation design, we had enough capacity on the virtual libraries inside the whole, a subset of the total data.

Data Protector has a very powerful feature called object copy. That allowed us to maintain our retention of data across two different products or technologies. So, object copy was another one that was very powerful.

There are also a couple of things around the ability to do the integration backups. In the past, we were using some technology that was very expensive in terms of using of disk space on our XPs, and using split-mirror backups. Now, we're using the online integrations for Oracle or SQL and we're also getting ready to add SharePoint and Microsoft Exchange.

Now, we're able to do online backups of these databases. Some of them are upwards of 23 terabytes. We're able to do that without any additional disk space and we're able to back that up without taking down the environment or having any downtime. That’s another thing that’s been very helpful with Data Protector.

Gardner: Lowell, before we wrap up, let's take a look into the future. Where do you see the trends pushing this now? I think we could safely say that there's going to still be more data coming down the pike. Are there any trends around cloud computing, mobile business intelligence, warehousing efforts, or real-time analysis that will have an impact on some of these products and processes?

Some of the things we need to see and we may start seeing in the industry are load management and how loads from different types of technologies talk to each other.



Dale: With some of the evolving technologies and some of the things around cloud computing, at the end of the day, we'll still need to mitigate downtime, data loss, logical corruption, or anything that would jeopardize that business asset.

With cloud computing, if we're using the current technology today with peak base backup, we have to get the data copied over to a data protection pool. There still would be the same approach of trying to get that data. If there is anything to keep up with these emerging technologies, for example, maybe we approach data protection a little bit differently and spread the load out, so that it’s somewhat transparent.

Some of the things we need to see and we may start seeing in the industry are load management and how loads from different types of technologies talk to each other. I mentioned virtualization earlier. Some of the tools with content-awareness and indexing has overhead associated with it.

I think you're going to start seeing these portfolio products talking to each other. They can schedule when to run their overhead function, so that they stay out of the way of production. It’s just a couple of challenges for us.

We're looking at new configurations and designs that consolidate our environment. So we're looking at reducing our environment from 50-75 percent just by redesigning our architecture and making available more resources that were tied up before. That's one goal that we're working on right now. We're deploying that design today.

And then, there's configuration and capacity management. This stuff is still evolving, so that we can manage the service level that we have today, keep that service level up, bring the capital down, and keep the people required to manage it down as well.

Gardner: Great. I'm afraid we're out of time. We've been focusing on the challenges and progress of conducting massive and comprehensive backups of enterprise-wide data and applications and systems. We've been joined by Lowell Dale, a technical architect in HP's IT organization. Thanks so much, Lowell.

Dale: Thank you, Dana.

Gardner: And, thanks to our audience for joining us for this special BriefingsDirect podcast coming to you from the HP Software Universe 2010 Conference in Washington DC. Look for other podcasts from this HP event on the hp.com website under HP Software Universe Live podcast, as well as through the BriefingsDirect Network.

I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this series of HP-sponsored Software Universe live discussions. Thanks again for listening and come back next time.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Sponsor: HP.

Transcript of a BriefingsDirect podcast from the HP Software Universe Conference in Washington, DC on the backing up a growing volume of enterprise data using HP Data Protector. Copyright Interarbor Solutions, LLC, 2005-2010. All rights reserved.

You may also be interested in:

Delta Air Lines Improves Customer Self-Service Apps Quickly Using Quality Assurance Tools

Transcript of a BriefingsDirect podcast with Delta Air Lines development leaders on gaining visibility into application testing to improve customer self-service experience.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Sponsor: HP.

Dana Gardner: Hello, and welcome to a special BriefingsDirect podcast series, coming to you from the HP Software Universe 2010 Conference in Washington, D.C. We're here the week of June 14, 2010, to explore some major enterprise software and solutions trends and innovations making news across HP’s ecosystem of customers, partners, and developers.

I'm Dana Gardner, Principal Analyst at Interarbor Solutions and I'll be your host throughout this series of HP sponsored Software Universe Live discussions.

Our customer case study today focuses on Delta Air Lines and the use of HP quality assurance products for requirements management as well as mapping the test cases and moving into full production. We are here with David Moses, Manager of Quality Assurance for Delta.com and its self service efforts. Thanks for joining us, David.

David Moses: Thank you, very much. Glad to be here.

Gardner: We're also here with John Bell, a Senior Test Engineer at Delta. Welcome John.

John Bell: Thank you.

Gardner: Tell me about the market drivers. What is the problem set when it comes to managing the development process around requirements and then quality and test out through your production? What are the problems that you're generally facing these days?

Moses: Generally, the airline industry, along with the lot of other industries I'm sure, is highly competitive. We have a very, very quick, fast-to-market type environment, where we've got to get products out to our customers. We have a lot of innovation that's being worked on in the industry and a lot of competing channels outside the airline industry that would also like to get at the same customer set. So, it's very important to be able to deliver the best products you can as quickly as possible. "Speed Wins" is our motto.

Gardner: What is it about the use of some of the quality assurance products that helps you pull off that dual trick of speed, but also reliability and high quality?

Moses: The one thing I really like about the HP Quality Center suite especially is that your entire software development cycle can live within that tool. Whenever you're using different tools to do different things, it becomes a little bit more difficult to get the data from one point to another. It becomes a little bit more difficult to pull reports and figure out where you can improve.

Data in one place

What you really want to do is get all your data in one place and Quality Center allows you to do that. We put our requirements in in the beginning. By having those in the system, we can then map to those with our test cases, after we build those in the testing phase.

Not only do we have the QA engineers working on it in Quality Center, we also have the business analysts working on it, whenever they're doing the requirements. That also helps the two groups work together a bit more closely.

Gardner: Do you have anything to add to that, John?

Bell: The one thing that's been very helpful is the way that the Quality Center tabs are set up. It allows us to follow a specific process, looking at the release level all the way down to the actual cycles, and that allows us to manage it.

It's very nice that Quality Center has it all tied into one unit. So, as we go through our processes, we're able to go from tab to tab and we know that all of that information is interconnected. We can ultimately trace a defect back to a specific cycle or a specific test case, all the way back to our requirement. So, the tool is very helpful in keeping all of the information in one area, while still maintaining the consistent process.

Gardner: Can you give us a sense of how much activity you process or how many applications there are -- the size of the workload you’ve got these days?

Bell: There is a lot. I look back to metrics we pulled for 2008. We were doing fewer than 70 projects. By 2009, after we had fully integrated Quality Center, we did over 129 projects. That also included a lot of extra work, which you may have heard about us doing related to a merger.

Gardner: With that increase in the number of applications that you're managing and dealing with, did you have any metrics in terms of the quality that you were able to manage, even though that volume increased so dramatically?

Moses: We were able to do that. That's one of the nice things. You can use your dashboard in Quality Center to pull those metrics up and see those reports. You can point out the projects that were your most troublesome children and look at the projects where you did really well.

Best-case scenario

You can go back and do a best-case scenario, and see what you did great and what you could improve. Having that view into it really helps. It’s also beneficial, whenever you have another project similar to one that was such an issue. You can have a heads up to say, "Okay, we need to treat this one differently this time."

Gardner: It’s the visibility to have repeatability when things go well, and, I suppose, visibility to avoid repeatability when things didn't go well.

Moses: Exactly.

Gardner: Let’s take a look at some of the innovation you've done. Tell me a bit about what you've worked with in terms of Quality Center in some of your own integration or tweaking?

Bell: One thing that we've been able to do with Quality Center is connect it with Quick Test Pro, and we do have Quality Center 10, as well as Quick Test Pro 10. We've been able to build our automation and store those in the Test Plan tab of Quality Center.

This has really been beneficial for us, when we go into our test labs and build our test set. We're able to take all of these automated pieces and combine them into test set. What this has allowed us to do is run all of our automation as one test set. We've been able to run those on a remote box. It's taken our regression test time from one person for five days, down to zero people and approximately an hour and 45 minutes.

Also, with the Test Lab tab, we're able to schedule these test sets to run during off hours. A lot of times our automation for things such as regression or sanity, can run on off hours. We schedule those to run at perhaps 6 o'clock in the morning. Then, when we come in at 8 o'clock in the morning, all of those tests would have already run.

That frees up our testers to be doing more of the manual functional testing and that allows us to know that we have complete coverage with the automation, as well as our sanity pieces. So, that's a unique way that we've used Quality Center to help manage that and to reduce our testing times by over 50 percent.

Gardner: Thank you, John. David, there have been some ways in which your larger goals as a business have been either improved upon or perhaps better aligned with the whole development process. I guess I'm looking for whether there is some payback here in terms of your larger business goals?

Moses: It definitely is. It goes back to speed to market with new functionality and making the customer's experience better. In all of our self-service products, it's very important that we test from the customers’ point of view.

We deliver those products that make it easier for them to use our services. That's one of the things that always sticks in my mind, when I'm at an airport, and I'm watching people use the kiosk. That's one of the things we do. We bring our people out to the airports and we watch our customers use our products, so we get that inside view of what's going on with them.

A lot on the line

I'll see people hesitantly reaching out to hit a button. Their hand may be shaking. It could be an elderly person. It could be a person with a lot on the line. Say it’s somebody taking their family on vacation. It's the only vacation they can afford to go on, and they’ve got a lot of investment into that flight to get there and also to get back home. Really there's a lot on the line for them.

A lot of people don’t know a lot about the airline industry and they don’t realize that it's okay if they hit the wrong button. It's really easy to start over. But, sometimes they would be literally shaking, when they reach out to hit the button. We want to make sure that they have a good comfort level. We want to make sure they have the best experience they could possibly have. And, the faster we can deliver products to them, that make that experience real for them, the better.

Gardner: I should think the whole notion of self service is usually important. It's important for the customer to be able to move through and do things their way, and I suppose there are some great cost savings and efficiencies on your end as well.

Dave, you could just highlight a little bit about how the whole notion of self service embedded into applications. It's important how some of the quality assurance tools and processes have helped there.

Moses: I go back to anytime you have to give up whenever you're having an issue with products, while you're online. You're on a website, and you have to call customer service. I think most people just sort of feel defeated at that point. People like to handle things themselves. You need a channel there for the customer to go to, if they need additional help.

So many clients and customers these days are so tech savvy. They know the industry they are in, and they know the tools they're working with, especially frequent flyers. I'd venture to say that most frequent flyers can hit the airport, check-in, get through security, and get to their plane really quickly. They just know their airports and they know everything they need to know about their flight, because this is where they live part of their lives.

You don't want to make them wait in line. You don't want to make them wait on a phone tree, when they make a phone call. You want them to be able to walk into the airport, hit a couple of buttons, get through security, and get to their gate.

By offering these types of products to the customers, you give them the best of both worlds. You give them a fast pass to check in. You give them a fast pass book. But, you can also give the less-experienced customer an easy-to-understand path to do what they need as well.

Gardner: And, to get those business benefits, those customer loyalty benefits, is really a function of good software development overall, isn't it?

Moses: Exactly. You have to give the customer the right tools that they want to get the job done for them.

Gardner: For other enterprises that are perhaps are going to be working towards a higher degree of quality in their software, but probably also interested in reducing the time to develop and time to value, do you have any suggestions, now that you’ve gone through this, that you might offer to them?

Interim approach

Bell: In using Quality Center, we've used an interim approach. Initially, we just used the Defects tab of Quality Center. Then, we slowly began to add the Requirements piece, and then Test Cases, and ultimately the Releases and Cycles.

One thing that we've found to be very beneficial with Quality Center is that it shows the development organization that this just isn't a QA tool that a QA team uses. What we've been able to do by bringing the requirements piece into it and by bringing the defects and other parts of it together, is bring the whole team on board to using a common tool.

In the past, a lot of people have always thought of Quality Centers as just a little tool that the QA people use in the corner and nobody else needs to be aware of. Now, we have our business analysts, project managers, and developers, as well as the QA team and even managers, because each person can get a different view of different information.

From Dashboard, your managers can look at your trends and what type of overall development lifecycle is coming through. Your project managers can be very involved in pulling the number of defects and see which ones are still outstanding and what the criticality of that is. The developers can be involved via entering information in on defects when those issues have been resolved?

We've found that Quality Center is actually a tool that has drawn together all of the teams. They're all using a common interface, and they all start to recognize the importance of tying all of this together, so that everyone can get a view as to what's going on throughout the whole lifecycle.

Moses: John hits on a really good point there. You have to realize the importance of it, and we did a long time ago. We've realized the importance of automating and we've realized the importance of having multiple groups using the same tool.

In all honesty, we were just miserable in our own history of trying to get those to work. You really take certain shots at it. For the past eight years, if we can go back that far, we've been using Quality Center tools for Test Director, just trying to get things automated, using the tools we had at the time.

The one thing that we never actually did was dedicate the resources. It's not just a tool. There are people there too. There are processes. There are concepts you're going to have to get in your head to get this to work, but you have to be willing to buy-in by having the people resources dedicated to building the test scripts. Then, you're not done. You've got to maintain them. That's where most people fall short and that's where we fell short for quite some time.

Once we were able to finally dedicate the people to the maintenance of these scripts to keep them active and running, that's where we got a win. If you look at a web site these days, it's following one of two models. You either have a release schedule, that’s a more static site, or you have a highly dynamic site that's always changing and always throwing out improvements.

We fit into that "Speed Wins," when we get the product out for the customers’ trading, and improve the experience as often as possible. So, we’re a highly dynamic site. We'll break up to 20 percent of all of our test scripts, all of our automated test scripts, every week. That's a lot of maintenance, even though we're using a lot of reusable code. You have to have those resources dedicated to keep that going.

Gardner: Well, I appreciate your time. We've been talking about the quality assurance process and the use of some HP tools. We've been learning about experiences from Delta Air Lines development executives. I want to thank our guests today, David Moses, Manager of Quality Assurance for Delta.com in the self-service function there. Thank you, David.

Moses: Thank you, very much.

Gardner: We've also been joined by John Bell, Senior Test Engineer there at Delta Air Lines. Thanks to you too, John.

Bell: It's been a pleasure.

Gardner: And, thanks to our audience for joining us for this special BriefingsDirect podcast coming to you from the HP Software Universe 2010 conference in Washington, DC.

Look for other podcasts from this HP event on the hp.com website, as well as via the BriefingsDirect Network.

I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this series of Software Universe Live Discussions. Thanks again for listening, and come back next time.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Sponsor: HP.

Transcript of a BriefingsDirect podcast with Delta Air Lines development leaders on gaining visibility into application testing to improve customer self-service experience. Copyright Interarbor Solutions, LLC, 2005-2010. All rights reserved.

You may also be interested in:

McKesson Shows Bringing Testing Tools on the Road Improves Speed to Market and Customer Satisfaction

Transcript of a BriefingsDirect podcast from the HP Software Universe 2010 Conference in Washington, DC on field-testing software installations using HP Performance Center products.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Sponsor: HP.

Dana Gardner: Hello, and welcome to a special BriefingsDirect podcast series, coming to you from the HP Software Universe 2010 Conference in Washington, D.C. We're here the week of June 14, 2010, to explore some major enterprise software and solutions trends and innovations making news across HP’s ecosystem of customers, partners, and developers.

I'm Dana Gardner, Principal Analyst at Interarbor Solutions, and I'll be your host throughout this series of HP sponsored Software Universe Live discussions.

Our customer case-study today focuses on McKesson Corp., a provider of certified healthcare information technology, including electronic health records, medical billing, and claims management software. McKesson is a user of HP’s project-based performance testing products used to make sure that applications perform in the field as intended throughout their lifecycle.

To learn more about McKesson’s innovative use of quality assurance software, please join me in welcoming Todd Eaton, Director of Application Lifecycle Management Tools in the CTO’s office at McKesson. Welcome to the show, Todd.

Todd Eaton: Thank you.

Gardner: Todd, tell me a little bit about what's going on in the market that is making the performance-based testing, particularly onsite, such an important issue for you.

Eaton: Well, looking at McKesson’s businesses, one of the things that we do is provide software for sale for various healthcare providers. With the current federal government regulations that are coming out and some of these newer initiatives that are planned by the federal government, these providers are looking for tools to help them do better healthcare throughout their enterprises.

With that in mind, they're looking to add functionality, they're looking to add systems, and they look to McKesson, as the leader in healthcare, to provide those solutions for them. With that in mind, our group works with the various R&D organizations within McKesson, to help them develop software for the needs of those customers.

Gardner: And what is it about performance-based testing that is so important now. We've certainly had lots of opportunity to trial things in labs and create testbeds. What is it about the real-world delivery that's important?

Eaton: It's one thing that we can test within McKesson. It's another thing when you test out at the customer site, and that's a main driver of this new innovation that we’re partnering up with HP.

When we build an application and sell that to our customers, they can take that application, bring it into their own ecosystem, into their own data center and install it onto their own hardware.

Controlled testing

The testing that we do in our labs is a little more controlled. We have access to HP and other vendors with their state-of-the-art equipment. We come up with our own set of standards, but when they go out to the site and get put in to those hospitals, we want to ensure that our applications act at the same speed and same performance at their site that we experience in our controlled environment. So, being able to test on their equipment is very important for us.

Gardner: And it's I suppose difficult for you to anticipate exactly what you're going to encounter, until you're actually in that data center?

Eaton: Exactly. Just knowing how many different healthcare providers there are out there, you could imagine all the different hardware platforms, different infrastructures, and the needs or infrastructure items that they may have in their data centers.

Gardner: This isn’t just a function of getting set up, but there's a whole life-cycle of updates, patches, improvements, and increased functionality across the application set. Is this something that you can do over a period of time?

Eaton: Yes, and another very important thing is using their data. The hospitals themselves will have copies of their production data sets that they keep control of. There are strict regulations. That kind of data cannot leave their premises. Being able to test using the large amount of data or the large volume of data that they will have onsite is very crucial to testing our applications.

Gardner: Todd, tell me the story behind gaining this capability of that performance-based testing onsite -- how did you approach it, how long has it been in the making, and maybe a little bit about what you’re encountering?

Eaton: When we started out, we had some discussion with some of the R&D groups internally about our performance testing. My group actually provides a performance-testing service. We go out to the various groups, and we’re doing the testing.

We always look to find out what we can do better. We’re always doing lesson learns and things like that and talking with these various groups. We found that, even though we did a very good job of doing performance testings internally, we were still finding defects and performance issues out at the site, when we brought that software out and installed it in the customer’s data center.

After further investigation, it became apparent to us that we weren’t able to replicate all those different environments in our data center. It’s just too big of a task.

The next logical thing to do was to take the testing capabilities that we had and bring it all out on the road. We have these different services teams that go out to install software. We could go along with them and bring the powerful tools that we use with HP into those data centers and do the exact same testing that we did, and make sure that our applications were running as expected on their environments.

Gardner: Getting it right the first time is always one of the most important things for any business activity. Any kind of failure along the way is always going to cost more and perhaps even jeopardize the relationship with the customer.

Speed to market

Eaton: Yeah, it jeopardizes the relationship with the customer, but one of the things that we also drive is speed to market. We want to make sure that our solutions get out there as fast as possible, so that we can help those providers and those healthcare entities in giving the best patient care that they can.

Gardner: What was the biggest hurdle in being able to, as you say, bring the testing capability out to the field. What were some of the hang-ups in order to accomplish that?

Eaton: Well, the tool that we use primarily within McKesson is Performance Center, and Performance Center is an enterprise-based application. It’s usually kept where we have multiple controllers, and we have multiple groups using those, but it resides within our network.

So, the biggest hurdle was how to take that powerful tool and bring it out to these sites? So, we went back to our HP rep, and said, "Here’s our challenge. This is what we’ve got. We don’t really see anything where you have an offering in that space. What can you do for us?"

Gardner: How far and wide have you been able to accomplish this? Are you doing it in terms of numbers of facilities, in what kind of organizations?

Eaton: Right now we have it across the board in multiple applications. McKesson develops numerous applications in the healthcare space, and we’ve used those across the board. Currently, we have two engagements going on simultaneously with two different hospitals, testing two different groups of applications, and even the application themselves.

I’ve got one site that’s using it for 26 different applications and other that’s using it for five. We’ve got two teams going out there, one from my group and one from one of the internal R&D groups that are assisting the customer and testing the applications on their equipment.

Gardner: From these experiences so far, are there metrics of success, paybacks, not only for you and McKesson, but also for the providers that you service?

Eaton: The first couple of times we did this, we found that we were able to reduce the performance defects dramatically. We’re talking something like 40-50 percent right off the bat. Some of the timing that we had experienced internally seemed to be fine, well within SLAs. But as soon as I got out to a site and onto different hardware configurations, it took some application tuning to get it down. We were finding 90 percent increases with our help of continual testing and performance tweaks.

Items like that are just so powerful, when you are bringing that out to the various customer, and can say, "If you engage us, and we can do this testing for you, we can make sure that those applications will run in the way that you want them to."

Gardner: How about for your development efficiency? Are you learning some lessons on the road that you wouldn’t have had before that you can now bring into the next rep. Is there a feedback loop of sorts?

Powerful feedback

Eaton: Yes. It’s a pretty powerful one back to our R&D groups, because getting back to that data scenario, the volume and types of data that the customers have can be unexpected. The way customers use systems, while it works perfectly fine, is not one of the use cases that is normally found in some applications, and you get different results.

So, finding them out in the field and then being able to bring those back to our R&D groups and say, "This is what we’re seeing out in the field and this is how people are using it," gives them a better insight and makes them able to modify their code to fit those use cases better.

Gardner: Todd, is there any advice that you would give to those considering doing this, that is to say, taking their performance testing out on the road, closer to the actual site where these applications are going to reside?

Eaton: The main one is to work with your HP rep on what they have available for this. We took a product that everybody is familiar with, LoadRunner, and tweaked it so it became portable. The HP reps know a lot more about how they packaged that up and what’s best for different customers based on their needs. Working with a rep would be a big help in trying to roll this out to various groups.

Gardner: Okay, great. We’ve been learning about how McKesson is bringing performance-based testing products out to their customers’ locations and gaining a feedback capability as well as reducing time to market and making the quality of those applications near 100 percent right from the start.

I want to thank our guest. We’ve been joined by Todd Eaton, Director of Application Lifecycle Management Tools in the CTO’s office at McKesson. Thank you so much Todd.

Eaton: You’re welcome. Nice talking to you.

Gardner: And, thanks to our audience for joining us for this special BriefingsDirect podcast, coming to you from the HP Software Universe 2010 Conference in Washington, DC.

Look for other podcasts from this HP event on the hp.com website under HP Software Universe Live podcast, as well as through the BriefingsDirect Network.

I’m Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this series of HP-sponsored Software Universe Live Discussions. Thanks for listening, and come back next time.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Sponsor: HP.


Transcript of a BriefingsDirect podcast from the HP Software Universe 2010 Conference in Washington, DC on field-testing software installations using HP Performance Center products. Copyright Interarbor Solutions, LLC, 2005-2010. All rights reserved.

You may also be interested in: