Tuesday, September 30, 2008

Improved Insights and Analysis From Systems Logs Reduce Complexity Risks From Virtualization

Transcript of BriefingsDirect podcast on the infrastructure management and security challenges of virtualization.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: LogLogic.

Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you’re listening to BriefingsDirect. Today, a sponsored podcast discussion about virtualization, and how to better improve management of virtualization, to gain better security using virtualization techniques, and also to find methods for compliance and regulation -- but without the pitfalls of complexity and mismanagement.

We're going to be talking about virtualization best practices with several folks who are dealing with this at several different levels. We're going to hearing from VMware, Unisys and LogLogic.

Let me introduce our panel today. First, we're joined by Charu Chaubal, senior architect for technical marketing, at VMware. Welcome, Charu.

Charu Chaubal: Thank you.

Gardner: We're also joined by Chris Hoff, chief security architect at Unisys. Hi, Chris.

Chris Hoff: Hi, how are you?

Gardner: Great. Also, Dr. Anton Chuvakin, chief logging evangelist and a security expert at LogLogic. Welcome to the show.

Dr. Anton Chuvakin: Hello. Thank you.

Gardner: Virtualization has certainly taken off, and this is nothing new to VMware. Organizations like Unisys are now doing quite a bit to help organizations that utilize, expand, and enjoy the benefits of virtualization. But virtualization needs to be done the correct way, without avoid pitfalls. If you do it too tactically, without allowing it to be part of an IT lifecycle and without management, then the fruits and benefits of virtualization can be largely lost.

Before we get into what virtualization can do, what to avoid, and how to better approach it, I'd like to just take a moment and try to determine why virtualization is really hot and taking off in the market now.

Let's start with Chris Hoff at Unisys. Some of these technologies have been around for many years. What is it about this point in time that is really making virtualization so hot?

Hoff: It's the confluence of quite a few things, and we see this sort of event happen in information technology (IT) quite often. You have the practically perfect storm of economics, technology, culture, and business coming together at one really interesting point in time.

The first thing that comes to mind is when people think about the benefits. The reasons people are virtualizing are cost, cost savings and then cost avoidance, which is usually seconded by agility and flexibility. It’s also about being able to, as an IT organization, service your constituent customers in a manner that is more in line with the way business functions, which is, in many cases, quite a fast pace -- with the need to be flexible.

These things are contributing a lot to the uptake, not to mention the advent of a lot of new technology in both hardware and software, which is starting to enable some of this to be more realistic in a business environment.

Gardner: Now over to VMware. Charu, tell us how deep and wide virtualization is emerging? It seems like people are using it in more and more ways, and in more and more places.

Chaubal: That's right. When the x86 virtualization first started out, maybe 10 years ago in a big way, it was largely being used in test and development types of environments. Over the last five years, it's definitely started to enter the production arena as well. We see more and more customers running even mission-critical applications on virtualization technologies.

Furthermore, we also see it across the board in terms of customer size, where everyone from the smallest customer to the very largest enterprises, are expanding further and further with their virtual environments.

Gardner: Let's go to LogLogic. Tell me, Anton, what sort of security and what sort of preventative measures are you helping your customers with in terms of gaining the visibility and the analytics about what's going on among these many moving parts? Many of these deployments are in now in an automated mode, more so than before they were virtualized. What are some of the issues that are you helping people deal with?

Chuvakin: You were exactly right about the visibility into the environments. As people deploy different types of IT infrastructure, first physical and now virtual, there is always a challenge of figuring out what happens with those PCs, at those PCs, which people are trying to connect to, or even attack them, and do all these at the same time around the clock.

Adding virtualization to the technology that people use in such a massive way as it's occurring now brings up the challenges of how do we know what happens in those environments. Is there anybody trying to abuse them, just use them, or use them inappropriately? Is there a lack of auditability and control in those environments? Logs are definitely one of the ways, or I would say a primary way, of gaining that visibility for most IT compliance, and virtualization is no exception.

As a result, as people deploy VMware and applications in a couple of virtual platforms, the challenge is knowing what actually happens on those platforms, what happens in those virtual machines (VMs), and what happens with the applications. Logging and LogLogic play a very critical role in not only collecting those bits and pieces, but also creating a big picture or a view of that activity across other organizations.

Virtualization definitely solves some of the problems, but at the same time, it brings in and brings out new things, which people really aren't used to dealing with. For example, it used to be that if you monitor a server, you know where the server is, you then know how to monitor it, you know what applications run there.

In virtual environments, that certainly is true, but at the same time it adds another layer of this server going somewhere else, and you monitor where it was moved, where it is now, and basically perform monitoring as servers come up and down, disappear, get moved, and that type of stuff.

Gardner: Now, Chris at Unisys, when you're dealing with customers, based on what we've heard about this expansion of virtualization, you're dealing with it on an applications level, and also on the infrastructure and server level.

What’s more, some folks are now getting into desktop virtualization infrastructure and delivering whole desktop interfaces out to end-user devices. This impacts not just a server. We're talking about network devices and storage devices. This is a bit more than a tactical issue. It really starts getting strategic pretty quickly.

Hoff: That's absolutely correct. If you really look at virtualization as an enabling technology or platform, as we can look out to the next three years of large companies use from the perspective of their strategic plans, you'll notice that there is a large trend toward what you might call "real-time infrastructure."

The notion here is about how you apply and take this enabling technology in the benefits of virtualization and leverage that to provide automation re-purposing. You have to deal with elements and issues that relate to charge-back for assets, as IT becomes more of a utility service.

If we look further out from there, we look at the governance issues of what it means to not really focus on hardware anymore, or even applications -- but on service and service levels. It gets a lot more strategic at times, played out all along the continuum.

While we focus virtualization on the notion of infrastructure and technology, what's really starting to happen now -- and what's important with the customers that we deal with -- is being able to unite both business process and business strategy, along with the infrastructure and the architecture that support it.

So we're a little excited and frothed up as it relates to all the benefits of virtualization today, and the bigger picture is even more exciting and interesting. That's going to fundamentally continue to cause us to change what we do and how we do it, as we move forward. Visibility is very important, but understanding the organizational and operational impacts that real-time infrastructure and virtualization bring, is really going to be an interesting challenge for folks to get their hands around.

Gardner: Now, Charu at VMware, you obviously are building out what you consider the premier platform and approach to virtualization technically. You've heard, obviously, the opportunity for professional services and methodologies for approaching this, and you have third parties like LogLogic that are trying to provide better visibility across many different systems and devices.

How are you using this information in terms of what you bring to the management table for folks who are moving from, say, tactical to more strategic use of virtualization?

Chaubal: A lot of customers are expanding their virtualization so much now, to the point where they're hitting some interesting challenges that they maybe wouldn't have hit before. One great example is around compliance, such as Payment Card Industry Data Security Standards (PCI) compliance. There are a lot of questions right now around virtualizing those systems that process credit card holder data.

Chaubal: They're asking, "If I do this, am I going to be compliant with PCI? Is this something that's a realistic possibility? If it is, how do I go about demonstrating this to an auditor?"

This is where partners like LogLogic come into play, because they have the tools that can help achieve this. We believe that VMware provides a compliance-ready type of platform, so it is something you can achieve compliance with. But, in order to demonstrate and maintain that compliance, it's useful to have these tools from partners that can help you do that.

Gardner: Now, Anton at LogLogic, you're able to examine a number of different systems, gather information, correlate that information, do analytics, and provide a picture of what should be happening. Or, when something is not happening, you can look for the reasons why and look for aberrant or unusual behavior. So let's address security a little bit.

What are some of the challenges in terms of security when you move from a physical environment for compute power and resources to a virtualized environment? Then second, what about the mixture? It is obviously going to be both physical and virtualized instances of infrastructure and applications. Tell us about the security implications.

Chuvakin: I just follow the same logic I used for our recent webcast about virtualization security. In this webcast, I basically presented a full view of things that are the same and that are different in virtualized environments. I'll use the same structure, because some people who get too frothy, as Greg put it, about virtualization just stick to "virtualization changes everything." That is used sometimes as an excuse to not do things that you should continue doing in a virtualized environment.

Let's start from what things are the same. When you migrate from a physical to a virtual infrastructure, you certainly still have servers and applications running in those servers and you have people managing those servers. That leaves you with the need to monitor the same audit and the same security technologies that you use. You shouldn't stop. You shouldn't throw away your firewalls. You shouldn't throw away your log analysis tool, because you still have servers and applications.

They might be easier to monitor in virtual environments. It might sometimes be harder, but you shouldn't change things that are working for you in the physical environment, because virtualization does change a few things. At the same time, the fact that you have applications, servers, and they serve you for business purposes, shouldn't stop you from doing useful things you're doing now.

Now, an additional layer on top of what you already have adds the new things that come with virtualization. The fact that this server might be there one day, but be gone tomorrow -- or not be not there one day and be built up and used for a while and then removed -- definitely brings the new challenges to security monitoring, security auditing in figuring out who did what where.

The definition of "who" didn't change. It's still a user, but what and where definitely did change. I mean, if it was done on a certain server, in virtual environment it might not be a server -- it might be a virtual image, which adds additional complexities

There are also new things that just don't have any occurrence in the physical environment -- for example, a rogue VM, a VM that is built by somebody who is not authorized to run VMs. It might be the end user who actually has his own little mini infrastructure. It brings up all sorts of forensic challenges that you have now solved. You don't just investigate a machine. You investigate a machine with a virtual platform, with another server on top, or another desktop on top.

This is my view of things that are the same that you should continue doing and things that are new that you should start learning how to audit and how to analyze the activity in the virtual environments, as well as how to do forensics, if what you have is a machine with potential a rogue VM.

Gardner: How about you, Chris at Unisys, how do you view implications for security and risk mitigation when it comes to moving increasingly into virtualized environments?

Hoff: I have to take a pretty pragmatic approach. The reality is that there are three conversations and three separate questions that need to be addressed, when you're talking about security in virtualized environments.

Unfortunately, what usually happens is that all three of them are combined into one giant question, which tends to lead to more confusion. So I like to separate the virtualization and security questions into three parts.

One of them is securing virtualization, and understanding what the impacts are on your architecture, your infrastructure, and your business process and models, when you introduce this new virtualization layer. That's really about securing the underlying virtualization platforms and understanding what happens and what changes when you introduce that, assuming that you have a decent understanding of what that means, and how that will ultimately flow down operationally.

The second point or question to address is one of virtualizing security, which is actually the operational element of, "What does it mean, and how do I go about taking what I might do in the physical world, and replicate that and/or even improve it in the virtual world?"

That's an interesting question, assuming that you have a good understanding of architecture and things that matter most to you, and how you might protect them, or how you might not be doing that. You may find several gaps today in your ability to actually do what you do in the physical world.

The third element is security through virtualization, which is okay, assuming that I have a good architectural blueprint and that I understand the impacts, the models, who and what changes operationally, how I have to go about securing things, and what benefits I get out of virtualization.

How do I actually improve my security posture by using these platforms and this technology? If you look at that, if you look at it in that way, you really are able to start dealing with the issues associated with each category. You could probably guess that if you mixed all three of them up, you could go down one path, and very easily be distracted by another.

When we break out the conversations with customers like that, it always comes back to a very basic premise that we seem to have forgotten in our industry. Despite all the technology, despite all the tools, and all of the things that go blinky-blink at night, the reality is that this comes down to being able to appropriately manage risk. That starts with understanding the things that matter to you most and using risk assessment frameworks and processes.

In a gross analogy, when you go to a grocery store and you take time to pack your frozen goods in one bag, and your canned goods and your soft goods in other bags, you use this compartmentalization, understanding what the impact is of all of the wonderful mobility, balanced with compliance and security needs.

If you got home, and you've got canned goods in with your fruit, the reality is that you've not done a good job of compartmentalizing and understanding what the impact of one good might have on the other.

The same thing applies in the virtual world. If you don't take the time to go back to the basics, understanding the impact of the infrastructure and the changes -- you're going to be a world of hurt later, even if you get the cost benefits and all the wonderful agility and mobility.

We really approach it pragmatically in a rational manner, such that people understand both the pluses, the pros and the cons of virtualization in their environments.

Gardner: We've determined that virtualization is quite hot. It's ramping up quickly. A number of studies have shown a 50-70 percent increase in the use of virtualization in the last few years. Projections continue for very fast-paced growth.

We also see a number of organizations using multiple vendors, when it comes to virtualization. We've also discussed how security and complexity apply to this, and that you need a comprehensive or contextual view of what's going on with your systems -- particularly if you have a mixture of physical and virtual.

Let's look at some examples of how this has been mitigated, how the risk has actually been decreased, and how the fruits, if you will, of virtualization are enjoyed without the pitfalls.

Let's first go to Charu at VMware. Can you offer some examples of how people have used virtualization, done it the right away, avoided some of these pitfalls, and have gained the visibility and analytics and therefore helped with their matured approach to virtualization?

Chaubal: One thing we've done at VMware over the last year and a half is try to provide as much prescriptive guidance as we can. So a lot of securing of virtualization comes down to making sure you actually deploy it [properly].

So, one thing that we've done is created hardening guides that really aim to show customers how this can be done. That's proved to be very popular among our customers.

Not to get into too much detail, but one of the main issues is the fact that you have a virtualization layer that typically has a management interface in it. Then, you have the interface that goes into your virtual machines. People need to understand that this management layer needs to be completely separated from the actual production network.

That principle is manifested in different recommendations and scenarios, when you plan a deployment and configure it. That's just one example where customers have been able to make use of our prospective guidance. Then, they architect something that is actually much more secure than possibly they would have with some preconceived notions that they might have had. I think that's one area where we are seeing success.

Gardner: Let's go to LogLogic. Anton, give us some examples, actual companies or at least use-case scenarios, where the use of LogLogic, or the methodologies that it supports, have brought to bear on virtualization – to lower the cost, increased performance, gain higher utilization, and so forth -- but without some of these risks.

Chuvakin: I'll give an example of a retail company that was using LogLogic for compliance, as well for operational usage, such as troubleshooting their servers. This company, in a separate project, was implementing virtualization to convert some of their infrastructure to a virtual machine.

At some point, those two projects mainly had their log management to track operations to satisfy PCI requirements. These issues collided with the virtualization projects, and the company realized that they now have to not just collect logs from the physical infrastructure, but also from the virtual side that is now being built.

What happened was that the logs from the virtual infrastructure were also streamed into LogLogic. Now, LogLogic has the ability to collect any type of a log. In this case, we did use that capability to collect the log, which were at the time not even supported or analyzed by LogLogic.

The customers understood that they have to collect the logs from the virtual platforms, and that LogLogic has an ability to collect any type of a log. They first started from a log collection effort, so that they could always go back and say, "We've got this data somewhere, and you can go and investigate it."

We also built up a package of contents to analyze the logs as they were starting their collection efforts to have logs ready for users. At LogLogic, we built and set up reports and searches to help them go through the data. So, it was really going in parallel with that, building up some analytic content to make sense of the data, if a customer already has a collection effort, which included logs from the virtual platform.

In this case, it was actually a great success story because we used part of the LogLogic infrastructure that doesn't rely on any preconceived notions of what the logs are. Then, they built up on top of that to help them pinpoint the issues with their VMs to see who accesses the platforms, what applications people use to manage the environment, and, basically, to track all sorts of interest in events in their virtual infrastructure.

I have to admit that it wasn't really tested on their PCI yet, but I'm pretty confident that their PCI auditors will accept what they did for the virtual environment. And, they would satisfy the requirements of PCI, which calls for logging and monitoring, as well as the requirements in the compliance mandate.

At the same time, while they are building it for that use, their analysts are already trying to do searches and look certain things that might be out of order in their VM environment. An operational use-case spontaneously emerged, and now they not only have their own idea for what to look for, but also our content to do that.

Gardner: You bring up a point here that we shouldn't overlook. This isn't something that you just build and walk away from. It requires ongoing refinement tuning. The dynamic nature of virtualization, while perhaps automated in terms of allocating resources, is an overall process that needs to be managed in order for these business outcomes to be enjoyed.

Let's go back to Chris at Unisys. Tell us about the ongoing nature of virtualization. How do you keep on top of it? How do you keep it performing well, and perhaps even eke out more optimized utilization benefits?

Hoff: There's not a whole lot of difference in terms of how you might apply the same query to non-virtualized infrastructure. It's not a monolithic single-time event, but, as I alluded to in a previous answer, the next extension should be evolution along the continuum. That notion of real-time infrastructure really does take in the concept of a lot of tasks.

Today, we are quite operationally inefficient in doing that, both from the perspective of practice and infrastructure utilization, and really making sure that our infrastructure, and the compute and storage, and all of the things that go into, up in our infrastructure become much more efficient, for power, cost efficiency, utility, and flexibility.

When you unite all of those capabilities, what it's going to mean going forward is a much more rich methodology and model for taking business process and instantiating that as an expression of policy within your infrastructure. So, you can say the things that are most important to your business are these processes, and these services.

What you need to be able to do, and ultimately what it means to automation and the efficiency problems, is that the infrastructure needs to self-govern, self-provision and re-provision. You need to be to able to allocate cost back to your constituents, and it gets closer and closer to becoming a loose, but federated, group of services. It can essentially play and interact in real-time to service the needs of the business.

All the benefits that we get out of virtualization today are just the beginning and kind of the springboard for what we are going to see in terms of automation, which is great. But we are right at the same problem set, as we kind of pogo along this continuum, which is trying really hard to unite this notion of governance and making sure that just because you can, doesn't mean you should. In certain instances the business processes and policies might prescribe that you don't do some things that would otherwise be harmful in your perspective.

It's that delicate balance of security versus operational agility that we need to get much better at, and much more intelligent about, as we use our virtualization as an enabler. That's going to bring some really interesting and challenging things to the forefront in the way in which IT operates -- benefits and then differences.

Gardner: In the way that you were describing this continuum, it almost sounds like you were alluding to cloud computing, as it's being defined more and more -- and perhaps the “private cloud,” where people would be managing their internal enterprise IT resources from a cloud perspective. Am I overstating it?

Hoff: No, I don't think you're overstating it. I think that's a reasonable assertion and assumption based on what I am saying. The difficulty in using the "cloud" word is that it means a lot of things to lots of people. I think you brought up three definitions in your one sentence.

But the notion of being able to essentially utilize our resources pretty much anywhere, regardless of who owns the infrastructure, is something that's enticing and brings up a host of wonderful issues that make security people like me itchy.

If you read Nicolas Carr's book The Big Switch, and you think about utility or grid computing or whatever you want to call it -- the notion of being able to better utilize my resources, balance that with security, and be very agile -- it's fun times ahead. You are absolutely right. I was alluding to the C-word, yes.

Gardner: Okay. Charu at VMware, given that organizations are at different rates of adoption around virtualization -- some are just starting to test the waters -- but the end goal for some of these adopters could be this cloud-compute value, this fabric of IT value.

How are people getting started, and how should they get started in a way that sets them up for this longer-term payoff?

Chaubal: That's a very broad question, but I think it is important that you can go in and use virtualization to consolidate physical servers on to smaller number of physical servers, and you get that savings that way. If that's the approach you take, you might end up at a dead-end, or you might get off on a tangent somewhere.

What we find is that there is really a maturity curve when it comes to virtualization adoption, and one of the most important axes along that curve is, in a broad sense, your operational maturity.

When you are starting out, sure, go ahead and consolidate servers. That's a good way to get some quick wins, but you're rapidly going to come to a point where you need to start to imposing an operational discipline and policies and procedures that perhaps you didn't have before.

Perhaps you had them, but they weren't all that rigidly adhered to or weren't really followed all the time. The most important thing is that you start thinking about this operational maturity, and then go to things like being able to standardize upon processes and standardize upon the way things are configured.

Any kind of process you do, make sure it goes through the right steps in terms of getting it approved. There is a whole methodology around that, and that's one of the things that we spend a lot of time with our customers.

We have this graph where, if you can look at how many servers are virtualized over time, we would like to see a steady upward 45-degree angle to that curve. If somebody virtualizes too many too soon, you will see that curve shoot up sharply. Then, you repeat yourself, because you virtualized so much so quickly, and all these other issues that Chris alluded to come into play, and they might bog you down.

On the other hand, you could suffer the other extreme where you virtualize so slowly, that the curve is very shallow, and you end up leaving savings and benefits on the table, because you are just picking them up so slowly.

Gardner: Missed opportunities, right?

Chaubal: Right, exactly. The most important thing, when you are starting out, is to keep that in mind that you are not just installing a piece of software that will optimize what you have already. It's really a fundamental transformation in how you do things.

Gardner: Okay, let's take the last question to Anton at LogLogic. How do you recommend people get started, particularly in reaching this balance between wanting not to miss opportunities, wanting to be able to ramp up quickly and to enjoy the benefits that virtualization provide, but doing it in such a way that they get that visibility and analytics, and can set themselves up to be risk resistant, but also strategic in their outlook?

Chuvakin: I'll use the case that I just presented to illustrate the way to do it. As has happened with me in technology before virtualization, people will sometimes deploy it in a manner that's really makes auditing and monitoring pretty hard. So they have to go back and figure out what the technologies are doing in terms of transparency and visibility.

I suggest that, as people deploy VMware and other virtualization platforms, they instantly connect those to their log-management tools, and that log collection starts day one.

Admittedly, most of those organizations would not know what to do with those logs, but having those logs as a first step will be important. Even if you don't know how to analyze the log, you don't know what they mean, or what they're trying to tell you, you still have that repository to fall back to.

If you have to investigate an issue, an incident, or an operational issue in an environment, you still have an ability to go back and say, "Oh, something of that sort already happened to me once. Let's see what else occurred at the same time."

Even if you have no skills to delve into the full scope of how to analyze all these signals that virtual infrastructure is sending us, I would focus first on selecting the data and having the data for analysis. When you do that, your future steps or your further steps, when you make sense of the data, will be much more easy, much more transparent, and much more doable overall.

You will have to learn what the signals are, what information is being emitted by your virtual infrastructure, and then make conclusions on that. But, to even analyze the information, to make conclusions, and to figure out what's going on, you have to have the original data.

It's easier to collect the data early, because it's really not a big deal. You just send those logs to LogLogic or the log management system, and they are capable of doing that right away. Now, admittedly, you have to pick a system, such as LogLogic, that can support your virtualization infrastructure and then you can build up your analysis and your understanding and build up your true visibility, sort of the next layer of the intelligence as you go. Don't try to use the analysis right away, but start collecting it day one.

Gardner: Right, visibility early and often. I appreciate your input. We have been talking about virtualization -- how to do it right, how to enjoy lower risk, understanding security implications, but at the same time moving aggressively as you can, because they are significant economic benefits.

Helping us understand virtualization in this context, we have been joined by Charu Chaubal, senior architect in technical marketing at VMware. Thank you, sir.

Chaubal: Thank you.

Gardner: Also Chris Hoff, chief security analyst at Unisys. I really appreciate your input, Chris.

Hoff: Thanks, very much.

Gardner: And also, Dr. Anton Chuvakin, chief logging evangelist and also a security expert at LogLogic. Thank you, sir.

Chuvakin: Thank you so much for inviting me.

Gardner: I would like to thank our sponsor for this podcast, LogLogic. This is Dana Gardner, principal analyst at Interarbor Solutions. You have been listening to a BriefingsDirect podcast. Thanks, and come back next time.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: LogLogic.

Transcript of BriefingsDirect podcast on the management and security challenges of virtualization. Copyright Interarbor Solutions, LLC, 2005-2008. All rights reserved.

Oracle and HP Explain History, Role and Future for New Exadata Server and Database Machine

Transcript of BriefingsDirect podcast recorded at the Oracle OpenWorld Conference in San Francisco the week of Sept. 22, 2008.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Hewlett-Packard.

Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you're listening to a special BriefingsDirect Podcast recorded at the Oracle OpenWorld Conference in San Francisco. We are here the week of Sept. 22, 2008. This HP Live! Podcast is sponsored by Hewlett-Packard, and distributed through the BriefingsDirect Network.

Today we are going to discuss a large and an impactful product announcement at Oracle OpenWorld that took place on Sept. 24. It was the introduction of appliances in a cooperative relationship between HP and Oracle to create some of the most high performing databases and date warehouses in history. We are going to talking about the Oracle Exadata Storage Server and -- when put together in a very impressive configuration -- what becomes the HP Oracle Database Machine.

Here to help us understand how these impressive server configurations and high-speed, extreme-performance databases came together, we are joined by Rich Palmer, the director of technology and strategy for industry standard servers at HP. We are also joined by Willie Hardie, vice president of Oracle database product marketing. Welcome to the show, Willie.

Willie Hardie: Good to be here, Dana.

Gardner: Tell me a little bit about this very momentous announcement. This has been several years in the making, but it’s not just a product announcement. It seems like an architectural shift, and also an alliance and partnership shift in terms of the cooperation between a hardware provider, in this case HP, and Oracle, until now purely a software company.

Hardie: That’s an excellent question. So what we actually announced this week is the Oracle Exadata Storage Server. Now, the Oracle Exadata Storage Server is an intelligent storage device. We’ve basically taken industry standard hardware and storage components from HP, and we’ve combined that with smart intelligence software from Oracle that allows us to offload query processing from the database servers to the storage servers.

So now they can do a lot of the work for us, to allow the stripping off of the rows and columns that we require, and push last data backups through much wider networks.

Gardner: For those of us who are not computer scientists, but are nonetheless interested in the outcomes, architecturally we are putting the intelligence that we usually have in a database server in very close proximity to the data storage itself, connecting that through a very fat pipe in the form of InfiniBand. And, in essence, parallel processing comes to bear, because of the proximity. Is that correct?

Hardie: Absolutely. So what we are able to do for the first time ever is we can use these storage devices to actually do the query processing itself. So the more that the storage server processes and we compute into our configuration, the more of the workload they can take off, which traditionally is done at the database server.

Gardner: Let’s go to Rich Palmer at HP. Tell us a little bit about the history. How did this come about, and what is it that HP has been doing to improve upon the performance of this long-term database lineage?

Rich Palmer: If you look at HP and Oracle as partners in this industry, we have a long-standing history together. We have several reference configurations, more than 50 reference configurations that we do with industry standard hardware and Oracle solutions, which we’ve been delivering for many years now.

Going back all the way to the introduction of Oracle Real Application Clusters (RAC), and even before RAC introductions, the history of the two companies really stems from two leadership positions. HP does more servers on Oracle than any other company. Oracle does more data warehouses than any other company. You bring those two forces together, and you get a very strong formidable entry into this data warehouse appliance market.

Where HP and Oracle really started this discussion stems back a couple of years, and it really became a trend in the market of bringing data and server processing power closer together; that trend has escalated over the last couple of years -- especially as so much data has been growing at exponential rates, every single year. What we found is that, you cannot push so much data over a traditional storage fabric. This new technology allows us to do that.

Gardner: And we are talking about very large data sets, of terabytes and larger, right?

Palmer: Enormous Data sets. Let me give you an example, and I think we are all very familiar with this example. We all use cell phones in today’s industry. Every one of those cell phone calls is a database record somewhere, be it on AT&T’s database or T-Mobile’s database or whomever's database -- they store that data. Now, when they are storing that data, sometimes they are going to want to move it. If you have a narrow pipe to push that data down, and you’re bringing back enormous amounts of data that is erroneous, and you don’t need the other data; all you need is just for what you’re looking for in the query.

So this process allows us to push just the query information across that pipe. Less data over the pipe, a wider pipe, and your performance goes up dramatically.

Gardner: Okay, so let’s unpack this a little bit. We’ve established that the marketplace is demanding better performance, particularly in the use of large data sets, 1 terabyte and larger up to 10 terabytes, and size often. That requires the movement of very large sets of data, and the inhibitor here was the storage’s physical capacity, and ability to deliver the data.

So you’ve re-architected, and we brought together two companies to work together. This brings the question: Why hasn’t the hardware and software duality gotten closer before this? Why now?

Palmer: In this market, it’s constantly evolving to a state where you have to bring software tools to the table, and you have to bring high-performance hardware to the table. The evolution of both of those have hit at the perfect time in the last year.

Oracle has been developing the software code for several years now, and HP has been working on the hardware side of this equation to bring together the two forces at this time. We are using industry standard technology, so it’s not something that we are the only hardware guys out there with InfiniBand, and InfiniBand is an evolving technology. But the performance of InfiniBand is at a point now where we can actually leverage it using Oracle software to offload the storage processing from the database server. Those are the two key components -- it’s not just the hardware, and it’s not just the software. You have to marry these two things together.

So why hasn’t it been done in the past? Well, it has to some degree, there are others who had tried to do this, but they haven’t done both. They haven’t been able to achieve both facets, and that’s really why this is the right product at the right time.

Gardner: Okay, Willie, let’s get into the actual product itself. Explain to me what the Oracle Exadata Storage Server actually is? What are we talking about?

Hardie: You see that the Oracle Exadata Storage Server is basically comprised of an industry standard HP DL180 Storage Server. So inside this storage server we have 12 3.5-inch disks to be 12 SATA drives. We have two Intel quad-core processors. We have 8 gigabytes of memory; we have two InfiniBand network connections, and dual power supplies.

So in this storage server we have a lot of storage capacity, we have a lot of processing power, and we have a lot of network bandwidth. Then the real secret sauce here is this intelligence software from Oracle that’s installed into each and every one of those devices. It’s this intelligent software that enables us to offload this query processing, which makes the Oracle Exadata Storage Server really unique.

Gardner: Okay, let's dumb this down a little bit in simplistic terms. Instead of large data sets moving from storage to the database and back, what happens differently now?

Hardie: What happens differently now is, because we are offloading the query processing at the storage server, the storage server can strip out the columns that we don’t need, strips out the rows we don’t need, returns a subset of data back up through this wide InfiniBand network. That’s what makes the difference. We are treating a much smaller data set that we pass up through this network, and the database server can just finish off that query processing much faster than it ever could previously.

Palmer: One of the other values that we achieve here is certainly in the data passing back and forth, or less data over a wider pipe. So you’re going to get exponentially better performance. Now at the storage servers you’ve taken the processing power of doing the query right at the disks, and in every one of these storage servers you have eight cores, these are Intel quad core processors, two of them in each servers, and so you have eight cores on the input/output (I/O) path directly to the disk.

So there is no external I/O going to your disks. Traditionally you’ve had to go outside of the server, go to the disk that is across the fabric -- and everyone else is sharing that fabric.

So you have many people sharing a fabric, versus now you have a dedicated fabric inside of the server. So it’s a copper-to-copper connection inside the server. Those disks are right on top of the processor. That is really the essence of it -- you can pull the data off of this rapidly because it’s all so much faster. As Willie indicated, you can strip out all the unnecessary data and pass a much smaller data set over a much wider pipe, back to your database servers. There are so many levels of performance improvement here.

Gardner: And to your point on the secret sauce -- you are also taking advantage of all those cores via multiple threads, and the software has been a deeply tuned to take advantage of those multiple threads in a concurrent fashion.

Hardie: Oh, absolutely, and Rich touched on that as well.

Palmer: When we add more Exadata Storage Servers into our configuration we can take advantage, not just that additional storage capacity, but we can now take advantage of that additional processing capability -- to own that storage layer, which is a big, big difference.

Gardner: And at the announcement here, Oracle Chairman and CEO Larry Ellison described use cases where improvement typically was 10x to up to 72x over what has been the industry benchmark.

Hardie: Absolutely, when you actually cut away the technology and look at this from a business perspective, what it means for me as a business user -- it means that when you’re accessing those data warehouses that Rich was talking about earlier -- like a call data record -- data warehouse have billions of rules additionally. What this means, when you’re accessing those, your queries are going to run much, much faster than they ever did previously. Not only will they run faster, you can have much more queries and more long-running queries concurrently. That’s what is going to be making the big difference.

So when we hear of customers talking about getting 20x performance, improving 30x performance in one particular instance; in one particular query, 72x performance -- that is extreme performance improvements, in anybody’s measurement.

Gardner: Okay, so we have this engine, this Oracle Exadata Storage Server. We also a new announcement, the HP Oracle Database Machine. Tell me how one relates to the other.

Palmer: The HP Oracle Database Machine is a single rack that contains everything you need to run a large data warehouse. It contains eight ProLiant servers running Oracle Database 11g and RAC. It has four InfiniBand network switches and it has 14 of these Oracle Exadata Storage Servers that we talked about earlier. So in a single unit you have everything you need, ready to load up your data and start running your business queries right away.

Gardner: Tell us a little bit, Rich, about this 42-slot rack configuration and why it’s right for the market now?

Palmer: Well, so if you look at the market in data warehousing, the appliance type of delivery is a much simpler deployment of hardware and software configurations. That is emerging as a high-growth area in data warehousing. So with this market trend that’s going on between HP and Oracle, we’ve been able to come together and put everything in customers’ needs in one box. We put it at the customer’s site, and that’s on a global basis.

If you look at HP, one of the strengths that HP brings to this relationship is our ability to distribute and deliver globally. We build all of these database servers or database machines in regions around the globe. They are not just built here in the United States; they are built in United States, they are built in Singapore, they are built in Scotland, and then they are delivered to those regions on a worldwide basis.

So this ability of HP to build the product from the ground up to an exact specification, deliver to the customer, install at to customer's site, and then have Oracle come in and tune the software to make sure it's optimally configured -- that is a no-lose environment. We have the ability here to deliver an appliance-like stack of hardware, put the right software set on that hardware, and target a customer's need for simplicity, high performance, and data reliability -- all in one box.

Gardner: Okay, we've described the marketplace need, the size of data pushing the envelope. Now we are re-architecting to adjust to that. We've described the subset, which is the Exadata Server, and then the configuration, which is the racked Machine. Now, what kind of organizations are going to be interested in having the forklift upgrade to this, bring it right in, drop it in, pre-configured, optimized, and what are they going to do with it? Is this for business intelligence (BI), is this for simply managing scale? What are the speeds that this now provides going to do for companies to improve, or to change, how they do business?

Hardie: The organizations that are going to be interested in Oracle Exadata Storage Server and the HP Oracle Database Machine are those primarily interested in large data warehouses. And by large data warehouses we're talking into the (terabytes and petabytes) and beyond. Now if you look at the organizations that are typically dependent on very large data warehouses, it's organizations that Rich mentioned earlier, the telcos could be an obvious one, call data records, retail organization, very much dependent on analyzing point of sales (POS) transactions. You look at other organizations like trading systems, massive amount of transactions flow through these systems on a daily basis.

Gardner: Especially these days.

Hardie: Absolutely. It is really important to understand what's going on with these transactions, and to make informed business decisions. The beauty of this is you have completely scalable infrastructure from a storage point of view. But more importantly, you've got completely scalable infrastructure from a query performance point of view. As you store more call data records into these systems, more POS transactions, more stock transactions into these systems, you're not going to deteriorate your query performance at all. The more hardware, the more storage servers you put into these systems, the better your performance is going to be.

Gardner: Now that I have this capability to bang on this thing, so to speak, in more ways without the degrading performance, in what ways do you expect these companies to actually "bang" on this? Is this going to provide new and higher level of business intelligence querying? Is this going to provide higher-order analytics? Are there going to more business applications that can derive near real-time data and analytics from this? All of the above? What's the qualitative payback?

Hardie: There is definitely an element of "all of the above." Let me give you some of the examples of some of the queries that customers have actually been experiencing using the Oracle Exadata Storage Server. This probably fits into the context pretty well. You have organizations out there, retail organizations, telcos, for example. You know, some of the queries they are running are literally running for over half an hour. In some cases it is hours.

Moving to this new architecture is bringing down these execution times. One particular example, a query that was running for over 30 minutes is now running in under 30 seconds. It's that scale of improvement. Now when you can set your terminal, your laptop, or your mobile device and then kick off a query and get an answer within seconds -- then you're going to do more of these. If you know that when you kick off a query it is going to take 30 seconds to return it, you're going to pick more times when you choose to kick that off. You don't have to worry timing that anymore. You can just ask queries when you like, and expect to get a quick answer.

Palmer: Willie, I think you are absolutely right. The ability to capture business information has accelerated so much because of this technology. There are customers that cannot access data records beyond a certain time period simply because of the massive size of those data records, or because of how long a query would take to access a historical group of data. That all goes away now.

Now you have the ability. Historically you might have been able to look at the last week's worth of retail records, or medical records. Now you have the ability to go and look at years and years of data in the same timeframe that you were looking at weeks of data, and query a much bigger dataset, because of this architecture. That's a big business value, because now I can trend my business in a much more effective way. I'm putting more productivity tools in the hands of the user, so that they can actually turn data queries and business intelligence back into a fundamental element of growing their business and being more competitive in their markets.

Gardner: I imagine this will also compel companies to put even more data and information into these warehouses, because they are not going to degrade the performance of these essential queries. They are also going to able to do more types of queries. And, again, we're improving the quality and breadth of the data types, but still getting even better performance. So it's sort of a qualitative improvement on many different dimensions.

Hardie: It's a qualitative improvement, and it's a quantitative. I mean, you're absolutely right. Organizations today are more and more dependent on faster access to better information. It's just as simple as that.

Gardner: We've talked about the types of organizations that we'll use this now in its current configuration. I expected this re-architecting of the database and the storage will also move down market a bit. What possible other use-case scenarios do you envision for leveraging this technology beyond the high-end of the market, into other areas of the market?

Palmer: If you look at some of the growing and emerging markets today, just think of cloud computing and all of the massive amounts of data that we're storing in other locations on the Internet, or through a paid service, and the massive amounts of storage that's being deployed for those types of applications. That's not going to slow down at all. This allows us through the Database Machine to go in and drop in a configured environment for that workload, specifically dedicated to a workload.

You can now scale this product by connecting multiple racks together, you can now scale just the storage component, if the processing side of the database environment is sufficient. You can now just scale the storage nodes, so it is a scalable grid architecture that can grow on the fly. So cloud computing is a very good example where we really don't know what the upper limit of that storage is going to be. So deploy a configuration, say, on a HP Oracle Database Machine and then grow it as your needs grow. This is one application where we know this is going to succeed.

Gardner: Willie, we're also aware that organizations will just want the Oracle Exadata Storage Server. They might have their own environments, their own preference for configuring what's available to them, and what would become available to them in the future.

Hardie: Any organization that wants to run their data warehouse on the Oracle Exadata Storage Server -- all they have to do is buy the Oracle Exadata Storage Server. It's just as simple as that. Oracle and HP of long given customers a choice of configurable options. So if customer feels that something like HP Oracle Database Machine is not the right fit for their organization, if it does not fit the standard needs for their organization, then they have the option of buying the individual components, the Oracle Exadata Storage Server, the InfiniBand connectors, connecting to the database servers, they have that option.

Gardner: Looking at this again through how to get started, where do organizations go? Now that this is available immediately, both of these configurations, is the sales happening through both HP and Oracle?

Palmer: It's a cooperative effort, but Oracle is leading the sales process. So the Oracle sales representatives on a global basis are leading this process, and HP is certainly as their partner going to join with them and make sure that the customer receives the best from both companies.

Gardner: HP is going to service the hardware, but the support comes through Oracle, is that correct?

Hardie: Oracle is the first point of contact if you want to buy an Oracle Exadata Storage Server, Oracle is your first point of contact. So talk to your local Oracle sales representatives. If you do decide to buy one, and you want to resolve a support issue, you call Oracle, and Oracle will bring in HP as and when required to resolve any issues.

Gardner: To sum up a little bit, for those folks who perhaps are a few steps removed from the IT department, who are doing queries, or using business applications, what's the big take away for them? What about this announcement is going to change their world?

Hardie: For these types of users you just mentioned, a little bit or a couple of steps removed from the IT department ... To be quite honest, they don't really care what their systems run on. What they are interested in is getting fast answers to their business queries. It's just simple as that. So when these business users know that they can get instantaneous response times, they can get real extreme performance of their date warehouse, or of their business intelligence applications -- that's what's going to make a big difference for them.

Gardner: Rich, at HP, let me flip the question to you. For those people inside the IT department, who want to come in Monday morning without big headaches, what is this new configuration and architectural approach mean for them?

Palmer: Simplicity, higher performance, the ability to increase their service level agreements (SLAs) with their customers in the warehousing world. This is a solution built on industry standard hardware, with Oracle software that is just well accepted in the industry as an enterprise software leader. The IT departments are very comfortable with both of those facts. They're very comfortable with HP; they're very comfortable with Oracle. Putting the two together is a natural event for any IT manager.

Gardner: We've been talking about a large and impactful announcement here at Oracle OpenWorld, the introduction of the Oracle Exadata Storage Server -- the first hardware product from Oracle. Isn't that right?

Hardie: Absolutely.

Gardner: We've also looked at the configuration of those Exadata servers into the HP Oracle Database Machine, which is in effect a data warehouse appliance. Joining us to help explain this, we have been happy to have Rich Palmer, director of technology and strategy in the industry standard servers group at HP. And also Willie Hardie, vice president of Oracle database product marketing. Thanks to you both.

Hardie: Thank you, Dana.

Palmer: Thank you very much, Dana.

Gardner: Our conversation comes to you today through a sponsored HP Live! Podcast from the Oracle OpenWorld Conference in San Francisco. Look for other podcasts from this HP Live! event series at hp.com, as well as via the BriefingsDirect Network. I'd like to thank our producers on today's show, Fred Bals and Kate Whalen.

I am Dana Gardner, principal analyst at Interarbor Solutions. Thanks for listening, and come back next time for more in-depth podcasts on enterprise IT topics and strategies. Bye for now.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Hewlett-Packard.

Transcript of BriefingsDirect podcast recorded at the Oracle OpenWorld Conference in San Francisco. Copyright Interarbor Solutions, LLC, 2005-2008. All rights reserved.

Wednesday, September 17, 2008

iTKO's John Michelsen Explains Roles and Methods for SOA Validation Across Complex Integration Lifecycles

Transcript of BriefingsDirect podcast with iTKO's John Michelsen on SOA testing and virtualization market trends.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: iTKO.

Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you’re listening to BriefingsDirect. Today, a sponsored podcast discussion about integration, validation, and testing for services-oriented architecture (SOA) and middleware -- particularly for business process management and to extend business processes more efficiently.

We’re going to be looking at how integration nowadays is really across multiple dimensions. We are talking about integrating technology, about various formats, and for extending frameworks, vendors, application sets, and specific application suites. There are also now enterprise service buses (ESBs) that are creating multiple types of integration across services -- from different hosting locations and from different technologies.

Not only that, we’re also dealing with traditional enterprise application integration (EAI) issues and middleware. And, of course, there’s more talk about cloud computing and software as a service (SaaS).

The whole notion of integration in the enterprise has exploded in terms of complexity -- but that puts more onus and importance on validation, testing and understanding what’s actually going on within these integration activities.

To help us understand more about integration, middleware and SOA validation and testing, we’re joined by John Michelsen, chief architect and founder of iTKO. Welcome to the show, John.

John Michelsen: Thanks, Dana, good to be here.

Gardner: We’ve talked several times in the past about the integration in SOA, and what’s been going on. How do you look at integration now among business applications and middleware? Is it, in fact, more onerous and complex than ever, and how would you characterize the current state of the market?

Michelsen: It really is, and it’s for a number of reasons. Most of us can surmise that, as soon as we look at it. We tend not to turn anything off. Existing systems don’t go away, and yet we bring in additional [IT] systems and new things all the time. We’re changing technologies, because we're always looking for the faster, cheaper, more effective way. That's great, and yet today, IT becomes legacy faster than before. In fact, you and I had a conversation a few weeks ago about that.

So, it gets more complex over time. And yet, to get real value out of IT you’ve got to think not from the perspective of these systems, but from the business’s processes, as they need to function. We have to do what you can, considering unreasonable gyrations in the systems, in order to make it reflect the way the business operates.

So there is a real mismatch here, and in order for us to accomplish value for the business, we’ve got to solve for it.

Gardner: Of course, at the same time, IT organizations are under pressure to reduce their complexity, reduce their maintenance and total cost of ownership (TCO). They’re dealing with long-term activities such as datacenter consolidation and application modernization. What is it that brings testing and validation into this mixture, in terms of end-to-end visibility?

Michelsen: Let’s say three or four systems are already interoperating in some way, and now you’ve become a part of a larger organization. You’ve merged into a large organization, or you’ve taken into your organization something you've acquired. You add another three or four end points, and now you’ve got this explosion of additional permutations. The interactions are so many that without good testing and validation, there’s just almost no hope of getting real visibility, and predictability out of these systems.

When things do fail, which unfortunately happens, you’ll have an extremely long recovery time without this test and validation capability, because knowing that something broke somewhere is the best you can do.

Gardner: I suppose we’re also looking now more at the lifecycle of these applications based on what’s going on at design time. Folks who are using agile development principles and faster iterations of development are throwing services up fairly quickly -- and then changing them on a fairly regular basis. That also throws a monkey wrench into how that impacts the rest of the services that are being integrated.

Michelsen: That’s right, and we’re doing that on purpose. We like the fact that we’re changing systems more frequently. We’re not doing that because we want chaos. We’re doing it because it’s helping the businesses get to market faster, achieving regulatory compliance faster, and all of those good things. We like the fact that we’re changing, and that we have more tightly componentized the architecture. We’re not changing huge applications, but we’re just changing pieces of applications -- all good things.

Yet, if my application is dependent upon your application, Dana, and you change it out from under me, your lifecycle impacts mine, and we have a “testable event,” even though I’m not in a test mode at the moment. What are we going to do about this? We've got to rethink the way that we do services lifecycles, we've got to rethink the way we do integration and deployment.

Gardner: There is, of course, a very high penalty if you don’t do this properly. If you don’t have that visibility, you lose agility, and the business outcomes suffer.

Michelsen: That’s right. And too often, we see customers where they’re in this dynamic of these highly interconnected systems. That frequency of change and the amount of failure that’s occurring because of those changes are actually having such a negative effect that they’re artificially reducing their pace of change -- which is, of course, not the goal for the business -- in order to try to accomplish some level of stability.

This means that we’ve gone through all this effort to provide this highly adaptable and agile platform and we’re doing all this work to get agile and integrated, but we have to then undo the benefit in order to accomplish stability.

Gardner: One of the basic principles of SOA is that you get benefit as a result of the “whole being greater than the sum of the parts,” but many of the parts come from specific vendors and/or open-source projects. They have management capabilities and insights built into them specifically. Yet when you rise up a bit more holistically, that’s where the issue comes in of how to get visibility across multiple systems.

Explain to us how you got started on this journey, and where your background and history comes in terms of addressing that higher abstraction of visibility.

Michelsen: Right, that’s a good point, because if the world were as simple as we wanted it to be, we could have one vendor produce that system that is completely self-contained, self-managed, very visible or very "monitorable," if you will. That’s great, but that becomes one box of the dozens on the white board. The challenge is that not every box comes from that same vendor.

So we end up in this challenge where we’ve got to get that same kind of visibility and monitoring management across all of the boxes. Yet that’s not something that you just buy and that you get out of the box.

This is exactly what pushed me into this phase throughout the 1990s. I had a company prior to founding this one that built mission-critical applications for lots of large companies, including some airlines and financial service companies; logistics, even database engines, and things like this.

The great thing was that I was able to put my little team together to build really cool stuff and deploy it really fast into an organization. They loved it. The challenge was that I was doing this in a very disruptive way to the rest of the IT organization. I'd come, bring in this new capability, and integrate it into the rest of the applications.

Well, in doing so, I’m actually causing this very same dynamic that we’re talking about now -- where all of a sudden my new thing, my new technology, integrated into a bunch of legacy, is causing disruption across all kinds of systems. We just didn’t have a sense for how to do this.

So I had to learn how to do this, how to transform these organizations into integration-based thinking, and put in test-and-validation best practices. That’s what caused us to end up building what we now call LISA.

Gardner: Unfortunately, a lot of organizations, when they face that disruption, their first instinct is probably just to put up a wall and say, “Okay, let’s sequester or isolate this set of issues.” But that, of course, aborts this business process level of innovation and value.

Michelsen: Exactly, and here's a classic example. A number of the types of systems that we built in the late 1990s were the e-commerce applications that were customer facing. The companies said, “I just don’t want to hear that this system can’t talk to that system. I want a Web-based presence that’s brain-dead simple, and that does things the way a customer wants to be able to do them. You’re going to interconnect all those back ends in order to get that to work. … You just do it for me. And if you won’t do it, I’m going to go find a vendor outside that will.”

The challenge is, no matter how it ends up there, now we've got to reckon with it. Frankly, even though those are sometimes difficult conversations the business is having with IT, the business needs those things, because the company that does it gains market share and increases the scope of their growth cycle. That obviously is something that every IT organization wants, because that leads to a bigger budget and a better company, and the success that we want to see.

Gardner: Now, we've certainly established that there is a problem, and that’s been evident for some time. We’ve underscored the fact that we want to get visibility, and offer new elements into an integrated environment, to take advantage of the technologies that are coming online, but not be in disruptive mode, or we certainly want to reduce the risk.

So we know there’s a problem, we know what we want to do. Now, how do you approach this technically, when you’re dealing with so many different vendors, so many variables?

Michelsen: Well, I’m the founder of a product company, and yet you don’t start by going and buying some software, installing it, and thinking you’re done. Let’s start with thinking around a new set of best practices for what this needs to look like. We frequently leverage a framework we call "the 3Cs" in order to accomplish this -- Complete, Collaborative and Continuous.

In a nutshell, we’ve got to be able to touch, from the testing point of view, all these different technologies. We have to be able to create some collaboration across all these teams, and then we have to do continuous validation of these business processes over time, even when we are not in lifecycles.

It’s a very high, broad-stroked approach to our solutions, but essentially, drilling down to the detail with the customer, we can show them how these 3 Cs establish that predictable, highly efficient, lots-of-visibility way to do these kinds of applications.

Gardner: There must be secret sauce? There must be technology in addition to the vision and methodological approach?

Michelsen: Right. In order to get that testability across all these technologies and collaboration among all the teams and, of course, continuous validation takes tooling and technology. Of course, we provide that, which is great. I personally like it, just as, from a professional point of view, I like the fact that the way we message to the market is: "These are the ways you’ve got to go about doing it." Once you see that that is an appropriate approach for you -- then you become a great candidate for using our products.

But let’s talk about making sure that this is right for you. Then we’ll talk about our product being useful, because that really is the way the things should work. I can’t tell you how many times I’ve seen a customer who has said, “Well, we've run out and bought this ESB and now we’re trying to figure out how to use it.” I've said, “Whoa! You first should have figured out you needed it, and in what ways you would use it that would cause you to then buy it.”

It’s the other way around sometimes. That’s why we’ll start with the best practices, even though we’re not a large services firm. Then, we’ll come in with product, as we see the approach get defined.

Gardner: Are there any specific types of enterprise companies -- whether in a particular maturity around IT or suffering from certain ills or ailments -- that pique your interest to say, “Well, this is a perfect candidate for our solution and product set?” What are some of the indicators that a company is ready for this level of validation and testing?

Michelsen: There are a couple. First, the large-scale, top-down SOA initiatives clearly need this, because this is the perfect example of … interconnecting things, wrapping legacy systems in modernization, creating business-process modeling environments, increasing the pace of change, and distributed development across many different teams. SOA does all of those things for you, and certainly scratches every one of those itches that we’ve been talking about.

The other is when you go into a large integration initiative. There are a lot of partner solutions -- from companies like TIBCO, WebMethods, Oracle Fusion and SAP NetWeaver, and forgive me for not naming all of our friends. When you’re going down this kind of path, you’re going down a path to interconnect your systems in this same kind of ways. Call it service orientation or call it a large integration effort, either way, the outcome from a system’s point of view is the same.

Then, traditionally, by the time a business has been large for many years, they just have this enormous amount of technology. A classic example is a large financial institution that does fixed-asset trades. In order for one trade to place, it takes Web services and EJBs, from Java Swing-based application into CORBA, into messaging, into C code, into two different databases, and out the other end of a Web application.

All of that technology, integrated together, is what the business thinks of the app. Of course, that takes hundreds of people across many different teams – U.S., Europe and Asia -- from an IT point of view. But, all of that technology together is the app. So that’s your reality. That’s where we really can sit and where these best practices really get to work.

Gardner: So when you went to enter into these organizations where there’s a pretty powerful need, what is it that they’re getting in terms of value and impact? How do they use these tools? Then, we’ll try to ask a little bit about validation examples of what the outcomes have been.

Michelsen: What they’re doing is adopting these best practices on a team level so that each of these individual components is getting their own tests and validation. That helps them establish some visibility and predictability. It’s just good, old-fashioned automated test coverage at the component level.

As these components start to orchestrate with each other in order to accomplish this higher-level objective -- where this component becomes a part of a larger solution -- then there’s a validation aspect to it. The application that is causing this component-to-component orchestration has a validation challenge to make sure that things continue to work over time, even in the face of change.

As these components come together, there’s a validation layer that’s put in place. At iTKO, we even have a virtualization capability that allows you to do these kinds of things in a very agile way and without some of the constraints that you typically have. At the very end of the process, we are near the glass, if you will, of the user screen. Then you’ve got business-process level validation or testing across the whole thing. So think of it as, “Here’s a business process model that I’ve modeled in a business process modeling (BPM) tool of choice."

The complement of that are one or more tests or validations of that particular business process, where I invoke the process and verify my technical outcomes. So that if placing an order means to do this, this, and this in these systems, you do that with a BPM tool. To validate the business process function as expected, you’ll invoke that business process with our product LISA and then make sure all of those expected outcomes occurred.

For example, the customer database is going to have an update in it, the order management systems is going to be creating a new order. The account activity system -- which might be completely independent -- the inventory system, or the shipping system, all of these things are going to have to have their expected outcomes verified in order for us to know that this system works as expected.

Gardner: This really sounds like a metaview of the integration, paths, occurrences, and requirements. It almost sounds as if you’re moving to what we used to refer to, and still do, as application lifecycle management (ALM). But, it sounds like you’re really approaching this additionally as “integration lifecycle management.”

Michelsen: That’s a great point. In fact, we’ve heard people say, “Wow, it sounds a little bit like also business activity monitoring (BAM), where you’re basically chasing all these transactions through the production system and making sure they are doing their thing.” Certainly, it's a valid point. But let’s be really clear. We must be capable of doing this as a part of our development cycles.

We can’t build stuff, throw it over the wall into the production system to see if it works, and then have a BAM-type tool tell us -- once it gets into the statistics -- "By the way, they’re not actually catching orders. You’re not actually updating inventory or your account. Your customer accounts aren’t actually seeing an increase in their credit balance when orders are being placed."

That’s not when you find out it doesn’t work, right? And the challenge is that’s what we do today. We largely complete these applications. We go into some user-acceptance test mode, where we have a people see if they can find any problems with this enormous amount of software, millions of lines of code. We give them a few weeks to see if they can find any bugs, and then we go to production.

We really can’t let that happen any more. These apps are too big, their connections are too many and the numbers of possible testable items are way too great. And, of course, tomorrow we invalidate all the work we just did in that human labor, when something changes somewhere.

So this is why, as a part of lifecycles, we have to do this kind of activity. In doing so, we drive into value, we get something for having done our work.

Gardner: Clearly from my observations, there’s a struggle now under way in the market to find better ways of relating, finding the relationship and dependencies between the design time activities and the run time activities -- and then creating more of a virtual feedback set of loops that allow for this to continue without that handing off; or waiting for the red light/green light value. Tell me how you think LISA provides a bridge, or maybe a catalyst, to increased feedback values between design time and run time, particularly in an SOA environment.

Michelsen: Great question, and I’m glad that you’re seeing that as well, Dana, because we think that it's an indication that things are maturing. When we see our customers asking us, “How do I essentially do that second C of yours, collaboration? How do I better collaborate?”… we know that they’re finally seeing the pain between a siloed-based lifecycle, and testing and operations being a disjointed activity. Development and test don’t talk to each other, or with project management. And the business analysts don’t really even know each other.

We know that when we’re hearing questions around collaboration, people are becoming aware that they really needed to accomplish it. This is great. Some specifics of how our products can help is by first being a test capability that every one of the teams I just mentioned can use to do their own part of the testing effort. Developers have a test responsibility. Certainly, quality assurance (QA) has one. Operations even has one, from a functional monitoring point of view.

The business analysts have this whole "validate the business process" activity they need to accomplish. Everyone has their part to play, and if we can provide a tool that helps all of them do their part with the same product, there’s an enormous amount of efficiency. More important, there’s a much more highly automated back channel through this lifecycle.

If a business process is not functioning as expected, that failing test case is consumable all the way back to that individual developer who can see the context in which my component is being exercised. [And that comes from seeing] the input and output, seeing the expected outcome, and seeing the unexpected actual outcome. Then I get a really good awareness of what my component is supposed to do in the context of the business process.

When we have this common tooling across the board -- instead of one way of doing it for development, one way of doing it for QA, one way for the business analyst and for operations and everything -- we get much greater collaboration.

One other important point here is that we also have an opportunity to introduce this continuous validation framework, where once we start these integration labs, those components are being delivered into that integration lab, and then into pre-production, performance labs and production. We need an infrastructure for all of this continuous validation that properly notifies whoever should be notified when failures occur.

So our application has lots of good technology for being able to do this as well.

Gardner: Well, of course, the proof of the pudding is in the eating. Can you give us some examples of organizations that have employed these methods, and then some of these tools? Start to think in terms of the 3 Cs that you’ve outlined. What sorts of results or paybacks are there in terms of return on investment (ROI) and TCO? What validates this?

Michelsen: A great example of this would be Lenovo, the ThinkPad guys, where they went through a major next generation of all of their customers and partner-facing order management systems. This is www.lenovo.com, and a number of the systems behind it. They went with a new vendor to bring in a new application and interconnected into all the existing back-end and legacy systems. It's a classic example, as I said a few minutes ago, of when this kind of activity becomes important.

Lenovo realized from their past experiences that they wanted to get better at doing this kind of activity because they didn’t want what happened to them sometime in the past, where application failures underneath the screens would be causing customer experience to degrade -- but you couldn’t even tell at the website.

They were not capturing the order, even though an order number was showing up on the Web page, and things like this. They realized this challenge was too great for them, and they brought our solution in, in order to validate all these individual components and then validate at the user’s business-process level.

They wanted to validate what it means to configure ThinkPads, to price them, to do all of the bundling, to make sure that I can place orders, check orders, verify shipping, and do all these different things. That takes a pretty significant amount of visibility. Of course, our product has some capability to give you that visibility, because you’re going to need it.

So you have this kind of capability, and Lenovo was able to move away from, "I hope this thing continues to run." What was very possible in the past was that the customer update occurred, but the order placement didn’t -- a partial commit.

Instead of having that reality, they now have a reality on a literally continuous basis. From seven different places all over the world, we’re continuously validating the performance and functional integrity of the entire system -- both for the component level, and at what I call this orchestration level.

In doing so, they have a whole lot more confidence that the thing actually performs the way they expect it to.

Gardner: There’s no question, John, that the organizations that are advancing, that are deeply into integration issues, are looking for this business process management value, at the orchestration level.

They've moved an abstraction up in terms of the approaches, and the accomplishments of what their IT departments and systems can deliver. But, of course, any time we move up an abstraction technologically into the functions of IT, that requires the company go up a level in validation testing and quality.

It makes sense now that you’re going to see a growing market. Is there any sense that you can give us from your business as to how these things are growing now? Are people really getting to that level where they want to bring together a lifecycle approach?

Michelsen: Well, hopefully the Lenovo example means, yes. By the way, a company partner of ours named i2 -- they see this. We all know there’s an amazing amount of effort to do a large-scale implementations of either a packaged applications or large-scale custom applications. I think we’ve done this long enough to realize that this has to be part of the way to do it.

I’m seeing that more and more. As a consequence, we are able to provide value to many customers. It’s just been thrilling. We brought our product to market in early 2003 with a single customer or two. If our growth rate is an indication -- as an IT discipline – the market has finally realized that we have to get this right, which is terrific. If you think about it, the evangelist in all of us wants to get this right, or wants to do the right thing. I’m seeing it more and more, and that’s certainly terrific.

Gardner: Great. Well, we've been discussing the issues around integration, middleware, and SOA, as well as the need to abstract value up to the integrations and into the business processes. We have talked about how these elements relate to one another, and, of course, explain the need for greater visibility, validation and testing in these environments.

We’ve been talking about LISA and iTKO with John Michelsen, the chief architect and founder of iTKO. I appreciate your input, and we look forward to learning more about how this market evolves. It is an exciting time.

Michelsen: Thanks a lot, Dana. I appreciate the time.

Gardner: This is Dana Gardner, principal analyst at Interarbor Solutions. You've been listening to a sponsored BriefingsDirect podcast. Thanks for listening, and come back next time.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: iTKO.

Transcript of BriefingsDirect podcast with iTKO's John Michelsen on validation and testing in application integrations. Copyright Interarbor Solutions, LLC, 2005-2008. All rights reserved.