Monday, January 18, 2010

Technical and Economic Incentives Mount Around Seeking Alternatives to Mainframe Applications

Transcript of the third in a series of sponsored BriefingsDirect podcasts on the rationale and strategies for application transformation.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Learn more. Sponsor: Hewlett-Packard.


Gain more insights into "Application Transformation: Getting to the Bottom Line" via a series of HP virtual conferences. For more on Application Transformation, and to get real time answers to your questions, register to access the virtual conferences for your region:

Access the Asia Pacific event.
Access the EMEA event.
Access the Americas event.


Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you’re listening to BriefingsDirect.

Today, we present a sponsored podcast discussion on why it's time to exploit alternatives to mainframe computing applications and systems. As enterprises seek to cut their total IT costs, they need to examine alternatives to hard to change and manage legacy systems. There are a growing number of technical and economic incentives for modernizing and transforming applications and the data center infrastructure that support them.

Today, we'll examine some case studies that demonstrate how costs can be cut significantly, while productivity and agility are boosted by replacing aging systems with newer, more efficient standards-based architectures.

This podcast is the third and final episode in a series that examines "Application Transformation: Getting to the Bottom Line." The podcast, incidentally, runs in conjunction with a series of Hewlett-Packard (HP) webinars and virtual conferences on the same subject.

Access the Asia Pacific event. Access the EMEA event. Access the Americas event.

Here with us now to examine alternatives to mainframe computing, is John Pickett, Worldwide Mainframe Modernization Program manager at HP. Hello, John.

John Pickett: Hey, Dana. How are you?

Gardner: Good, thanks. We're also joined by Les Wilson, America's Mainframe Modernization director at HP. Welcome to the show, Les.

Les Wilson: Thank you very much, Dana. Hello.

Gardner: And, we're also joined by Paul Evans, Worldwide Marketing Lead on Applications Transformation at HP. Welcome back, Paul.

Paul Evans: Hello, Dana.

Gardner: Paul, let me start with you if you don't mind. We hear an awful lot about legacy modernization. We usually look at it from a technical perspective. But it appears to me that in many of the discussions I have with organizations is that they are looking for more strategic levels of benefit, to finding business agility and flexibility benefits. The technical and economic considerations, while important in the short-term, pale in comparison to some of the longer term and more strategic benefits.

Pushed to the top

Evans: Where we find ourselves now -- and it has been brought on by the economic situation -- is that it has just pushed to the top an issue that's been out there for a long time. We have seen organizations doing a lot with their infrastructure, consolidating it, virtualizing it, all the right things. At the same time, we know, and a lot of CIOs or IT directors listening to this broadcast will know, that the legacy applications environment has somewhat been ignored.

Now, with the pressure on cost, people are saying, We've got to do something, but what can come out of that and what is coming out of that?" People are looking at this and saying, "We need to accomplish two things. We need a longer term strategy. We need an operational plan that fits into that, supported by our annual budget."

Foremost is this desire to get away from this ridiculous backlog of application changes, to get more agility into the system, and to get these core applications, which are the ones that provide the differentiation and the innovation for organizations, able to communicate with a far more mobile workforce.

At an event last week in America, a customer came up to me and said, "Seventy percent of our applications are batch running on a mainframe. How do I go to a line-of-business manager and say that I can connect that to a guy out there with a smartphone? How do I do that?" Today, it looks like an impossible gap to get across.

What people have to look at is where we're going strategically with our technology and our business alignment. At the same time, how can we have a short-term plan that starts delivering on some of the real benefits that people can get out there?

Gardner: In the first two parts of our series, we looked at several case studies that showed some remarkable return on investment (ROI). So, this is not just a nice to have strategic maturity process, but really pays dividends financially, and then has that longer term strategic roll-out.

Evans: Absolutely. These things have got to pay for themselves. An analyst last week, looked me in the face and said, "People want to get off the mainframe. They understand now that the costs associated with it are just not supportable and are not necessary."

One of the sessions you will hear in the virtual conference will be from Geoffrey Moore, where he talks about this whole difference between core applications and context -- context being applications that are there for productivity reasons, not for innovation or differentiation.

Lowest-cost platform


With a productivity application you want to get delivery on the lowest-cost platform you possibly can. The problem is that 20 or 30 years ago, people put everything on the mainframe. They wrote it all in code. Therefore, the challenge now is, what do you not need in code that can be in a package? What do you not need on the mainframe that could be running on a much more lower cost infrastructure or a completely different means of delivery, such as software as a service (SaaS).

The point is that there are demonstrably much less expensive ways of delivering these things. People have to just lift their heads up and look around, come and talk to us, and listen to the series and they will begin to see people who have done this before, and who have demonstrated that it works, as well as some of the amazing financial rewards that can be generated from this sort of work.

Gardner: John Pickett, let's go to you. We've talked about this, but I think showing it is always more impressive. The case studies that demonstrate the real-world returns tend to be the real education points. Could you share with us some of the case studies that you will be looking at during the upcoming virtual conference and walk us through how the alternative to mainframe process works?

Pickett: Sure, Dana. As Paul indicated, it's not really just about the overall cost, but it's about agility and being able to leverage the existing skills as well.

One of the case studies that I will go over is from the National Agricultural Cooperative Federation (NACF). It's a mouthful, but take a look at the number of banks that the NACF has. It has 5,500 branches and regional offices, so essentially it's one of the largest banks in Korea.

One of the items that they were struggling with was how to overcome some of the technology and performance limitations of the platform that they had. Certainly, in the banking environment, high availability and making sure that the applications and the services are running were absolutely key.

At the same time, they also knew that the path to the future was going to be through the IT systems that they had and they were managing. What they ended up doing was modernizing their overall environment, essentially moving their core banking structure from their current mainframe environment to a system running HP-UX. It included the customer and account information. They were able to integrate that with the sales and support piece, so they had more of a 360 degree view of the customer.

We talk about reducing costs. In this particular example, they were able to save $40 million on an annual basis. That's nice, and certainly saving that much money is significant, but, at the same time, they were able to improve their system response time two- to three-fold. So, it was a better response for the users.

But, from a business perspective, they were able to reduce their time to market. For developing a new product or service, that they were able to decrease that time from one month to five days.

Makes you more agile

If you are a bank and now you can produce a service much faster than your competition, that certainly makes it a lot easier and makes you a lot more agile. So, the agility is not just for the data center, it's for the business as well.

To take this story just a little bit further, they saw that in addition to the savings I just mentioned, they were able to triple the capacity of the systems in their environment. So, it's not only running faster and being able to have more capacity so you are set for the future, but you are also able to roll out business services a whole lot quicker than you were previously.

Gardner: I imagine that with many of these mainframe systems, particularly in a banking environment, they could be 15 or 20 years old. The requirements back then were dramatically different. If the world had not changed in 20 years, these systems might be up to snuff, but the world has changed dramatically. Look at the change we have seen in just the last nine months. Is that what we are facing here? We have a general set of different requirements around these types of applications.

Pickett: There are a couple of things, Dana. It's not only different requirements, but it's also being driven by a couple of different factors. Paul mentioned the cost and being able to be more efficient in today's economy. Any data center manager or CIO is challenged by that today. Given the high cost of legacy and mainframe environment, there's a significant amount of money to be saved.

It's not a one-size-fits-all. It's identifying the right choice for the application, and the right platform for the application as well.



Another example of what we were just talking about is that, if we shift to Europe, Middle East, and Africa region, there is very large insurance company in Spain. It ended up modernizing 14,000 million instructions per second (MIPS). Even though the applications had been developed over a number of years and decades, they were able to make the transition in a relatively short length of time. In a three- to six-month time frame they were able to move that forward.

With that, they saw a 2x increase in their batch performance. It's recognized as one of the largest batch re-hosts that are out there. It's just not an HP thing. They worked with Oracle on that as well to be able to drive Oracle 11g within the environment.

So, it's taking the old, but also integrating with the new. It's not a one-size-fits-all. It's identifying the right choice for the application, and the right platform for the application as well.

Gardner: So, this isn't a matter of swapping out hardware and getting a cheaper fit that way. This is looking at the entire process, the context of the applications, the extended process and architectural requirements in the future, and then looking at how to make the transition, the all important migration aspect.

Pickett: Yes. As we heard last week at a conference that both Paul and I were at, if all you're looking to do is to take your application and put it on to a newer, shinier box, then you are missing something.

Gardner: Let's go now to Les Wilson. Les, tell us a little bit about some studies that have been done and some of the newer insights into the incentives as to why the timing now for moving off of mainframes is so right.

Customer cases

Wilson: Thanks, Dana. I spend virtually every day talking directly to customers and to HP account teams on the subject of modernizing mainframes, and I'll be talking in detail about two particular customer case studies during the webinar.

Before I get into those details though, I want to preface my remarks by giving you some higher level views of what I see happening in the Americas. First of all, the team here is enjoying an unprecedented demand for our services from the customer base. It's up by a factor of 2 over 2008, and I think that some of the concepts that John and Paul have discussed around the reasons for that are very clear.

There's another point about HP's capabilities, as well, that makes us a very attractive partner for mainframe modernization solutions. Following the acquisition of EDS, we are really able to provide a one-stop shop for all of the services that any mainframe customer could require.

That includes anything from optimization of code, refactoring of code on the mainframe itself, all the way through re-hosting, migration, and transformation services. We've positioned ourselves as definitely the alternative to IBM mainframe customers.

In terms of customer situations, we've always had a very active business working with organizations in manufacturing, retail, and communications. One thing that I've perceived in the last year specifically -- it will come as no surprise to you -- is that financial institutions, and some of the largest ones in the world, are now approaching HP with questions about the commitment they have to their mainframe environments.

We're seeing a tremendous amount of interest from some of the largest banks in the United States, insurance companies, and benefits management organizations, in particular.

Second, maybe benefiting from some of the stimulus funds, a large number of government departments are approaching us as well. We've been very excited by customer interest in financial services and public sector. I just wanted to give you that by way of context.

In terms of the detailed case studies, when John Pickett first asked me to participate in the webinar, as well as in this particular recording, I was kind of struck with a plethora of choices. I thought, "Which case study should I choose that best represents some of the business that we are doing today?" So, I've picked two.

The first is a project we recently completed at a wood and paper products company. This is a worldwide concern. In this particular instance we worked with their Americas division on a re-hosting project of applications that are written in the Software AG environment. I hope that many of the listeners will be familiar with the database ADABAS and the language, Natural. These applications were written some years ago, utilizing those Software AG tools.

Demand was lowered

They had divested one of the major divisions within the company, and that meant that the demand for mainframe services was dramatically lowered. So, they chose to take the residual applications, the Software AG applications, representing about 300-350 MIPS, and migrate those in their current state, away from the mainframe, to an HP platform.

Many folks listening to this will understand that the Software AG environment can either be transformed and rewritten to run, say, in an Oracle or a Java environment, or we can maintain the customer's investment in the applications and simply migrate the ADABAS and Natural, almost as they are, from the mainframe to an alternative HP infrastructure. The latter is what we did.

By not needing to touch the mainframe code or the business rules, we were able to complete this project in a period of six months, from beginning to end. They are saving over $1 million today in avoiding the large costs associated with mainframe software, as well as maintenance and depreciation on the mainframe environment.

They're very, very pleased with the work that's being done. Indeed, we're now looking at an additional two applications in other parts of their business with the aim of re-hosting those applications as well.

They are saving over $1 million today in avoiding the large costs associated with mainframe software, as well as maintenance and depreciation on the mainframe environment.



The more monolithic approach to applications development and maintenance on the mainframe is a model that was probably appropriate in the days of the large conglomerates, where we saw a lot of companies trying to centralize all of that processing in large data centers. This consolidation made a lot of sense, when folks were looking for economies of scale in the mainframe world.

Today, we're seeing customers driving for that degree of agility you have just mentioned. In fact, my second case study represents that concept in spades. This is a large multinational manufacturing concern. They never allow their name to be used in these webcasts, so we will just refer to them as "a manufacturing company." They have a large number of businesses in their portfolio.

Our particular customer in this case study is the manufacturer of electronic appliances. One of the driving factors for their mainframe migration was precisely what you just said, Dana, that the ability to divest themselves from the large mainframe corporate environment, where most of the processing had been done for the last 20 years.

They wanted control of their own destiny to a certain extent, and they also wanted to prepare themselves for potential investment, divestment, and acquisition, just to make sure that they were masters of their own future.

Gardner: You mentioned earlier, John, about a two-times increase in the demand since 2008. I wonder if this demand increase is a blip. Is this something that is just temporary, or has the economy -- and some people call it the reset economy, actually changed the game -- and therefore IT needs to respond to that?

In a nutshell the question is whether this is going to be a two-year process, or are we changing the dynamic of IT and how business and IT need to come together in general?

Not a blip

Pickett: First, Dana, it's not a blip at all. We're seeing increased movement from mainframe over to HP systems, whether it's on an HP-UX platform or a Windows Server or SQL platform. Certainly, it's not a blip at all.

As a matter of fact, just within the past week, there was a survey by AFCOM, a group that represents data-center workers. It indicated that, over the next two years, 46 percent of the mainframe users said that they're considering replacing one or more of their mainframes.

Now, let that sink in -- 46 percent say they are going to be replacing high-end systems over the next two years. That's an absurdly high number. So, it certainly points to a trend that we are seeing in that particular environment -- not a blip at all.

Dana, that also points to the skills piece. A lot of times when we talk to people in a mainframe environment the question is, "I've got a mainframe, but what about the mainframe people that I have? They're good people, they know the process, and they have been around for a while." We found that HP, and moving to an HP centralized environment is really a soft landing for these people.

They can use the process skills that they have developed over time. They're absolutely the best at what they do in the process environment, but it doesn’t have to be tied to the legacy platform that they have been working on for the last 10 or 20 years.

We've found that there is a very strong migration for those skills and landing in a place where they can use and develop them for years to come.



We've found that you can take those same processes and apply them to a large HP Integrity Superdome environment, or NonStop environment. We've found that there is a very strong migration for those skills and landing in a place where they can use and develop them for years to come.

Gardner: Les, why do you see this as a longer term trend, and what are the technological changes that we can expect that will make this even more enticing, that is to say, lower cost, more efficient, and higher throughput systems that allow for the agility to take place as well?

Wilson: Good question, Dana, and you have two parts to it. Let me address the first one about the trend. I've been involved in this kind of business on and off since 1992. We have numbers going back to the late 1980s as to the fact that at that time there were over 50,000 mainframes installed worldwide.

When I next got into this business in 2004, the analyst firms confirmed that the number was now around 15,000-16,000. Just this week, we have had information, confirmed by another analyst, that the number of installed mainframes is now at about 10,400. We've seen a 15-20 year trend away from the mainframe, and that will continue, given this unprecedented level of interest we are seeing right now.

You talked about technology trends. Absolutely. Five years ago, it would have been fair to say that there were still mainframe environments and applications that could not be replaced by their open-system equivalents. Today, I don't think that that's true at all.

Airline reservation system

To give you an example, HP, in cooperation with American Airlines, has just announced that we're going to be embarking on a three-year transition of all of the TPF-based airline reservation systems that we HP has been providing as services to customers for 20 years.

That TPF environment will be re-engineered in its entirety over the course of the next three years to provide those same and enhanced airline reservation systems to customers on a Microsoft-HP bladed environment.

That's an unprecedented change in what was always seen as a mainframe centric application, airlines reservations, with the number of throughputs and the amount of transactions that need to be done every second. When these kinds of applications can be transformed to open systems' platforms, it's open season on any mainframe application.

Furthermore, the trend in terms of open-systems price performance improvement continues at 30-35 percent per annum. You just need to look at the latest Intel processors, whether they be x86 or Itanium-based, to see that. That price performance trend is huge in the open systems market.

I've been tracking what's been going on in the IBM System Z arena, and since there are no other competitors in this market, we see nothing more than 15 percent, maybe 18 percent, per annum price performance improvement. As time goes on, HP and industry standard platforms continue, and will continue, to outpace the mainframe technology. So, this trend is bound to happen.

People have to take considered opinions. Investments here are huge. The importance of legacy systems is second to none.



Gardner: Paul, we've heard quite a bit of compelling information. Tell us about the upcoming conference, and perhaps steps that folks can take to get more information or even get started as they consider their alternatives to mainframes?

Evans: Based on what you've heard from John and Les, there is clearly an interest out there in terms of understanding. I don't think this is, as they say in America, a slam dunk. The usual statement is, "How do you eat an elephant? and the answer is, "One bite at a time."

The point here is that this is not going to happen overnight. People have to take considered opinions. Investments here are huge. The importance of legacy systems is second to none. All that means that the things that John and Les are talking about are going to happen strategically over a long time. But, we have people coming to us every day saying, "Please, can you help me understand how do I start, where do I go, where do I go now, or where do I go next week, next year, or next month?

The reason behind the conference was to take a sort of multi-sided view of this. One side is the business requirement, which people like Geoffrey Moore will be talking about -- where the business is going and what does it need.

We'll be looking at a customer case study from the Italian Ministry of Education, looking at how they used multiple modernization strategies to fix their needs. We'll be looking at tools we developed, so that people can understand what the code is doing. We'll be hearing from Les, John, and customers -- Barclays Bank in London -- about what they have been doing and the results they have been getting.

Then, at the very end, we'll be hearing from Dale Vecchio, vice president of Gartner research, about what he believes is really going on.

Efficiency engine

The thing that underpins this is that the business requirement several decades ago drove the introduction of the mainframe. People needed an efficiency engine for doing payroll, human resources, whatever it may be, moving data around. The mainframe was invented and was built perfectly. It was an efficiency engine.

As time has gone on, people look at technology now to become an effectiveness engine. We've seen the blending of technologies between mainframes, PCs, and midrange systems. People now take this whole efficiency thing for granted. No one runs their own payroll, even to the point that people now look to BPOs or those sorts of things.

As we go forward, with people being highly mobile, with mobile devices dramatically exploding all over the place in terms of smartphones, Net PCs, or whatever, people are looking to blend technologies that will deliver both the efficiency and the effectiveness, but also the innovation. Technology is now the strategic asset that people will use going forward. There needs to be a technological response to that.

Over the last year or two, either John or Les referred to the enormous amounts of raw power we can now get from, say, an Intel microprocessor. What we want to do is harness that power and give people the ability to innovate and differentiate, but, at the same time, run those context applications that keep their companies alive.

That's really what we're doing with the conference -- demonstrating, in real terms, how we can get this technology to the bottom-line and how we can exploit it going forward.

Gardner: Well, great. We've been hearing about some case studies that demonstrate how costs can be cut significantly, while productivity and agility are boosted.

I want to thank our guests in today’s discussion. We've been joined by John Pickett, Worldwide Mainframe Modernization Program manager. Thank you, John.

Pickett: Thank you, Dana.

Gardner: We've also been joined by Les Wilson, America’s Mainframe Modernization director. Thank you, Les.

Wilson: Thank you for the opportunity, Dana.

Gardner: And also Paul Evans, worldwide marketing lead on Applications Transformation at HP. Thanks again Paul.

Evans: It's a pleasure.

Gardner: This is Dana Gardner, principal analyst at Interarbor Solutions. You have been listening to a sponsored BriefingsDirect podcast. Thanks for listening, and come back next time.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Learn more. Sponsor: Hewlett-Packard.




Gain more insights into "Application Transformation: Getting to the Bottom Line" via a series of HP virtual conferences. For more on Application Transformation, and to get real time answers to your questions, register to access the virtual conferences for your region:

Access the Asia Pacific event.
Access the EMEA event.
Access the Americas event.




Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Learn more. Sponsor: Hewlett-Packard.


Transcript of the third in a series of sponsored BriefingsDirect podcasts on the rationale and strategies for application transformation. Copyright Interarbor Solutions, LLC, 2005-2009. All rights reserved.

Tuesday, January 05, 2010

Game-Changing Architectural Advances Take Data Analytics to New Performance Heights

Transcript of a BriefingsDirect podcast on how new advances in collocating applications with data architecturally provides analytics performance breakthroughs.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Learn more. Sponsor: Aster Data Systems.

Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you're listening to BriefingsDirect.

Today, we present a sponsored podcast discussion on how new architectures for data and logic processing are ushering in a game-changing era of advanced analytics.

These new approaches support massive data sets to produce powerful insights and analysis, yet with unprecedented price-performance. As we enter 2010, enterprises are including more forms of diverse data into their business intelligence (BI) activities. They're also diversifying the types of analysis that they expect from these investments.

We're also seeing more kinds and sizes of companies and government agencies seeking to deliver ever more data-driven analysis for their employees, partners, users, and citizens. It boils down to giving more communities of participants what they need to excel at whatever they're doing. By putting analytics into the hands of more decision makers, huge productivity wins across entire economies become far more likely.

But such improvements won’t happen if the data can't effectively reach the application's logic, if the systems can't handle the massive processing scale involved, or the total costs and complexity are too high.

In this discussion we examine how convergence of data and logic, of parallelism and MapReduce -- and of a hunger for precise analysis with a flood of raw new data -- all are setting the stage for powerful advanced analytics outcomes.

Here to help us learn how to attain advanced analytics and to uncover the benefits from these new architectural activities for ubiquitous BI, are Jim Kobielus, senior analyst at Forrester Research. Welcome, Jim.

Jim Kobielus: Hi, Dana. Hi, everybody.

Gardner: We're also joined by Sharmila Mulligan, executive vice president of marketing at Aster Data. Welcome, Sharmila.

Sharmila Mulligan: Thank you. Hello, everyone.

Gardner: Jim, let me start with you. We're looking at a shift now, as I have mentioned, in response to oceans of data and the need for analysis across different types of applications and activities. What needs to change? The demands are there, but what needs to change in terms of how we provide the solution around these advanced analytical undertakings?

Rethinking platforms

Kobielus: First, Dana, we need to rethink the platforms with which we're doing analytical processing. Data mining is traditionally thought of as being the core of advanced analytics. Generally, you pull data from various sources into an analytical data mart.

That analytical data mart is usually on a database that's specific to a given predictive modeling project, let's say a customer analytics project. It may be a very fast server with a lot of compute power for a single server, but quite often what we call the analytical data mart is not the highest performance database you have in your company. Usually, that high performance database is your data warehouse.

As you build larger and more complex predictive models -- and you have a broad range of models and a broad range of statisticians and others building, scoring, and preparing data for these models -- you quickly run into resource constraints on your existing data-mining platform, really. So, you have to look for where you can find the CPU power, the data storage, and the I/O bandwidth to scale up your predictive modeling efforts. That's the number one thing. The data warehouse is the likely suspect.

Also, you need to think about the fact that these oceans of data need to be prepared, transformed, cleansed, meshed, merged, and so forth before they can be brought into your analytical data mart for data mining and the like.

Quite frankly, the people who do predictive modeling are not specialists at data preparation.



Quite frankly, the people who do predictive modeling are not specialists at data preparation. They have to learn it and they sometimes get very good at it, but they have to spend a lot of time on data mining projects, involved in the grunt work of getting data in the right format just to begin to develop the models.

As you start to rethink your whole advanced analytics environment, you have to think through how you can automate to a greater degree all these data preparation, data loading chores, so that the advanced analytics specialists can do what they're supposed to do, which is build and tune models of various problem spaces. Those are key challenges that we face.

But, there is one third challenge, which is advanced analytics producing predictive models. Those predictive models increasingly are deployed in-line to transactional applications, like your call center, to provide some basic logic and rules that will drive such important functions as "next best offer" being made to customers based on a broad variety of historical and current information.

How do you inject predictive logic into your transactional applications in a fairly seamless way? You have to think through that, because, right now, quite often analytical data models, predictive models, in many ways are not built for optimal embedding within your transactional applications. You have to think through how to converge all these analytical models with the transactional logic that drives your business.

Gardner: Okay. Sharmila, are your users or the people that you talk to in the market aware that this shift is under way? Do they recognize that the same old way of doing things is not going to sustain them going forward?

New data platform

Mulligan: What we see with customers is that the advanced analytics needs and the new generation of analytics that they are trying to do is driving the need for a new data platform.

Previously, the choice of a data management platform was based primarily on price-performance, being able to effectively store lots of data, and get very good performance out of those systems. What we're seeing right now is that, although price performance continues to be a critical factor, it's not necessarily the only factor or the primary thing driving their need for a new platform.

What's driving the need now, and one of the most important criteria in the selection process, is the ability of this new platform to be able to support very advanced analytics.

Customers are very precise in terms of the type of analytics that they want to do. So, it's not that a vendor needs to tell them what they are missing. They are very clear on the type of data analysis they want to do, the granularity of data analysis, the volume of data that they want to be able to analyze, and the speed that they expect when they analyze that data.

There is a big shift in the market, where customers have realized that their preexisting platforms are not necessarily suitable for the new generation of analytics that they're trying to do.



They are very clear on what their requirements are, and those requirements are coming from the top. Those new requirements, as it relates to data analysis and advanced analytics, are driving the selection process for a new data management platform.

There is a big shift in the market, where customers have realized that their preexisting platforms are not necessarily suitable for the new generation of analytics that they're trying to do.

Gardner: Let's take a pause and see if we can't define these advanced analytics a little better. Jim, what do we mean nowadays when we say "advanced analytics?"

Kobielus: Different people have their definitions, but I'll give you Forrester's definition, because I'm with Forrester. And, it makes sense to break it down into basic analytics versus advanced analytics.

What is basic analytics? Well, that's BI. It's the core of BI that you build your decision support environment on. That's reporting, query, online analytical processing, dashboarding, and so forth. It's fairly clear what's in the core scope of BI.

Traditional basic analytics is all about analytics against deep historical datasets and being able to answer questions about the past, including the past up to the last five seconds. It's the past that's the core focus of basic analytics.

What's likely to happen

Advanced analytics is focused on how to answer questions about the future. It's what's likely to happen -- forecast, trend, what-if analysis -- as well as what I like to call the deep present, really current streams for complex event processing. What's streaming in now? And how can you analyze the great gushing streams of information that are emanating from all your applications, your workflows, and from social networks?

Advanced analytics is all about answering future-oriented, proactive, or predictive questions, as well as current streaming, real-time questions about what's going on now. Advanced analytics leverages the same core features that you find in basic analytics -- all the reports, visualizations, and dashboarding -- but then takes it several steps further.

First and foremost, it's all about amassing a data warehouse or a data mart full of structured and unstructured information and being able to do both data mining against the structured information, and text analytics or content analytics against the unstructured content.

Then, in the unstructured content, it's being able to do some important things, like natural language processing to look for entities and relationships and sentiments and the voice of the customer, so you can then extrapolate or predict what might happen in the future. What might happen if you make a given offer to a given customer at a given time? How are they likely to respond? Are they likely to jump to the competition? Are they likely to purchase whatever you're offering? All those kinds of questions.

The query and reporting aspect continues to be very important, but the difference now is that the size of the data set is far larger than what the customer has been running with before.



Gardner: Sharmila, do you have anything to offer further on defining advanced analytics in this market?

Mulligan: Before I go into advanced analytics, I'd like to add to what Jim just talked about on basic analytics. The query and reporting aspect continues to be very important, but the difference now is that the size of the data set is far larger than what the customer has been running with before.

What you've got is a situation where they want to be able to do more scalable reporting on massive data sets with very, very fast response times. On the reporting side, in terms of the end result to the customer, it is similar to the type of report they are trying to achieve, but the difference is that the quantity of data that they're trying to get at, and the amount of data that these reports are filling up is far greater than what they had before.

That's what's driving a need for a new platform underneath some of the preexisting BI tools that are, in themselves, good at reporting, but what the BI tools need is a data platform beneath them that allows them to do more scalable reporting than you could do before.

Kobielus: I just want to underline that, Sharmila. What Forrester is seeing is that, although the average data warehouse today is in the 1-10 terabyte range for most companies, we foresee the average warehouse size going, in the middle of the coming decade, into the hundreds of terabytes.

In 10 years or so, we think it's possible, and increasingly likely, that petabyte-scale data warehouses or content warehouses will become common. It's all about unstructured information, deep history, and historical information. A lot of trends are pushing enterprises in the direction of big data.

Managing big data

Mulligan: Absolutely. That is obviously the big topic here, which is, how do you manage big data? And, big data could be structured or it could be unstructured. How do you assimilate all this in one platform and then be able to run advanced analytics on this very big data set?

Going back to what Jim discussed on advanced analytics, we see two big themes. One is
the real-time nature of what our customers want to do. There are particular use cases, where what they need is to be able to analyze this data in near real-time, because that's critical to being able to get the insights that they're looking for.

Fraud analytics is a good example of that. Customers have been able to do fraud analytics, but they're running fraud checks after the fact and discovering where fraud took place after the event has happened. Then, they have to go back and recover from that situation. Now, what customers want, is to be able to run fraud analytics in near real-time, so they can catch fraud while it's happening.

What you see is everything from cases in financial services companies related to product fraud, as well as, for example, online gaming sites, where users of the system are collaborating on the site and trying to commit fraud. Those type of scenarios demand a system that can return the fraud analysis data near real-time, so it can block these users from conducting fraud while it's happening.

The other big thing we see is the predictive nature of what customers are trying to do. Jim talked about predictive analytics and modeling analytics. Again, that's a big area that we see massive new opportunity and a lot of new demand. What customers are trying to do there is look at their own customer base to be able to analyze data, so that they can predict trends in the future.

. . . The other big theme we see is the push toward analysis that's really more near real time than what they were able to do before.



For example, what are the buying trends going to be, let's say at Christmas, for consumers who live in a certain area? There is a lot around behavior analysis. In the telco space, we see a lot of deep analysis around trying to model behavior of customers on voice usage of their mobile devices versus data usage.

By understanding some of these patterns and the behavior of the users in more depth, these organizations are now able to better service their customers and offer them new product offerings, new packages, and a higher level or personalization, by understanding the behavior of their customers in more depth.

Predictive analytics is a term that's existed for a while, and is something that customers have been doing, but it's really reaching new levels in terms of the amount of data that they're trying to analyze for predictive analytics, and in the granularity of the analytics itself in being able to deliver deeper predictive insight and models.

As I said, the other big theme we see is the push toward analysis that's really more near real time than what they were able to do before. This is not a trivial thing to do when, it comes to very large data sets, because what you are asking for is the ability to get very, very quick response times and incredibly high performance on terabytes and terabytes of data to be able to get these kind of results in real-time.

Gardner: Jim, these examples that Sharmila has shared aren't just rounding errors. This isn't a movement toward higher efficiency. These are game changers. These are going to make or break your business. This is going to allow you to adjust to a changing economy and to shifting preferences by your customers. We're talking about business fundamentals here.

Social network analysis

Kobielus: We certainly are. Sharmila was discussing behavioral analysis, for example, and talking about carrier services. Let's look at what's going to be a true game changer, not just for business, but for the global society. It's a thing called social network analysis.

It's predictive models, fundamentally, but it's predictive models that are applied to analyzing the behaviors of networks of people on the web, on the Internet, Facebook, and Twitter, in your company, and in various social network groupings, to determine classification and clustering of people around common affinities, buying patterns, interests, and so forth.

As social networks weave their way into not just our consumer lives, but our work lives, our life lives, social network analysis -- leveraging all the core advanced analytics of data mining and text analytics -- will take the place of the focus group. In an online world, everything is virtual. As a company, you're not going to be able, in any meaningful way, to bring together your users into a single room and ask them what they want you to do or provide for them.

What you're going to do, though, is listen to them. You're going to listen to all their tweets and their Facebook updates and you're going to look at their interactions online through your portal and your call center. Then, you're going to take all that huge stream of event information -- we're talking about complex event processing (CEP) -- you're going to bring it into your data warehousing grid or cloud.

You're also going to bring historical information on those customers and their needs. You're going to apply various social network behavioral analytics models to it to cluster people into the categories that make us all kind of squirm when we hear them, things like yuppie and Generation X and so forth. Professionals in the behavioral or marketing world are very good at creating segmentation of customers, based on a broad range of patterns.

They can get a sense of how a product or service is being perceived in real-time, so that the the provider of that product or service can then turn around and tweak that marketing campaign . . .



Social network analysis becomes more powerful as you bring more history into it -- last year, two years, five years, 10 years worth of interactions -- to get a sense for how people will likely respond likely to new offers, bundles, packages, campaigns, and programs that are thrown at them through social networks.

It comes down to things like Sharmila was getting at, simple things in marketing and sales, such as a Hollywood studio determining how a movie is being perceived by the marketplace, by people who go out to the theater and then come out and start tweeting, or even tweeting while they are in the theater -- "Oh, this movie is terrible" or "This movie rocks."

They can get a sense of how a product or service is being perceived in real-time, so that the the provider of that product or service can then turn around and tweak that marketing campaign, the pricing, and incentives in real-time to maximize the yield, the revenue, or profit of that event or product. That is seriously powerful and that's what big data architectures allow you to do.

If you can push not just the analytic models, but to some degree bring transactional applications, such as workflow, into this environment to be triggered by all of the data being developed or being sifted by these models, that is very powerful.

Gardner: We know that things are shifting and changing. We know that we want to get access to the data and analytics. And, we know what powerful things those analytics can do for us. Now, we need to look at how we get there and what's in place that prevents us.

Let's look at this architecture. I'm looking into MapReduce more and more. I am even hearing that people are starting to write MapReduce into their requests for proposals (RFPs), as they're looking to expand and improve their situation. Sharmila, what's wrong with the current environment and why do we need to move into something a bit different?

Moving the data

Mulligan: One of the biggest issues that the preexisting data pipeline faces is that the data lives in a repository that's removed from where the analytics take place. Today, with the existing solutions, you need to move terabytes and terabytes of data through the data pipeline to the analytics application, before you can do your analysis.

There's a fundamental issue here. You can't move boulders and boulders of data to an application. It's too slow, it's too cumbersome, and you're not factoring in all your fresh data in your analysis, because of the latency involved.

One of the biggest shifts is that we need to bring the analytics logic close to the data itself. Having it live in a completely different tier, separate from where the data lives, is problematic. This is not a price-performance issue in itself. It is a massive architectural shift that requires bringing analytics logic to the data itself, so that data is collocated with the analytics itself.

MapReduce, which you brought up earlier, plays a critical role in this. It is a very powerful technology for advanced analytics and it brings capabilities like parallelization to an application, which then allows for very high-performance scalability.

What we see in the market these days are terms like "in-database analytics," "applications inside data," and all this is really talking about the same thing. It's the notion of bringing analytics logic to the data itself.

One of the biggest shifts is that we need to bring the analytics logic close to the data itself.



I'll let Jim add a lot more to that since he has developed a lot of expertise in this area.

Gardner: Jim, are we in a perfect world here, where we can take the existing BI applications and apply them to this new architecture of joining logic and data in proximity, or do we have to come up with whole new applications in order to enjoy this architectural benefit?

Kobielus: Let me articulate in a little bit more detail what MapReduce is and is not. MapReduce is, among other things, a set of extensions to SQL -- SQL/MapReduce (SQL/MR). So, you can build advanced analytic logic using SQL/MR that can essentially do the data prep, the data transformations, the regression analyses, the scoring, and so forth, against both structured data in your relational databases and unstructured data, such as content that you may source from RSS feeds and the like.

To the extent that we always, or for a very long time, have been programming database applications and accessing the data through standard SQL, SQL/MR isn't radically different from how BI applications have traditionally been written.

Maximum parallelization

But, these are extensions and they are extensions that are geared towards enabling maximum parallelization of these analytic processes, so that these processes can then be pushed out and be executed, not just in-databases, but in file systems, such as the Hadoop Distributed File System, or in cloud data warehouses.

MapReduce, as a programming model and as a language, in many ways, is agnostic as to the underlying analytic database, file system, or cloud environment where the information, as a whole lives, and how it's processed.

But no, you can't take your existing BI applications, in terms of the reporting, query, dashboarding, and the like, transparently move them, and use MapReduce without a whole lot of rewriting of these applications.

You can't just port your existing BI applications to MapReduce and database analytics. You're going to have to do some conversions, and you're going to have to rewrite your applications to take advantage of the parallelism that SQL/MR enables.

MapReduce, in many ways, is geared not so much for basic analytics. It's geared for advanced analytics. It's data mining and text mining. In many ways, MapReduce is the first open framework that the industry has ever had for programming the logic for both data mining and text mining in a seamless way, so that those two types of advanced analytic applications can live and breathe and access a common pool of complex data.

In the marriage of SQL with MapReduce, the real intent is to bring the power of MapReduce to the enterprise, so that SQL programmers can now use that technology.



MapReduce is an open standard that Aster clearly supports, as do a number of other database and data warehousing vendors. In the coming year and the coming decade, MapReduce and Hadoop -- and I won't go to town on what Hadoop is -- will become fairly ubiquitous within the analytics arena. And, that’s a good thing.

So, any advanced analytic logic that you build in one tool, in theory, you can deploy and have it optimized for execution in any MapReduce-enabled platform. That’s the promise. It’s not there yet. There are a lot of glitches, but that’s the strong promise.

Mulligan: I'd like to add a little bit to that Dana. In the marriage of SQL with MapReduce, the real intent is to bring the power of MapReduce to the enterprise, so that SQL programmers can now use that technology. MapReduce alone does require some sophistication in terms of programming skills to be able to utilize it. You may typically find that skill set in Web 2.0 companies, but often you don’t find developers who can work with that in the enterprise.

What you do find in enterprise organizations is that there are people who are very proficient at SQL. By bringing SQL together with MapReduce what enterprise organizations have is the familiarity of SQL and the ease of using SQL, but with the power of MapReduce analytics underneath that. So, it’s really letting SQL programmers leverage skills they already have, but to be able to use MapReduce for analytics.

Important marriage

Over time, of course, it’s possible that there will be more expertise developed within enterprise organizations to use MapReduce natively, but at this time and, we think, in the next couple of years, the SQL/MapReduce marriage is going to be very important to help bring MapReduce across the enterprise.

Hadoop, itself, obviously is an interesting platform too in being able to store lots of data cost effectively. However, often customers will also want some of the other characteristics of a data warehouse, like workload management, failover, backup recovery, etc., that the technology may not necessarily provide.

MapReduce right now, available with massive parallel processing (MPP), the new generation of MPP data warehouse is such a vast data solution, does bring kind of the best of both worlds. It brings what companies need in terms of the enterprise data warehouse capabilities. It lets you put application logic near data, as we talked about earlier. And, it brings MapReduce, but through the SQL/MapReduce framework, which really primarily is designed to ease adoption and use of MapReduce within the enterprise.

Gardner: Jim, we are on a journey. It’s going to be several years before we are getting to where we want to go, but there is more maturity in some areas than others. And, there is an opportunity to take technologies that are available now and do some real strong business outcomes and produce those outcomes.

Give me a sense of where you see the maturity of the architecture, of the SQL, and the tools and making these technologies converge? Who is mature? How is this shaking out a little bit?

Kobielus: Maturity is a best practice, in this case in-database analytics. As I said, it’s widely supported through proprietary approaches by many vendors.

In terms of the maturity, it's judged by adoption of an open industry framework with cross-vendor interoperability.



In terms of the maturity, it's judged by adoption of an open industry framework with cross-vendor interoperability. it's not mature yet, in terms of MapReduce and Hadoop. There are pioneering vendors like Aster, but there are a significant number of established big data warehousing vendors that have varying degrees of support now or in the near future for these frameworks. We're seeing strong indications. In fact, Teradata already is rolling out MapReduce and Hadoop support in their data warehousing offerings.

We're not yet seeing a big push from Oracle, or from Microsoft for that matter, in the direction of support for MapReduce or Hadoop, but we at Forrester believe that both of those vendors, in particular, will come around in 2010 with greater support.

IBM has made significant progress with its support for Hadoop and MapReduce, but it hasn’t yet been fully integrated into that particular vendor's platform.

Looking to 2010, 2011

If we look at a broad range of other data warehousing vendors like Sybase, Greenplum, and others, most vendors have it on their roadmap. To some degree, various vendors have these frameworks in in development right now. I think 2010 and 2011 are the years when most of the data warehousing and also data mining vendors will begin to provide mature, interoperable implementations of these standards.

There is a growing realization in the industry that advanced analytics is more than just being able to mine information at rest, which is what MapReduce and Hadoop are geared to doing. You also need to be able to mine and do predictive analytics against data in motion. That’s CEP. MapReduce and Hadoop are not really geared to CEP applications of predictive modeling.

There needs to be, and there will be over the next five years or so, a push in the industry to embed MapReduce and Hadoop. There are few vendors that are showing some progress toward CEP predictive modeling, but it’s not widely supported yet, and it’s in proprietary approaches.

In this coming decade, we're going to see predictive logic deployed into all application environments, be they databases, clouds, distributed file systems, CEP environments, business process management (BPM) systems, and the like. Open frameworks will be used and developed under more of a service-oriented architecture (SOA) umbrella, to enable predictive logic that’s built in any tool to be deployed eventually into any production, transaction, or analytic environment.

It will take at least 3 to 10 years for a really mature interoperability framework to be developed, for industry to adopt it, and for the interoperability issues to be worked out.



It will take at least 3 to 10 years for a really mature interoperability framework to be developed, for industry to adopt it, and for the interoperability issues to be worked out. It’s critically important that everybody recognizes that big data, at rest and in motion, needs to be processed by powerful predictive models that can be deployed into the full range of transactional applications, which is where the convergence of big data, analytics, and transactions come in.

Data warehouses, as the core of your analytics environment, need to evolve to become in their own right application servers that can handle both the analytic applications or traditional data warehousing in BI and data mining, as well as the transactional logic, and really handle it all with full security and workload isolation, failover, and so forth in a way that’s seamless.

I'm really excited, for example, by what Aster has rolled out with their latest generation, 4.0 of the Data-Application Server. I see a little bit of progress by Oracle on the Exadata V2. I'm looking forward to seeing if other vendors follow suit and provide a cloud-based platform for a broad range of transactional analytics.

Gardner: Sharmila, Jim has painted a very nice picture of where he expects things to go. He mentioned Aster Data 4.0. Tell us a little bit about that, and where you see the stepping stones lining up.

Mulligan: As I mentioned earlier, one of the biggest requirements in order to be able to do very advanced analytics on terabyte- and petabyte-level data sets, is to bring the application logic to the data itself. Earlier, I described why you need to do this. You want to eliminate as much data movement as possible, and you want to be able to do this analysis in as near real-time as possible.

What we did in Aster Data 4.0 is just that. We're allowing companies to push their analytics applications inside of Aster’s MPP database, where now you can run your application logic next to the data itself, so they are both collocated in the same system. By doing so, you've eliminated all the data movement. What that gives you is very, very quick and efficient access to data, which is what's required in some of these advanced analytics application examples we talked about.

Pushing the code

What kind of applications can you push down into the system? It can be any app written in Java, C, C++, Perl, Python, .NET. It could be an existing custom application that an organization has written and that they need to be able to scale to work on much larger data sets. That code can be pushed down into the apps database.

It could be a new application that a customer is looking to write to do a level of analysis that they could not do before, like real-time fraud analytics, or very deep customer behavior analysis. If you're trying to deliver these new generations of advanced analytics apps, you would write that application in the programming language of your choice.

You would push that application down into the Aster system, all your data would live inside of the Aster MPP database, and the application would run inside of the same system collocated with the data.

In addition to that, it could be a packaged application. So, it could be an application like software as a service (SaaS) that you want to scale to be able to analyze very large data sets. So, you could push a packaged application inside the system as well.

One of the fundamental things that we leverage to allow you to do more powerful analytics with these applications is MapReduce. You don’t have to MapReduce enable an application when you push it down into the apps system, but you could choose to and, by doing so, you automatically parallelize the application, which gives you very high performance and scalability when it comes to accessing large datasets. You also then leverage some of the analytics capabilities of MapReduce that are not necessary inherent in something like SQL.

That's a very attractive feature, because fundamentally the data warehousing cloud is an analytic application server.



The key components of 4.0 drive to where it's providing you a platform that can efficiently and cost effectively store massive amounts of data, plus give you a platform that allows you to do very advanced and sophisticated analytics. To run through those key things that we've done in 4.0, is first, the ability to push applications inside the system, so apps are collocated with the data.

We also offer SQL/MapReduce as the interface. Business analysts who are working with this application on a regular basis don’t have to learn MapReduce. They can use SQL/MR and leverage their existing SQL skills to work with that app. So, it makes it very easy for any number of business analysts in the organization to leverage their preexisting SQL skills and work with this app that's pushed down into the system.

Finally, in order to support the ability to run application inside a data, which as I said earlier is nontrivial, we added fundamental new capabilities like Dynamic Mix Workload Management. Workload Management in the Aster system works not just on data queries, but on the application processes as well, so you can balance workloads when you have a system that's managing data and applications.

Kobielus: Sharmila, I think the greatest feature of the 4.0 is simply the ability to run predictive models developed in SaaS or other tools in their native code without converting them necessarily to SQL/MR. That means that your customers can then leverage that huge installed piece of intellectual property or pool of intellectual property, all those models, bring it in, and execute it natively within your distributed grid or cloud, as a way of avoiding having to do that rewrite. Or, if they wish, they can migrate them or convert them over to SQL/MR. It's up to them.

That's a very attractive feature, because fundamentally the data warehousing cloud is an analytic application server. Essentially, you want that ability to be able to run disparate legacy models in parallel. That's just a feature that needs to be adopted by the industry as a whole.

The customer decides

Mulligan: Absolutely. I do want to clarify that the Aster 4.0 solution can be deployed in the cloud, or it can be installed in a standard implementation on-premise, or it could be adopted in an appliance mode. We support all three. It's up to the customer which of those deployment models they need or prefer.

To talk in a little bit more detail about what Jim is referring to, the ability to take an existing app, have to do absolutely no rewrite, and push that application down is, of course, very powerful to customers. It means that they can immediately take an analytics app they already have and have it operate on much larger data sets by simply taking that code and pushing it down.

That can be done literally within a day or two. You get the Aster system, you install it, and then, by the second day, you could be pushing your application down.

If you choose to leverage the MapReduce analytics capabilities, then as I said earlier, you would MapReduce enable an app. This simply means you take your existing application and, again, you don’t have to do any rewrite of that logic. You just add MapReduce functions to it and, by doing so, you have now MapReduce-enabled it. Then, you push it down and you have SQL/MR as an interface to that app.

The process of MapReduce enabling an app also is very simple. It's a couple of days process. This is not something that takes weeks and weeks to do. It literally can be done in a couple of days.

It means that they can immediately take an analytics app they already have and have it operate on much larger data sets by simply taking that code and pushing it down.



We had a retailer recently who took an existing app that they had already written, a new type of analytics application that they wanted to deploy. They simply added MapReduce capabilities to it and pushed it down into the Aster system, and it's now operating on very, very large data sets, and performing analytics that they weren't able to originally do.

The ease of application push down and the ease of MapReduce enabling is definitely key to what we have done in 4.0, and it allows companies to realize the value of this new type of platform right away.

Gardner: I know it's fairly early in the roll out. Do you have any sense of metrics, from some of these users? What do they get back? We talked earlier in the examples about what could be done and what should be done nowadays with analysis. Do you have any sense of what they have able to do with 4.0?

Reducing processing times

Mulligan: For example, we have talked about customers like comScore who are processing 1.6 billion rows of data on a regular basis, and their data volumes continue to grow. They have many business analysts who operate the system and run reports on a daily basis, and they are able to get results very quickly on a large data set.

We have customers who have gone from 5-10 minute processing times on their data set, to 5 seconds, as a result of putting the application inside of the system.

We have had fraud applications that would take 60-90 minutes to run in the traditional approach, where the app was running outside the database, and now those applications run in 60-90 seconds.

Literally, by collocating your application logic next to the data itself, you can see that you are immediately able to go from many minutes of processing time, down to seconds, because you have eliminated all the data movement altogether. You don’t have to move terabytes of data.

Add to that the fact that you can now access terabyte-sized data sets, versus what customers have traditionally been left with, which is only the ability to process data sets in the order of several tens of gigabytes or hundreds of gigabytes. Now, we have telcos, for example, processing four- or five-terabyte data sets with very fast response time.

We're talking about a collision of two cultures, or more than two cultures. Data warehousing professionals and data mining professionals live in different worlds, as it were.



It's the volume of data, the speed, the acceleration, and response time that really provide the fundamental value here. MapReduce, over and above that, allows you to bring in more analytics power.

Gardner: A final word to you, Jim Kobielus. This really is a good example of how convergence is taking place at a number of different levels. Maybe you could give us an insight into where you see convergence happening, and then we'll have to leave it there.

Kobielus: First of all, with convergence the flip side is collision. I just want to point out a few issues that enterprises and users will have to deal with, as they move toward this best practice called in-database analytics and convergence of the transactions and analytics.

We're talking about a collision of two cultures, or more than two cultures. Data warehousing professionals and data mining professionals live in different worlds, as it were. They quite often have an arm's length relationship to each other. The data warehouse traditionally is a source of data for advanced analytics.

This new approach will require a convergence, rapprochement, or a dialog to be developed between these two groups, because ultimately the data warehouse is where the data mining must live. That's going to have to take place, that coming together of the tribes. That's one of the best emerging practices that we're recommending to Forrester clients in that area.

Common framework

Also, transaction systems -- enterprise resource planning (ERP) and customer relationship management (CRM) -- and analytic systems -- BI and data warehousing -- are again two separate tribes within your company. You need to bring together these groups to work out a common framework for convergence to be able to take advantage of this powerful new architecture that Sharmila has sketched out here.

Much of your transactional logic will continue to live on source systems, the ERP, CRM, supply chain management, and the like. But, it will behoove you, as an organization, as a user to move some transactional logic, such as workflow, in particular, into the data warehousing cloud to be driven by real-time analytics and KPIs, metrics, and messages that are generated by inline models built with MapReduce, and so forth, and pushed down into the warehousing grid or cloud.

Workflow, and especially rules engines, increasingly we will find to be tightly integrated or brought into a warehousing or analytics cloud that's got inline logic.

Another key trend for convergence is that data mining and text mining are coming together as a single discipline. When you have structured and unstructured sources of information or you have unstructured information from new sources like social networks and Twitter, Facebook, and blogs, it's critically important to bring it together into your data mining environment. A key convergence also is that data at rest and data in motion are converging, and so a lot of this will be real-time event processing.

Those are the key convergence and collision avenues that we are looking at going forward.

Gardner: Very good. We've been discussing how new architectures for data and logic processing are ushering in this game-changing era of advanced analytics. We've been joined by Jim Kobielus, senior analyst at Forrester Research. Thanks so much, Jim.

Kobielus: No problem. I enjoyed it.

Gardner: Also, we have been talking with Sharmila Mulligan, executive vice president of marketing at Aster Data. Thank you Sharmila.

Mulligan: Thanks so much, Dana.

Gardner: This is Dana Gardner, principal analyst at Interarbor Solutions. You've been listening to a sponsored BriefingsDirect podcast. Thanks for listening, and come back next time.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Download the transcript. Learn more. Sponsor: Aster Data Systems.

Transcript of a BriefingsDirect podcast on how new advances in collocating applications with data architecturally provides analytics performance breakthroughs. Copyright Interarbor Solutions, LLC, 2005-2010. All rights reserved.

You may also be interested in:

Monday, December 21, 2009

HP's Cloud Assure for Cost Control Takes Elastic Capacity Planning to Next Level

Transcript of a BriefingsDirect podcast on the need to right-size and fine-tune applications for maximum benefits of cloud computing.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Learn more. Download the transcript. Sponsor: Hewlett-Packard.

Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you’re listening to BriefingsDirect.

Today, we present a sponsored podcast discussion on the economic benefits of cloud computing -- of how to use cloud-computing models and methods to control IT cost by better supporting application workloads.

Traditional capacity planning is not enough in cloud-computing environments. Elasticity planning is what’s needed. It’s a natural evolution of capacity planning, but it’s in the cloud.

We'll look at how to best right-size applications, while matching service delivery resources and demands intelligently, repeatedly, and dynamically. The movement to pay-per-use model also goes a long way to promoting such matched resources and demand, and reduces wasteful application practices.

We'll also examine how quality control for these applications in development reduces the total cost of supporting applications, while allowing for a tuning and an appropriate way of managing applications in the operational cloud scenario.

To unpack how Cloud Assure services can take the mystique out of cloud computing economics and to lay the foundation for cost control through proper cloud methods, we're joined by Neil Ashizawa, manager of HP's Software-as-a-Service (SaaS) Products and Cloud Solutions. Welcome to BriefingsDirect, Neil.

Neil Ashizawa: Thanks very much, Dana.

Gardner: As we've been looking at cloud computing over the past several years, there is a long transition taking place of moving from traditional IT and architectural method to this notion of cloud -- be it private cloud, at a third-party location, or through some combination of the above.

Traditional capacity planning therefore needs to be refactored and reexamined. Tell me, if you could, Neil, why capacity planning, as people currently understand it, isn’t going to work in a cloud environment?

Ashizawa: Old-fashioned capacity planning would focus on the peak usage of the application, and it had to, because when you were deploying applications in house, you had to take into consideration that peak usage case. At the end of the day, you had to be provisioned correctly with respect to compute power. Oftentimes, with long procurement cycles, you'd have to plan for that.

In the cloud, because you have this idea of elasticity, where you can scale up your compute resources when you need them, and scale them back down, obviously that adds another dimension to old-school capacity planning.

Elasticity planning

The new way look at it within the cloud is elasticity planning. You have to factor in not only your peak usage case, but your moderate usage case and your low level usage case as well. At the end of the day, if you are going to get the biggest benefit of cloud, you need to understand how you're going to be provisioned during the various demands of your application.

Gardner: So, this isn’t just a matter of spinning up an application and making sure that it could reach a peak load of some sort. We have a new kind of a problem, which is how to be efficient across any number of different load requirements?

Ashizawa: That’s exactly right. If you were to take, for instance, the old-school capacity-planning ideology to the cloud, what you would do is provision for your peak use case. You would scale up your elasticity in the cloud and just keep it there. If you do it that way, then you're negating one of the big benefits of the cloud. That's this idea of elasticity and paying for only what you need at that moment.

If I'm at a slow period of my applications usage, then I don’t want to be over provisioned for my peak usage. One of the main factors why people consider sourcing to the cloud is because you have this elastic capability to spin up compute resources when usage is high and scale them back down when the usage is low. You don’t want to negate that benefit of the cloud by keeping your resource footprint at its highest level.

Gardner: I suppose also the holy grail of this cloud-computing vision that we've all been working on lately is the idea of being able to spin up those required instances of an application, not necessarily in your private cloud, but in any number of third-party clouds, when the requirements dictate that.

Ashizawa: That’s correct.

Gardner: Now, we call that hybrid computing. Is what you are working on now something that’s ready for hybrid or are you mostly focused on private-cloud implementation at this point?

Ashizawa: What we're bringing to the market works in all three cases. Whether you're a private internal cloud, doing a hybrid model between private and public, or sourcing completely to a public cloud, it will work in all three situations.

Gardner: HP announced, back in the spring of 2009, a Cloud Assure package that focused on things like security, availability, and performance. I suppose now, because of the economy and the need for people to reduce cost, look at the big picture about their architectures, workloads, and resources, and think about energy and carbon footprints, we've now taken this a step further.

Perhaps you could explain the December 2009 announcement that HP has for the next generation or next movement in this Cloud Assure solution set.

Making the road smoother

Ashizawa: The idea behind Cloud Assure, in general, is that we want to assist enterprises in their migration to the cloud and we want to make the road smoother for them.

Just as you said, when we first launched Cloud Assure earlier this year, we focused on the top three inhibitors, which were security of applications in the cloud, performance of applications in the cloud, and availability of applications in the cloud. We wanted to provide assurance to enterprises that their applications will be secure, they will perform, and they will be available when they are running in the cloud.

The new enhancement that we're announcing now is assurance for cost control in the cloud. Oftentimes enterprises do make that step to the cloud, and a big reason is that they want to reap the benefits of the cost promise of the cloud, which is to lower cost. The thing here, though, is that you might fall into a situation where you negate that benefit.

If you deploy an application in the cloud and you find that it’s underperforming, the natural reaction is to spin up more compute resources. It’s a very good reaction, because one of the benefits of the cloud is this ability to spin up or spin down resources very fast. So no more procurement cycles, just do it and in minutes you have more compute resources.

The situation, though, that you may find yourself in is that you may have spun up more resources to try to improve performance, but it might not improve performance. I'll give you a couple of examples.

You can find yourself in a situation where your application is no longer right-sized in the cloud, because you have over-provisioned your compute resources.



If your application is experiencing performance problems because of inefficient Java methods, for example, or slow SQL statements, then more compute resources aren't going to make your application run faster. But, because the cloud allows you to do so very easily, your natural instinct may be to spin up more compute resources to make your application run faster.

When you do that, you find yourself in is a situation where your application is no longer right-sized in the cloud, because you have over provisioned your compute resources. You're paying for more compute resources and you're not getting any return on your investment. When you start paying for more resources without return on your investment, you start to disrupt the whole cost benefit of the cloud.

Gardner: I think we need to have more insight into the nature of the application, rather than simply throwing additional instances of the application. Is that it at a very simple level?

Ashizawa: That’s it at a very simple level. Just to make it even simpler, applications need to be tuned so that they are right-sized. Once they are tuned and right-sized, then, when you spin up resources, you know you're getting return on your investment, and it’s the right thing to do.

Gardner: Can we do this tuning with existing applications -- you mentioned Java apps, for example -- or is this something for greenfield applications that we are creating newly for these cloud scenarios?

Java and .NET

Ashizawa: Our enhancement to Cloud Assure, which is Cloud Assure for cost control, focuses more on the Java and the .NET type applications.

Gardner: And those would be existing applications or newer ones?

Ashizawa: Either. Whether you have existing applications that you are migrating to the cloud, or new applications that you are deploying in the cloud, Cloud Assure for cost control will work in both instances.

Gardner: Is this new set software, services, both? Maybe you could describe exactly what it is that you are coming to market with.

Ashizawa: Cloud Assure for cost control solution comprises both HP Software and HP Services provided by HP SaaS. The software itself is three products that make up the overall solution.

Once you've right-sized it, you know that when you scale up your resources you're getting return on your investment.



The first one is our industry-leading Performance Center software, which allows you to drive load in an elastic manner. You can scale up the load to very high demands and scale back load to very low demand, and this is where you get your elasticity planning framework.

The second solution from a software’s perspective is HP SiteScope, which allows you to monitor the resource consumption of your application in the cloud. Therefore, you understand when compute resources are spiking or when you have more capacity to drive even more load.

The third software portion is HP Diagnostics, which allows you to measure the performance of your code. You can measure how your methods are performing, how your SQL statements are performing, and if you have memory leakage.

When you have this visibility of end user measurement at various load levels with Performance Center, resource consumption with SiteScope, and code level performance with HP Diagnostics, and you integrate them all into one console, you allow yourself to do true elasticity planning. You can tune your application and right-size it. Once you've right-sized it, you know that when you scale up your resources you're getting return on your investment.

All of this is backed by services that HP SaaS provides. We can perform load testing. We can set up the monitoring. We can do the code level performance diagnostics, integrate that all into one console, and help customers right-size the applications in the cloud.

Gardner: That sounds interesting, and, of course, harkens back to the days of distributed computing. We're just adding another level of complexity, that is to say, a sourcing continuum of some sort that needs to be managed as well. It seems to me that you need to start thinking about managing that complexity fairly early in this movement to cloud.

Ashizawa: Definitely. If you're thinking about sourcing to the cloud and adopting it, from a very strategic standpoint, it would do you good to do your elasticity planning before you go into production or you go live.

Tuning the application

The nice thing about Cloud Assure for cost control is that, if you run into performance issues after you have gone live, you can still use the service. You could come in and we could help you right-size your application and help you tune it. Then, you can start getting the global scale you wish at the right cost.

Gardner: One of the other interesting aspects of cloud is that it affects both design time and runtime. Where does something like the Cloud Assure for cost control kick in? Is it something that developers should be doing? Is it something you would do before you go into production, or if you are moving from traditional production into cloud production, or maybe all the above?

Ashizawa: All of the above. HP definitely recommends our best practice, which is to do all your elasticity planning before you go into production, whether it’s a net new application that you are rolling out in the cloud or a legacy application that you are transferring to the cloud.

Given the elastic nature of the cloud, we recommend that you get out ahead of it, do your proper elasticity planning, tune your system, and right-size it. Then, you'll get the most optimized cost and predictable cost, so that you can budget for it.

One of the side benefits obviously to right-sizing applications and controlling cost is to mitigate risk.



Gardner: It also strikes me, Neil, that we're looking at producing a very interesting and efficient feedback loop here. When we go into cloud instances, where we are firing up dynamic instances of support and workloads for application, we can use something like Cloud Assure to identify any shortcomings in the application.

We can take that back and use that as we do a refresh in that application, as we do more code work, or even go into a new version or some sort. Are we creating a virtual feedback loop by going into something like Cloud Assure?

Ashizawa: I can definitely see that being that case. I'm sure that there are many situations where we might be able to find something inefficient within the code level layer or within the database SQL statement layer. We can point out problems that may not have surfaced in an on-premise type deployment, where you go to the cloud, do your elasticity planning, and right-size. We can uncover some problems that may not have been addressed earlier, and then you can create this feedback loop.

One of the side benefits obviously to right-sizing applications and controlling cost is to mitigate risk. Once you have elasticity planned correctly and once you have right-sized correctly, you can deploy with a lot more confidence that your application will scale to handle global class and support your business.

Gardner: Very interesting. Because this is focused on economics and cost control, do we have any examples of where this has been put into practice, where we can examine the types of returns? If you do this properly, if you have elasticity controls, if you are doing planning, and you get across this life cycle, and perhaps even some feedback loops, what sort of efficiencies are we talking about? What sort of cost reductions are possible?

Ashizawa: We've been working with one of our SaaS customers, who is doing more of a private-cloud type implementation. What makes this what I consider a private cloud is that they are testing various resource footprints, depending on the load level.

They're benchmarking their application at various resource footprints. For moderate levels, they have a certain footprint in mind, and then for their peak usage, during the holiday season, they have an expanded footprint in mind. The idea here is that, they want to make sure they are provisioned correctly, so that they are optimizing their cost correctly, even in their private cloud.

Moderate and peak usage

We have used our elastic testing framework, driven by Performance Center, to do both moderate levels and peak usage. When I say peak usage, I mean thousands and thousands of virtual users. What we allow them to do is that true elasticity planning.

They've been able to accomplish a couple of things. One, they understand what benchmarks and resource footprints they should be using in their private cloud. They know that they are provisioned perfectly at various load levels. They know that, because of that, they're getting all of the cost benefits of their private cloud At the end of the day, they're mitigating their business risk by ensuring that their application is going to scale to their global cost scale to support their holiday season.

Gardner: And, they're going to be able to scale, if they use cloud computing, without necessarily having to roll out more servers with a forklift. They could find the fabric either internally or with partners, which, of course, has a great deal of interest from the bean counter side of things.

Ashizawa: Exactly. Now, we're starting to relay this message and target customers that have deployed applications in the public cloud, because we feel that the public cloud is where you may fall into that trap of spinning up more resources when performance problems occur, where you might not get the return on your investment.

So as more enterprises migrate to the cloud and start sourcing there, we feel that this elasticity planning with Cloud Assure for cost control is the right way to go.

Once it’s predictable, then there will be no surprises. You can budget for it and you could also ensure that you are getting the right performance at the right price.



Gardner: Also, if we're billing people either internally or through these third-parties on a per-use basis, we probably want to encourage them to have a robust application, because to spin up more instances of that application is going to cost us directly. So, there is also a built-in incentive in the pay-per-use model toward these more tuned, optimized, and planned-for cloud types of application.

Ashizawa: You said it better than I could have ever said it. You used the term pay-per-use, and it’s all about the utility-based pricing that the cloud offers. That’s exactly why this is so important, because whenever it’s utility based or pay-per-use, then that introduces this whole notion of variable cost. It’s obviously going to be variable, because what you are using is going to differ between different workloads.

So, you want to get a grasp of the variable-cost nature of the cloud, and you want to make this variable cost very predictable. Once it’s predictable, then there will be no surprises. You can budget for it and you could also ensure that you are getting the right performance at the right price.

Gardner: Neil, is this something that’s going to be generally available in some future time, or is this available right now at the end of 2009?

Ashizawa: It is available right now.

Gardner: If people were interested in pursuing this concept of elasticity planning, of pursuing Cloud Assure for cost benefits, is this something that you can steer them to, even if they are not quite ready to jump into the cloud?

Ashizawa: Yes. If you would like more information for Cloud Assure for cost control, there is a URL that you can go to. Not only can you get more information on the overall solution, but you can speak to someone who can help you answer any questions you may have.

Gardner: Let's look to the future a bit before we close up. We've looked at cloud assurance issues around security, performance, and availability. Now, we're looking at cost control and elasticity planning, getting the best bang for the buck, not just by converting an old app, sort of repaving an old cow path, if you will, but thinking about this differently, in the cloud context, architecturally different.

What comes next? Is there another shoe to fall in terms of how people can expect to have HP guide them into this cloud vision?

Ashizawa: It’s a great question. Our whole idea here at HP and HP Software-as-a-Service is that we're trying to pave the way to the cloud and make it a smoother ride for enterprises that are trying to go to the cloud.

So, we're always tackling the main inhibitors and the main obstacles that make it more difficult to adopt the cloud. And, yes, where once we were tackling security, performance, and availability, we definitely saw that this idea for cost control was needed. We'll continue to go out there and do research, speak to customers, understand what their other challenges are, and build solutions to address all of those obstacles and challenges.

Gardner: Great. We've been talking about moving from traditional capacity planning towards elasticity planning, and a series of announcements from HP around quality and cost controls for cloud assurance and moving to cloud models.

To better understand these benefits, we've been talking with Neil Ashizawa, manager of HP's SaaS Products and Cloud Solutions. Thanks so much, Neil.

Ashizawa: Thank you very much.

Gardner: This is Dana Gardner, principal analyst at Interarbor Solutions. You've been listening to a sponsored BriefingsDirect podcast. Thanks for listening, and come back next time.

Listen to the podcast. Find it on iTunes/iPod and Podcast.com. Learn more. Download the transcript. Sponsor: Hewlett-Packard.

Transcript of a BriefingsDirect podcast on the need to right-size and fine-tune applications for maximum benefits of cloud computing. Copyright Interarbor Solutions, LLC, 2005-2009. All rights reserved.

You may also be interested in: