Monday, December 29, 2008

BriefingsDirect Analysts Make 2009 Predictions for Enterprise IT, SOA, Cloud and Business Intelligence

Edited transcript of BriefingsDirect Analyst Insights Edition podcast, Vol. 35, on how analysts see cloud computing, SOA, the economy, and Obama Administration in 2009, recorded Dec. 19, 2008.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Charter Sponsor: Active Endpoints.

Special offer: Download a free, supported 30-day trial of Active Endpoint's ActiveVOS at

Dana Gardner: Hello, and welcome to the latest BriefingsDirect Analyst Insights Edition, Vol. 35. This periodic discussion and dissection of IT infrastructure related news and events, with a panel of industry analysts and guests, comes to you with the help of our Charter Sponsor, Active Endpoints, maker of the ActiveVOS visual orchestration system. I'm your host and moderator Dana Gardner, principal analyst at Interarbor Solutions.

Our topic this week, and this is the week of Dec. 15, 2008, marks our year-end show. Happy holidays to you all! But, rather than look back at this year in review, because the year changed really dramatically after September, I think it makes a lot more sense to look forward into 2009.

We're going to look at what trends may have changed in 2008, but with an emphasis on the impacts for IT users, and buyers and sellers in the coming year. We're going to ask our distinguished panel of analysts and experts for their predictions for IT in 2009.

To help us gaze into the crystal ball, we're joined by this week's BriefingsDirect Analyst Insights panel. Please let me welcome Jim Kobielus, senior analyst at Forrester Research.

Jim Kobielus: Hi, Dana. Hi, everybody.

Gardner: Tony Baer, senior analyst at Ovum.

Tony Baer: Happy holidays, Dana.

Gardner: Brad Shimmin, principal analyst at Current Analysis.

Brad Shimmin: Hi there, Dana, thanks for having me.

Gardner: Joe McKendrick, independent analyst and prolific blogger.

Joe McKendrick: Hi, Dana, and a happy Festivus to all.

Gardner: Dave Linthicum, founder of Linthicum Group.

Dave Linthicum: Hey, guys.

Gardner: Mike Meehan, senior analyst at Current Analysis.

Mike Meehan: Hello, all.

Gardner: And joining us for the first time, JP Morgenthal, senior analyst at Burton Group. Good to have you, JP.

JP Morgenthal: Thanks, Dana, and I'll jump on the Festivus wagon as well.

Shadow IT

Gardner: Let me start with the predictions. It gives me a chance to steal the thunder and get out there first.

My first prediction for 2009 is that spending from shadow IT activities will actually grow, and that the amount of money devoted to shadow IT activities will come from outside traditional IT budgets, from a variety of different sources, maybe even petty cash, and we'll see a bit of growth in these rogue activities.

At the same time, I think we will see a flattening, and in many cases a reduction, in officially sanctioned IT activities, but that the net result will actually be more spending overall across a variety of activities based on services and consulting as much as actual buying of licensed software and hardware products.

The risk is that these rogue applications can make it complex for governance, management, and even security, but that moving into these areas for business development purposes is going to be an overwhelming temptation. There will be more opportunities in the cloud, software as a service (SaaS), applications as a service, and for folks like marketers, business analysts, and business development professionals to take advantage and move in the market.

We're going to be looking at aggressive sales activities and new ways of reaching consumers of all kinds, across B2B and B2C activities.

I expect very little staff erosion in IT, but I think there will be a change in emphasis as to what IT is, defining it differently. Service-oriented architecture (SOA) is going to continue to grow, but Web oriented architecture (WOA) will probably overtake it and perhaps become a catalyst to some of these rogue activities. There will be a blurring between which WOA activities happen inside IT and outside.

So, my second prediction is that inside of traditional IT we're going to find a lot of new ways to quickly cut costs. This is going to be a drill for organizations to not spend money or spend less money. Virtualization will be a big part of that. Hypervisors will perhaps go commodity, and the value-add in the virtualized environment is going to be at the stacks -- virtualized stacks or containers at the applications level.

This could then lead to more direction toward a cloud operating system and a de-facto standard could begin to emerge, which would then spur even more adoption of virtualization.

We're going to see a lot more dumping of Unix and mainframes. We are going to sunset a lot of applications that aren't essential and save on the underlying costs of supporting them. There will be some modernization of applications, but only in areas where there is low risk.

There are still going to be a lot of organizations that aren't going to want to tinker with applications that are important, even if they are running on expensive infrastructure.

My third prediction is around extreme business intelligence (BI). There will be a move in scale, larger sets of data, larger sets of content, and more mingling or joining of disparate types of data and content in order to draw inferences about what the customers are willing to do and pay across both B2B and B2C activities.

We'll start to see an increased use of multi-core and parallelism to support these BI activities, and we will begin to see IT have a big role in this. This isn't something you can do as a rogue activity, but it might end up supporting rogue activities. That is to say, these new extreme BI activities might lead organizations to seek out services outside of IT. They then can execute on what they find through their analysis.

I also predict, at number four, that upgrades will suffer. Were not going to see a lot of swapping out of one system for another, unless there's a very compelling return-on-investment (ROI) scenario with verifiable short-term metrics. This is going to hurt companies like SAP and Microsoft, and Oracle and IBM to a lesser extent, given their diversification.

Trouble for Windows 7

I think Windows 7 is in trouble. People are not going to just run to Windows 7. They're going to continue to stay with XP, and this makes the timing around the Vista debacle all the more injurious to Microsoft. In hindsight, Vista needed to be a winner. Now that we're in a downturn, people are going to stick with what they have, and, of course, upgrades are essential for Microsoft to continue with its back-end strategy on data-center architecture and infrastructure.

This provides more of an opening for Linux and non-Microsoft virtualization, and that will continue. This could mean that Microsoft needs to move to its cloud offerings all the more quickly, which then could actually spell earnings troubles for the company, at least in the short to medium term.

My last prediction is that the role of social media and networks will continue to grow and be impactful for enterprises, as marketers and salespeople begin to look to these organizations from the metadata and inference about what customers are willing to buy, particularly under tight economic conditions.

There's going to be a need to tie traditional customer relationship management (CRM) and sales applications with some sort of a process overlay into the metadata that's available from these Web-based cloud environments, where users have shared so much inference and data about themselves.

So, I look for some mashups between social data and the sales and business development, perhaps through these rogue applications and approaches outside of IT, but IT activities nonetheless, in 2009. Thanks.

Jim Kobielus, you're up. What are your five predictions?

Kobielus: I need to go home now. You stole all my predictions. Actually, that was great, Dana. I was taking notes, just to make sure that I don't repeat too many of your points unnecessarily, although I do want to steal everything you just said.

My five predictions for 2009 ... I'll start by listing them under a quick phrase and then I'll elaborate very quickly. I don't want to steal everybody else's thunder.

The five broad categories of prediction for 2009 are: Number one, Obama. Number two, cloud. Number three, recession. Number four, GRC -- that's governance, risk, and compliance. Then, number five, social networking.

Let me just start with [U.S. President Elect Barack] Obama. Obviously, we're going to have a new president in 2009. He'll most likely appoint a national chief technology officer or a national tech policy coordinator. Based on his appointment so far, I think Obama is going to choose a heavy hitter who has huge credibility and stature in the IT space.

We've batted around various names, and I'm not going to add more to the mix now. Whoever it is, it's going to be someone who's going to focus on SOA at a national level, in terms of how we, as a country, can take advantage of reusing agility, transformation, optimization, and all the other benefits that come from SOA properly implemented across different agencies.

So, number one, I think Obama is going to make a major change in how the government deploys IT assets and spends them.

The maturing of clouds

Number two, cloud. Dana went to town on cloud, and I am not going to say much more, beyond the fact that in 2009, clouds are going to become less of a work in progress, in terms of public clouds and private clouds, and become more of a mature reality, in terms of how enterprises acquire functionality, how they acquire applications and platforms.

I break out the cloud developments in 2009 into a long alliterative list. Clouds will start up in greater numbers. They will stratify, which means that the vendors, like Google, Microsoft, and Amazon and others with their cloud offerings, will build full stacks, strata, in their cloud services that include all the appropriate layers, application components, integration services, and platforms. So, the industry will converge on a more of a reference model for cloud in 2009.

They'll also stabilize the clouds. In other words, they'll become more mature, stable and less scary for corporate IT to move applications and data to. They'll standardize, and the clouds will standardize around SOA and WOA standards. There will be more standards, interfaces, and application programming interfaces (APIs) focused on cloud computing, so you can move your applications and data from one cloud to another a bit more seamlessly than you can now with these proprietary clouds that are out there. And, there are other "S" items that I won't share here.

Number three, recession. Clearly, we are in a deep funk, and it might get a lot worse before it gets better. That's clearly hammering all IT budgets everywhere. So, as Dana said, every user and every organization is going to look for opportunities to save money on their IT budgets.

They're going to put a freeze on projects. They're going to delay or cancel upgrades. Their users, as you said very nicely, Dana, are going to dip into petty cash and go around IT to get what they need. They're going to go to cloud offerings. So, the recession will hammer the entire IT industry and all budgets.

As far as GRC, government is cracking down. If it has to bail out the financial-services industry, bail out the auto industry, and bail out other industries, the government is not going to do it with no strings attached.

Compliance, regulations, reporting requirements, the whole apparatus of GRC will be brought to bear on the industries that the government is saving and bailing out.

Then finally, social networking. Dana provided a very good discussion of how social networking will pervade everything in terms of applications and services.

The Obama campaign set the stage clearly for more WOA-style, Web 2.0, or social-networking style governance in this country and other countries. So, we'll see more uptake of social networking.

We'll see more BI become social networking, in the sense of mashup as a style of BI application, reporting, dashboards, and development. Mashups for user self-service BI development will come to the fore. It will be a huge theme in the BI space in 2009 and beyond of that.

That really plays into the whole cost control theme, which is that IT will be severely constrained in terms of budget and manpower. They're going to push more of the development work to the end user. The end user will build reports that heretofore you've relied on data modelers to build for you. Those are my five.

Gardner: Thank you, Jim. Tony Baer, you're up. What did we miss?

Cost savings, cost savings

Baer: It's going to be hard to top both of you folks, so I'm going to just add some things in the margins. If I were to make one elevator statement on this, I feel like the guy [Kenan Thompson as Oscar Rogers] from Saturday Night Live, the economic expert, who they interview on "Weekend Update." He starts to give all the causes. Then, he just says, "Well, just fix it!"

That's essentially going to be the theme this year. The top five are going to be cost savings, cost savings, cost savings.

That does involve a lot of the strategies that both you and Jim have just described. For one thing, it's going to put a lot more emphasis on using the resources and infrastructure that you already have. It's going to damp down entering into new long-term contracts for anything.

Ironically, one result of that is that for the moment, you'll actually see little less emphasis on outsourcing, because that does imply a long-term contract. The fact is, I don't think anyone is really doing any meaningful projecting beyond Q1. I was just reviewing Adobe's year-end numbers and projections. Normally, they project out for the full fiscal year, and they are only going to project out for the Q1.

I'll just go through a very quick laundry list. For one thing, as I mentioned, it's going to be a lot of low cost, no cost. There will be a lot more use of open source, a lot more. This is definitely the year that the cloud and SaaS come into their own, but with a key qualification.

I think it's going to be managed clouds. Essentially, to take advantage of raw clouds, like Amazon EC2 you have to put in more of your own management infrastructure. I don't see the use of what I would call "clouds in the wild." I see more managed clouds from that standpoint.

For IT organizations, it's going to dictate more attention to IT service management to show that we're not just keeping systems going and keeping the lights on, but more along the lines of, "Here are the services that we're delivering to the business," as they try to justify the system.

On the back-end, it will be "Use more of what you have," and huge renewed investments in BI. So, Jim, I do think you still have a job this year.

Finally, because it's going to take a while for this to unfold -- you just don't regulate overnight -- there will be much greater attention to GRC.

Gardner: Thank you, Tony. Brad Shimmin, you're up.

Shimmin: Thanks, Dana. For my predictions for 2009 I took a different tact in anticipation of a new analytical concern we're starting up here in January. It's going to focus on collaboration. So, everything I did settled on that.

All the predictions I have stem from the themes that you guys have been talking about: cutting cost, such as travel, and squeezing efficiencies out of the IT infrastructure, as well as the users themselves. So, bear that in mind as I go through this.

Collaborative social networks

The first one for me is vendors tackling enterprise-plus-consumer based social networks, a blended view of those. Enterprise-focused vendors are going to do more than simply sink info from public sites like Facebook. They're going to take that information and build into or out from the enterprise into those social networks and drive information from those. It's going to become a two-way street.

You're going to see folks like Facebook, and most notably, LinkedIn, working in the other direction themselves, and with third parties, to develop enterprise-bound social networks. Look for those to emerge next year.

The second thing for me is cloud software, now that it's jumped the shark. I know we've all been talking about it, but it's definitely jumped the shark for me. I see the vendors within the collaboration space settling beyond the small and medium business (SMB) market and looking more toward the larger enterprises that are looking to squeeze more out of their existing IT infrastructure or cut costs.

Folks like IBM and Microsoft have already shown us that they can hit the long tail with stuff like Bluehouse and Microsoft Online Services (MOS) for collaboration. But, you're going to see vendors like Cisco and Oracle take up this challenge with more of a focus on managed hosting services that look more like SaaS, but they are really managed.

That's something that will appeal to the larger enterprises, owing to security, manageability, and other assurances that you get from that, not just pure-play, do-it-yourself SaaS.

The third thing for me is that enterprises are going to move away from a steep hierarchy, or the word might be "oligarchy," of an organizational model internally. This is just about how enterprises structure themselves.

This goes back to what you were saying, Dana, with stuff going off the books, and what Tony was saying about driving revenue from places other than CAPEX. Instead, to become not just more efficient and agile, companies are going to want to self organize to create these internal ecosystems, if you will, where organizations are built around employee experience, associations, interests, and energy levels -- what they want to focus on.

That's going to allow companies to more efficiently harness the users. The people, as Jim was saying earlier, perhaps are going to be tasked with setting up their own BI queries and mashing up their own applications. It's really thinking about those people, giving them the ability to run the show inside of an organization, instead of waiting for everything to come top-down.

The fourth thing for me is -- speaking in terms of communities, both internally and externally -- I am seeing silos breakdown between those.

Gone are the days of consumer-faced social networking and enterprise-faced social networking existing as independent entities, as I was saying earlier. Thanks to user profile standards like OpenID and expansion of APIs, community providers and third-party aggregation and integration tool vendors are going to allow applications and users to flow between what were heretofore closed communities.

For example, you already have vendors moving in that direction with Yahoo's YOS, which now allows the My Yahoo start page to host third-party applications from nemesis Google.

The fifth and final thing for me -- and this might be more of a wish than a prediction; I'm an eternal optimist I guess -- I'm looking for virtual worlds to gain a foothold in the enterprise.

We've seen folks like [Cisco Chairman and CEO] John Chambers use Second Life to do a dog-and-pony show. Those are great marketing tools, but they're nothing compared to the efficiencies and benefits you can gain from using the software for other things. Dana, you alluded earlier to being able to leverage that mechanism for communication with CRM. I think we're going to see that change how virtual networks can be utilized inside the enterprise.

It's not just for marketing and sales, but also to support B2B and B2C communities, where effective communication between your supply channel members is really paramount. To date, nobody has tackled that.

So, we'll see virtual worlds actually make an impact in terms of allowing these global, loosely coupled entities communicate more effectively in 2009. That's it for me.

Gardner: Thanks. Joe McKendrick, how do you see things shaking out?

McKendrick: Thanks, Dana. You guys are a hard act to follow. My first prediction -- are you ready for this -- the government, the U.S. Treasury, is going to swoop in with the Troubled Assets Relief Program (TARP) funds and swoop up all the troubled IT assets across the country -- those IBM mainframes, older mainframes, DEC units, Windows NT.

Then, the Fed is going to come in with zero percent liquidity to help finance it, and that's going to raise all boats.

Gardner: Joe, are you defining a new sector called "Toxic IT?"

McKendrick: Toxic IT, there you go.

Gardner: Joe, April 1 is not for several months.

McKendrick: Okay, just kidding. My other prediction: President Obama is going to make Tony Baer the National CTO/CIO, because he wants to "just fix it," and that's a good philosophy.

It's the economy

Okay, all seriousness aside now. The top issue, of course, is the economy. It's going to dominate our thinking through 2009. But, recession planning is so 2008, because SOA, which I focus on as well as IT, is a long-term process. You need to look three years down the road.

The economy is going to turn around. I see it turning around at some point in 2009. That's what economists are saying, and companies have to prepare for a growth mode and the ability to grow within a new environment.

Let's face it. IT has already been tight. IT has been tight since the dot-bomb era of 2001-2002. As some of us have already been saying, there probably is not going to be a huge diminishment in IT departments, because of the fact that the budgets have been lean, things have already been tight, companies already know, or have been running very efficiently, and IT departments have been overworked as it is.

An interesting sidelight is the whole Enterprise 2.0. JP, you and I have discussed this a little bit. The recession and downturn isn't going to be like it's been in the past. People are more empowered with social networking tools, as employees and as people looking for jobs. They're looking to start new businesses

We have a lot of tools available to us now that we didn't have back in 2000, or we didn't have back in 1991 or 1982, or any of those previous eras. People don't have to be victims of an economic downturn, as they have been in the past. We have the capability to network across the globe. We have the capability to start new businesses.

I've talked on this webcast before about a company that started a business with an $80 investment in IT infrastructure, thanks to cloud computing. I just heard about another company that spent about $200 for its first two months of IT.

Gardner: The question is, Joe, are they getting their money's worth?

McKendrick: I think they are. They don't have to invest in servers. They don't have to go out and buy servers. They don't have to go out and buy disk arrays, and worry about the maintenance, hiring people, and know how to maintain those things. There are a lot of opportunities for companies, and we are going to see that. We are going to see folks -- maybe IT people, or people who work for vendors and have been laid off -- have the ability to start their own business at a very low cost of entry.

On the flip side of that, the whole social-networking and cloud-computing phenomena, companies have these tools as well to employ low-cost methods to reach their markets and to interact with their customers. We're going to see a lot more of that as well.

A marketing campaign doesn't have to cost $200,000 to reach your customers. You can use the social network, the Web 2.0 tools, to interact and collaborate and find out what's going on in your markets at a very relatively low cost.

Gardner: From your mouth to God's ears. All right. Dave Linthicum, we have the entire future before us. What should we expect?

Linthicum: You guys took a lot of my better ideas, but I'll just expand on some of them.

The first thing I'd like to do is throw my firm out there for a bailout from the government. I think a billion dollars. I'm cash-flow positive, but I think I can do a lot with the money, including throwing one hell of a New Year's Eve party. So, hopefully the money will start coming in.

Cloud comes into its own

Number one is that the interest in cloud computing, which I have been focusing on in my career, at least for the last eight years, is finally going to come into its own, like everybody has been saying here. That's rather obvious at this point.

As far as what I can add to what's been said so far, what we're going to see in 2009 is a lot of startups, specifically some cloud-computing startups. You're going to see even more around what I call "cloud mediation." That is guys like RightScale, and a few other folks in the space that sit between you and the major cloud providers. They basically mediate issues around data semantics, performance management, load balancing, and those sorts of things.

One thing that's a big hole in the cloud computing movement so far is that most of the solutions out there, even the database solutions, are proprietary. They use different APIs, different interfaces, and different sets of standards. It's going to be a play for a lot of companies to get in there and provide more reliable infrastructure in and between these various guys out there.

I'm aware of one startup a week, and they're coming in through the funders, not necessarily through the entrepreneurs, which is unusual.

The links to social networking will be there. They're not going to be quite as pervasive as everybody thinks. Social networking is going to have its place, but once we figure it out, it will be, "Okay, yeah." It's going to have its value, but we're just going to move on as far as this revolution goes. I don't think that's going to happen in 2009.

People are going to use it as a marketing opportunity, just like they used email, Web sites and those sorts of things, and now blogging opportunities, but eventually it's just going to fall into place.

There will be a huge explosion in the rogue cloud movement, as you mentioned, Dana, and also the platform-as-a-service (PaaS) space. The architects and CIOs out there are going to be scrambling around trying to figure out how to place governance around that.

Everybody is going to be building applications, typically using free platforms like Google App Engine. They're going to start launching these things into production, and there is going to be no rhyme or reason around how they fit into the existing infrastructure. That's happening now and it's going to happen more in 2009.

In switching gears to SOA, there's going to be a larger focus on inter-domain SOA technology. The focus will still be on the short-term tactical and the ability to provide quick value in the SOA space to justify it, so you can get additional funding.

As we start building these things, people are going to look at the departments that are implementing their SOA projects and try to figure out how to bind these things at an enterprise level. I call this the micro domain versus the macro domain.

Technology doesn't scale typically to that point, as people are finding, and it's going to take a different set of technologies and a different set of architectural skill sets to solve that problem.

On the downside, the jig will be up for poor SOA technology out there. Guys who haven't been able to get acquired or haven't been able to hit that inflection point and are still stumbling along -- typically making $2-$5 million a year and burning about that much in cash -- are going to eventually just going to have the plug pulled. And, 2009 is going to be when it's going to happen. They're just going to run out of steam.

We have a few of them right now. Ultimately, they're going to have lots of cuts, start hemorrhaging cash, and they're just going to go out. Some of them may be bought on the cheap, but the majority of them are just going to shut their doors.

Decline of the SOA buzzword

Finally, the SOA buzzword out there is going to diminish in relevancy. I'm talking about the buzzword, not necessarily the notion of SOA. SOA predates when the buzzword was created, and it's going to postdate when the word "SOA" was created. It's going to morph into different things, and the cloud computing movement is going to get into it and define it in different directions.

Enterprise architecture had a chance to get in there and figure out how SOA relates back into their world. They're been fairly successful in some aspects of it, but they have been too slow in moving. The whole SOA movement is going to be more defined by the cloud. That's good for me and probably for everybody on this call.

Gardner: You predicted a couple of years ago, Dave, that SOA would get subsumed into enterprise architecture. I assume that's what you are talking about?

Linthicum: Yeah, that's what I am talking about. Most SOA is going to get practiced in '09 and '10, at least the new stuff, in the cloud-computing movement, even though it’s still SOA. Basically, It's going to encompass cloud resources. Enterprise architecture will ultimately morph with SOA, and they'll become fundamentally the same concept.

SOA, which has always been an architectural pattern under the domain of enterprise architecture, will be subsumed by enterprise architecture and will be an architectural pattern under enterprise architecture. But, we're not going to be talking as much about SOA in '09.

Gardner: Just one quick follow-up. In terms of startups, you don't seem to think that there is going to be much funding left, no IPOs to speak of. What's the business model for these startups that you're seeing, the ones that can take advantage of PaaS with low upfront costs? How do they get funded? Do they need funding? And, what's their end strategy as a business?

Linthicum: They do need funding, but they don't need as much as funding as a company a couple of years ago, just because of everything you can get on demand. The strategy for the business is basically to glom onto the cloud-computing movement.

Some of the larger enterprises out there, some of my clients who are moving into the cloud-computing space by leaps and bounds, are realizing there are huge holes in the area, such as monitoring, event management, security, data mediation, all these sorts of things that aren't built into the larger cloud providers out there.

They have an immediate demand right now, a pent-up demand that's being created by the desire to lower cost, and driving a lot of these enterprises out into cloud computing. They're seeing these holes, and they are looking for solutions to make these happen. Both the entrepreneurs and the funders have realized that these things exist, and they are scrambling around trying to get them up and running.

As far as funding goes, it doesn't take that much to get a company, the assets, and the infrastructure up and running. Most of these solutions you will find will be leveraging on-demand platforms themselves. So, they'll be coming out of the cloud, providing services to clouds.

Gardner: They might actually find some engineers to hire from all those other startups that went away.

Linthicum: There are a lot of them on the streets right now.

Gardner: All right. Mike Meehan, there must be something we've missed so far.

Meehan: I don't know if there's anything you really missed, but I am going to pretend like you have and try to get some stuff in there.

The first three have to do with the economy, because obviously everybody is dealing with what we expect to be a down economy.

Rise of the 'Yankee Swap'

The first one is going to be a blast from the recent past. If everybody remembers back in 2001, when that recession hit, all of a sudden you could buy wonderful amounts of gear on eBay for next to nothing. I remember talking to one guy who was smiling like a Cheshire Cat, because he had replaced $45,000 worth of Unix with $500 worth of Linux. I think you are going to see a lot of that.

People are going to be shutting down data centers. That's going to cause a glut of servers and storage gear and network gear, and you are going to be able to get it cheap and affordable. That's going to hit the storage and network and server companies.

New sales are going to be tough to come by, because you're going to be able to get previously owned gear at affordable prices.

Gardner: So, a great disruption to the existing channel then?

Meehan: Exactly. It's really going to hit the channel vendors. CIOs are going to be able to come in and say, "Hey, look, I'm genius. I bought all of this stuff for next to nothing." And, there are going to be other CIOs who come in and say, "Hey, you know what. I was able to get some money by liquidating our assets." That financial pressure is going to affect everybody in the hardware market.

Gardner: They use to call it a Yankee Swap. Didn't they?

Meehan: Yeah. I think you are going to see a big international Yankee Swap. So that's going to be out there.

The next one is license wars. The CIOs are coming in, they are going to be asked to cut budget, and there is only so much flesh you can cut out before you have to deal with that maintenance license budget. I think every company in the world is aware of the fact that they pay more in licenses than they want to. They have always theoretically wanted to lower those costs. The pressure now is going to be too great for them to not consider options.

This is going to be great for open source companies, which are going to be able to come in and say, alright, you don't have to pay me a rolling license, here is my support cost, see how much its going to lower your license.

It is going to be bad for Microsoft, because again, to a degree they are becoming commoditized across their portfolio, and that's going to hit them right in the breadbasket.

Gardner: Do you agree with me that in hindsight the fact that Vista didn't live up to its potential is really going to hurt them?

Meehan: Absolutely. There are still companies out there working on Windows 2000, and those companies are going to be looking to switch, that they haven't gone to Vista just makes them a free agent. And this is going to also apply to Office.

Gardner: Whoever that architect was on that Vista project, he's fired, right?

Meehan: I think he's long gone. I think he is running the charitable foundation. They not only missed it, but they reinforced every negative perception of Microsoft when they came out with Vista: The inability to meet a product deadline; the security flaws that have been long associated with Microsoft; you need a zillion patches just to get it to work and do basic things.

Everything that they were supposed to have addressed, they failed to address, and then they reinforced that. Now, companies are just sitting there asking, "Why am I paying this much money for bad software?"

Bad year on the sell side

Gardner: So, it will be a really a good year, if you are a negotiator on the buy side, but a terrible year if you're on the sell side.

Meehan: I'd think so. This should hit some enterprise resource planning (ERP) vendors too. Anybody who can sell SaaS in the ERP market is going to be doing better. I think you are going to see some erosion on the SAP and Oracle side, as far as enterprise apps go.

"Make my life easier or go away." That basically means, users are going to need productivity and ease-of-use integration. You're going to see those in requests for proposals (RFPs). If they're not stated explicitly, they will be there implicitly.

Referring to SOA projects, for example, don't come in and tell me how much work I'm going to have to do to make all of this come together. Come in and tell me how this is going to make my life easier on day one. The companies that can deliver that will be the ones making the sales. The ones who are telling you that you're going to need to do eight months of work to get this up and running are going to be pushed to the back burner.

I really think that's the lure of the Web-oriented stuff. I take issue with the notion of WOA, because I don't necessarily buy into the architecture portion of it, but I do buy into the notion that it makes your life easier. It makes things easier to do. If you are a developer, it can get your stuff up and running quickly. If you can do that in some sort of organized governable fashion, then go with that.

What you're going to see in a lot of the SOA projects out there in particular is, "All right. Make it easy for me to assemble an application. Make it easy for me to reuse my assets. Make it easy for me to modify my existing applications. Make it easy for me to integrate different applications and even information between different divisions of my company."

Gardner: When you say "make it easy," are you talking about governance?

Meehan: I'm actually just talking about the mechanical process of doing it. You almost want it to be governable on the fly. What you really want is that you don't have to dedicate too much time and resources to undertake these functions. Users aren't going to have that much time or that many resources.

For example, imagine I'm a financial-services company and I've picked up a good loan portfolio from a distressed corporate loan company that had to sell their good loans off, because they were distressed, because they had made bad private loans. I got a good package of corporate loans from them. I need to integrate that quickly into my system, otherwise I am not going to be able to effectively govern that. I'm also not going to be able to effectively create the future programs around those customers, which is what I am looking to do.

So, how quickly can I do things now, as opposed to how thoroughly can I do things? You're going to want to be thorough to an extent, but really it's going to be speed to market and speed to end of project that's going to be a determinant in there.

Telecom shakeup. The U.S. government is going to start treating telecom like its our national road system, and you are going to see some serious investment in that area. That's going to become one of the key points in the economic stimulus package that you're going to see.

I also think you are going to see European telcos begin to encroach, either through acquisition or just through offering services into the U.S. market.

The last one, HP buys Sun. Somebody is going to get bought this year, somebody fairly big. I'm saying HP is buying Sun.

Gardner: They don't need to buy them. They can just replace all their servers in the marketplace.

Meehan: Basically.

Gardner: JP Morgenthal, you're up. The predictions swan song. We must be missing something?

Morgenthal: The funny thing is, I have had you on mute, listening to everybody, and struggling, because while this was going on, I had a visit from my media-services-in-the-cloud provider. He had to come set up my new entertainment in-the-cloud service box. We still need people is the point there. So, I found that very interesting and humorous to be going on when everyone was talking about clouds.

Age of reformation

Gardner: You're talking about the cable guy?

Morgenthal: Exactly, the cable guy. The cable guy was here setting up my TiVo box. I'm going to preface my five by saying that I see we're entering into a modern age of reformation, and there are some really interesting things that are going to start occurring this year, moving forward to 2012. I know. It's my own prophecy, and it's out there, hanging on a limb.

My first prediction is that we're going to see a greater focus on the business process. Not business process management (BPM) per se, although initially people will target that thinking they are doing business process, but eventually they will get it.

I think SOA is dead, and I believe companies have no stomach for IT initiatives that cannot immediately be attributed to a value. They're going to do some small-scale business process re-engineering, they're going to get tremendous value from it, and they are going to get it.

They're going to see that simplification is the way to go. Why are we doing all these complex things -- this hooking to that, hooking to this, hooking to that? I can just go into this one box and get everything done there. I don't care that it's not sexy, okay.

The age of disposable computing is here. We have had disposable electronics, disposable cars, and disposable appliances. The age of disposable computing is here.

Number two: The backlash of social networking. We're just on the precipice. Everyone is getting into it, having a little fun. Certain ones of us are on the leading edge. We're already getting bombarded and tired. We're already fried and overloaded from these social networks. The new people think it's a great new toy.

Give it a couple of years and you are going to see a tremendous backlash. You're going to see a rise of firms that will get paid to get people off the grid -- people who made big mistakes in thinking they were having fun during their early social networking experiment.

Gardner: This is sort of like tattoos, but in the cloud?

Morgenthal: Exactly. Angelina Jolie has got to get Bobby off her butt, and it's going to cost her. We're going to start to see that. We'll see the real backlash come into effect in 2010, but we'll start to see forms of it in this coming year.

Third, the pain from the economy is going to impact the open-systems market. We're seeing the rise of what I call the "anti IT." You hit upon that. You read about people reaching into petty cash, doing things on the cheap, finding other ways to get things done.

The one that's going to be the biggest impact is that people are treating open source like free software. That will destroy the open-source market for sure. It's the death knell. It's the stake in the vampire's heart.

People don't get it. I remind every one of my customers of that, when I talk to them, and they ask about an open-source solution. I've got to put my warning out there. Open source is not free software. You're either contributing dollars to the team that's doing it, or you are contributing your time and effort. It's not free software. You just don't take it and use it. That will be the death knell for open source for sure.

Gardner: Wait a minute, a death knell for open source or death knell for commercial open source as a business model?

Morgenthal: That's a good question. I won't differentiate at this point, because I'm looking at it from the perspective of the event horizon, where people are treating it like free software. There is no free lunch. Somewhere it's going to take hold. There's going to be a lack of support or a lack of desire to continue this thing, if people are abusing the system. It happens all the time. Nothing will drive greater abuse of open source than a bad economy, where there are no dollars.

Gardner: Okay. What else have you got?

Morgenthal: Number four: the millennial workforce is starting. This is going to change everything, and it's starting to already. These people have attitude that I haven't seen in a workforce since marketing people came out in the dot-com era.

They definitely feel like, "I want my toys. I want to be able to use my phone at work. I want to use my computer at work. I want to be able to access my sites at work." I see companies dealing with this issue in a unique way.

Their attitude isn't, "If you want a job, then you have to deal with it in our way." It's, "I'm scared. I don't know where I am going to get my workforce for the 21st Century, and I don't know how to deal with these people." Their first inclination isn't to push back with the old adage and the old way of talking about it, saying, "Hey, it's our way or the highway. We've got the money." It's "Okay, what do you want?"

This is going to really change things. How? It's yet to be seen, but clearly the introduction of a much more mobile force, more telecommuters.

Gardner: Most of us.

Morgenthal: That's a lifestyle choice. Yeah, it's pretty interesting. The millennial workforce is going to change things dramatically.

Shift in patent landscape

The last one is that there's a big change coming in Digital Rights Management (DRM) and patent and copyright. It's being lead by this initiative out of Harvard with the Recording Industry Association of America (RIAA). RIAA may have just started a war for everybody in the industry who has any copyright or any patent infringement suit. The judge in case said, "All you people, you big companies with big lawyers and big money, are taking on these poor little schnooks, and it has got to stop. They are coming in here and they don't even know what their legal rights are."

Gardner: Do you think this what Nathan Myhvold is up to?

Morgenthal: I didn't see his name associated with it. It was actually a Harvard law class, I believe, represented by a Harvard law professor [Charles Nesson], backing it. They're representing it as unconstitutional. So this case could be landmark for DRM, copyright infringement, and patent infringement.

Gardner: So, the basic message is kill all the patent trolls.

Morgenthal: It could be, and it would have a tremendous impact going into the potential for a startup economy. Dave talked about the startup economy, where downtime is a great time to start a new company and a great time to get out there and get your technology done early.

Landmark cases like this will do a lot to further the opportunities of these firms to go out there and build something without worrying, "Am I going to get taken out by Microsoft? Am I going to get taken out by Apple? I can't afford that." It's really interesting what could happen, given the cases like this are now falling on the side of the small guy, and not on the side of big companies.

Gardner: Right. Big companies were the victims of the patent trolls, now they are becoming patent trolls themselves.

Morgenthal: Yeah. They're hiring companies to go eat these things up, and then they are going after the small guy. We had multi-million dollar lawsuits over patent infringement for technologies that people hadn't even built or owned. I really think that the greed of Wall Street is also going to see that backlash, and it's going to lead to more of the same, or at least help those cases significantly.

People who have made big money pillaging the system over the years, in the age of reformation, are the ones that are going to get hung in the next two to three years.

Gardner: We're just about out of time. Let's go quickly down our list for any last synthesis insights.

Jim Kobielus, senior analyst at Forrester Research, thanks for joining. What's your synthesis of what you have heard?

Kobielus: My synthesis is that we are living in a very turbulent and volatile time in the industry. Things are changing on many levels simultaneously, and a lot of it will just be hammered by the recession. Approaches like cloud, social networking, and everything will be driven by the need to cut cost and to survive through fiscal austerity for an indefinite period.

Gardner: Tony Baer, senior analyst, Ovum, what's your takeaway?

Baer: It's hard to know where to start, but if there is one way to look at, it's back to basics. There are a lot of complex issues, and I think it's all going to be resolved locally, which in the long run, is going to present a huge governance challenge.

Gardner: Brad Shimmin, principal analyst, Current Analysis, what's your current analysis?

Shimmin: Currently, I'm thinking that the millennial generation and the down economy are converging like a perfect storm to wipe away what we have known for the last 10 years, and then ushering either perfect terror or a great new economy. I'm not sure which yet.

Gardner: Joe McKendrick, independent analyst and blogger, what's your toxic IT prediction?

McKendrick: We're definitely at a turning point. I agree with what everybody is saying out there about growth mode. Dana, I like your observations about the rogue or the shadow IT. You're going to see a lot more of that. It's been predicted for quite a few years actually that IT is going to be less of an entity onto itself and more of a function that's built into business units.

Business people are getting more involved in IT. Business people are getting more savvy about IT. JP talked about the millennial generation. They're very savvy about what IT and the power of IT can provide. We're going to see less of IT as a distinct area of the business and more part of the business, an enabler of the business. This year is going to accelerate that.

Gardner: Dave Linthicum, founder of Linthicum Group, what are you seeing from what you have heard and what's your net-net?

Linthicum: I think it's going to be one of the most exciting couple of years in IT. Just by sheer cost pressure, we're going to have to get down to simplifying and solving some of these issues, and not just playing around with technology. Things are going to get more simplistic, more effective, and more efficient than they have been over the last 20 years of building layer upon layer of complexity. We just can't afford to do that anymore, and now we are going to have to go fix it.

Gardner: Mike Meehan, senior analyst, Current Analysis, any additional takeaways?

Meehan: There's a lot of panic out there, and in keeping with one of the great holiday traditions, I think the winner is going to be Mr. Potter. The future belongs to warped, frustrated old men.

Gardner: He's buying up all those mortgages for pennies.

Meehan: Exactly.

Gardner: Alright. JP Morgenthal, one last go. What do you see from what you have heard on a high-level takeaway?

Morgenthal: Opportunity and fear -- and it's a matter of which one is stronger. I have no prediction as to which will win out. They're both equally powerful right now, and it's going to be, as Dave said, exciting to watch these two clash and see which one wins.

Gardner: I guess my takeaway is that we don't know how long it's going to take, but we will come out of this period. Survive anyway you can, but be mindful that on the other end it's going to be something quite new, with a lot of opportunities, and it's going to look a lot more like Internet time, and the clicks will mean more than the bricks.

Well, thanks all very much. Have a great holiday season. Please take a few days off and relax with your families.

I also want to thank our Charter Sponsor for the BriefingsDirect Analyst Insights Edition podcast series, and that is Active Endpoints, maker of the ActiveVOS visual orchestration system.

This is Dana Gardner, principal analyst at Interarbor Solutions, thanks for listening. Have a good year in 2009, somehow.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Charter Sponsor: Active Endpoints.

Special offer: Download a free, supported 30-day trial of Active Endpoint's ActiveVOS at

Edited transcript of BriefingsDirect Analyst Insights Edition podcast, Vol. 35,on how analysts see cloud computing, SOA, the economy, and Obama in 2009. Copyright Interarbor Solutions, LLC, 2005-2008. All rights reserved.

Tuesday, December 16, 2008

MapReduce-scale Analytics Change Business Intelligence Landscape as Enterprises Mine Ever-Expanding Data Sets

Transcript of BriefingsDirect podcast on new computing challenges and solutions in data processing and data management.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Greenplum.

Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you're listening to BriefingsDirect. Today, we present a sponsored podcast discussion on the architectural response to a significant and fast-growing class of new computing challenges. We will be discussing how Internet-scale data sets and Web-scale analytics have placed a different set of requirements on software infrastructure and data processing techniques.

Following the lead of such Web-scale innovators as Google, and through the leveraging of powerful performance characteristics of parallel computing on top of industry-standard hardware, we are now focusing on how MapReduce approaches are changing business intelligence (BI) and the data-management game.

More types of companies and organizations are seeking new inferences and insights across a variety of massive datasets -- some into the petabyte scale. How can all this data be shifted and analyzed quickly, and how can we deliver the results to an inclusive class of business-focused users?

We'll answer some of these questions and look deeply at how these new technologies will produce the payback from cloud computing and massive data mining and BI activities. We'll discover how the results can quickly reach the hands of more decision makers and strategists across more types of businesses.

While the challenge is great, the new value for managing these largest data sets effectively offers deep and powerful new tools for business and for social and economic progress.

To provide an in-depth look at how parallelism, modern data infrastructure, and MapReduce technologies come together, we welcome Tim O’Reilly, CEO and founder of O’Reilly Media, and a top influencer and thought leader in the blogosphere. Welcome, Tim.

Tim O’Reilly: Hi, thanks for having me.

Gardner: We're also joined by Jim Kobielus, senior analyst at Forrester Research. Thank you, Jim.

Jim Kobielus: Hi, Dana. Hi, everybody.

Gardner: Also, Scott Yara, president and co-founder at Greenplum. Welcome, Scott.

Scott Yara: Thank you.

Gardner: We're still dealing with oceans of data, even though we have harsh economic times. We see reduction in some industries, of course, but the amount of data and need for analytics across the Internet is still growing rapidly. BI has become a killer application over the past few years, and we're now extending that beyond enterprise-class computing into cloud-class computing.

I want to go to Jim Kobielus first. Jim, why has this taken place now? What is happening in the world that is simultaneously creating these huge data sets, but also making necessary even better analytics across more businesses?

Kobielus: Thanks, Dana. A number of things are happening or have been happening over the past several years, and the trend continues to grow. In terms of the data sets, it’s becoming ever more massive for analytics. It’s equivalent to Moore’s Law, in the sense that every several years, the size of the average data warehouse or data mart grows by an order of magnitude.

In the early 1990s or the mid 1990s, the average data warehouse was in gigabytes. Now, in the mid to late 2000s, it's in the terabytes. Pretty soon, in the next several years, the average data warehouse will be in the petabyte range. That’s at least a thousand times larger than the current middle-of-the-road data warehouse.

Why are data warehouses bulking up so rapidly? One key thing is that organizations, especially in tough times when they're trying to cut costs, continue to consolidate a lot of disparate data sets into fewer data centers, onto fewer servers, and into fewer data warehouses that become ever-more important for their BI and advanced analytics.

What we're seeing is that more data warehouses are becoming enterprise data warehouses and are becoming multi-domain and multi-subject. You used to have tactical data marts, one for your customer data, one for your product data, one for your finance data, and so forth. Now, the enterprise data warehouse is becoming the be all and end all -- one hub for all of those sets.

What that means is that you have a lot of data coming together that never needed to come together before. Also, the data warehouse is becoming more than a data warehouse. It's becoming a full-fledged content warehouse, not just structured relational data, but unstructured and semi-structured data -- from XML, from your enterprise content management (ECM) system, from the Web, from various formats, and so forth. It's coming together and converging into your warehouse environment. That’s like the bottom of the iceberg that’s coming up, you're seeing it now, and it's coming into your warehouse.

Also, because of the Web 2.0 world and social networking, a lot of the customer and market intelligence that you need is out there in blogs, RSS feeds, and various formats. Increasingly, that is the data that enterprises are trying to mine to look for customers, marketing opportunities, cross-sell opportunities, and clickstream analysis. That’s a massive amount of data that’s coming together in warehouses, and it's going to continue to grow in the foreseeable future.

Gardner: Let’s go to Tim O’Reilly. Tim, from your perspective, what has changed over the past 10 or 20 years that makes these datasets so important?

Long-term perspective

O'Reilly: If you look at what I would call Web 2.0 in a long-term historical perspective, in one sense it's a story about the evolution of computing.

In the first age of computing, business models were dominated by hardware. In the second age, they were dominated by software. What started to happen in the 1990s, underneath everybody’s nose, but not understood and seen, was the commodification of software via open industry standards. Open source started to create new business models around data, and, in particular, around network applications that built huge data sets through user participation. That’s the essence of what I call Web 2.0.

Look at Google. It's a BI company, based on massive data sets, where, first of all, they are spidering all the activity off of the Web, and that’s one layer. Then, they do this detailed analysis of the link structure of that Web, and that’s another layer. Then, they start saying, "Well, what else can we find? They start looking at click stream data. They start looking at browsing history, and where people go afterward. Think of all the data. Then, they deliver service against that.

That’s the essence of Web 2.0, building a massive data set, doing real-time analytics against it, and then figuring out what services you can deliver. What’s happening today is that movement is transferring from the consumer Web into business. People are starting to realize, "Oh, the companies that are doing better are better with their data."

A great example of that is Wal-Mart. You can think of Wal-Mart as a Web 2.0 company. They've got end-to-end analytics in the same way that Google does, except they're doing it with stuff. Somebody takes something off the shelf at Wal-Mart and rings it up. Wal-Mart knows, and it sends a signal downstream to the supplier.

We need to understand that this move to real-time understanding of data at massive scale is going to become more and more important as the lever of competitive advantage -- not just in computer businesses, but in all businesses. Data warehousing and analytics aren't just something that you do in the back office and it's a nice-to-have. It's the very essence of competitive advantage moving forward.

When we think about where this is going, we first have to understand that everybody is connected all the time via applications, and this is accelerating, for example, via mobile. The need for real-time analytics against massive data sets is universal.

Look at some of the things that are happening on the phone. Okay, where am I? What data is relevant to me right now, because you know where I am? Speech recognition is starting to come into focus on the phone. Again, it's a massive data problem, integrating not only speech recognition, but also local dialogs. Oh, wait, local again, you start to see some cross connections between data streams that will help you do better.

Even in the case of starting with someone from Nuance about why Google is able to do some interesting things in the particular domain of search and speech recognition, it’s because they're able to cross-correlate two different data sets -- the speech data set and the search data set. They say, "Okay, yeah, when somebody says that, they are most likely looking for this, because we know that. When they type, they also are most likely looking for that." So this idea of cross-correlation between data sets is starting to come up more and more.

This is a real frontier of competitive advantage. You look at the way that new technologies are being explored by startups. So many of the advantages are in data.

A great example is the company where I'm on the board. It's called Wesabe. They're a personal finance application. People upload their bank statements or give Wesabe information to upload their bank statements. Wesabe is able to do customer analytics for these guys, and say, "Oh, you spent so much on groceries." But, more than that, they're able to say, "The average person who shops at Safeway, spends this much. The average person who shops at Lucky spends this much in your area." Again, it's a massive data problem. That’s the heart of their application.

Now, you think the banks are going to get clued into this and they are going to start to say, "Well, what services can we offer?" Phone companies: "What services can we offer against our data?"

One thing that’s going to happen is the migration of all the BI competencies from the back office to the front office, from being something that you do and generate reports from, to something that you actually generate real-time services from. In order to do that, you've absolutely got to have high performance at massive scale.

Second, a lot of these data sets are not the old-fashion data sets where it was simply structured data.

Gardner: Let’s go to Scott Yara. Scott, we need this transformation. We need this competitive differentiation and new, innovative business approaches by more real-time analytics across larger sets and more diverse sets of content and inference. What’s the approach on the solution side? What technologies are being brought to bear, and how can we start dealing with this at the time and scale that’s required?

A big shift

Yara: Sure. For Greenplum, one of the more interesting aspects of what’s going on is that big technology concepts and ideas that have really been around for two or three decades are being brought to bear, because of the big shift that Tim alludes to, and we are big believers. We're now entering this new cycle, where companies are going to be defined by their ability to capture and make use of the data and the user contributions that are coming from their customers and community. That is really being able to make parallel computing a reality.

We look at the other major computing trend today, and it’s a very mainstream thing like virtualization. Well, virtualization itself was born on the mainframe well over 30 years ago. So, why is virtualization today, in 2008, so important?

Well, it took this intersection of major trends. You had x86 and, as Tim mentioned, the commoditization of both hardware and software, and x86 and multi-core machines became incredibly cheap. At the same time, you had a high-level business trend, an industry trend. The rising cost of data centers and power became so significant that CIOs had to think about the efficiency of their data centers and their infrastructure and what could lower the cost of computing.

If you look at running applications on a much cheaper and much more efficient set of commodity systems and consolidating applications through virtualization, that would be a really compelling thing, and we've seen a multi-billion dollar industry born of that.

You're seeing the same thing here, because business is now driven by Web 2.0, by the success of Google, and by their own use and actions of the Web realizing how important data is to their own businesses. That’s become a very big driver, because it turns out that parallel computing, combined with commodity hardware, is a very disruptive platform for doing large-scale data analysis.

The fact that you can take very, very cheap machines, as Google has shown -- off-the-shelf PCs -- and with the right software, combine them to hundreds, thousands and tens of thousands of systems to deliver analytics at a scale that people couldn’t do before. It’s that confluence and that intersection of market factors that's actually making this whole thing possible.

While parallel computing has been around for 30 years, the timing has become such that it’s now having an opportunity to become really mainstream. Google has become a thought leader in how to do this, and there are a lot of companies creating technologies and models that are emblematic of that.

But, at the end of the day, the focus is in software that is purpose-built to provide parallelism out of the box. This allows companies to sift through huge amounts of data, whether structured or unstructured data. All the fault tolerance, all the parallelism, all those things that you need are done in software, so that you choose off-the-shelf hardware from HP, IBM, Dell, and white-box systems. That’s a model that's as disruptive a shift as client-server and symmetric multiprocessing (SMP) computing was on the mainframe.

Gardner: Jim Kobielus, speak to this point of moving the analytic results, the fruits of this impressive engine and architectural shift from the back office to the front office. This requires quite a shift in tools. We're not going to have those front-office folks writing long SQL queries. They're not going to study up on some of the traditional ways that we interact with data.

What’s in the offing for development, so developers can create applications that target this data now that’s in a format that we can get out and is cross-pollinated in huge data sets that are themselves diverse? What’s in store for app dev, and what’s in store for the people that are looking for a graphical way to get into the business strategist type of user?

Self-service paradigm

Kobielus: One thing we're seeing in the front-end app development is, to take Tim’s point even further, it’s very much becoming more of a Web 2.0 user-centric, self-service development paradigm for analytics.

Look at the ongoing evolution of the online analytical processing (OLAP) market, for example. Things that are going on in terms of user self service, development of data mining, advanced analytic applications within their browser, and within their spreadsheet. They can pull data from various warehouses and marts, and online transaction processing (OLTP) systems, but in a visual, intuitive paradigm.

That can catch a lot of that information in the front-end -- in other words, on the desktop or in the mobile device -- and allows the user to graphically build ever-richer reports and dashboards, and then be able to share that all out to the others in their teams. You can build a growing and collective analytical knowledge base that can be shared. That whole paradigm is coming to the fore.

At Forrester, we published a number of reports on it. Recently, Boris Evelson and I looked at the next generation of OLAP technology. One very important initiative to look at is what Microsoft is doing with Project Gemini. They're still working on that, but they demoed it a couple of months ago at their BI show.

The front office is the actual end user, and power users are the ones who are going to do the bulk of the BI and analytics application development in this new paradigm. This will mean that for the traditional high priesthood of data modelers and developers and data mining specialists, more and more of this development will be offloaded from them, so they can do more sophisticated statistical analysis, and so forth.

The front office will do the bulk of the development. The back office -- in other words, the traditional IT data-modeling professionals -- will be there. They'll be setting the policies and they'll be providing the tooling that the end users and the power users will use to build applications that are personalized to their needs.

So IT then will define the best practices, and they'll provide the tooling. They'll provide general coaching and governance around all of the user-centric development that will go on. That’s what’s going to happen.

It’s not just Microsoft. You can look at the OLAP tooling, more user-centric in-memory spreadsheet-centric approaches that IBM, Cognos, Oracle, and others are rolling out or have already rolled out in their product sets. This is where it’s all going.

Gardner: Tim O’Reilly, in the past, when we've opened up more technological power to more people, we've often encountered much greater innovation, unpredictably so. Should we expect some sort of a wisdom-of-crowd effect to come into play, when we take more of these data sets and analytic tools and make them available?

O'Reilly: There's a distinction between the wisdom of crowds and collective intelligence. The wisdom-of-crowds thesis, as expounded by Surowiecki, is that if you get a whole bunch of people independently, really independently, to weigh in on some subject, their average guess is better than any individual expert's. That’s really about a certain kind of quantitative stuff.

But, there's also a machine-learning approach in which you're not necessarily looking for the average, but you're finding different kinds of meaning in data. I think it’s important to distinguish those two.

Google realized that there was meaning in links that every other search engine of the day was throwing away. This was a way of harnessing collective intelligence, but it wasn’t just the wisdom of crowds. This was actually an insight into the structure of the data and the meaning that was hidden in it.

The breakthroughs are coming from the ability of people to discern meaning in data. That meaning sometimes is very difficult to extract, but the more data you have, the better you can be at it.

A great example of this recently is from the last election. Nate Silver, who ran, was uncannily accurate in calling the results of the election. The reason he was able to do that was that he looked at everybody’s polls, but didn’t just say, "Well, I'm just going to take the average of them." He used all kinds of deep thinking to understand, "Well, what’s the bias in this one. What’s the bias in that one?" And, he was able to develop an algorithm in which he weighted these things differently.

Gardner: I suppose it’s important for us to take the ability to influence the algorithms that target these advanced data sets and put them into the hands of the people that are closer to the real business issues.

More tools are critical

O'Reilly: That’s absolutely true. Getting more tools for handling larger and more complex data sets, and in particular, being able to mix data sets, is critical.

One of the things that Nate did that nobody else did was that he took everybody’s polls and then created a meta-poll.

Another example is really interesting. You guys probably are familiar with the Netflix Challenge, where Netflix has put up a healthy sum of money to whomever can improve their recommendation algorithm by 10 percent. What’s interesting is that people seem to be stuck at about 8 percent, and they haven’t been able to get the last couple of percent.

It occurred to me in a conversation I was having last night that the breakthroughs will come, not by getting a better algorithm against the Netflix data set, but by understanding some other data set that, when mixed with the Netflix data set, will give better predicted results.

Again, that tells us something about the future of data mining and the future of business intelligence. It is larger, more complex, and more diverse data sets in which you are able to extract meaning in new ways.

One other thing. You were talking earlier about the democratization of these tools. One thing I don’t want to pass by is a comment that was made recently by Joe Hellerstein, who is a computer science professor at UC Berkeley. It was one of those real wake-up-and-smell-the-coffee moments. He said that at Berkeley, every freshman student in CS is now being taught Hadoop. SQL is an elective for seniors. You say, "Whoa, that is a fundamental change in our thinking."

That’s why I think what Greenplum is doing is really interesting, trying to marry the old BI world of SQL with the new business intelligence world of these loose, unstructured data sets that are often analyzed with a MapReduce kind of approach. Can we bring the best of these things together?

That fits with this idea of crossing data sets being one of the new competencies that people are going to have to get better at.

Kobielus: If I can butt in here just one moment, I want to tie into something that Tim just said, that I said a little bit earlier. One important thing is that when you add more data sets to say your analytic environment, it gives you the potential to see more cross-correlations among different entities or domains. So, that’s one of the value props for an all-encompassing or more multi-domain enterprise data warehouse.

Before, you had these subject-specific marts -- customer data here, product data there, finance data there -- and you didn’t have any easy way to cross-correlate them. When you bring them altogether into common repository, implementing common dimensions and hierarchies, and conforming with common metadata, it makes it a whole lot easier for the data miners, the power users, and the end users, to build the applications that can tie it altogether.

There is the "aha" moment. "Aha, I didn’t realize all these hooked up in these various ways." You can extract more meaning by bringing it all together into a unified, enterprise data warehouse.

Gardner: To you, Scott Yara. There's a great emphasis here on bringing together different data sets from disparate sources, with entirely different technologies underlying them. It's not a trivial problem. It’s not a matter of scale necessarily.

What do you see as the potential? What is Greenplum working on to allow folks to mix and match in such a way that the analytics can be innovative and game-changing in a harsh economic environment?

Price/performance improvement

Yara: A couple of things. One, I definitely agree with the assertion that analysis gets easier the more data you have. Whether those are heterogeneous data sets or just the scale of data that people can collect, it's fundamentally easier, cheaper.

In general, these businesses are pretty smart. The executives, analysts, or people that are driving business know that their data is valuable and that insight in improving customer experience through data is key. It’s just really hard and expensive, and that has made it prohibitive for a long, long time.

Now, we're talking about using parallel computing techniques, open-source software, and commodity hardware. It’s literally a 10- to 100-fold improvement in price performance. When the cost of data analysis comes down 10 to 100 times, that’s when new things become possible.

O'Reilly: Absolutely.

Yara: We see lots of customers now from the New York Stock Exchange. These are all businesses that are across vertical industries, but are all affected by the Web and network computing at some level.

Algorithmic trading is driving financial services in a way that we haven’t seen before. They're processing billions of trades every day. Whether it's security, surveillance, or real-time support that they need to provide to very large trading companies, that ability to mine and sift through billions of transactions on a real-time basis is acute.

We were sitting down with one of our large telecom customers yesterday, and there was this convergence that Tim’s talking about. You've got companies with very large mobile carrier businesses. They're broadband service providers, fixed-line service providers, and Internet companies.

Today, the kind of basic personalization that companies like Amazon, eBay, or Google do, telecom carriers are just at the beginning of trying to do that. They have to aggregate the consumer event stream from all these disparate communication systems, and it’s at massive scale.

Greenplum is solely focused on making that happen and mixing the modalities of data, as Tim suggested. Whether it’s unstructured data, whether those are things that exist in legacy databases, or whether you want to mix and match SQL or MapReduce, fundamentally you need to make it easy for businesses to do those things. That’s starting to happen.

Gardner: I suppose part of the new environment that we are in economically is that incremental change is probably not going to cut it. We need to find new forms of revenue and be able to attain them at a very low cost, upfront if possible, and be transformative in how we can take our businesses out through the public networks to reach more customers and give them more value.

Now that we've established that we have these data sets, we can combine them to a certain degree, and that will improve over time. What are the ways in which companies can start actually making money in new ways using these technologies?

Apple’s Genius comes to mind for me as a way of saying, "Okay, you pick a song in your iTunes library, and we're going to use our data and our analytics, and come back with some suggestions on what you might like as a result of that." Again, this is sort of a first go at this, but it opens my eyes to a lot of other types of business development opportunities. Any thoughts on this, Tim O’Reilly?

O'Reilly: In general, as I said earlier, this is the frontier of competitive advantage. Sure, iTunes’ has Genius, but it's the same thing with Netflix recommendations. Amazon has been doing this for years. It's part of their competitive advantage. I mentioned earlier how this is starting to be a force in areas like banking. Think about phone companies and all of the opportunities for new local services.

Not only that, one of my pet hobbyhorses is that phone companies have this call-history database, but they're not building new services for users against it. Your phone still only remembers the last few people that you called. Why can’t I do a search against somebody I talked to three months ago. "Who the heck was that? Was it a guy from this company?" You should be able to search that. They've got the data.

So, as I said earlier, the frontier is turning the back office into new user-facing services, and having the analytics in place to be able to do that meaningfully at scale in real-time. This applies to supply chains. It applies to any business that has data that gets better through user interaction.

This is the lesson of the Web. We saw it first in Web applications. I gave you the example earlier of Wal-Mart. They realized, "Oh, wait a minute. Every time somebody buys something, it’s a vote." That’s the same point that Wesabe is trying to exploit. A credit card statement is a voting list.

I went to this restaurant once. That doesn’t necessarily mean anything. If I go back every week, that may mean something. I spent on average this much. It’s going up. That means something. I spend on average this much. It’s going down, and that means something. So, finding meaning in the data that I already have, how could this be useful not just me but to my users, to my customers, and the services could I build.

This is the frontier, particularly in the world that we are entering, in which computing is going mobile, because so many of the mobile services are fundamentally going to be driven by BI. You need to be able to say in real-time or close to real-time, "This is the relevant data set for this person based on where they are right now."

Needed: future view

Kobielus: I want to underline what Tim just said. Traditionally, data warehouses existed to provide you with perfect hindsight on the customer -- historical data, massive historical data, hopefully on the customer, and that 360 degree view of everything about the customer and everything they have ever done in the past, back to the dawn of recorded time.

Now, it’s coming down to managing that customer relationship and evolving and growing with that relationship. You have to have not so much a past or historical view, but a future view on that customer. You need to know that customer and where they are going better than they know themselves.

In other words, that’s where the killer app of the online recommendation engine becomes critical. Then, the data warehouse, as the platform for recommendation engines, can take both the historical data that persists, but also can take the continuing streams of real-time event data on pricing, on customer interaction in various channels -- be it on the Web or over the phone or whatever -- customer transactions that are going on now, and things and events that are going on in the customer social network.

Then, you feed that all into a recommendation engine, which is a predictive-analytics model running inside the data warehouse. That can optimize that customer’s interaction at every touch point. Let’s say they're dealing with a call-center person live. The call-center person knows exactly how the world looks to that customer right now and has a really good sense for what that customer might need now or might need in three month, six months, or a year, in terms of new services or products, because other customers like them are doing similar things.

It can have recommendations being generated and scripted for the call-center agent in real-time saying, "You know what we think. We recommend that you upgrade to the following service plan because, it provides you with these features that you will find useful in your lifestyle, blah, blah, blah."

In other words, it's understanding the customer in their future, in their possible future, and suggesting things to the customers that they themselves didn’t realize until you suggested them. That’s the future of analytics, and competitive advantage.

O'Reilly: I couldn’t agree more.

Gardner: Scott Yara, we've been discussing this with a little bit of a business-to-consumer (B2C) flavor. In the business-to-business (B2B) world many things are equal in a commoditized market, with traditional types of products and services.

An advantage might be that, as a supplier, I'm going to give you analytics that I can derive from data sets that you might not have access to. I might provide analytical results to you as a business partner free of charge, but as an enticement for you to continue to do business with me, when I don’t have any other way to differentiate. What do you see are some of the scenarios possible on the B2B side?

Yara: You don’t have to look much further than what is doing. In a lot of ways, they're pioneering what it means to be an enterprise technology company that sells services, and ultimately data, back to their customers. By creating a common platform, where applications can be built, they are very much thinking about how the data is being aggregated on the platforms in use, not by their individual customers, but in aggregate.

You're going to see lots of cases where for traditional businesses that are selling services and products to other businesses, the aggregation of data is going to be interesting and relevant. At the same time, you have companies where even the internal analysis of their data is something they haven’t been able to do before.

We were talking about Google, which is an amazing company. They have this big vision to organize the world’s information. What the rest of the business world is finding out is that while it’s a great vision and they have a lot of data, they only have a small fraction of the overall data in the world. Telecommunication companies, financial stock exchange, retail companies, have all of this real-world data that's not being indexed or organized by Google. These companies actually have access to amazing amounts of information about the customers and businesses.

They are saying, "Why can’t we, at the point of interaction -- like eBay, Amazon, or some of these recommended engines -- start to take some of this aggregate information and turn it into improving businesses in the way that the Web companies have done so successfully. That’s going to be true for B2C businesses, as well as for B2B companies.

We're just at the beginning of that. That’s fundamentally what’s so exciting about Greenplum and where we're headed.

Gardner: Jim Kobielus, who does this make sense for right away? Some companies might be a little skeptical. They're going to have to think about this. But where is the low-lying fruit, where are the no-brainer applications for this approach to data and analytics?

Kobielus: No-brainers -- I always hate that term. It sounds like I am condescending, but low-hanging fruit should be one of those "aha!" opportunities that everybody realizes intuitively. You don’t have to explain to them, so in a sense it's a no-brainer. It’s call center -- customer-contact center.

The customer-contact center is where you touch the customer, and where you hopefully initiate, cultivate, nurture, maintain, and grow the customer relationship. It's one of the many places where you do that. There are people in your organization who are in that front-line capacity.

It doesn’t have to be just people. It could be automated programs through your Website that need to be empowered continuously with the full customer context -- the history of that customer's interactions, the customer’s current state, current sentiment and feelings, and with a full context on the customer’s likely future evolution. So, really it's the call center.

In fact, I cover data warehousing for Forrester. I talk to the data warehousing vendors and their customers about in database analytics, where they are selling this capability right now into real-world deployment. The customer call center is, far and away -- with a bullet -- the number one place for inline analytics to drive the customer interaction in a multi-channel fashion.

Gardner: How about you, Tim O’Reilly. Where are some of the hot verticals and early adopters likely to be on this?

O'Reilly: I've already said several times, mobile apps of various kinds are probably highest on the list. But, I'm a big fan of supply chain. There's a lot to be done there, and there's a huge amount of data. There already is a BI infrastructure, but it hasn’t really been tuned to think about it as a customer-facing application. It's really more a back-office or planning tool.

There are enormous opportunities in media, if you want to put it that way. If you think about the amount of money that’s spent on polling and the power of integrating actual data, rather than stated preference, I think it's huge.

How do we actually figure out what people are going to do? There is great marketing study. I forget who told this story, but it was about a consumer product. They showed examples of different colors. It was a boom box or something like that.

They said, "How many of you think white is the cool color, how many of you think black, how many, blah, blah, blah?" All the people voted, and then they had piles of the boom boxes by the door that the people took as their thank you gift. What they said and what they did were completely at variance.

One of the things that’s possible today is that, increasingly, we are able to see what people actually do, rather than what they say they will do or think they will do.

Gardner: We're just about out of time. Scott Yara, what’s your advice for those folks who are just getting their heads wrapped around this on how to get started? It’s not a trivial activity. It does require a great deal of concerted effort across multiple aspects of IT, perhaps more so than in the past. How do you get started, what should you be doing to get ready?

Yara: That’s one of the real advantages. In sort of a orthogonal way, the ability to create new businesses online in the age of Web 2.0 has been fundamentally cheaper and faster. Doing something disruptive inside of business with their data has to be a fundamentally cheaper and easier thing. So not starting with the big vision of where they need to go, and starting with something tactical -- whether it lives in the call center or at some departmental application -- is the best way to get going.

There are technologies, services, and people now that you can actually peel off a real project, and you can deliver real value right away.

I agree with Tim. We're going to see a lot of activity in the mobility and telecommunication space. These companies are just realizing this. If you think about the kind of personalization that you get with almost every major Internet site today, what’s level of personalization you get from your carrier, relative to how much data that they have? You're going to see lots of telecom companies do things with data that will have real value.

One of our customers was saying that in the traditional old data warehousing world, where it was back office, the service level agreement (SLA) was that when a call got placed and logged, it just needed to make its way into the warehouse seven days later. Seven days from the point of origination of a call, it would make itself into a back-office warehouse.

Those are the kinds of things that are going to change, if we are going to really provide mobility, locality, and recommendation services to customer.

It's having a clear idea of the first application that can benefit from data. Call centers are going to be a good area to provide the service representation of a profile of a customer and be able to change the experience. I think we are going to see those things.

So, they're tractable problems. Starting small is what held back enterprise data warehousing before, where they were looking at these huge investments of people and capital and infrastructure. I think that’s really changing.

Gardner: I am afraid we have to leave it there. We've been discussing new approaches to managing data, processing data, mixing data types and sets, and extracting real-time business results from that. We've looked at tools and we've looked at some of the verticals in business advantages.

I want to thank our panel. We've been joined today by Tim O’Reilly, the CEO and founder of O’Reilly Media. Thank you Tim.

O'Reilly: Glad to do it.

Gardner: Jim Kobielus, Forrester senior analyst. Thank you Jim.

Kobielus: Dana, always a pleasure.

Gardner: Scott Yara, president and co-founder of Greenplum. Appreciate it, Scott.

Yara: Great. Thanks everybody.

Gardner: This is Dana Gardner, principal analyst at Interarbor Solutions. You've been listening to a sponsored BriefingsDirect podcast. Thanks, and come back next time.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Greenplum.

Transcript of BriefingsDirect podcast on new computing challenges and solutions in data processing and data management. Copyright Interarbor Solutions, LLC, 2005-2008. All rights reserved.