Saturday, February 17, 2007

Transcript of BriefingsDirect SOA Insights Edition Vol. 9 Podcast on TIBCO's SOA Tools News, ESBs as Platform, webMethods Fabric 7, and HP's BI Move

Edited transcript of weekly BriefingsDirect[TM] SOA Insights Edition, recorded Jan. 19, 2007.

Listen to the podcast here. If you'd like to learn more about BriefingsDirect B2B informational podcasts, or to become a sponsor of this or other B2B podcasts, contact Dana Gardner at 603-528-2435.

Dana Gardner: Hello, and welcome to the latest BriefingsDirect SOA Insights Edition, Volume 9. This is a weekly discussion and dissection of Services Oriented Architecture (SOA) related news and events with a panel of IT industry analysts. I’m your host and moderator, Dana Gardner, principal analyst at Interarbor Solutions, ZDNet blogger, and Redmond Developer magazine columnist.

This week, our panel of independent IT analysts includes show regular Steve Garone. Steve is an independent analyst, a former program vice president at IDC and the founder of the AlignIT Group. Welcome back, Steve.

Steve Garone: Hi, Dana. It's great to be here again.

Gardner: Also joining us is Joe McKendrick, an independent research consultant and columnist at Database Trends, as well as a blogger at ZDNet and ebizQ. Welcome back to the show, Joe.

Joe McKendrick: Hi, Dana.

Gardner: Next Neil Ward-Dutton, research director at Macehiter Ward-Dutton in the U.K., joins us once again. Hello, Neil.

Neil Ward-Dutton: Hi, Dana, good to be here.

Gardner: Jim Kobielus, principal analyst at Current Analysis, is also making a return visit. Thanks for coming along, Jim.

Jim Kobielus: Hi, everybody.

Gardner: Neil, you had mentioned some interest in discussing tools. We’ve discussed tools a little bit on the show, but not to any great depth. There have been some recent announcements that highlight some of the directions that SOA tools are taking, devoted toward integration, for the most part.

However, some of the tools are also looking more at the development stage of how to create services and then join up services, perhaps in some sort of event processing. Why don’t you tell us a little bit about some of the recent announcements that captured your attention vis-a-vis SOA tools?

Ward-Dutton: Thanks, Dana. This was really sparked by a discussion I had back in December -- and I think some of the other guys here had similar discussions -- with TIBCO Software around the announcement that they were doing for this thing called ActiveMatrix. The reason I thought it was worth discussing was that I was really kind of taken by surprise. It took me a while to really get my head around it, because what TIBCO is doing with ActiveMatrix is shifting beyond its traditional integration focus and providing a rear container for the development and deployment of services, which is subtly different and not what TIBCO has historically done.

It’s much more of a development infrastructure focus than an integration infrastructure focus. That took me by surprise and it took me a while to understand what was happening, because I was so used to expecting TIBCO to talk about integration. What I started thinking about was, "What is the value of something like ActiveMatrix?" Because at first glance, ActiveMatrix appears to be something with JBI, a Java Business Integration implementation, basically a kind of standards-based plug-and-play ESB on steroids. It's probably a crass way of putting it, but you kind of get the idea.

Let’s look at it from the point of view of a development theme. What is required to help those guys get into building high-quality networks of services? There are loads of tools around to help you take existing Java code, or whatever, right-click on it, and create SOAP and WSDL bindings, and so on. But, there are other issues of quality, consistency of interface definitions, and use of schemas -- more leading-edge thinking around using policies, for example. This would involve using policies at design time, and then having those enforced in the runtime infrastructure to do things like manage security automatically and help to manage performance, availability, and so on.

It seems to me that this is the angle they’re coming from, and I haven’t seen very much of that from a lot of the other players in the area. The people who are making most of the noise around SOA are still approaching it from the point of view: "You’ve got all this stuff already, all these assets, and what you’re really doing is service-enabling and then orchestrating those services." So, I just want to throw that out there. It would be really interesting to hear what everyone else thinks. Is what TIBCO is doing useful? Are they out ahead or are there lots of other people doing similar things?

Gardner: TIBCO’s heritage has been in middleware messaging, which then led them into integration, Enterprise Application Integration (EAI), and now they’ve moved more toward a service bus-SOA capability. Just to clarify, this tooling, is it taking advantage of the service bus as a place to instantiate services, production, and management? And is it the service bus that’s key to the fact that they’re now approaching tooling?

Ward-Dutton: That’s how I believe it, except it extends to service bus in two ways. One is into the tooling, if you think about what Microsoft is doing with Windows Communication Framework. From a developer perspective, they’re abstracting a lot of the glop they need to tie code into an ESB, and TIBCO is trying to do something similar to that.

It’s much more declarative. It’s all about annotations and policies you attach to things, rather than code you have to write. On the other side, what was really surprising to me was that, if I understand it right, [TIBCO] are unlike a lot of the other ESB players. They are trying to natively support .NET, so they actually have a .NET container that you can write .NET components in and hook them into the service bus natively. I haven’t really seen that from anywhere else, apart from Microsoft. Of course, they’re .NET only. I think there’s two ways in which they’re moving beyond the basic ESB proposition.

Gardner: So, the question is about ESB as a platform. Is it an integration platform that now has evolved into a development platform for services, a comprehensive place to manage and produce and, in a sense, direct complex service integration capabilities? Steve Garone, is the definition of ESB, now, much larger than it was?

Garone: I think it is. I agree with Neil. When I looked at this announcement, the first thing that popped into my mind was, "This is JBI." When Sun Microsystems talked about JBI back in 2005, this is what they were envisioning, or was part of what they were envisioning. Basically, as a platform, that raises the level of abstraction to where current ESB thinking was already. At the time was confusing users -- and still is -- because they didn’t quite understand how, or why, or when they should use an ESB?

In my opinion, this raises that level of abstraction to eliminate a lot of the work developers have to do in terms of coding to a specific ESB or to a specific integration standard, and lets them focus on developing the code they need to make their applications work. But, I would pull back a little bit from the notion that this is purely, or at a very high percentage, a developer play. To me, this is a logical extension of what companies like TIBCO have done in the past in terms of integration and messaging. However, it does have advantages for developers who need to develop applications that use those capabilities by abstracting out some of the work that they need to do for that integration.

Gardner: How about you, Joe? Do you see this as a natural evolution of ESB? It makes sense for architects and developers and even business analysts to start devoting logic of process to the ESB and let the plumbing take care of itself, vis-à-vis standards and module connectors.

McKendrick: In terms of ESBs, there’s actually quite a raging debate out there about the definition of an ESB, first of all, and what the purpose of an ESB should be. For example, I quote Ann Thomas Manes . . .

Gardner: From Burton Group, right?

McKendrick: Right. She doesn’t see ESB as a solution that a company should ultimately depend on or focus on as mediation. She does seem to lean toward the notion of an ESB on the development side as a platform-versus-mediation system. I've also been watching the work of Todd Biske, he is over at MomentumSI. Todd also questions whether ESBs can take on such multiple roles in the enterprise as an application platform versus a mediation platform. He questions whether you can divide it up that way and sell it to very two distinct markets and groups of professionals within the enterprise.

Gardner: How about you, Jim Kobielus? Do you see the role of ESB getting too watered down? Or, do you see this notion of directing logic to the ESB as a way of managing complexity amid many other parts and services, regardless of their origins, as the proper new direction and definition of ESB?

Kobielus: First of all, this term came into use a few years back, popularized by Gartner and, of course, by Progress Software as a grand unification acronym for a lot of legacy and new and emerging integration approaches. I step back and look at ESB as simply referring to a level backplane that virtualizes the various platform dependencies. It provides an extremely flexible integration fabric that can support any number of integration messaging patterns, and so forth.

That said, looking at what TIBCO has actually done with ActiveMatrix Service Grid, it's very much to the virtualization side of what an ESB is all about, in the sense that you can take any integration logic that you want, develop it to any language, for any container, and then run it in this virtualized service grid.

One of the great things about the ActiveMatrix service grid is that TIBCO is saying you don’t necessarily have to write it in a particular language like Java or C++, but rather you can compose it to the JBI and Service Component Architecture (SCA) specifications. Then, through the magic of ActiveMatrix service grid, it can get compiled down to the various implementation languages. It can then get automatically deployed out to be executed in a very flexible end-to-end ESB fabric provided by TIBCO. That’s an exciting vision. I haven’t seen it demonstrated, but from what they’ve explained, it’s something that sounds like it’s exactly what enterprises are looking for.

It’s a virtualized development environment. It’s a virtualized integration environment. And, really, it’s a virtualized policy management environment for end-to-end ESB lifecycle governance. So, yeah, it is very much an approach for overcoming and taming the server complexity of an SOA in this level backplane. It sounds like it’s the way to go. Essentially, it sounds very similar to what Sonic Software has been doing for some time. But TIBCO is notable, because they’re playing according to open standards that they have helped to catalyze -- especially the SCA specifications.

Gardner: Now, TIBCO isn’t alone in some releases since the first of the year. We recently had webMethods with its Fabric 7.0. Has anyone on the call taken a briefing with webMethods and can you explain what this is and how it relates to this trend on ESB?

Kobielus: I've taken the briefing on Fabric 7.0, and it’s really like TIBCO with ActiveMatrix in many ways. It's a strong development story there and it’s a strong virtualization story. In the case of webMethods Fabric 7.0, you can develop complex-end-to-end integration process logic in a high-level abstraction. In their case, they’re implementing the Business Process Modeling Notation (BPMN) specification. Then, you can, within their tooling, take that BPMN definition, compile it down to implementation languages like BPEL that can then get executed by the process containers or process logic containers within the Fabric 7.0 environment.

It’s a very virtualized ESB/SOA development environment with a strong BPMN angle to it and a very strong metadata infrastructure. WebMethods recently acquired Infravio, and so webMethods is very deep now both on the UDDI registry side and providing the plumbing for a federated metadata infrastructure that’s necessary for truly platform agnostic ESB and SOA applications.

Gardner: And, I believe BEA has come out through its Liquid campaign with the components that amount to a lot of this as well. I'm not sure there are standards in interoperability, based on TIBCO's announcement, but clearly I think they have the same vision. In the past several weeks, we’ve discussed how complexity has been thrown at complexity in SOA, and that’s been one of the complaints, one of the negative aspects.

It seems to me that this move might actually help reduce some of that by, as you point out, virtualizing to the level where an analyst, an architect, a business process-focused individual or team can focus in on this level of process to an ESB, not down to application servers or Java and C++, and take advantage of this abstraction.

Before we move on to our next topic, I want to go back to the panel. Steve Garone, do you see this as a possible way of reducing the complexity being thrown at complexity issue?

Garone: Yes, I do. A lot of it's going to depend on how well this particular offering -- if you're talking about TIBCO or webMethods, but I think we were sort of focusing mostly on TIBCO this morning.

Gardner: I think I’d like to extend to the larger trend. Elements that IBM is doing relates to this. Many of the players are trying to work toward this notion of abstracting up, perhaps using ESB as a platform to do so. Let's leave it on more general level.

Garone: That’s fine a good point. You’re right. IBM is doing some work in this area, and logically so, although they come at this even though they have a lot of integration products. I consider them a platform vendor, which means their viewpoint is a little more about the software stack than a specific integration paradigm.

I think the hurdle that we’ll need to get over here in terms of users taking a serious look at this is the confusion over what an ESB actually is and what it should be used for by customers. The vendors who talk to their customers about this are going to have to get over a perception hurdle that this is somewhat different. It makes things a lot easier and resolves a lot of those confusion points around ESBs. Therefore, it's something they should look at seriously, but in terms of the functionality and the technology behind it, it's the logical way to go.

Gardner: Joe McKendrick, how about you in this notion of simplicity being thrown at complexity? Are we going to retain that? Is this the right direction?

McKendrick: Ah, ha. Well, I actually have fairly close ties with SHARE, the mainframe user group, and put out a weekly newsletter for them. The interesting point about SOA in general is that TIBCO, webMethods and everybody are moving to SOA. They have no choice. They have to begin to subscribe to the standards they agree upon. What else would they do?

When we talk about what was traditionally known as the Enterprise Application Integration (EAI) market, it’s been associated with large-scale, expensive integration projects. What I have seen in the mainframe market is that there is interest in SOA, and there is a lot of experimentation and pilot projects. There are some very clear benefits, but there is also a line of thinking that says, "The application we have running on the mainframe, our CICS application transaction system, works fine. Why do we need to SOA-enable this platform? Why do we need to throw in another layer, an abstraction of service layer, over something that works fine, as-is?"

It may seem archaic or legacy. You may even have green-screen terminals, but it runs. It’s got mainframe power behind it. It’s usually a two-tier type of application. The question organizations have to ask themselves is, Do we really need to add another layer to an operation that runs fine as-is?

Gardner: If they only have isolated operations, and they don’t need to move beyond them, I suppose it's pretty clear for them from cost-benefit analysis to stay with what works. However, it seems that more companies, particularly as they merge and become engaged in partnerships, or as they ally with other organizations and go global, want to bring in more of their assets into a business process-focused benefit. So, that's the larger evolution of where we’re going. It's not islands of individual applications churning away, doing their thing, but associating those islands for a higher productivity benefit.

Kobielus: The notion of what organizations have to examine is right on the money, but I think that’s more of a fundamental issue around SOA in general. I think the question you asked was how does something like this affect the ease with which one can do that, and will it figure into the cost-benefit analysis that an organization does to see if in fact that's the right way to go.

Gardner: Neil, this was your topic. How do you see it? Does this larger notion strike you as moving in a direction of starting to solve this issue of complexity being thrown a complexity? That is to say, there’s not enough clear advantage and reduced risk as an organization for me to embrace SOA. Do you think what you’re seeing now from such organizations as TIBCO and webMethods is ameliorating that concern?

Ward-Dutton: Yes and no. And I think most of my answers on these podcasts end up like that, which is quite a shame. The "no" part of my answer is really the cynical part, which is that, at the end of the day, too much simplicity is bad for business. It’s not really in any vendor’s interest to make things too easy. If you make things too easy, no one’s going to buy any more stuff. And the easiest thing to do, of course, for the company is to say, "You know what? Let’s just put everything on one platform. We’ll throw out everything we’ve got, and rebuild everything from the ground up, using one operating system, one hardware manufacturer, one hardware architecture, and so on."

If the skills problem would go away overnight, that would be fantastic. Of course, it’s not about technology. It’s all of our responsibility to keep reminding everyone that, while this stuff can, in theory, make things simpler, you can’t just consider an end-state. You've got to consider the journey as well, and the complexity and the risk associated with the journey. That’s why so many organizations have difficulties, and that's why the whole world isn't painted Microsoft, IBM, Oracle, or webMethods. We’re in a messy, messy world because the journey is itself a risky thing to do.

So, I think that what's happening with IBM around SCA, and what TIBCO is doing around ActiveMatrix, and what webMethods is doing, have the capability for people with the right skills and the right organizational attributes. They have the ability to create this domain, where change can be made pretty rapidly and in a pretty manageable way. That's much more than just being about technology. It’s actually an organizational, cultural process, an IT process, in terms of how we go about doing things. It's those issues, as well as a matter of buying something from TIBCO. Everything’s bound up together.

Gardner: To pick up on your slightly cynical outlook on vendors who don’t want to make it too simple, they do seem to want to make things simpler from the tooling perspective, as long as that requires the need for their run time, their servers, their infrastructure, and so on.

TIBCO has also recently announced BusinessWorks 5.4, which is a bit more complex-turnkey-platform approach that a very simplified approach to tools might then engender an organization to move into. I guess I see your point, but I do think that the tooling and the simplification is a necessary step for people and process to be the focus and the priority, and that the technology needs to help to bring that about?

Ward-Dutton: You’re absolutely right, Dana, but I think the part of the point you made when you were asking your question a few minutes ago was around do we see less technical communities getting more heavily involved in development work. This is the kind of the mythical use of programming thing I remember from Oracle 4GL and Ingress 4GL. That was going to be user-programming, and, of course, that didn’t happen either. I do see the potential for a domain where it’s easier to change things and it’s more manageable, but I don’t see that suddenly enabling this big shift to business analysts doing the work -- just like we didn’t do with UML or 4GLs.

Gardner: We’re not yet at the silver-bullet level here.

Kobielus: Neil nailed it on the head here. Everybody thinks of simplicity in terms of, "Well, rather than write low-level code, people will draw high-level pictures of the actual business process, not that technical plumbing." And, voila! the infrastructure will make it happen, and will be beautiful and the business analysts will drive it.

Neil alluded to the fact that these high-level business processes, though they can be drawn and developed in BPMN, or using flow charting and all kinds of visual tools, are still ferociously complex. Business process logic is quite complex in it’s own right, and it doesn’t simply get written by the business analyst. Rather, it gets written by teams of business and IT analysts, working hand in hand, in an iterative, painful process to iron out the kinks and then to govern or control changes, over time, to various iterations of these business processes.

This isn’t getting any simpler. In fact, the whole SOA governance -- the development side of the governance process -- is just an ongoing committee exercise of the IT geeks and the business analyst geeks getting together regularly and fighting it out, defining and redefining these complex flow charts.

Gardner: One of the points here is around how the plumbing relates to the process, and so it’s time and experience that ultimately will determine how well this process is defined. As you say, it’s iterative. It’s incremental. No one’s just going to sit there, write up the requirements, and it’s going to happen. But it’s the ability to take iterations and experience in real time and get the technology to keep up with you as you make those improvements that's part of the “promise” of SOA.

McKendrick: The collaboration is messy. You’re dealing with a situation where you’ve got collaboration among primarily two major groups of people who have not really worked a lot together in the past and don’t work that well together now.
Link
Gardner: Well, that probably could be said about most activities from last 150,000 years. All right, moving onto our next topic: IBM came out with its financials this week, we’re talking about the week of January 15, 2007, and once again, they had a strong showing on their software growth. They had 14 percent growth in software revenues, compared to the year-ago period. This would be for the fourth quarter of 2006, and that's compared to the total income growth for the company of 11 percent -- services growing 6 percent, and hardware growing only 3 percent.

So, suddenly, software, which does include a lot at IBM, but certainly a large contribution form WebSphere and middleware and mainframes. Mainframes are still growing, but not great -- 5 percent. Wow. The poster child at IBM is software. Who'd have thunk it? Anybody have a reaction to that?

Ward-Dutton: Of course, one of the things that's been driving IBM software growth has been acquisitions. I know I’m a bit behind the curve on this one, but the FileNet acquisition was due to close in the fourth quarter. If that did happen, then that probably had quite a big impact. I don’t know. Does anyone else know?

Gardner: I guess we’d have to do a bit more fine-tuning to see what contribution the new acquisition’s made on a revenue basis, but the total income growth being a certain percentage and then the software, as a portion of that, I suppose, is the trend. Even so, if they’re buying their way into growth, software is becoming the differentiator in the growth opportunity for IT companies, not hardware, not necessarily even professional services.

That does point out that where companies are investing, where enterprises are investing, and where they're willing to pay for a high-margin and not fall into a commodization pattern, which we might see in hardware, is in software.

Kobielus: Keep in mind, though, in the fourth quarter of 2006, IBM had some major product enhancements. Those happened both in the third and the fourth quarter in the software space, and those were driving much of this revenue growth. In July, they released a DB2 Version 9, formerly code-named Viper, and clearly they were making a lot of sales of new licenses for DB2 V9. Then, in the beginning of the fourth quarter, they released their new Data Integration Suite. That's not so new, but rather enhancements to a variety of point integration tools that they’ve had for a long time, including a lot of these software products that they'd acquired with Ascential.

Gardner: That’s the ETL stuff, right?

Kobielus: Not only that, it's everything, Dana. It’s the ETL, the EII, the metadata, the data quality tools, and the data governance tools. It’s a lot of different things. Of course, they also acquired FileNet during that time. But also in the late third quarter IBM released at least a dozen linked solo-product upgrades. In the late third quarter, they were clearly behind much of the revenue growth, and in the fourth quarter for the software group. In other words, the third and fourth quarters of this past year had announcements that IBM had primed the pump for in terms of the customers’ expectations. And, clearly, there were a lot of pent-up orders in hand from customers who were screaming for those products.

Gardner: So you're saying that this might be a cyclical effect, that we shouldn't interpret the third and fourth quarter software growth as a long-term trend but perhaps as beneficial but nonetheless a "bump in the road" for IBM.

Kobielus: Oh, yeah. Just like Microsoft is finally having a bump, now that it’s got Vista and all those other new products coming downstream. These few quarters are going to be a major bump for Microsoft, just like the last two were a major bump for IBM.

Gardner: Let’s take that emphasis that you have pointed out, and I think is correct, on the issue of data -- the lifecycle of data, and how to free it and expose it to wider uses and productivity in enterprise. IBM has invested quite a bit in that. We just also heard an announcement this week from Hewlett-Packard that they are going to be moving more aggressively into business intelligence (BI) and data warehouse activities, not necessarily trying to sell databases to people, but to show them how to extract, and associate, and make more relevant data that they already have -- a metadata-focused set of announcements. Anyone have reaction to that?

Garone: I don’t know too much about this announcement, but from what I’ve read it seems as if this is largely a services play. HP sees this as a professional services opportunity to work with customers to build these kinds of solutions, and there's certainly demand for it across the board. I’m not so sure this is as much products as it is services.

Kobielus: HP, in the fourth quarter of 2006, acquired a services company in the data warehousing and BI arena called Knightsbridge, and Knightsbridge has been driving HP's foray into the data warehousing market. But, also HP sees that it’s a major hardware vendor, just as Teradata and IBM are, and wants to get into that space. If you look at the growth in data warehousing and BI, these are practically the Number 1 software niches right now.

For HP it’s not so much a software play. They are partnering with a lot of software vendors to provide the various piece parts, such as overall Master Data Management (MDM), data warehousing, and business intelligence product sets. But, very clearly, HP sees this as a services play first and foremost. If you look at IBM, 50 percent of their revenues are now from the global services group, and a lot of the projects they are working on are data warehousing, and master data management, and data integration. HP covets all that.

They want to get into that space, and there’s definitely a lot of room for major powerhouse players like them to get into it. Also, very interestingly, NCR has announced in the past week or so that it’s going to spin off Teradata, which has been operating more or less on an arms-length basis for some time. Teradata has been, without a doubt, the fastest growing product group within NCR for a long time. They're probably Number 1 or a close Number 2 in the data warehousing arena. This whole data warehousing space is so lucrative, and clearly HP has been coveting it for a while. They’ve got a very good competency center in the form of Knightsbridge.

They have got a good platform, this Neoview product that they are just beginning to discuss with the analyst community. I’m trying to get some time on their schedule, because they really haven't made a formal announcement of Neoview. It’s something that’s been trickling out. I’ve taken various informal briefings for the last six months, and they let me in on a few things that they are doing in that regard, but HP has not really formally declared what its product road map is for data warehousing. I expect that will be imminent, because, among other things, there is a trade show in February in Las Vegas, the Data Warehousing Institute, and I’m assuming that they -- just like Teradata and the others -- will have major announcements to share with all of us at that time.

Gardner: Well, thanks for that overview. Anyone else have anything to offer on the role of data warehousing?

McKendrick: Something I always found kind of fascinating is that the purpose and challenges of data warehousing are very much parallel to those of SOA. The goal of data warehousing is to abstract data from various sources or silos across the enterprise and bring it all into one place. And the goal of SOA is to take these siloed applications, abstract them and make them available across the enterprise to users in a single place. The ROI formula interestingly is the same as well.

When you start a data warehouse, you’re pumping in a lot of money. Data warehouses aren't cheap. You need to take a single data source, apply the data warehouse to that, and as that begins to generate some success, you can then expand the warehouse to a second data source, and so forth. It’s very much the same as SOA.

Kobielus: I agree wholeheartedly with that. Data warehouses are a platform for what’s called master data management. That's the term in the data-management arena that refers to a governance infrastructure to maintain control over the master reference data that you run your business on -- be it your customer data, your finance data, your product data, your supply chain data and so forth.

If you look at master data management, it’s very much SOA but in the data management arena. In other words, SOA is a paradigm about sharing and re-using critical corporate resources and governing all that. Well, what's the most critical corporate resource -- just about the most critical that everybody has? It's that gospel, that master reference data, that single version of the truth.

MDM needs data warehousing, and data warehousing very much depends on extremely scalable and reliable and robust platforms. That’s why you have these hardware vendors like HP, IBM, Teradata, and so forth, that are either major players already in data warehousing or realizing that they can take their scalable, parallel processing platforms, position them into this data warehousing and MDM market, and make great forays.

I don’t think HP, though, will become a major software player in its own right. It’s going to rely on third-party partners to provide much of the data integration fabric, much of the BI fabric, and much of the governance tooling that is needed for full blown MDM and data warehousing.

Gardner: Great. I'd like to thank our panel for another BriefingsDirect SOA Insights Edition, Volume 9. Steve Garone, Joe McKendrick, Neil Ward-Dutton, Jim Kobielus and myself, your moderator and host Dana Gardner. Thanks for joining, and come back next week.

If any of our listeners are interested in learning more about BriefingsDirect B2B informational podcasts or to become a sponsor of this or other B2B podcasts, please fill free to contact me, Dana Gardner at 603-528-2435.

Listen to the podcast here.

Transcript of Dana Gardner’s BriefingsDirect SOA Insights Edition, Vol. 9. Copyright Interarbor Solutions, LLC, 2005-2007. All rights reserved.

Tuesday, February 06, 2007

Transcript of BriefingsDirect Podcast on Music Search Technology and Implications

Edited transcript of BriefingsDirect[TM] B2B informational podcast on music search with Sun Labs, recorded Jan. 10, 2007.

Listen to the podcast here.

If you'd like to learn more about BriefingsDirect B2B informational podcasts, or to become a sponsor of this or other B2B podcasts, contact Dana Gardner at 603-528-2435.

Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you're listening to BriefingsDirect. Today, a discussion with Paul Lamere, a staff engineer at Sun Microsystems Labs and principal investigator for Search Inside the Music.

This interesting research project is taking search to quite a new level. Instead of using lists of data about the music author, album, or composer, Search Inside the Music actually digs into the characteristics of the music, finding patterns, and then associating that with other music.

The possibilities here are pretty impressive. I want to thank Paul Lamere for joining us. Welcome to the show, Paul.

Paul Lamere: Hi, Dana. Thanks for having me.

Gardner: I had the opportunity to see your presentation face-to-face when I visited the Sun Labs’ campus in Burlington, Mass., back in December, 2006, and it really had me thinking for a few days after. I kept coming back to it and thinking, “Wow! What about this and what about that?” I thought it would be good to bring this to a wider audience and to use the medium of audio, seeing as it is so germane to this project.

Let’s get a little historical context. Tell us how you got involved with this project, why music, and how does that relate to IT in general?

Lamere: Okay, as you know, I work in Sun’s research lab, and our role as researchers is to be the eyes and ears for Sun. We’re trying to look out on the distant horizon for Sun to keep an eye on the interesting things that are happening in the world of computers. Clearly music has been something that's been greatly changed by computers since Napster and iTunes.

So I started to look at what's going on in the music space, especially around music discovery. The thing that I thought was really interesting is looking at Chris Anderson’s book and website, The Long Tail. You know, music is sort of the sweet spot for the long tail; the audio is a nice conveniently sized packet, and people want them. What we’re seeing right now is the major labels putting out 1,000 songs a week. If you start to include some of the independent labels and the music that's happening out on MySpace and that sort of thing, you may start to see more like 10,000 songs a week.

Chris Anderson said, “The key to The Long Tail is just to make everything available.” Now imagine not just every indie artist, but every kid with a drum machine in their basement and a PC putting on their tracks, and every recording of every performance of "Louie Louie" on the Web, and that same thing happening all over the world and sticking that on the Web. Now we may start having millions of songs arriving on the Web every week.

Gardner: Not to mention the past 800 or 1,000 years’ worth of music that has been recorded in the last 50 or 100 years.

Lamere: That’s right. So, we have many orders of magnitude, more music to sift through and 99.9 percent of that music is something you would never ever want to listen to. However, there is probably some music in there that is our favorite songs if we could just find them.

I’m really interested in trying to figure out how to get through this slush pile to find the gems that are in there. Folks like Amazon have traditionally used collaborate filtering to work through content. I’m sure you’re familiar with Amazon’s “customers who bought this book also bought this book,” and that works well if you have lots of people who are interested in the content. You can take advantage of the Wisdom of Crowds. However, when you have…

Gardner: They are working on the "short head," but not the long tail.

Lamere: That’s right. When you have millions of songs out there, some that people just haven’t listened to, you have no basis for recommending music. So, you end up with this feedback where, because nobody’s listening to the music, it’s never going to be recommended -- and because it’s never recommended, people won’t be listening to the music. And so there is no real entry-point for these new bands. You end up once again with the short head, where you have U2 and The Beatles who are very, very popular and are recommended all the time because everyone is listening to them. But there is no entry point for that garage band.

Gardner: Yes. When I use Amazon or Netflix and they try to match me up, they tell me things I already know; they haven't told me things that I don’t know.

Lamere: That’s right. Did you really need to know that if you liked The Beatles, you might like The Rolling Stones? So, we’re taking a look at some alternative ways to help weed through this huge amount of music. One of the things that we’re looking at is the idea of doing content-based recommendation. Instead of relying on just the Wisdom of Crowds -- actually rely on the audio content.

We use some techniques very similar to what a speech recognizer does. It will take the audio and will run signal processing algorithms over it and try to extract out some key features that describe the music. We then use some machine-learning techniques basically to teach this system how to recognize music that is both similar and dissimilar. So at the end, we have a music similarity model and this is the neat part. We can then use this music similarity model to recommend music that sounds similar to music that you already like.

Gardner: Yes, this is fascinating to me because you can scan or analyze music digitally and come out and say, this is blues, this is delta blues; this is jazz, this is New Orleans jazz. I mean, it’s amazing how discreet you can get on the type of music.

Lamere: Yes, that’s right, and the key is that you can do this with any kind of music without having any metadata at all. So, you can be given raw audio and you can either classify it into rather rich sets of genres or just say, "Hey, this sounds similar to that Green Day song that you’ve been listening to, so you might like to listen to this, too."

Gardner: Fascinating. So once we’re able to get characteristics and label and categorize music, we can scan all sorts of music and help people find what is similar to what they want. Perhaps they’re experimenting and might listen to something and think, “I wonder if I am interested in that,” and do all kinds of neat things. So, explain the next step.

Lamere: Well, there are lots of ways to go. One of the things that we can do with this similarity model is to provide different ways of exploring their music collections. If you look through current music interfaces, they look like spreadsheets. You have lists of albums, tracks, and artists and you can scroll through them much like you would through Lotus 1-2-3 or whatever spreadsheet you are using.

It should be fun; it should be interesting. When people look for music, they want to be engaged in the music. Our similarity model gives people new and different ways of interacting with their music collections.

We can now take our music collection and essentially toss it into a three-dimensional space based on music similarity, and give the listener a visualization of the space and actually let them fly through their collection. The songs are clustered based on what they sound like. So you may see one little cluster of music that’s your punk and at the other end of the space, trying to be as far away from the punk music, might be your Mozart.

Using this kind of visualization gives you a way of doing interesting things like exploring for one thing, or seeing your favorite songs or some songs that you forgot about. You can make regular playlists or you can make playlists that have trajectories. If you want to listen to high-energy music while driving home from work, you can play music in the high-energy, edgy space part of your music space. If you like to be mellowed out by the time you get home, you have a playlist that takes you gradually from hard-driving music to relaxing and mellow music by the time you pull into the driveway.

Gardner: Now, for those who are listening, I’m going to provide some links so you see some of these visualizations. It’s fascinating because it does look like some of these Hubble Telescope cosmos-wide diagrams where you see clusters of galaxies, and then you’re shown the same sort of visualization with clusters of types of music and how they relate.

If we take the scale in the other direction -- down into our brains and with what we know now about brain mapping and where activities take place and how brain cells actually create connections across different parts of the brain -- there is probably a physical mapping within our brains about the music that we like. We’re almost capturing that and then using that as a tool to further our enjoyment of music.

Lamere: That’s an interesting idea.

Gardner: Now, I’m looking here on my PC at my iTunes while we’re talking and I’ve got 4,690 items, 15 days of music, and 26.71GB. And it turns out -- even when I use shuffle and I’ve got my playlists and I’ve dug into this -- I’m probably only listening to about 20 percent or 30 percent of my music. What’s up with that?

Lamere: Yes, exactly. We did a study on that. We looked at 5,000 users and saw that the 80-20 rule really applies to people’s music collections: 80 percent of their listening time is really concentrated in about 20 percent of their music. In fact, we found that these 5,000 users had about 25 million songs on their iPods and we found that 63 percent of the songs had not been listened to even once. So, you can think of your iPod as the place that your music goes to die, because once it’s there, the chances are you will never listen to it again.

Gardner: I don’t want it to be like that. So, clearly we can use some better richer tools. Is that right?

Lamere: That’s right. Shuffle play is great if you have only a few hundred songs that you can pick and put on there, but your iPod is a lot like mine. It has 5,000 songs and it also has my 11-year-old daughter’s high-school musical and DisneyMania tracks. I have Christmas music and some tracks I really don’t want to listen to.

When I hit shuffle play, sometimes those come up. Also with shuffle play, you end up going from something like Mozart to Rammstein. I call that the iPod whiplash. A system that understands a little bit about the content of the music can certainly help you generate playlists that are easier to listen to and also help you explore your music collection.

So you can imagine instead of hitting shuffle play or just playing the same albums over again, you could have a button on your iPod, or your music player, that lets you play music you like. That’s something that is really needed out there.

Gardner: So, instead of a playlist, you could have a "moodlist." For example, I’m stressed and I need to relax, or I really want to get pumped up because I’m going to a party and I want to feel high-energy, or I have the kids in the back seat and I want them to relax. Your mood can, in a sense, dictate how you assemble a playlist.

Lamere: That’s right. Imagine that a few years from now, (it’s not hard to extrapolate with the new iPhone), we’re going to have wireless iPods that are probably connected to a cloud of millions of tracks. It’s not hard to imagine all of the world’s music will be available on your hand-held audio player in a few years. Try using shuffle play when you have 500 million songs at your disposal; you’ll never find anything.

Gardner: I don’t have the benefit of a DJ to pick out what I want either, so I’m sort of stuck. I’m not in the old top 40 days, but I’m in the 40 million tracks days.

Lamere: That’s right.

Gardner: Now let’s look at some practical and commercial uses. Professional playlists assemblers who are creating what goes into these channels that we get through satellite, cable probably could use this to advantage. However, that doesn’t strike me as the biggest market. Have you thought about the market opportunity? Would people be willing to pay another five dollars a month to add it to their service so they have all their music readily accessible? How do you foresee it commercializing?

Lamere: I’m sure you’ve heard of Netflix. They are probably one of the biggest DVD shippers and one of their biggest advantages is their recommendation engine. I’m not sure if you’ve heard about the Netflix contest. Any researcher who can improve their recommendation by just 10 percent will receive $1 million from Netflix. I think that really represents how valuable good recommendations are to companies trying to deliver Long Tail content.

Amazon has built their business around helping connect people with their content as well. The same things are going to happen (or are happening now) within the music world. There is a lot of value hooking up to people with music; getting them into the Long Tail.

Gardner: Can we easily transfer this into a full multi-media experience -- video and audio? Have you tried to use this with tracks from a movie or a television show? Is there an option to take just from the audio alone; a characteristic map that people could say, "I like this TV show, now give me ones that are like it?" Is that too far-fetched? How do we go to full multi-media search with this?

Lamere: Those are all really interesting research questions. We haven't got that far yet. There are so many interesting spaces to bring this -- for instance, digital games. People are interacting with characters, monsters, and things like that. Currently, they may be in the same situation 50 times because they keep playing the game over and over again.

Wouldn’t it be nice to hook up to music that matches the mood and knows changes or may even push the latest songs that match the mood into the games, so that instead of listening to the same song, you get new music that does not detract from the game? There are all sorts of really interesting things that could happen there.

Gardner: I saw you demonstrating a 3D map in real time that was shifting as music was going in and out of the library as different songs were playing. It was dynamic; it was visual; there were little icons that represented the cover art from the albums floating around. It was very impressive. Now, that doesn’t happen with a small amount of computer power. Is this a service that could be delivered with the required amount of back-end computational power?

Lamere: Yes. We can take advantage of all of the nifty gaming hardware out there to give us the whiz bang with the 3D visualizations. The real interesting stuff, the signal processing and the machine learning when dealing with millions of songs, is going to use vast computer resources.

If you think music is driving a lot of bandwidth now in storage and computation, in a few years when people start really gravitating toward this content-based analysis, music will be driving lots and lots of CPU cycles. A small company with this new way of content-based recommendation can rent time on a grid at a per-CPU hour rate and get an iTunes-sized music collection (a few million songs) in a weekend as opposed to the five or 10 years it would take on a couple of desktop systems.

Gardner: Interesting. The technology that you’ve applied to music began with speech. Is there a way of moving this back over to speech? We do have quite a bit of metadata and straight text and traditional search capabilities. What if we create an overlay of intonation, emphasis, and some of the audio cues that we get in language that don’t show up in the written presentation or in what is searchable? Does that add another level of capability or "color" to how we manage the spoken word and/or the written word? With my podcasting, for example, I start with audio -- but I go to text, and then try to enjoy the benefits of both.

Lamere: Right. These are all great research questions; the things that researchers could spend years on in the lab. I think one interesting application would be tied to meetings; work meetings, conference meetings, just like when you visited Sun last month.

If we had a computer that was listening to the meeting and maybe trying to do some transcripts, but also noting some of the audio events in the meeting such as when everybody laughed or when things got really quiet. You could start to use that as keys for searching content in the meetings. So, you could go back to a recording of the meeting and find the introductions again very easily so you can remember who was at the meeting or find that summary slide and the spoken words that went with the conclusion of the talk.

Gardner: Almost like a focus group ability from just some of these audio cues.

Lamere: That’s right.

Gardner: Hey, I’ve got something that you could take to the airlines. I tend to sit on planes for long periods of time and after my battery runs out, I am stuck listening to what the airline audio provides through those little air tubes. Wouldn’t it be nice if there were audio selections that were rich and they really fit my mood. This is a perfect commercialization.

Lamere: Yes, you can have your favorite music as part of your travel profile.

Gardner: This could also work with automakers. Now that everyone has found a way to hook up to their iPods or their MP3 equivalent in their automobile, the automakers can give you what you want off of the satellite feed.

Lamere: Definitely.

Gardner: There are many different directions to take this. Obviously you’ve got some interest in promoting Sun’s strategic direction. There must be some other licensing opportunities for something like this. Is this patented or are you approaching it from an intellectual property standpoint? If our listeners have some ideas that we haven’t thought of, what should they do?

Lamere: When you’re in the labs, the people who run the lab really like to make sure that the researchers are not tempted by other people’s ideas because it can make it difficult down the road. If people have some ideas they want to send my way, it’s always great to hear more things. We do have some patents around the space. We generally don’t try to exploit the patents to charge people entry into a particular space.

Gardner: Since this does fall under the category of search, there are some big companies out there that are interested in that technology. We have a lot of Google beta projects, for example, such as Google News, Google Blogs, Google Podcasts, and Google base. Why not Google Music?

Lamere: Google has two -- I guess, you may call them Friday projects -- on their labs site around music. One is Google Trends, and the idea there is they’re trying to track which music is popular. If you use Google’s instant messenger, you can download a plug-in that will also track your listening behavior. So every time you play a song, it sends the name of the artist and the track to Google and they keep track of that. They give you charts of top 100 music, top 100 artists, whatever. The other thing they have is a Music Search tailored toward music.

If you type in Coldplay or The Beatles, you’ll get a search result that’s oriented toward music. You’ll see links of the artist page and links to lyrics but, interestingly enough, they haven't done anything in public to my knowledge about indexing music itself. This is interesting because Google has never been shy about indexing.

Its mission is to index all the world’s information, and certainly music falls into that category. They haven't been shy about going up against publishers when it comes to their library project, where they’re scanning all the books in all the libraries despite some of the objections of the book publishers. But for some reason they don’t want to do the same with music. So, it’s an open question. But probably they’ll be announcing the Google Music Search tomorrow.

Gardner: At least we can safely say we’re in the infancy of music search, right?

Lamere: That’s right. I see a lot of companies trying to get into the space. Most of them are trying to use the collaborate filtering models. The collaborate filtering models really require lots of data about lots of users. So they have a hard time attracting users because until they get a lot of users, their recommendations are not that good. And because their recommendations are not that good, they don’t get a lot of users.

Gardner: The classic chicken-and-egg dilemma.

Lamere: Yes, it’s called the "cold start" problem.

Gardner: I firmly believe in the "medium is the message"-effect, and not just for viewing news or getting information. When I was getting my music through the AM radio, that characterized a certain way that I listened and enjoyed music. I had to be in a car, or I had to be close to a radio, and then I might avoid sitting under a bridge.

Then I had my LPs and my eight tracks and they changed from one song into an album format for me. We’re going back a number of years here. I’ve been quite fond of my iPod and iTunes for the last several years and that has also changed the way I relate to music. Now, you have had the chance to enjoy your Search Inside the Music benefit. How has using this changed the way you relate to and use music?

Lamere: That’s a good question. I agree that the media is the message, and that really affects our way of dealing with music. As we switch over to MP3s, I think listening to music has shifted from the living room to the computer. People are now jogging with their iPod and listening experiences are much more casual.

They may have access to 10,000 tracks. They’re not listening to the same track over and over like we used to. So I think over time music is going to shift back from the computer to the living room, back to the living spaces in our house and away from the computer.

I try to use our system away from the computer -- just because I like to listen to music when I’m living, not just when I’m working. So I can use something like Search Inside the Music to generate interesting playlists that I don’t have to think about.

Instead of just putting the shuffle play on The Beatles, I can start from where I was yesterday, and since we were eating dinner, let’s circle around this area of string quartets and then when we’re done, we will ramp it up to some new indie music. You still have the surprise that you get with shuffle, which is really nice, but you also have some kind of arc to the listening.

Gardner: So you are really creating a musical journey across different moods, sensibilities, and genres. You can chart that beyond one or two songs or a half-dozen songs into a full-day playlist.

Lamere: That’s right.

Gardner: Very interesting. Well, thanks for joining us, Paul. We’ve been talking with Paul Lamere, a researcher and staff engineer at the Sun Microsystems Research Labs in Burlington, Mass. We’ve been discussing Search Inside the Music, a capability that he’s been investigating. A lot of thoughts and possibilities came from this. Paul is the principal investigator. I wish you well on the project.

Lamere: Thanks, Dana. It’s been great talking to you.

Gardner: This is Dana Gardner, principal analyst at Interarbor Solutions. You’ve been listening to BriefingsDirect. Come back next time.

If any of our listeners are interested in learning more about BriefingsDirect B2B informational podcasts or to become a sponsor of this or other B2B podcasts, please fill free to contact Dana Gardner at 603-528-2435.

Listen to the podcast here.

Transcript of BriefingsDirect podcast on music search with Sun Labs. Copyright Interarbor Solutions, LLC, 2005-2007. All rights reserved.

Monday, January 29, 2007

Transcript of BriefingsDirect Podcast on Business Webs and Relationships-Oriented Business Search

Edited transcript of BriefingsDirect[TM] podcast on Business Webs and search with host Dana Gardner, recorded Jan. 9, 2007.

Listen to the podcast here. Sponsor: Zoom Information Inc.

SPECIAL OFFER: Listeners of this podcast are invited to try ZoomInfo's Business Web search and discovery benefits through a special free-trial offer. Just go to www.zoominfo.com/businesswebpodcast to obtain free access to Zoominfo's advanced business search capabilities.

Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you're listening to BriefingsDirect. Today, a sponsored podcast discussion on the burgeoning "Business Web," the concept around how companies and commerce-focused organizations can discover one another and create partnerships, relationships, and alliances.

In doing so, they can exploit and leverage the search technologies around the Semantic Web and around the new opportunity for discovery of assets, resources and tacit knowledge. Joining us to discuss this is a representative from Zoom Information Inc., Russ Glass. He is the vice president of product and marketing at ZoomInfo. Welcome to the show, Russ.

Russ Glass: Thanks for having me.

Gardner: Also joining is John Blossom, president of Shore Communications and a noted analyst covering these topics. Thanks for joining us, John.

John Blossom: Thank you, Dana.

Gardner: Search has been around for a number of years. It’s been used for many -- I suppose you could say -- pedestrian purposes, and has increasingly been focused on business. But business relationships and finding partners is as old as business itself. Finding the right partners can make or break an activity, a go-to-market, an innovation, and any new process that’s brought to the market.

Now, we’re starting to bring these things together, perhaps a "Reese’s Cup" moment -- chocolate and peanut butter, if you will -- a whole greater than the sum of the parts. This seems to be particularly relevant now because we’re entering this Web 3.0 era, where semantic information is more readily available, and context scales massively, but also scales down to the long-tail or micro niches.

So, we want to talk about the need for aggregation, the fact that there is just a tremendous amount of information, even an overload, given what search is capable of. We also want to explore how this new semantic opportunity and how business relationships and the Business Web can be brought to bear on overload and discovery.

I want to start with you, John. Could you help us understand just how large this ocean is that we’re roiling about in with search. What is the need for aggregation, context, or pertinence when it comes to this huge trove of disparate business information that it is now available through the Web?

Blossom: Sure thing, Dana. The Internet as everybody knows is a growing database of information, accessible globally. We used to measure things in megabytes and gigabytes, then terabytes, then petabytes and exabytes. Basically, we just keep on putting the zeroes behind the ones, and we have untold quantities of information that are being produced every day around the world and are being published on the open Web.

What’s happening in today’s environment though is that businesses are getting much more intelligent about how they use the Web to present themselves as not just companies with shareware sites, but as publishers that provide in-depth information about their activities.

At the same time, individuals in business are learning how to expose themselves through that instrument also. Why did they bother? Well, our research shows that most people in business start on the open Web when they’re trying to solve a business problem and find business information, as opposed to going to internal subscription services. Certainly, subscription content is still a very important part of the equation, but many people, knowing that companies are out there publishing information on the Web, are going out to search engines to find the answers to their business questions.

So, search engines have become a natural first point for not just Googling individuals to find general information or haphazard information, but to solve real, specific business and sales problems in the process. In today’s environment, with so much information being published by individuals and institutions, search engines don’t necessarily provide the level of filtering that the average businessperson needs to be able to do solve problems effectively.

Gardner: All right, let’s move to Russ. Russ, it's a given that search is become a front-end for more and more business activities. When we’re thinking about exploring new opportunities in different markets, partnerships -- and given that more and more companies are in fact partnering and aligning themselves -- what can be brought to this opportunity and how can we then at the same time solve this overload issue?

Glass: When John talked about how much information is on the Web and how fast it’s growing, he nailed it right on the head. John Battelle calls it the "Database of Intentions." where the Web is now this huge repository for what companies are doing, what partnerships they’re signing up, who works within different companies, what industries they’ve worked in in the past, where they went to school, etc. If there was a way to tie all that together in an efficient manner and with a search interface that made sense for the business user, there’s tremendous value to get out of it.

At ZoomInfo, we’ve got a product called PowerSearch, which is essentially the front-end interface that’s targeted to the business user to be able to pull out these intentions: who’s working with whom, what are they trying to accomplish, how can I discover markets to use in my day to day business life?

Gardner: So, it really isn’t just about businesses. It’s about individuals, because businesses are composed of people. I might be dating myself here, but there was this time when your Rolodex was your standard. How good your Rolodex was indicated how good you were at partnerships and relationships and what your past business history was. That’s now given away to the Web, but again we hit overload. What’s being done technically to align business user needs with some an automated approach? In a general sense, are we talking about crawling, and is it traditional, or do we have to go a step further technically in order to accomplish this notion of a Business Web?

Glass: We need to go a step further. Obviously, crawling is a starting point for all search engines. You have to have a mechanism to get out there and actually find the information, but it’s not just enough to say, “Oh, these keywords exists out there.” For example, my name, Russell Glass, is actually two keywords in the eyes of the search engine, when Russell Glass is really an entity. I’m a person. You have to take that level of understanding beyond just crawling, to be able to say that I’m a person, I have certain attachments, I belong to certain organizations, I’ve worked for certain companies.

I have all these characteristics when you really get to semantics, the Semantic Web side of this. I have all this metadata, if you want to call it that, from a technical standpoint, which defines who I am. The next step for search is really to understand that metadata and use that to create a more useful and filtered experience for a business user. For example, if I am looking for people within a certain industry -- let’s say the healthcare industry -- I need to be able to tell that search engine that the results I want are people, and those people have to work within the healthcare industry. That search engine has to be a lot smarter than just bringing back certain keywords that are found out there on the Web.

Blossom: That’s a key point, and what you’re pointing out from my perspective, Russ, is that for many years people were talking about the Semantic Web and the need for metadata, so people will be able to do these kinds of searches. But the Web is not self-organizing in the way that it’s published. The information is out there and it can be extracted, but people are not publishing content with that structure in mind.

The technologies that are becoming more prevalent in this era, and that sometimes get bucketed into the label "Web 3.0," are about being able to extract that semantic structure from content that’s been published without semantics in mind. They provide the hooks that will allow specialized searches to go out there and provide semantic structure across a wide range of information, regardless of how it was initially published.

Gardner: John, let’s take a step back for a moment, and for our listeners benefit, give them a timeline of Web 1, Web 2 and Web 3. I’ll start with a very broad overview. Web1 was HTML content, pretty straightforward and basically a push, it was just, "Here’s a brochure" type of affair.

Web 2.0 is characterized a lot by relationships and user content discussion, give and take, bringing this more to the level of a village, but with massive scale but also massive niche or long-tail opportunity.

Help us explain now this next progression to Web 3.0, what do you mean by "semantic" and "Web," and what sort of tools can be brought in this newer era to align business interests and cut to the wheat and get rid of the chaff.

Blossom: It’s always dangerous to use labels like Web 3.0, especially when it’s so vague and undefined that Wikipedia has locked down the page for Web 3.0, and won’t even allow people to make an entry on it. What I think we can say safely is that everything old is becoming new again, through the advancement of technology on the Web. If you think of pre-Web information services, they were oriented around databases. Those databases were highly structured and you had indexes and needed specific words to be able to organize records in that database.

Gardner: You practically had to be a developer to know how to approach the database to get basic information?

Blossom: Absolutely. If you look at some of the interfaces that were available in early information products, such as Dialog and other databases of that sort, with a virtual programming language unto itself just to do a simple query. Then, along comes the Web, and what's now termed Web 2.0, and people said, “Well, let’s just not worry about structure too much, let’s put the information out there." There is enough intelligence in search engines for us to be able to find stuff that’s been pushed out there on the Web.

It was a huge step forward, when the search engines provided the ability for information to be aggregated in new ways without the structure of a database, but on a rather ad-hoc basis. Sometimes it was guided by popularity and sometimes by semantic structure, but whatever the combination, information was more easily aggregated from any number of sources regardless, of its structure.

Web 2.0 is probably best described as the read/write Web, where there is an increasing proportion of content that’s being pushed out by individuals as opposed to institutions. We have a broader mix of sources and a broader mix of information going out from a range of publishers. It has some default structure in it because of the standards that are being applied, but it’s still fairly loose information.

Gardner: At least those individuals are not only contributing content, but they also, in a sense, contribute ranking opportunity by voting with their attention, with their activity, and with their links, which information is relevant, and this helps codify into some categorization.

Blossom: Exactly, and the ability for people to be able to build up webs of links, to be able to indicate what’s important, provides a level of importance to semantics. That begins to highlight information more effectively to people, when they’re trying to figure out what’s relevant in a trendy topic.

Gardner: If it works for finding people who are interested in your baseball collection, your baseball card collection, then it might also help in putting together your business community or ecology as well.

Blossom: Absolutely. Business publications and corporations are all beginning to latch onto this idea and to use Web 2.0-style publishing to be able to reach people. We have individuals out there in business with Weblogs. We have CEO’s with Weblogs. We have PR departments with Weblogs. Everybody is pushing out and trying to engage the world in a conversation. Now, with all that information out there, the question is how we structure it. What we’re beginning to see in the Web environment is the use of more sophisticated content extraction technologies and analytics to be able to take those relationships and to present them more effectively to information systems.

So, to bring it back to what I was saying originally, we’ve gone back to trying to provide structure in our information, but with a much more sophisticated publishing environment than people used to have in standalone databases. Now, in effect, the entire Web has become a database because of the ability to structure information on the fly.

Gardner: Okay, so the best of the old and the best of the new. Russ, you’re in the position now of creating product and then bringing that product to market at ZoomInfo. What is it about this next step of the best of the old and the new that is exciting to you as a search product developer?

Glass: John made a key point -- when he was talking about how Web 2.0 was driven by the individual -- moving from totally unstructured information to a world where now users can start to give feedback. Businesses have started to follow the individuals' lead to Web 2.0, but they have then taken the next step because all that information is useful to a business only if it’s structured and if it can be efficiently incorporated into the business environment.

Businesses are really leading the drive toward these "Web 3.0" technologies, moving from unstructured to structured information. As a product guy in this space, that’s what excites me, because I think that companies like ZoomInfo, which are focused on semantic technologies, are going to find the most success in the business world in the short term, because the ROI is there. Businesses can make tremendous amounts of money in both the cost side and the top line by aggregating this information and using it effectively.

Gardner: It seems to me that this notion of a Business Web isn't really, "Let’s build it, and see who’ll come." This is really business responding to an opportunity, recognizing that they have need of finding an audience, of finding channels, of finding partnership, extending into new markets -- globalization. Help me understand the chicken and the egg here when it comes to the Business Web?

Glass: Sure. I don’t think we’re creating the Business Web. I think the Business Web’s already out there. It’s out there in the form of Websites and press releases and SEC filings and blogs and anything else with business content. What we’re doing is taking technology and making all of that information accessible in an efficient way. That’s really the next step, because as we’ve talked about already, there’s just too much information out there to absorb. It’s in too many different repositories, in too many different formats.

When we have something that can tie it all together in a way that makes sense to a business user who can then use it to define markets and find people within those markets and do competitive research on different companies within those markets, etc., that’s where the real value is going to be extracted from the Business Web.

Blossom: It's interesting that the sorts of problems that businesses are trying to solve on the open Web are very similar to the problems that they’re also trying to solve behind the firewall and their intranets and extranets. Even inside corporations, they have created many repositories of different types of information. Being able to unify that effectively with analytics-driven aggregation systems is key for their internal operations. Increasingly, businesses are seeing the open Web as an extension of their business operations. It’s just the matter of which IP routers are you going through to be able to extract business intelligence.

Gardner: Perhaps we have dual track here. On one hand, we’ve got progression toward the Semantic Web, but at the same time inside the firewall, as you point out, we’ve got a long-term progression toward the semantic enterprise. We’ve been using warehousing and mining business intelligence, and a number of other trends over the past 10 or 15 years.

Help me understand, John, how these come together. What is this "whole greater than the sum of the parts," when we apply the semantic enterprise to the Semantic Web?

Blossom: It’s the same problem solved in a different environment, although there are more structured databases inside your typical corporation. With the proliferation of departmental Web servers and Web 2.0 applications within the corporate environment, there’s not a lot of homogeneity in their internal information. When the corporations look out to the external Web, they are more open to the idea of this being a resource that they can use for business purposes, not just because of the tools that are out there, but because they understand this is the same sort of problem that they have had to solve internally.

There’s a greater openness to look at Web information as the same type of resource that they have available on a corporate basis, and they're more open to the idea of being able to transform it into a resource that has semantic structure and usefulness for enterprise-ready applications.

Glass: Think about it from the end user’s perspective. If I’m a businessperson and I need access to information, at the end of the day, I don’t really care where it comes from as long as I know the authority of the source, and as long as I’m able to quickly and efficiently use the information. Just make sure that it’s holistic, that I get as much information about the topics I’m looking for, and that it’s easy and efficient to use.

Blossom: Users in the corporate environment use open Web search engines as their first point to find information. The internal search engines that they have in the corporations are literally competing for attention with those. It only makes sense that those users are going to be very open to more sophisticated resources that are available to them on the Web.

Gardner: I suppose it boils down to, they want to know what they don’t know about what's going on out in the greater business environment.

Blossom: Absolutely. And, the Web itself is becoming the most efficient channel through which to communicate that information. There’s less need for the info-media ways of the past.

Gardner: Let’s drop down from a few thousand feet to the basement and get concrete here. Are there some examples of companies that have gone out -- perhaps beta customers for you at ZoomInfo -- and applied this kind of cutting edge innovative basis and had some payoffs?

Glass: We’ve got on the order of a couple of thousand customers that are using this technology now and getting bigger ROI out of being able to use this information that's just been gathered and display it efficiently from the open Web. Adobe is a good example. They use us within their field sales force. The problem they face is that their field sales are disparate. They look at a lot of different types of companies and a lot of different types of industries that all have different value props for using Adobe products.

A tool like ZoomInfo can define markets and really give a good holistic look at which companies play in a certain market space in a certain location. So, their field sales force has a good understanding of where they can go look for opportunities, all the way down to the specific person within the company -- whether it be the VP of marketing or whoever else would buy Adobe products.

Gardner: So you can take that notion of the Rolodex and just explode it. Not just shooting in the dark, but pinpointing the right people to talk to at the right time in certain companies. That gives you a much better chance of saying, “Aha, you’ve got a problem? I’ve got a solution.”

Glass: It’s a context. It’s understanding. Your Rolodex gives you a name in a company, but if you know that that person makes decisions about certain products, and you know that the person knows someone else that you already have as a customer, it really helps you be able to both warm the call, get yourself in the door, get your foot in the door, and then close the deal.

Gardner: If I'm selling and I find one company that’s a likely candidate for my universe, then I can apply the same criteria and probably come up with a bunch of similar companies, and then come up with an expanding universe of prospects.

Glass: That’s exactly right. It’s kind of like the crystal set. You need that one little crystal to form the entire large crystal. Once you know a single company in a certain area, you can pivot around that company and discover all the other companies that do exactly what that company does.

That’s all because of the way the Web works. Every company is out there trying to describe what they do. Having an intelligent technology that can take the metadata out of that, take the semantics out of that, and understand which companies are actually doing the same things is very powerful for business.

Gardner: I can apply that to sales and marketing prospects, looking for buyers, but I suppose I can apply the same approach of the Business Web to looking for partners, saying, "Here is a company that is doing something close to what I’m doing and there’s a natural affinity. Let me find all the other companies that could also potentially be in a partnership or a symbiotic relationship with me. Then, I'd have a whole new opportunity to expand my universe.

Glass: As we’ve all seen that in the Web and the hi-tech world today, "co-opetition" is the name of the game. You’re competing with companies, and at the same there are great efficiencies to working with them. If anybody has defined that, it’s Google. I don’t know of a single company that doesn’t have some sort of relationship with Google, even if it’s as simple as AdSense or using them to place ads on their content, but at the same time they’re often competing with Google for advertising dollars.

Blossom: What seems to be happening in the midst of this is that the Web is becoming a more effective tool over time to classify businesses than traditional industry classification schemes. The competitors I had three months ago are not necessarily the competitors I’m going to have three months from now -- or even today -- because business changes at the speed of light, as the Internet pumps information around the world and business strategies change on the dime.

In that sort of environment, it becomes more effective to be able to classify companies and individuals through semantics and relationships that are defined through Web content, than to rely exclusively on databases based on long standing industry classifications and formal relationships. Being able to get those webs of competitors and relationships right on the fly requires an environment like the Web, where that information changes every day.

Gardner: John, it seems to me there are some other mega trends that accelerate this. Globalization, we’ve mentioned. It seems that as companies go more to business-process outsourcing -- if they’re looking to employ services oriented architecture and software-as-a-service, it seems that its what wires and connects companies together becomes more important than having it all internal, doing it all yourself. Am I too far out here?

Blossom: Not really. There is a great example that Don Tapscott uses in his new book, Wikinomics. He talks about a gold mining company up in Canada that recognized all of a sudden that they had all sorts of potential deposits of gold, but they had no idea exactly where they were, and they wanted to get at them most efficiently. Their CEO happened to go to a seminar that was talking about open-source software and Linus Torvalds's development of Linux.

He said to himself, “Well, let’s open-source our functions." He put all of their normally secretive information about geology and mineralogy, mineral deposits that were on that property, and he put it out on the open Web and asked people to find the gold. He put out a little reward money. They got input from over 1,000 people, and doubled the yield of that property in just a matter of weeks.

Glass: Wisdom of crowds, right?

Blossom: The wisdom of crowds, which sounds a little bit frightening at times -- and at times it can be -- but the idea that corporations benefit most from wherever today’s most insightful people are at the drop of the hat, pushes people towards Web infrastructure, where they can collect that knowledge and wisdom as efficiently as possible in an open environment. That doesn't necessarily mean that they have business relationships with those people, but being part of that Web is becoming far more important over time than segmenting too much intellectual property behind the firewall.

Gardner: So, it sort of takes the notion of the knowledge economy a step further, that the coin of the realm is information in the right people’s hands at the right time?

Blossom: Absolutely.

Gardner: And search is a very powerful tool to bring that about. Well, great. This has been a good exploration of this notion of the Business Web. I was sort of fuzzy on it, and now I have a much better handle on it. Was there anything else that we needed to bring to the table for people to appreciate business Web? How about when this notion, still nebulous as it is, of Web 3.0 and the Semantic Web? Are we in the fifth or sixth inning of a nine-inning ballgame here, or are we only just in the first few minutes of this in terms of what’s going to be possible in the next few years?

Glass: We’re in the first few minutes. The Business Web and what's happening with tools like ZoomInfo is the first step. You can almost think of the Business Web as a microcosm for what's going to happen to the rest of the Web. This will be the first step toward a truly semantic type of search, where you can aggregate all this information and use the power of unstructured content to, as John said earlier, be able to recognize things on the fly, as they’re changing real time, and not have to rely on old categorizations and old information. The rest of the Web will head this direction, led by what's going on in the Business Web -- one big content nation comprised of individuals and institutions cooperating and collaborating to create knowledge that’s going to help all of us.

Gardner: That’s a high note to end on. I want to thank you both for joining us in this discussion of exploring the definition of a business Web, some of the tools that are available to explore that now, and what we can expect in the future. We’ve been taking this discussion to a new level with Russ Glass, Vice President of Product and Marketing at Zoom Information, and John Blossom, President of Shore Communications. Thank you, gentlemen.

Blossom: Thank you very much.

Glass: Thank you, Dana.

Gardner: This is Dana Gardner, principal analyst at Interarbor Solutions. You’ve been listening to a sponsored BriefingsDirect podcast. Thanks for listening, and join us next time.

SPECIAL OFFER: Listeners of this podcast are invited to try ZoomInfo's Business Web search and discovery benefits through a special free-trial offer. Just go to www.zoominfo.com/businesswebpodcast to obtain free access to Zoominfo's advanced business search capabilities.

Listen to the podcast here. Sponsor: Zoom Information, Inc.

Transcript of Dana Gardner’s BriefingsDirect podcast on Business Webs and relationships-oriented search. Copyright Interarbor Solutions, LLC, 2005-2007. All rights reserved.