Showing posts with label Governor. Show all posts
Showing posts with label Governor. Show all posts

Monday, August 11, 2008

WSO2 Data Services Provide Catalyst for SOA and Set Stage for New Cloud-Based Data Models

Transcript of BriefingsDirect podcast on data services, SOA and cloud-based data hosting models.

Listen to the podcast. Sponsor: WSO2.

Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you’re listening to BriefingsDirect.

Today, a sponsored podcast discussion about data services, how services-oriented architecture (SOA) is enabling an expansive reach for data, particularly enterprise data, how this will relate to the development of cloud infrastructures and services.

We’ll also examine what technologies and approaches organizations will need to federate and integrate these data sources and services and hosts, but without additional risk. That is to say, to free up data, give it more legs, but without losing control or increasing security risk.

We're also going to look at how open-source software relates to this, and how organizations are bridging the risk reduction and larger availability of data using open-source software.

To help us in this discussion, we are joined by distinguished a panel. First, Paul Fremantle, the chief technology officer at WSO2. Welcome, Paul.

Paul Fremantle: Hi, nice to be here.

Gardner: We are also joined by Brad Svee, the IT manager of development at Concur Technologies, a travel and expense management company in Redmond, Wash. Welcome to the show, Brad.

Brad Svee: Glad to be here.

Gardner: We are also joined by James Governor, a principal analyst and founder at RedMonk. Good to have you back, James.

James Governor: Thank you very much. Hello, everyone.

Gardner: Let's set this up by looking at the problem. I suppose the good news of looking into the past is that data has been secure and controlled. There are lots of relational databases with many, many bells and whistles applied over the years. There has also been a lot of development, middleware, and system's administration work to manage that data, keep it available, but also secure. That's the good news.

The bad news is that it's been limited in some respects by that firewall of personnel, technologies, and standards around it. I want to go to first to Paul Fremantle. Can you tell us a little bit about why the old model is not going to be sustainable in the world of services and mixed hosting environment?

Fremantle: It's a very interesting question. There are two key issues around the old model. The first is the just-in-time nature of data systems that are being bought today. Typically, customers are coming onto Websites and expecting to see the status of the world as it is right now.

They don't want to know what happened yesterday. They don't want to know what happened a week ago. They want to know exactly what happened right now. So it's very important that we move away from batch-oriented systems and file dumping and move to a real, live connected world, which is what people expect out of the Internet.

That live, connected world needs to be managed properly and it's very, very difficult to build that as a single monolithic system. So, it's really essential to move the ownership of that data to the people who really know it, create it, and own it, and that really plays to SOA. This is the model of keeping the data where it belongs and yet making it available to the rest of the world. That's the first point.

The second point is, of course, the danger inherent in getting it wrong. I have two stories, which I think will shed some interesting light on this. One is, I was working with a government organization and they were involved in a situation where every day one of the employees has to FTP a data file from a remote system and back load into their local system.

The employee went ill, and, of course, this didn't happen. They had a whole process to find out who had the password, who could do this and solve this problem. The employees had no one there to back load this data. As I was investigating, it turned out that data in the remote system, from the other organization, was actually coming from within their own organization.

There was another employee uploading the data from the main system to the remote system every day, and they had no clue about this. They didn't realize that this process had built up, where the data from organization A, was being sent to organization B, and then re-downloaded to organization A again, every single day, taking up two employee's time to do that.

Gardner: This is sort of the medieval approach to data transfer.

Fremantle: This is the medieval approach to data transfer. This was not back in 1963 that this is happening. This was actually happening in 2007.

Governor: Medieval or not, the simple fact is that there are vast amounts of exactly that kind of stuff going on out there. Another lovely story told by Martin Fowler talks about a customer -- I believe he was in the U.K. NHS, but I should be a little bit careful there. I should say it was a large organization, and they were freaking out. They said, "We've got to get off Java, because the printer driver is just no good."

He said, "What exactly are you trying to do? Let's have a chat about the situation." "We got to get off Java. We will just try and work it out." He looked at the work was that got involved. Basically, they were getting a document, printing it out, taking it across the room, and then typing it into another system on the other side of the room. He had to tell them, "Well, maybe there is another way of doing it that won't require printer drivers."

Gardner: One of the motivators, it seems, is if nothing dramatic requires you to change your patterns, then you stay with them. It's sort of inertia with people's behavior, and that includes IT. What we're seeing now is an impetus, or an acceleration and automation in services, because they have to, because there are outside organizations involved. A business process is not strictly internal, from one side of the room to the other, but could be across the globe and span many different companies. Does that sound correct, Paul?

Fremantle: Absolutely. I just want to give you a second example, which has been very well published in the U.K. where I live, but maybe it hasn't been so well known outside of U.K. The revenue and the customs in the U.K. had a significant problem recently, where they sent a CD containing 20 million records, including the birth dates, names, national insurance numbers, and bank account details of the 20 million people to another government department.

And, they lost it. They sent it again, and they lost it again. It would not be too far to say this had significant ramifications on the government and their ability to stay in government. The payoff of this was, they had policemen out searching rubbish dumps. They had to send a personal letter to each of the 20 million people. Banks had to update their security procedures.

The overall cost of this mistake, I imagine, must be in the millions of pounds. Now, the interesting question is, firstly, they didn't encrypt that data properly, but even if they had, there is a huge difference between encrypting an online system and encrypting a CD. If a hacker gets hold of the CD, he can spend as long as it takes to decrypt that system. If it takes him two years of computing power to do that, he can sit there for two years and break it.

If you have an encrypted online system and someone keeps trying to break it, the administrator sees log messages, knows that something is happening, and can deal with that. So it's not just the lack of encryption and the bulk dumping of data from one department to other, that's the problem. The model of sticking it on a CD hugely increases the dangers.

Governor: Well, people should be imprisoned for that, or at least lose the right to trade. Obviously, being government organizations, it's difficult to make that stick, but the U.K. government loves the use of phrase "fit for purpose." Quite frankly, there has been evidence that they are not fit for purpose.

Interestingly enough, one of the things about the importance of data and managing it more effectively, is thinking about data in a more rigorous way. I was going to talk on this call about "leaky abstractions." One of the problems with SOA is the notion that, "Oh, we can we can just take the system as it is and make it available as a service."

Actually, you do want to do some thinking and modeling about your data, your data structures, and how it can be accessible, and so on, because of this notion of leaky abstractions. You can push something in one place and something else is going to pop out in another by just taking a service as it is and making it online. You may not be doing the work required to use it more effectively.

I think that's the kind of thing that Paul is talking about there. What better example of the leaky abstraction is there than somebody sending a disk and not tracking where it goes? Again, the fact that there wasn't any cryptography used is really shocking, but frankly, this is business as usual.

Fremantle: In fact just to completely confirm what you are saying there, the government department that wanted this data did not want the bank account details, the national insurance numbers, or the ages. They didn't want all that data. What actually happened was the revenue and customs team were not sufficiently IT aware to be able to export just the data that was wanted, so they dumped everything onto the disk.

I think that exactly confirms what you are talking about the leaky abstraction. They just pushed out everything, because that was the simplest possible thing to do, when it wasn't exactly what's required, which is what should have been done.

Gardner: So, it does seem clear that the status quo is not sustainable. That there is inherent risk in the current system and that simply retrofitting existing data in turning it on as a service is not sufficient. Either you need to rationalize, think about the data, and generate the ability to slice it and dice it a little better, so that in the case of this disk of vast amounts of information, there was only a small portion of that that was actually required.

Let's look at this also through the lens of, "If we need to change, how do we best do best do that?" Let's look at an example of how someone who needs to view data in a different sense, in a more modern sense, how they are adjusting? Let's go to Brad at Concur. Your organization is involved with helping to improve the efficiency and productivity of travel and management inside of organizations.

Your data is going to come from a variety of areas that probably could be sensitive data in many organizations. Certainly, people are not happy about having their travel information easily available around the organization or certainly outside of it. And, of course there are government and tax implications, compliance, and implications as well. Can you give us a little bit of sense of what your data problem set is and if it's different from what we have heard on the "medieval" front? What sort of approaches you would like to take and have been taking?

Svee: First, I would like to clarify the distinct separation between our research and development team, which actually works on our product that we sell the clients, and my team, which works internally with our internal data.

I would like to draw a distinct clarification between those two. I am only able to speak to the internal data, but what we have found is exactly that that. Our data is trapped in these silos, where each department owns the data, and there is a manual paper process to request a report.

Requesting a customer report takes a long time, and what we have been able to do is try to expose that data through Web services using mashup type UI technology and data services to keep the data in the place that it belongs, without having a flat file flying between FTP servers, as you talked about, and start to show people data that they haven't seen before in an instant, consumable way.

Gardner: So, not even taking that further step of how this data might be used in an extended enterprise environment or across or departmental organization boundaries, just inside your organization, as you are trying to modernize and free up the data, you are looking at this through the lens of availability, real time, lower cost and clip, print, and touch from IT personnel. What sort of technologies and approaches have you been taking in order to try to achieve that?

Svee: For the last year or so, we have been pushing an SOA initiative and we have been evaluating the WSO2 product line, since, maybe November. We have been trying to free up our data, as well as rethink the way all our current systems are integrated. We are growing fairly rapidly and as we expand globally it is becoming more and more difficult to expose that data to the teams across the globe. So we have to jump in and rethink the complete architecture of our internal systems.

Gardner: What is it about the architecture that has a bearing on these flexibility and agility you are looking for, but that also protects your sense of reduced risk, security privacy access control?

Svee: Most of the data that we are dealing with is fairly sensitive, and therefore almost all of it has a need for at least per-user access basis, as well as, when we are transporting data, we will have to make sure that it's encrypted or at least digitally signed.

Gardner: Now, it seems to me that this data will need to be available through a browser-based portal or application to the end users, but that the data is also going to play a role with back office system, ledger, and different accounting activities, as this travel and expense content needs to be rectified across the company's books.

Svee: The browser becomes the ubiquitous consumption point for this data, and we are able to mash up the data, providing a view into several different systems. Before, that was not possible, and the additional piece of moving the file between financial systems, for example, we are able to not have to pull files, but actually use Web services to send only the data that has changed, as opposed to a complete dump of the data, which really decreases our network bandwidth usage.

Governor: There's even potentially a green argument in there. I mean, all of this batch is just kind of crazy and unnecessary. We see a lot of it. There is so much data duplicated everywhere. It seems like we, as an industry, are very good at just replicating and getting ridiculous redundancy, and not so good at synchronizing and really thinking about what data does need to be transported and working with that accordingly.

That sort of makes a lot of sense to me. It's very good to hear you are taking that approach. I think sometimes we miss-call things SOA, when in fact what you are doing is kind of "suck and play." You take this thing, suck old things out, and then work on the new thing, as opposed to actually thinking about the data structures you need to enable the data to be useful and fit you.

Gardner: Let's go to Paul. Now, here is an instance where the organization has, I think, its feet in both camps. In the old style, there is accounting, the ledgers, and data extension across application sets from a common repository, and how to batch that in such a way that the data is all on the same page, so to speak, across these applications in a time frame.

We also need to take this out through Web applications to individuals and also across applications that are Web services enabled. So, it sounds like what we have here is a situation where the data needs to do many different tricks, not just a couple of old basic tricks.

What is it that WSO2 has done recognizing this kind of need in the market and is able to satisfy this need?

Fremantle: What we have built is what we call WSO2 Data Services, which is a component of our application server. The WSO2 Data Services component allows you to take any data source that is accessible through JDBC, MySQL databases, Oracle databases, or DB2, but, in addition, we also have support for Excel, CSV files, and various other formats and very simply expose it as XML

Now this isn't just exposed to, for example, Web Services. In fact, it can also be exposed by REST interfaces. It can be exposed through XML over HTTP, can even be exposed as JSON. JavaScript Object Notation makes it very easy to build Ajax interfaces. It can also support it over JMS, and messaging system.

So the fundamental idea here is that the database can be exposed through a simple mapping file into multiple formats and multiple different protocols, without having to write new code or without having to build new systems to do that. What we're really replacing there is, for example, where you might take your database and build an object relational map and then you use multiple different programming toolkits -- one Web services toolkit, one REST toolkit, one JMS toolkit -- to then expose those objects.

We take all that pain away, and say, "All you have to do is a single definition of what your data looks like in a very simple way, and then we can expose that to the rest of the world through multiple formats."

Gardner: When that data changes on the core database, those changes are then reflected across all of these different avenues, channels, and approaches it's sharing. Is that correct?

Fremantle: Absolutely, because it's being accessed on demand and then exposing them as needed through whichever format they ask for. So, it's not storing those data formats in it's own system,

Governor: One of the things that I really like about this story is that we went through a period where there was a view that everything needed to be done with the WS stack, and the only way to do SOA, the only way to data integration, was to use these large-scale Web standards. But they're not critical in all cases, and it really depends on your requirements for the security and so on. Do you really need SOAP and some of the heavier weight protocols and technology?

I think that the approaches that say, "Let's understand is this behind the firewall? What are the levels of protection that are required?" "Can we do this in a simpler fashion?" are very valuable. The point about JSON, for UI related stuff, certainly REST kind of interfaces, but at the end of the day it's a question of, do you have developers that are available out there in your shop or to hire that are going to be able to do the work that's required and some good examples that came out of the Web world?

If you look at eBay, they had a SOAP API, but nobody used it. A great number, or 80 percent plus, of the calls were using RESTful styles. Understanding the nature of your problem and having more flexibility is very, very important.

Gardner: One of the things that I really like about this is that, almost like Metcalfe's Law. The more participants there are on the network, the more valuable it is. The more people and systems and approaches to distributing data, the more valuable the data becomes. What's been nice is that we've elevated this distribution value with data, at the same time that open source and community-based development have become much more prominent.

That means that the ways in which the data is shared and transferred is not just going to be dependent upon a commercial vendor's decision about which standards to support, but we can open this up to a community where even very esoteric users can get a community involvement to write and create the means for sharing and transferring.

The data can take on many more different integration points, and standards can evolve in new and different ways. Let's discuss a little bit, first with Paul, about the role of open source, community, and opening up the value of data.

Fremantle: I am just a fanatic about open source and community. I think that open source is absolutely vital to making this work, because fundamentally what we're talking about is breaking down the barriers between different systems. As you say, every time you're pushing the proprietary software solution that isn't based on open standards, doesn't have open APIs, and doesn't have the ability to improve it and contribute back, you're putting in another barrier.

Everyone has woken up to this idea of collaboration through Web 2.0 websites, whether through Flickr or FaceParty or whatever. What the rest of the world is waking up to is what open-source developers have been discovering over the last five to ten years. Open source is Web 2.0 for developers. It's how do you collaborate, how do I put my input, my piece of the pie? It's user-generated content for developers, and that power is unbelievable. I think we're going to see that grow even more over the next few years.

Governor: I fundamentally agree with it. Open source was an application of a pattern. Open source was the first real use case for this need for a distributed way of working, and we're certainly seeing that broadened out. People are getting a much, much better understanding of some of the values and virtues of open approaches of exposing data to new sources.

Very often, you will not get the insight, but someone else will, and that sort of openness and transparency, and that's one of the key challenges -- actually just getting organizations to understand some of the value of opening up their data.

I think that is one thing to have to tools to see that. Another is that we all now are beginning to see organizations kind of get it. Certainly, "How do we syndicate our information?" is a really key question. We are seeing media companies ask themselves exactly that. "Do we have an API? How do we build an API? Where do we get an API, so that people can syndicate the information that we have?”

I suppose I'm just double-clicking on what Paul said -- that passion is something that is becoming more and much better understood. Reuters is realizing it has to have an API. The Guardian, which is a British newspaper -- and those Americans certainly of the leftward persuasion are very familiar with it -- now has a team that is also presenting at Web conferences and talking about the API. We've got to think about how to make data more available, and open source will just be the first community to really understand this

Gardner: I'd like to bounce this off of Brad at Concur. Do you feel a little bit less anxious, or more at ease, knowing that whatever data needs that you have for the future, you don't have to wait for a vendor to come up with the solution? You might be able to go and explore what's available in a community, or if it's not available, perhaps write it yourself or have it written and contribute it back to the community. It seems to me that this would be something that would make you sleep better at night -- that an open-source and community-based approach to data services deliverability gives you more options.

Svee: I personally love open source. I think that it is the movement that's going to fix software and all these proprietary systems. I think that my small team, four developers and myself, would not be able to produce the kind of quality products internally that we're essentially asked to do, without being able to stand on the shoulders of a lot of these geniuses out there who are writing amazing code.

Gardner: Do you agree that there is this sense that you can almost future-proof yourself by recognizing, as you embrace open source, that you're not going to get locked in, that you're going to have flexibility and opportunity in the future

Svee: Exactly. I find that there are a few products that we have that we've been locked into for quite some time. It's very difficult to try to move forward and evaluate anything new, when we're locked into something that's proprietary and maybe not even supported anymore. With the open-source community out there, we're finding that the answers we get on forums and from mailing lists are every year getting faster and better. More people are collaborating, and we're trying to contribute as much as we can as well.

Gardner: And, of course, over the past several years, we've seen a tremendous uptake in the use of open-source databases and sources from MySQL, Ingres, Postgres, and there are others. Let's bounce this back now to the WSO2 product set. What is it about, when you are developing your products, Paul, that open source becomes an enabler, as well as, in a sense, a channel into the market?

Fremantle: What was interesting about us developing this data services solution was the fact of what we built on top. The data service's component that we built actually took us very little time to get to its first incarnation, and obviously we are constantly improving it and adding new capabilities.

We were working on that and it didn't take time, but the very first prototype of this was just a piece of work by one of our team who went out and did this. What enabled that really was the whole framework on which it was built, the access to framework, the app server that we built, and that framework built on the work of literally hundreds of people around the world worked on it.

For example, if we talk about the JMS support, that was a contribution by a developer to that project. The JSON support was a contribution by another developer and relied on the JSON library written by someone else. The fact that we can choose the level of encryption and security from HTTPS all the way up to full digital signatures relies on the works of the Apache XML security guys who have written XML security libraries. That's an incredible, complex piece of work and it's really the pulling together of all these different components to provide a simple useful facility.

I think it's so amazing, because you really stand on the shoulders of giants. That's the only way you can put it. What I like about this is to hear Brad say that he is doing the same, we are doing the same, and all around there is a value change of people doing small contributions that, when put together, add up to something fantastic. That's just an amazing story.

Gardner: Given that there are many approaches that Brad, as a user organization, is undertaking, and they dovetail somewhat with what you are doing as a supplier, we also have other suppliers that are embracing open source increasingly and building out product sets that have originated from technology that was contributed or project format or license. How do these pieces come together, if we have a number of different open-source infrastructure projects and the products? I'm thinking about perhaps an ESB, and your data solution, and some integration middleware. What's the whole that's greater than the sum of the parts?

Governor: I certainly have some pretty strong opinions here. I think we can learn a lot from the ecosystems as well. One of the absolutely key skills in open source, as a business, is packaging. Packaging is very, very important to open source, and pulling things together and then offering support and service is a very powerful model.

It's really nothing new. If we look at personal computers, you go out and you can buy yourself chips from AMD or Intel, you can buy an OEM version of Windows or choose to do with Linux, you can buy RAM from another company, you can buy storage disks from another company, and kind of glom it all together.

But, as that industry has shown us, it really makes a lot more sense to buy it from a specialist packager. That might be Dell, HP, or others. I think that open-source software has really got some similar dynamics. So, if you want an Eclipse IDE, you are likely to be buying it from an IBM or a Genuitec or CodeGear, and a couple of those are our clients. I should disclose that.

In this space we've got the same dynamics. If you are, for example, a Web company, and you don't want to be paying these third parties to do that packaging for you, fine. But, for the great mass of enterprises, it really doesn't make that much sense to be spending all your time there with glue guns, worrying about how pieces fit together, even in Eclipse, where it is a very pluggable architecture.

It makes a great deal of sense to outsource that to a third party, because otherwise it's really a recipe for more confusion, I would argue. So yes, you can do it yourself, but that doesn't necessarily mean, you should. The PC example, yes, for a hobbyist or someone who wants to learn about the thing, absolutely, build your own, roll your own. But, for getting on with things in business, it does make sense to work with the packager that's going to offer you full service and support.

Fremantle: I've got to jump in here and say that's exactly our model. Though we don't just offer the data services, we offer, an ESB, a mashup server, and SOA registry, and we make sure all those things work together. The reality is that there are a lot of programmers out there who are hobbyists, so there are a lot of people who do like to take individual components and pieces and put them together, and we support both of those equally, but I think your analogy of the PC market and that plug and play model is absolutely like open source and specifically open-source SOA. We all focus very much on interoperability will make sure that our products work together.

Open source drives this market of components, and it's exactly the same thing that happened in the PC market. As soon as there was an open buy off that wasn't owned by a single company, the world opened up to people being able to make those components, work in peace and harmony, and compete on a level playing field. That's exactly where the open-source market is today.

Gardner: So how about that, Brad? Do you really like the idea that you can have a package approach, but you can also shake and bake it your own way?

Svee: That's exactly the sweet part in my opinion. I can shake and bake, I can code up a bunch of stuff, I can prototype stuff rapidly, and then my boss can sleep well at night, when he knows that he can also buy some support, in case whatever I cook up doesn't quite come out of the oven. I see there's a kind of new model in open source that I think is going to be successful because of that.

Gardner: Okay, now we have seen some very good success with this model: have it your way, if you will, on the infrastructure level. We are moving up into data services now. It seems to me that this also sets us up to move an abstraction higher into the realm of data portability. Just as we are seeing the need in social networks, where the end user wants to be able to take their data from one supplier of a social networking function to another, I think we are going to start to see more of that in business ecologies as well.

A business will not want to be locked into a technology, but it also doesn't want to be locked into a relationship with another supplier, another business. They want to be able to walk away from that when the time is right and take their data with it. So, maybe we'll close out our discussion with a little blue-sky discussion about this model taking a step further out into the cloud. Any thoughts about that, Paul?

Fremantle: I think that's a really interesting discussion. I was at a conference with Tim O'Reilly about two years ago and we were having exactly this discussion, which is that openness of services needs to be matched by openness of data. We are definitely seeing that in the Web marketplace through back-end systems like Amazon S3 storage, and we are beginning to see a lot of other people start to jump on this and start to build open accessible databases.

I think that's an absolutely fantastic usage for this kind of data service, which is to say, "It's my data. I don't just want to host things in an open fashion. I don't want to write code in an open fashion. I want open services and open data, so I can get it, move it, protect it myself, and relocate it."

So, I think there's a really interesting idea behind this, which is, once we get to the point where your data is no longer tied to a specific system and no longer has to be co-located with a particular MySQL database, we start to free up that processing. If you look at what Amazon did with the Elastic Cloud Service and their storage system, the storage system came first. The data services were a precursor to having an effective cloud-computing platform. So, it's really a precursor. You have to have data services, before you can start to migrate your processing and scale it up in this fashion.

Gardner: What do you think, James? Is this something that will be in great demand in the market, and there is also a green angle here?

Governor: Yeah, I think undoubtedly it will. Simon Phipps from Sun talks about the freedom to leave. We had a big example recently, Comcast buying Plaxo. They have lost a lot of the users. A lot of Plaxo users just closed up their account there. Interestingly enough, Plaxo had a nice function to do that -- very good for closing the account, not so good for exporting the data. I am not so sure the problems are primarily technical. I think there are a great deal of policy and social problems that we are going to have to deal with.

It's very interesting to me that we call people heroes that are trying to break Facebook terms of service, in some cases with the recent data portability example. We've got some really key challenges about what does data ownership mean. From my perspective, as I said earlier, I think it's very important that we have the mechanisms whereby we have access to data without necessarily allowing replication of it all over the place.

If it is your data, then yes, by all means, you should have permission to take a copy of it. What about if you're on a network and you want to take all the data and all of the surrounding metadata? Really, the discussion becomes about that metadata. Am I allowed to get anything back from Google about my behaviors and other people's behaviors?

It's really a social question, and we, as a society or a number of different societies, have got to think about this, and what we want from our data, what we want from privacy, and what we want we want from transparency. We can gain wonderful things, I mean wonderful advantages, but there is also the flip side, and I think it's very important that we keep that in mind.

So, it's going to be a wild ride. It's exciting, and I think that it is important that we get the tools in place, so that once we get the policies well understood, we can actually begin to do things more effectively. So, again, it's very exciting, but there are a lot of threats and lot of risks that we do need to take account of. Those risks are expanded, as I say, by what I sometime call "information bulimia." This notion that we just keep eating and swallowing more and more information and more data and we need more information, and if you do that, what you end up doing is puking it all up.

Gardner: Let's close here with that real-world perspective, Brad, aside from the visual image of puking, does this interest you in terms of the idea of third-party neutral cloud-based data and does that have any bearing on your real-world issues?

Svee: Well, I can give you an example what we were able to do with data services. Within a matter of weeks, not even months, we are able to use the data services in the application server from WSO2 to essentially give a complete client picture to the business by reaching into the ERP system, pointing out invoices and products, and then reaching into the CRM system to pull out open issues, as well as, sales manager, probably about 50 data points about each customer from the CRM, and then expose those services through a simple JSON-based UI with a smart type-ahead for the customer name. Quickly, we are able to show a picture of our clients that hadn't previously been available -- and within a matter of weeks actually.

Gardner: That data could have come from any number of different sources if, to James' point, you had the proper permissioning?

Svee: Yeah, and since we are IT and we own the systems, we are able to determine who is who, and we were able to use a Web service, another data service into our HR system, to pull out roles to see whether or not you could access that information.

Gardner: That's highly valuable from a productivity and planning perspective. If you are a business strategist, that's precisely the kind of information you want?

Svee: Exactly, and they were amazed that they've had been able to live their lives without it for so long.

Gardner: Paul, do you think much of this common view business, when it comes to data services?

Fremantle: Actually, we are working on another project with a health-care provider, which is providing a single patient view. So, it's exactly the same kind of scenario with significant security and encryption and data challenges to make sure that you don't provide the wrong information to the wrong person. Obviously, all the same issues need to be solved, and being able to pull together everything that is known about a patient from multiple different system into a single view once again has huge value to the organization.

Gardner: Well, this has to be our swan song on this particular podcast. We are out of time. I want to thank our guests for helping us get into a nice far-reaching discussion about data services, what the problem set has been, what the opportunity is, and how at least one organization, Concur, is making some good use of these technologies. We have been joined by Paul Fremantle, chief technology officer at WSO2. Thank you, Paul.

Fremantle: Thank you, it has been great fun.

Gardner: I also strongly appreciate your input Brad Svee, IT manager of development at the Redmond, Wash.- based Concur. Thank you, Brad.

Svee: Well, thank you.

Gardner: And always, thank you, James Governor from RedMonk for joining. We appreciate your input.

Governor: Thank you much. It has been an interesting discussion.

Gardner: This is Dana Gardner, principal analyst at Interarbor Solutions. You have been listening to a sponsored BriefingsDirect Podcast. Thanks and come back next time.

Listen to the podcast. Sponsor: WSO2.

Transcript of BriefingsDirect podcast on data services, SOA and cloud-based data hosting models. Copyright Interarbor Solutions, LLC, 2005-2008. All rights reserved.

Friday, February 08, 2008

New Eclipse-Based Tools Offer Developers More Choices, Migrations and Paths to IBM WebSphere

Transcript of BriefingsDirect podcast on Eclipse-based tool choices for IBM WebSphere shops.

Listen to the podcast here. Sponsor: Genuitec.


Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you’re listening to BriefingsDirect. Today, a sponsored podcast discussion about choices developers have when facing significant changes or upgrades to deployment environments. We'll be looking at one of the largest global installed bases of application servers, the IBM WebSphere platform.

Eclipse-oriented developers and other developers will be faced with some big decisions, as their enterprise architects and operators begin to adjust to the arrival of the WebSphere Application Server 6.1. That has implications for tooling and infrastructure in general.

The platform depends largely on the Rational Application Developer (RAD), formerly known as the WebSphere Studio Application Developer. This recent release is designed to ease implementations into Services Oriented Architecture (SOA) and improve speed for Web services.

However, the new Rational toolset comes with a significant price tag and some significant adjustments. Into this changeable environment, Genuitec, the company behind the MyEclipse IDE, is offering a stepping-stone approach to help with this WebSphere environment tools transition.

The MyEclipse Blue Edition arrives on March 15, after a lengthy beta run, and may be of interest to developers and architects in WebSphere shops as they move and adjust to the WebSphere Application Server 6.1.

To help us understand this transition, the market, and the products we are joined by Maher Masri, president of Genuitec. Welcome to the show, Maher.

Maher Masri: Thank you, Dana.

Gardner: Also, James Governor, a co-founder and industry analyst at RedMonk. Welcome, James.

James Governor: Hi, Dana.

Gardner: James, let’s start with you. We’re looking at a pretty dynamic marketplace around tools. There are certainly lots of different frameworks and approaches floating around. Folks are dealing with SOA, with Software as a Service (SaaS), with agile development. They are dealing with mashups and Enterprise 2.0 issues. We’re seeing increased use of REST and SOAP. This is just a big, fluid, dynamic environment.

On the other hand, we’re also seeing some consolidation around runtimes. Organizations looking to cut cost and infrastructure and trying to bring their data centers under as few runtime environments as possible. So, we’re left with somewhat of a conundrum, and into this market IBM is introducing a major upgrade.

Maybe you could paint a picture for us of what you see from the enterprises and developers you speak to on how they deal with, on one hand, choice and, on the other hand, consolidation.

Governor: It's a great question. In this industry we can expect continuing change. If anything is certain, it's that. When we look at this marketplace, if we go back a couple of years into the late 1990s, there was a truism that you could not make money as a tools company. The only way you could really sustain a business would be connected to, and interwoven with, the application server and the deployment environment. So it's interesting that now, sometime later, we’re beginning to rethink that.

If you look at a business like Genuitec, the economics are somewhat different. The Eclipse economics, in terms of open source and the change there, where there is a code based being worked on, have meant that it's actually easier to maintain yourself as an independent and work on a specific set of problems.

In terms of your question about Web 2.0, agile development, and so on, there are an awful lot of changes going on. That does create some opportunities for the third parties. Frankly, when you look at the very largest firms, it's actually quite difficult for them to maintain the sorts of innovation that we’re seeing from some of the smaller players.

In terms of the new development environments, it might be something like the fact that we’re seeing more Ruby on Rails. P scripting languages continue to be used in the enterprise. So, supporting those is really important, and you are not always going to get that from the lead vendors.

I'll leave it up to Genuitec to pitch what they do, but one of the interesting things they did, which you certainly wouldn’t have seen from IBM, was a while back, when they bridged the Eclipse world with the NetBeans ’ Matisse GUI building application development set.

Crossing some of those boundaries and being able to deal with that complexity and work on the customer problems, it's not surprising to me that we’ve seen this decoupling, largely driven by open source. Open source is re-enabling companies to focus on one thing, rather than saying, "Okay, we've got to be end-to-end."

Gardner: So, we've got a dynamic environment. We have some amazing uptake in Eclipse over the past several years becoming a dominant job oriented IDE. We have WebSphere as the dominant deployment platform.

As you pointed out, the economics around tools have shifted dramatically. It seems that the value add is not so much in the IDE now, but in building bridges across environments, making framework choices easier for developers, and finding ways of mitigating some of these complexity issues, when it comes to the transition on the platform side.

Let’s go to Maher. Tell me a little bit about why Eclipse has been so successful, and do you agree that it's the value add to the IDE where things are at right now?

Masri: Let me echo James’ point regarding the tools environment, and software companies not being able to make money at that. I think that was based on some perceived notion that people refuse to pay money for software. In fact, what we've found is that people don’t mind paying for value, and perceived value, when it’s provided at their own convenience and at their own price point.

That’s why we set the price for the MyEclipse Enterprise Workbench at such a low point that it could be purchased anywhere in the world without a series of internal financial company decisions, or even a heartbreaking personal decision.

Although the product was just the JSP editor when it was first launched, today it's a fully integrated development environment that rivals any Tier 1 product. It's that continuity of adding value continually with every release, multiple releases within the same year, to make sure that, a) we listen to our customer base, and b) they get the value that they perceive they need to compensate for the cost that we charge them.

Eclipse obviously has become the default standard for the development environment and for building tools on top of it. I don’t think you need to go very far to find the numbers that support those kinds of claims, and those numbers continue to increase on a year-to-year basis around the globe.

When it started, it started not as a one-company project, but a true consortium model, a foundation that includes companies that compete against each other and companies in different spaces, growing in the number of projects and trying to maintain a level of quality that people can build upon to provide software on top of it from a tools standpoint.

A lot of people forget that Eclipse is not just a tools platform. It's actually an application framework. So it could be, as we describe it internally, a floor wax and a dessert topping.

The ability for it to become that mother board for applications in the future makes it possible for it to move above and beyond a tools platform into what a lot of companies already use it for -- a runtime equation.

The next Ganymede 3.4 and the 4.0 extension of Eclipse is pushing it in exactly that direction. The OSGi adoption is making a lot of people reconsider their thought in terms of, "What application do I write for productivity applications internally, for tools that I provide to my internal and external customers, for which client implementations?"

It's forcing quite a bit of rethinking in terms of the traditional client/server models, or the Web-only application model, because of productivity requirements and so on.

IBM was the company that led the way for all of the IBM WebSphere implementation and many of their internal implementations. A lot of technologies are now based on Eclipse and based on Eclipse runtime.

Gardner: So, we have this big bear, Eclipse, in the market and we have this big bear, WebSphere, in the market. Why is there a need for someone like you to come in between and help developers?

Masri: The story that we hear internally from our own customers is pretty consistent, and it starts with the following. "We love you guys. You provide great values, great features, great support, except I cannot use you beyond a certain point." Companies for whatever internal reasons, from a vendor standpoint, are making the choices today to move forward with WebSphere 6.1, and that’s really the story we keep hearing.

"I am moving into 6.1, and the reason for that is I am re-implementing or have a revival internally for Web services, SOA, Rich-net applications, and data persistence requirements that are evolving out of the evolution of the technology in the broader space, and specifically as implemented into the new technology for 6.1."

Gardner: They need to modernize it.

Masri: But their challenge is similar. Every one of them tells us exactly the same story. "I cannot use your Web service implementation because, a) I have to use this web services within WebSphere or I lose support, and b) I have invested quite a bit of money in my previous tools like WebSphere Application Developer (WSAD), and that is no longer supported now.

"I have to transition into, not only a runtime requirement, but also a tools requirement." With that comes a very nice price tag that not only requires them to retool their development and their engineers, but also reinvest into that technology.

But the killer for almost all of them is, "I have to start from scratch, in the sense that every project that I have created historically, my legacy model. I can no longer support that because of the different project model that’s inside."

For example, Rational 7.0 is only one of the few versions of WebSphere that supports 6.1 and supports all of the standards for Web services, for AJAX support, for persistence requirements that they need to modernize. They have to implement it, but cannot take, for example, an existing WSAD project, import it into Rational 7.0, and continue development. They pretty much start from scratch.

Gardner: Let’s go to James for a moment. James, you’re familiar with the IBM stack and their road map. Why are they doing this? It seems to me that there is an application lifecycle management (ALM) set of benefits that the Rational toolset and platform bring that IBM is trying to encourage people to take advantage of. It does require transition, but they have a larger goal in mind. Perhaps we should address this ALM, or do you have other thoughts about this transition?

Governor: From an IBM perspective, it’s a classic case of kind of running ahead of the stack. If you see the commoditization further down the stack, you want to move on up. So IBM looks at the application developer role and the application development function and thinks to itself, "Hang on a second. We really need to be moving up in terms of the value, so we can charge a fair amount of money for our software," or what they see is a fair amount of money.

From an IBM standpoint, I think they really looked at players such as Genuitec, looked at where Eclipse was going, and they thought, "Wait a second. We really do need to be moving forward with this notion of software development."

If you talk to a lot of developers, they don’t really think of the world that way, but many of their managers do. So, the idea of moving to situation where there is better integration of the different datasets, where you've got one repository of metadata moving forward with that kind of stuff, that’s certainly the approach they are taking.

The idea is you've got "auditability," as you build applications. You’re going from a classic distributed development, but you’re doing a better job of centralizing, managing, and maintaining all the data that’s associated with that.

The fact that IBM is making that change is indicative of the fact that when they look at the market more broadly, they think to themselves, "Well, where is our margin coming from?"

IBM’s strategy is very much to look at business process as opposed to the focus on just a technical innovation. That certainly explains some of the change that's being made. They want to drive an inflection point. They can't afford to see orders-of-magnitude cheaper software doing the same thing that their products do.

Gardner: As we mentioned earlier, there are so many complexities involved in decision making now, different approaches to creating services, that the operators and the vice presidents of engineering are saying, “Wow, we need to manage this complexity.”

They are looking for life cycle approaches, ways of bridging design time and runtime. IBM is addressing some of these needs, but, as you point out, developers are often saying, "Hey, I just want my tool. I want to stick with what I know." So we’re left with a little bit of a disconnect.

I’m assuming, Maher, that this is where you’re stepping in and saying, "Aha, perhaps we can let the developers have it their way for a time to mitigate the pain of the transition, at the same time recognizing that these vice presidents of engineering and development are going to need to look at a much more holistic life-cycle approach. So, perhaps we can play a role in satisfying both." Am I reading too much into that?

Masri: No. We understand internally that different technologies have different adoption life cycle behind them. ALM is no different. It’s going to take a number of years for it to become the standard throughout the industry, and it is the right direction that almost every company is going to have to face at some time in the future.

The challenge for everybody, us and IBM, is the bottom-up sale process, to provide the tools and the capabilities for companies to embrace, for people to embrace those technologies, and, at the same time, putting the infrastructure in place for managers to be able to continue to manage projects into success.

Our decision is very simple. We looked at the market. Our customers looked back at us and basically gave us the same input. If you provide us this delta of functionalities, specifically speaking, if you’re able to make my life a little easier in terms of importing projects that exist inside of WebSphere Application Developer into your tool environment, if you can support the web services standard that’s provided by WebSphere.

If you can integrate better with ClearCase from a code management standpoint, and if you could provide a richer deployment model into WebSphere so my developers could feel as if they’re deploying it from within the IBM toolset, I don’t have the need to move outside of your toolset. I can continue to deploy, develop and run all my applications from a developer's standpoint, not from an administrator's.

Obviously if you are an administrator and have one to three people within the company that maintain a runtime version of WebSphere, you will need specific tools for that. We’re not targeting those one to three people. We’re targeting the 10 to 500 developers internally that need to build those applications. That’s really where Blue is coming from.

Governor: Maher, can you be a little bit more specific about it. You just used the top-down bottom-up or top-down in terms of your argument. Can you talk a little bit more to sort of that and your sales staff?

Certainly, from RedMonk’s standpoint, we do tend to be more aligned with the bottom-up, just in terms of our customer and community base. But, in terms of what you’re seeing and saying, how is what you do different from IBM? I didn’t quite get that from your last comments.

Masri: I'll give you a very simple example. Just take the experience of a developer installing MyEclipse or installing RAD from ground zero. MyEclipse, you can install in a two-megabyte root install. It installs a 600-megabyte version on your desktop that contains all the tools. You no longer need to buy additional tools from somewhere else. If you need to do UML development, if you need to do UI design, all that is included as one bundle within MyEclipse.

If you install RAD, you need a multi-DVD, six or seven gigabytes, I understand, in order just to begin the installation. The configuration is a nightmare. Everyone is telling us that it's a very difficult configuration process just get started.

MyEclipse is part of a very rich, simple profile that a user can download directly through the MyEclipse site or through our managed application environment inside of Pulse. You can be up and running with tools, with runtime configurations, and with examples, literally within minutes, as opposed to within hours or days beyond that.

On the issue of simplicity, the feedback that we keep getting is that our response level in terms of request for features, request for innovations, request in the technologies, we can deliver within months, as opposed to years or multi-months, when looking at the competition. All of that becomes internalized from the developer standpoint into, "I like this better, if it can bridge that gap that I now have to use this technology, in order to satisfy my business requirements."

Gardner: Perhaps another way of asking a similar question is: you are in beta now. You’re going to be coming out on March 15 with MyEclipse Blue Edition. What's the difference between MyEclipse and MyEclipse Blue Edition?

Masri: Excellent point. MyEclipse Blue Edition is inclusive of all MyEclipse professional features. It’s roughly on the order of 1,000 to 1,500 features above and beyond what the Eclipse platform provides, as well as the highly targeted functionalities that I mentioned. It can import and manage an existing project that you had previously inside WebSphere application developer and can develop to the Web services SOA standards that are specified into the WebSphere runtime.

It has much better integration into IBM code management, ClearCase technology, and almost an identical implementation of what you possibly could see inside Rational for deployment model and the ability to debug an existing project or a new project into the runtime environment.

Gardner: Developers, of course, are hard to come by in a lot of regions around the globe. There’s a lot of competition. Organizations like to keep their developers happy and productive. At the same time, they need to deal with some of the complexity issues of moving to SOA. If they're WebSphere shops, they know that they are going to be tied into that for some period of time. It does sound like you are trying to give both of these parties something to be a little bit cheery about.

Governor: The one of the things that I think is important about open source and understanding open source in the enterprise, but also more broadly. Sometimes you think about open source as a personal trainer for proprietary software companies. You've got these fat, flabby toys and they need to get a life. They need to get on the treadmill. They need to get thinner and more agile. They need to get more effective. Frankly, it was ever thus with IBM. IBM is a pretty big beast.

Let me go back to the old mainframe times to think about Amdahl as a third party. When the IBM salesperson came in, you always made sure you had an Amdahl mug on the desk, right in front of the salesperson. Obviously, we’re a few years on now, but that dynamic remains important. As much as organizations balance BEA WebLogic and WebSphere against one another, or WebLogic and JBoss Application Server against one another, you would also want a balance in your toolsets.

One interesting thing here is that because you've got the specificity around WebSphere, and the sort of value prop the third party is putting forward, you're able to start that balance, that conversation to drive innovation, to drive price down. That’s one of the really useful things that Eclipse has enabled and delivered in the marketplace. It helps to keep some of the bigger vendors honest.

Gardner: So, the need to support heterogeneity is going to remain in both tools and runtime, but we’re also facing the time when heterogeneity isn’t going to include hybrid approaches to deployment. And so, we’re seeing more people interested, particularly if they are ISVs or perhaps small- to medium-size businesses in taking advantage of some of these cloud-computing options. I'm thinking of course of Amazon and some others. Tell us, Maher, how this choice in tool and heterogeneity plays into some of these hybrid approaches of deployment in a cloud of some sort.

Masri: Let me expand on James’ point and then I’ll add to it. I just want to make sure that we’re not trying to present MyEclipse Blue as if we are trying to compete with IBM, which is really could be easily perceived there. What we see is an under-served market and people that are trying to make the decision, but cannot afford to make that decision.

There are companies that are always going to be a pure IBM shop and no one is going to be able to change their mind. The ability to provide choice is very important for those that need to make that decisions going forward, but they need some form of affordability to make that decision possible. I believe we provide that choice in spades in our current pricing model and our ability to continue to support without the additional premium above that.

Going forward, I fully agree with you that the hybrid model is very interesting, and we see it in the way that companies come back to us with very specific feedback on either MyEclipse or our Pulse product. There's quite a bit of confusion out there, in terms of how Web 2.0, Rich Internet Application (RIA), and Rich Client Application are designed and geared to provide and all the underlying technology to support that in terms of runtime.

There seems to be a dichotomy. I could go in the Web 2.0 world and provide a very rich, all Web enabled, all Web centric technologies for my end-users because I need to control my environment. The other side of that is the rich client application, where I have to have some form of a rich client implementation with full productivity applications for certain people, and I have to divorce the two because there is no way I can either rely on the Web or rely on the technologies or rely on anything else.

Everyone that we’ve talked to so far has a problem with that model. They have to have some form of very strong, rich implementation of not necessarily a very fat client, but some form of a client on the end-user’s desktop. They need to be able to control that, whether you are using very specific implementation of Web Services, talking to somebody else’s Web services, need to use a very specific persistent architecture, or have to integrate with other specific architectures. It gets very dicey very quickly.

That’s really where we saw the future of the market. This is probably not the right time to talk about this specifically, since the topic is Blue, but that’s why we also moved into the managed-application space and into our other product line called Pulse. This is for end-users who are using Eclipse-based technology right now, and in the future far more than that. They'll be able to assemble, share, deploy and manage a stack of applications, regardless of where those applications reside and regardless of the form of technology itself.

Take, for example, a rich-client runtime of Eclipse running on someone’s desktop. All of a sudden, you have a version of software that’s you can deploy and manage, but it already has an interface into a browser. You can provide other Web 2.0 and RIA models, as well as other rich Internet technology, such as a Flex and Flash. These technologies are merging very quickly, and companies have to be right there to make sure they meet those growing demands.

Gardner: It sounds like you're really talking about risk mitigation, trying to find some focal point that allows you to support your legacy, move to the rich-client and SOA activities, as well as be ready to go to what some people call Web Oriented Architecture, and take advantage of these new hybrid deployment options. Does that sound like what you're doing?

Masri: That's a fair statement.

Gardner: James, is this something that we can expect to shake out soon, or are companies going to be dealing with heterogeneity -- not just in terms of technology, but in approaches -- for some time?

Governor: We actually see an acceleration in this area -- tools and apps that span clients and the Web. I’ve taken to calling it the "synchronized Web." How can you have two different sets of services talk to one another? In terms of how you develop in that environment, you’ve got to develop conversationally. It’s about message passing. Because of that, we all are going to see some changes around the language choices.

We're seeing some interest in terms of some interesting new development languages, such as Erlang and Haskell. We are certainly seeing interest from developers in those areas.

It's like enterprise software companies not having an open-source strategy. Basically, you need one. From an economic standpoint, you just don't have a choice. Any software company that doesn’t have a thorough-going strategy for understanding and developing both for Web modes and offline modes is really missing the point.

Whether we're thinking of our clients that come from Google Gears, whether we are thinking about offline clients using an environment like Adobe's Apollo Integrated Runtime (AIR), we're already thinking about spanning clients and websites.

From an enterprise standpoint, the same choices need to be made. User expectations now are that, they are going to be able to have some of those benefits and centralization, but they are also going to be able to have rich experiences that they're used to on desktop clients.

This is a very important transition and, whether it’s Pulse or any number of the Web apps we're seeing this from, we are definitely seeing this in enterprise Web development. It's really important for us to be thinking about the implications, in terms of the language support and in terms of runtime. We've already mentioned the Amazon Web services back end. We're going to be seeing more and more of that stuff.

There’s a little company called Coghead, and it’s really focused on these kinds of areas and it’s now excellent. They've chosen Amazon Web services as a back end and they've chosen Derby Flex as a front-end to give that interactivity. The Amazon model teaches, or should teach, a lot of software companies some important lessons. When I look at developers, certainly grassroots developers, it has almost become a badge of honor that you're getting, "This is what Amazon charged me this week."

The notion of the back end in the cloud is growing in importance again. That’s probably why IBM just announced yet another one of its, "Hey, we're going to take a billion dollars and move it towards cloud-computing" kind of initiatives.

Gardner: Right. We’ve obviously seen a lot of change in the market. Organizations and enterprises that depend on an ongoing evolution on a single-stack approach need to try to come up with the tooling and framework and environment that allow them to accomplish what they need from the backwards-compatibility perspective. They also need to put themselves into as low a risk position as possible for taking advantage of these dynamic environments and the change in the economics and the landscape.

We've been talking about the transition to WebSphere Application Server 6.1 and the implications for tooling, the pending arrival of MyEclipse Blue Edition from Genuitec, helping companies find some additional choices to manage these transitions.

Helping us weed through some of this -- and I have enjoyed the conversation -- we have been joined by Maher Masri, president of Genuitec. Any last words, Maher?

Masri: Just a reminder that the Blue Edition first milestone releases will be available in February. There will be a number of milestone releases that will be available for immediate access and we encourage people to download and try it.

Gardner: Very good. And, also James Governor, co-founder and industry analyst at RedMonk. What's your parting shot on this one, James?

Governor: Let’s get specific again. Some of this has been a little bit blue sky. I think it’s very interesting that IBM is has posted a pretty good set of financial results today.

Gardner: They're not going away, are they?

Governor: They are not going away. That’s exactly right. It used to be said that IBM is not the competition; it is the environment in which you compete. It seems to me that Genuitec and many others are probably a pretty good example of that. That was well put by you. IBM isn't going away.

Gardner: Well, thanks. This is Dana Gardner, principal analyst at Interarbor Solutions. You’ve been listening to a sponsored BriefingsDirect podcast. Thanks, and come again next time.

Listen to the podcast here. Sponsor: Genuitec.

Transcript of BriefingsDirect podcast on tool choices for WebSphere shops. Copyright Interbarbor Solutions, LLC, 2005-2008. All rights reserved.