Showing posts with label Hadoop. Show all posts
Showing posts with label Hadoop. Show all posts

Tuesday, July 09, 2013

Want a Data-Driven Culture? Start Sorting Out the BI and Big Data Myths Now

Transcript of a BriefingsDirect podcast on current misconceptions about big data and how organizations should best approach a big-data project.

Listen to the podcast. Find it on iTunes. Download the transcript. Sponsor: Dell Software.

Dana Gardner: Hi, this is Dana Gardner, Principal Analyst at Interarbor Solutions and you're listening to BriefingsDirect.

Gardner
Today, we present a sponsored podcast discussion on debunking some major myths around big data. It used to be that data was the refuse of business applications, a necessary cleanup chore for audit and compliance sake.

But now, as analytics grow in importance for better running businesses and in knowing and predicting dynamic market trends and customer wants in real-time, data itself has become the killer application.

As the volumes and types of value data are brought to bear on business analytics, the means to manage and exploit that sea of data has changed rapidly, too. But that doesn't mean that the so-called big data is beyond the scale of mere business mortals or too costly or complex for mid-size companies to master.

So we're here to pose some questions -- many of them the stuff of myth -- and then find better answers to why making data and big data the progeny of innovative insight is critical for more companies.

To help identify and debunk the myths around big data so that you can enjoy the value of those analytics better, please join me in welcoming our guest, Darin Bartik, Executive Director of Products in the Information Management Group at Dell Software. Welcome, Darin. [Disclosure: Dell is a sponsor of BriefingsDirect podcasts.]

Darin Bartik: Thanks, Dana. Good to be with you.

Gardner: We seem to be at an elevated level of hype around big data. I guess a good thing about that is it’s a hot topic and it’s of more interest to more people nowadays, but we seem to have veered away from the practical and maybe even the impactful. Are people losing sight of the business value by getting lost in speeds and feeds and technical jargon? Is there some sort of a disconnect between the providers and consumers of big data?

Bartik: I'm sure we're going to get into a couple of different areas today, but you hit the nail on the head with the first question.  We are experiencing a disconnect between the technical side of big data and the business value of big data, and that’s happening because we’re digging too deeply into the technology.

Bartik
With a term like big data, or any one of the trends that the information technology industry talks about so much, we tend to think about the technical side of it. But with analytics, with the whole conversation around big data, what we've been stressing with many of our customers is that it starts with a business discussion. It starts with the questions that you're trying to answer about the business; not the technology, the tools, or the architecture of solving those problems. It has to start with the business discussion.

That’s a pretty big flip. The traditional approach to business intelligence (BI) and reporting has been one of technology frameworks and a lot of things that were owned more by the IT group. This is part of the reason why a lot of the BI projects of the past struggled, because there was a disconnect between the business goals and the IT methods.

So you're right. There has been a disconnect, and that’s what I've been trying to talk a lot about with customers -- how to refocus on the business issues you need to think about, especially in the mid-market, where you maybe don’t have as many resources at hand. It can be pretty confusing.

Part of the hype cycle

The other thing you asked is, “Are vendors confusing people?" Without disparaging the vendors like us, or anyone else, that’s part of the problem of any hype cycle. Many people jumped on the bandwagon of big data. Just like everyone was talking cloud. Everyone was talking virtualization, bring your own device (BYOD), and so forth.

Everyone jumps on these big trends. So it's very confusing for customers, because there are many different ways to come at the problem. This is why I keep bringing people back to staying focused on what the real opportunity is. It’s a business opportunity, not a technical problem or a technical challenge that we start with.

Gardner: Right. We don’t want to lose the track of the ends because the means seem to be so daunting. We want to keep our focus on the ends and then find the means. Before we go into our myths, tell me a little bit, Darin, about your background and how you came to be at Dell.

Bartik: I've been a part of Dell Software since the acquisition of Quest Software. I was a part of that organization for close to 10 years. I've been in technology coming up on 20 years now. I spent a lot of time in enterprise resource planning (ERP), supply chain, and monitoring, performance management, and infrastructure management, especially on the Microsoft side of the world.

Most recently, as part of Quest, I was running the database management area -- a business very well-known for its products around Oracle, especially Toad, as well as our SQL Server management capabilities. We leveraged that expertise when we started to evolve into BI and analytics.

I started working with Hadoop back in 2008-2009, when it was still very foreign to most people. When Dell acquired Quest, I came in and had the opportunity to take over the Products Group in the ever-expanding world of information management. We're part of the Dell Software Group, which is a big piece of the strategy for Dell over all, and I'm excited to be here.
It’s not a size issue. It's really a trend that has happened as a result of digitizing so much more of the information that we all have already.

Gardner: Great. Even the name "big data" stirs up myths right from the get-go, with "big" being a very relative term. Should we only be concerned about this when we have more data than we can manage? What is the relative position of big data and what are some of the myths around the size issue?

Bartik: That’s the perfect one to start with. The first word in the definition is actually part of the problem. "Big." What does big mean? Is there a certain threshold of petabytes that you have to get to? Or, if you're dealing with petabytes, is it not a problem until you get to exabytes

It’s not a size issue. When I think about big data, it's really a trend that has happened as a result of digitizing so much more of the information that we all have already and that we all produce. Machine data, sensor data, all the social media activities, and mobile devices are all contributing to the proliferation of data.

It's added a lot more data to our universe, but the real opportunity is to look for small elements of small datasets and look for combinations and patterns within the data that help answer those business questions that I was referencing earlier.

It's not necessarily a scale issue. What is a scale issue is when you get into some of the more complicated analytical processes and you need a certain data volume to make it statistically relevant. But what customers first want to think about is the business problems that they have. Then, they have to think about the datasets that they need in order to address those problems.

Big-data challenge

That may not be huge data volumes. You mentioned mid-market earlier. When we think about some organizations moving from gigabytes to terabytes, or doubling data volumes, that’s a big data challenge in and of itself.

Analyzing big data won't necessarily contribute to your solving your business problems if you're not starting with the right questions. If you're just trying to store more data, that’s not really the problem that we have at hand. That’s something that we can all do quite well with current storage architectures and the evolving landscape of hardware that we have.

We all know that we have growing data, but the exact size, the exact threshold that we may cross, that’s not the relevant issue.

Gardner: I suppose this requires prioritization, which has to come from the business side of the house. As you point out, some statistically relevant data might be enough. If you can extrapolate and you have enough to do that, fine, but there might be other areas where you actually want to get every little bit of possible data or information relevant, because you don't know what you're looking for. They are the unknown unknowns. Perhaps there's some mythology about all data. It seems to me that what’s important is the right data to accomplish what it is the business wants.

Bartik: Absolutely. If your business challenge is an operational efficiency or a cost problem, where you have too much cost in the business and you're trying to pull out operational expense and not spend as much on capital expense, you can look at your operational data.
There's a lot of variability and prioritization that all starts with that business issue that you're trying to address.

Maybe manufacturers are able to do that and analyze all of the sensor, machine, manufacturing line, and operational data. That's a very different type of data and a very different type of approach than looking at it in terms of sales and marketing.

If you're a retailer looking for a new set of customers or new markets to enter in terms of geographies, you're going to want to look at maybe census data and buying-behavior data of the different geographies. Maybe you want datasets that are outside your organization entirely. You may not have the data in your hands today. You may have to pull it in from outside resources. So there's a lot of variability and prioritization that all starts with that business issue that you're trying to address.

Gardner: Perhaps it's better for the business to identify the important data, rather than the IT people saying it’s too big or that big means we need to do something different. It seems like a business term rather than a tech term at this point.

Bartik: I agree with you. The more we can focus on bringing business and IT to the table together to tackle this challenge, the better. And it does start with the executive management in the organization trying to think about things from that business perspective, rather than starting with the IT infrastructure management team. 

Gardner: What’s our second myth?

Bartik: I'd think about the idea of people and the skills needed to address this concept of big data. There is the term "data scientist" that has been thrown out all over the place lately. There’s a lot of discussion about how you need a data scientist to tackle big data. But “big data” isn't necessarily the way you should think about what you’re trying to accomplish. Instead, think about things in terms of being more data driven, and in terms of getting the data you need to address the business challenges that you have. That’s not always going to require the skills of a data scientist.

Data scientists rare

I suspect that a lot of organizations would be happy to hear something like that, because data scientists are very rare today, and they're very expensive, because they are rare. Only certain geographies and certain industries have groomed the true data scientist. That's a unique blend between a data engineer and someone like an applied scientist, who can think quite differently than just a traditional BI developer or BI programmer.

Don’t get stuck on thinking that, in order to take on a data-driven approach, you have to go out and hire a data scientist. There are other ways to tackle it. That’s where you're going to combine people who can do the programming around your information, around the data management principles, and the people who can ask and answer the open-minded business questions. It doesn’t all have to be encapsulated into that one magical person that’s known now as the data scientist.

Gardner: So rather than thinking we need to push the data and analytics and the ability to visualize and access this through a small keyhole, which would be those scientists, the PhDs, the white lab coats, perhaps there are better ways now to make those visualizations and allow people to craft their own questions against the datasets. That opens the door to more types of people being able to do more types of things. Does that sum it up a bit?

Bartik: I agree with that. There are varying degrees of tackling this problem. You can get into very sophisticated algorithms and computations for which a data scientist may be the one to do that heavy lifting. But for many organizations and customers that we talk to everyday, it’s something where they're taking on their first project and they are just starting to figure out how to address this opportunity.

For that, you can use a lot of the people that you have inside your organization, as well potentially consultants that can just help you break through some of the old barriers, such as thinking about intelligence, based strictly on a report and a structured dashboard format.
Often a combination of programming and some open-minded thinking, done with a  team-oriented approach, rather than that single keyhole person, is more than enough to accomplish your objectives.

That’s not the type of approach we want to take nowadays. So often a combination of programming and some open-minded thinking, done with a  team-oriented approach, rather than that single keyhole person, is more than enough to accomplish your objectives.

Gardner: It seems also that you're identifying confusion on the part of some to equate big data with BI and BI with big data. The data is a resource that the BI can use to offer certain values, but big data can be applied to doing a variety of other things. Perhaps we need to have a sub-debunking within this myth, and that is that big data and BI are different. How would you define them and separate them?

Bartik: That's a common myth. If you think about BI in its traditional, generic sense, it’s about gaining more intelligence about the business, which is still the primary benefit of the opportunity this trend of big data presents to us. Today, I think they're distinct, but over time, they will come together and become synonymous.

I equate it back to one of the more recent trends that came right before big data, cloud. In the beginning, most people thought cloud was the public-cloud concept. What’s turned out to be true is that it’s more of a private cloud or a hybrid cloud, where not everything moved from an on-premise traditional model, to a highly scalable, highly elastic public cloud. It’s very much a mix.

They've kind of come together. So while cloud and traditional data centers are the new infrastructure, it’s all still infrastructure. The same is true for big data and BI, where BI, in the general sense of how can we gain intelligence and make smarter decisions about our business, will include the concept of big data.

Better decisions

So while we'll be using new technologies, which would include Hadoop, predictive analytics, and other things that have been driven so much faster by the trend of big data, we’ll still be working back to that general purpose of making better decisions.

One of the reasons they're still different today is because we’re still breaking some of the traditional mythology and beliefs around BI -- that BI is all about standard reports and standard dashboards, driven by IT. But over time, as people think about business questions first, instead of thinking about standard reports and standard dashboards first, you’ll see that convergence.

Gardner: We probably need to start thinking about BI in terms of a wider audience, because all the studies I've seen don't show all that much confidence and satisfaction in the way BI delivers the analytics or the insights that people are looking for. So I suppose it's a work in progress when it comes to BI as well.

Bartik: Two points on that. There has been a lot of disappointment around BI projects in the past. They've taken too long, for one. They've never really been finished, which of course, is a problem. And for many of the business users who depend on the output of BI -- their reports, their dashboard, their access to data -- it hasn’t answered the questions in the way that they may want it to.

One of the things in front of us today is a way of thinking about it differently. Not only is there so much data, and so much opportunity now to look at that data in different ways, but there is also a requirement to look at it faster and to make decisions faster. So it really does break the old way of thinking.
People are trying to make decisions about moving the business forward, and they're being forced to do it faster.

Slowness is unacceptable. Standard reports don't come close to addressing the opportunity in front us, which is to ask a business question and answer it with the new way of thinking supported by pulling together different datasets. That’s fundamentally different from the way we used to do it.

People are trying to make decisions about moving the business forward, and they're being forced to do it faster. Historical reporting just doesn't cut it. It’s not enough. They need something that’s much closer to real time. It’s more important to think about open-ended questions, rather than just say, "What revenue did I make last month, and what products made that up?" There are new opportunities to go beyond that.

Gardner: I suppose it also requires more discipline in keeping your eye on the ends, rather than getting lost in the means. That also is a segue to our next myth, which is, if I have the technology to do big data, then I'm doing big data, and therefore I'm done.

Bartik: Just last week, I was meeting with a customer and they said, "Okay, we have our Hadoop cluster set up and we've loaded about 10 terabytes of sample data into this Hadoop cluster. So we've started our big data project."

When I hear something like that, I always ask, "What question are you trying to answer? Why did you load that data in there? Why did you start with Hadoop? Why did you do all this?" People are starting with the technology first too often. They're not starting with the questions and the business problems first.

Not the endgame

You said as far as making sure that you keep your eye on the endgame, the endgame is not to spin up a new technology, or to try a new tool. Hadoop has been one of those things where people have started to use that and they think that they're off and running on a big-data project. It can be part of it, but it isn't where you want to start, and it isn’t the endgame.

The endgame is solving the business problem that you're out there trying to address. It’s either lowering costs inside the business, or it’s finding a new market, figuring out why this customer set loves our products and why some other customer set doesn’t. Answering those questions is the endgame, not starting a new technology initiative.

Gardner: When it comes to these technology issues, do you also find, Darin, that there is a lack of creativity as to where the data and information resides or exists and thinking not so much about being able to run it, but rather acquire it? Is there a dissonance between the data I have and the data I need. How are people addressing that?

Bartik: There is and there isn’t. When we look at the data that we have, that’s oftentimes a great way to start a project like this, because you can get going faster and it’s data that you understand. But if you think that you have to get data from outside the organization, or you have to get new datasets in order to answer the question that’s in front of us, then, again, you're going in with a predisposition to a myth.

You can start with data that you already have. You just may not have been looking at the data that you already have in the way that’s required to answer the question in front of you. Or you may not have been looking at it all. You may have just been storing it, but not doing anything with it.
Storing data doesn’t help you answer questions. Analyzing it does.

Storing data doesn’t help you answer questions. Analyzing it does. It seems kind of simple, but so many people think that big data is a storage problem. I would argue it's not about the storage. It’s like backup and recovery. Backing up data is not that important, until you need to recover it. Recovery is really the game changing thing.

Gardner: It’s interesting that with these myths, people have tended, over the years, without having the resources at hand,  to shoot from the hip and second-guess. People who are good at that and businesses that have been successful have depended on some luck and intuition. In order to take advantage of big data, which should lead you to not having to make educated guesses, but to have really clear evidence, you can apply the same principle. It's more how you get big data in place, than how you would use the fruits of big data.

It seems like a cultural shift we have to make. Let’s not jump to conclusions. Let’s get the right information and find out where the data takes us.

Bartik: You've hit on one of the biggest things that’s in front of us over the next three to five years -- the cultural shift that the big data concept introduces.

We looked at traditional BI as more of an IT function, where we were reporting back to the business. The business told us exactly what they wanted, and we tried to give that to them from the IT side of the fence.

Data-driven organization

But being successful today is less about intuition and more about being a data-driven organization, and, for that to happen, I can't stress this one enough, you need executives who are ready to make decisions based on data, even if the data may be counter intuitive to what their gut says and what their 25 years of experience have told them.

They're in a position of being an executive primarily because they have a lot of experience and have had a lot of success. But many of our markets are changing so frequently and so fast, because of new customer patterns and behaviors, because of new ways of customers interacting with us via different devices. Just think of the different ways that the markets are changing. So much of that historical precedence no longer really matters. You have to look at the data that’s in front of us.

Because things are moving so much faster now, new markets are being penetrated and new regions are open to us. We're so much more of a global economy. Things move so much faster than they used to. If you're depending on gut feeling, you'll be wrong more often than you'll be right. You do have to depend on as much of a data-driven decision as you can. The only way to do that is to rethink the way you're using data.

Historical reports that tell you what happened 30 days ago don't help you make a decision about what's coming out next month, given that your competition just introduced a new product today. It's just a different mindset. So that cultural shift of being data-driven and going out and using data to answer questions, rather than using data to support your gut feeling, is a very big shift that many organizations are going to have to adapt to.

Executives who get that and drive it down into the organization, those are the executives and the teams that will succeed with big data initiatives, as opposed to those that have to do it from the bottom up.
It's fair to say that big data is not just a trend; it's a reality. And it's an opportunity for most organizations that want to take advantage of it.

Gardner: Listening to you Darin, I can tell one thing that isn’t a product of hype is just how important this all is. Getting big data right, doing that cultural shift, recognizing trends based on the evidence and in real-time as much as possible is really fundamental to how well many businesses will succeed or not.

So it's not hype to say that big data is going to be a part of your future and it's important. Let's move towards how you would start to implement or change or rethink things, so that you can not fall prey to these myths, but actually take advantage of the technologies, the reduction in costs for many of the infrastructures, and perhaps extend and exploit BI and big data problems.

Bartik: It's fair to say that big data is not just a trend; it's a reality. And it's an opportunity for most organizations that want to take advantage of it. It will be a part of your future. It's either going to be part of your future, or it's going to be a part of your competition’s future, and you're going to be struggling as a result of not taking advantage of it.

The first step that I would recommend -- I've said it a few times already, but I don't think it can't be said too often -- is pick a project that's going to address a business issue that you've been unable to address in the past.

What are the questions that you need to ask and answer about your business that will really move you forward?" Not just, "What data do we want to look at?" That's not the question.

What business issue?

The question is what business issue do we have in front of us that will take us forward the fastest? Is it reducing costs? Is it penetrating a new regional market? Is it penetrating a new vertical industry, or evolving into a new customer set?

These are the kind of questions we need to ask and the dialogue that we need to have. Then let's take the next step, which is getting data and thinking about the team to analyze  it and the technologies to deploy. But that's the first step – deciding what we want to do as a business.

That sets you up for that cultural shift as well. If you start at the technology layer, if you start at the level of let's deploy Hadoop or some type of new technology that may be relevant to the equation, you're starting backwards. Many people do it, because it's easier to do that than it is to start an executive conversation and to start down the path of changing some cultural behavior. But it doesn’t necessarily set you up for success.

Gardner: It sounds as if you know you're going on a road trip and you get yourself a Ferrari, but you haven't really decided where you're going to go yet, so you didn’t know that you actually needed a Ferrari.

Bartik: Yeah. And it's not easy to get a tent inside a Ferrari. So you have to decide where you're going first. It's a very good analogy.
Get smart by going to your peers and going to your industry influencer groups and learning more about how to approach this.

Gardner: What are some of the other ways when it comes to the landscape out there? There are vendors who claim to have it all, everything you need for this sort of thing. It strikes me that this is more of an early period and that you would want to look at a best-of-breed approach or an ecosystem approach.

So are there any words of wisdom in terms of how to think about the assets, tools, approaches, platforms, what have you, or not to limit yourself in a certain way?

Bartik: There are countless vendors that are talking about big data and offering different technology approaches today. Based on the type of questions that you're trying to answer, whether it's more of an operational issue, a sales market issue, HR, or something else, there are going to be different directions that you can go in, in terms of the approaches and the technologies used.

I encourage the executives, both on the line-of-business side as well as the IT side, to go to some of the events that are the "un-conferences," where we talk about the big-data approach and the technologies. Go to the other events in your industry where they're talking about this and learn what your peers are doing. Learn from some of the mistakes that they've been making or some of the successes that they've been having.

There's a lot of success happening around this trend. Some people certainly are falling into the pitfalls, but get smart by going to your peers and going to your industry influencer groups and learning more about how to approach this.

Technical approaches

There are technical approaches that you can take. There are different ways of storing your data. There are different ways of computing and processing your data. Then, of course, there are different analytical approaches that get more to the open-ended investigation of data. There are many tools and many products out there that can help you do that.

Dell has certainly gone down this road and is investing quite heavily in this area, with both structured and unstructured data analysis, as well as the storage of that data. We're happy to engage in those conversations as well, but there are a lot of resources out there that really help companies understand and figure out how to attack this problem.

Gardner: In the past, with many of the technology shifts, we've seen a tension and a need for decision around best-of-breed versus black box, or open versus entirely turnkey, and I'm sure that's going to continue for some time.

But one of the easier ways or best ways to understand how to approach some of those issues is through some examples. Do we have any use cases or examples that you're aware of, of actual organizations that have had some of these problems? What have they put in place, and what has worked for them?
There are a lot of resources out there that really help companies understand and figure out how to attack this problem.

Bartik: I'll give you a couple of examples from two very different types of organizations, neither of which are huge organizations. The first one is a retail organization, Guess Jeans. The business issue they were tackling was, “How do we get more sales in our retail stores? How do we get each individual that's coming into our store to purchase more?”

We sat down and started thinking about the problem. We asked what data would we need to understand what’s happening? We needed data that helps us understand the buyer’s behavior once they come into the store. We don't need data about what they are doing outside the store necessarily, so let's look specifically at behaviors that take place once they get into the store.

We helped them capture and analyze video monitoring information. Basically it followed each of the people in the store and geospatial locations inside the store, based on their behavior. We tracked that data and then we compared against questions like did they buy, what did they buy, and how much did they buy. We were able to help them determine that if you get the customer into a dressing room, you're going to be about 50 percent more likely to close transactions with them.

So rather than trying to give incentives to come into the store or give discounts once they get into the store, they moved towards helping the store clerks, the people who ran the store and interacted with the customers, focus on getting those customers into a dressing room. That itself is a very different answer than what they might have thought of at first. It seems easy after you think about it, but it really did make a significant business impact for them in rather short order.

Now, they're also thinking about other business challenges that they have and other ways of analyzing data and other datasets, based on different business challenges, but that’s one example.

Another example is on the higher education side. In universities, one of the biggest challenges is having students drop out or reduce their class load. The fewer classes they take, or if they dropout entirely, it obviously goes right to the top and bottom line of the organization, because it reduces tuition, as well as the other extraneous expenses that students incur at the university.

Finding indicators

The University of Kentucky went on an effort to reduce students dropping out of classes or dropping entirely out of school. They looked at a series of datasets, such as demographic data, class data, the grades that they were receiving, what their attendance rates were, and so forth. They analyzed many different data points to determine the indicators of a future drop out.

Now, just raising the student retention rate by one percent would in turn mean about $1 million of top-line revenue to the university. So this was pretty important. And in the end, they were able to narrow it down to a couple of variables that strongly indicated which students were at risk, such that they could then proactively intervene with those students to help them succeed.

The key is that they started with a very specific problem. They started it from the university's core mission: to make sure that the students stayed in school and got the best education, and that's what they are trying to do with their initiative. It turned out well for them.

These were very different organizations or business types, in two very different verticals, and again, neither are huge organizations that have seas of data. But what they did are much more manageable and much more tangible examples  many of us can kind of apply to our own businesses.

Gardner: Those really demonstrate how asking the right questions is so important.
What we have today is a set of capabilities that help customers take more of a data-type agnostic view and a vendor agnostic view to the way they're approaching data and managing data.

Darin, we're almost out of time, but I did want to see if we could develop a little bit more insight into the Dell Software road map. Are there some directions that you can discuss that would indicate how organizations can better approach these problems and develop some of these innovative insights in business?

Bartik: A couple of things. We've been in the business of data management, database management, and managing the infrastructure around data for well over a decade. Dell has assembled a group of companies, as well as a lot of organic development, based on their expertise in the data center for years. What we have today is a set of capabilities that help customers take more of a data-type agnostic view and a vendor agnostic view to the way they're approaching data and managing data.

You may have 15 tools around BI. You may have tools to look at your Oracle data, maybe new sets of unstructured data, and so forth. And you have different infrastructure environments set up to house that data and manage it. But the problem is that it's not helping you bring the data together and cross boundaries across data types and vendor toolset types, and that's the challenge that we're trying to help address.

We've introduced tools to help bring data together from any database, regardless of where it may be sitting, whether it's a data warehouse, a traditional database, a new type of database such as Hadoop, or some other type of unstructured data store.

We want to bring that data together and then analyze it. Whether you're looking at more of a traditional structured-data approach and you're exploring data and visualizing datasets that many people may be working with, or doing some of the more advanced things around unstructured data and looking for patterns, we’re focused on giving you the ability to pull data from anywhere.

Using new technologies

We're investing very heavily, Dana, into the Hadoop framework to help customers do a couple of key things. One is helping the people that own data today, the database administrators, data analysts, the people that are the stewards of data inside of IT, advance their skills to start using some of these new technologies, including Hadoop.

It's been something that we have done for a very long time, making your C players B players, and your B players A players. We want to continue to do that, leverage their existing experience with structured data, and move them over into the unstructured data world as well.

The other thing is that we're helping customers manage data in a much more pragmatic way. So if they are starting to use data that is in the cloud, via Salesforce.com or Taleo, but they also have data on-prem sitting in traditional data stores, how do we integrate that data without completely changing their infrastructure requirements? With capabilities that Dell Software has today, we can help integrate data no matter where it sits and then analyze it based on that business problem.

We help customers approach it more from a pragmatic view, where you're  taking a stepwise approach. We don't expect customers to pull out their entire BI and data-management infrastructure and rewrite it from scratch on day one. That's not practical. It's not something we would recommend. Take a stepwise approach. Maybe change the way you're integrating data. Change the way you're storing data. Change, in some perspective, the way you're analyzing data between IT and the business, and have those teams collaborate.
But you don't have to do it all at one time. Take that stepwise approach.

But you don't have to do it all at one time. Take that stepwise approach. Tackle it from the business problems that you're trying to address, not just the new technologies we have in front of us.

There's much more to come from Dell in the information management space. It will be very interesting for us and  for our customers to tackle this problem together. We're excited to make it happen.

Gardner: Well, great. I'm afraid we'll have to leave it there. We've been listening to a sponsored BriefingsDirect podcast discussion on debunking some major myths around big data use and value. We've seen how big data is not necessarily limited by scale and that the issues around  it don't always have to supersede the end for your business goals.

We've also learned more about levels of automation and how Dell is going to be approaching the market. So I appreciate that. With that, we'll have to end it and thank our guest.

We've been here with Darin Bartik, Executive Director of Products in the Information Management Group at Dell Software. Thanks so much, Darin.

Bartik: Thank you, Dana, I appreciate it.

Gardner: This is Dana Gardner, Principal Analyst at Interarbor Solutions. Thanks also to our audience for joining and listening, and don't forget to come back next time.

Listen to the podcast. Find it on iTunes. Download the transcript. Sponsor: Dell Software.

Transcript of a BriefingsDirect podcast on current misconceptions about big data and how organizations should best approach a big-data project.  Copyright Interarbor Solutions, LLC, 2005-2013. All rights reserved.

You may also be interested in:

Tuesday, June 11, 2013

HP Experts Analyze and Explain the HAVEn Big Data News From HP Discover Conference

Transcript of a BriefingsDirect podcast on how HP's new HAVEn Initiative puts the full power and breadth of big data in the hands of companies.

Listen to the podcast. Find it on iTunes. Download the transcript. Sponsor: HP.

Dana Gardner: Hello, and welcome to the next edition of the HP Discover Performance Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your moderator for this ongoing discussion of IT innovation and how it’s making an impact on people’s lives.

Gardner
Once again, we're focusing on how IT leaders are improving their services' performance to deliver better experiences and payoffs for businesses and end users alike, and this time we're coming to you directly from the HP Discover 2013 Conference in Las Vegas. [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]

We're here in the week of June 10 and we are now joined by our co-host, Chief Evangelist at HP Software, Paul Muller. Welcome, Paul.

Paul Muller: Dana, I'm surprised your voice is holding out after this week.

Gardner: Right, it’s been quite busy. There has been a lot said about big data in the last year and HP has made an announcement for a broader vision for businesses that gained actionable intelligence from literally a universe of potential sources and data types.

We're now joined by two additional HP executives to explore the implication and business values from the HAVEn news at Discover. Please join me now in welcoming our guests. First is Chris Selland, Vice President of Marketing at HP Vertica. Welcome, Chris.

Chris Selland: Thanks Dana, it’s great to be here. It's great to work with you again, Paul, and I'm really looking forward to this.

Gardner: And we're joined by Tom Norton, Vice President for Big Data Technology Services at HP. Welcome, Tom.

Tom Norton: Hello, Dana.

Gardner: Let’s go to Chris first. Fairly recently, only critical data was given this high-falutin' treatment for analysis, warehousing, applying business intelligence (BI) tools, making sure that it was backed up and treated almost as if it were a cherished child.

But almost overnight, the savvy businesses, those who are looking for business results, are more interested in all the data or more information of any kind so that they can run their businesses and find inferences in the areas that they maybe didn’t understand or didn’t even know about.

So what do you think has happened? Why have we moved from this BI-as-sacred ivory tower approach to now more pedestrian?

Competitive issue

Selland: First-and-foremost, it’s really that it’s become a competitive issue. Competitiveness issue might be a better way to say it. Just about every company will pay attention to their customers.

Selland
You can tell senior management that this data is important. We're going to analyze it and give you insights about it, but you start realizing that we have an opportunity to grow our business or we're losing business, because we're not doing a good enough job, or we have an opportunity to do better job with data.

Social media has been the tip of the arrow here, because just about all industries all of a sudden realize that there is all data out there floating around. Our customers are actually talking to each other and talking about us, and what are we doing about that? That’s brought a lot of attention above and beyond the CIO and made this an issue that the CMO, the CFO, the COO, the CEO start to care about.

We’ll drill down on this, as we go through the discussion today. Big data is about far more than social media, but I do think social media gets a lot of the credit for making companies pay a lot more attention. It's, "Wait a minute. There is all this data, and we really need to be doing something with this."

Gardner: Paul Muller, as you travel around the world and speak with businesses and governments, are you seeing a shift in the way that people perceive of data as an asset or have they shifted their thinking about how they want to exploit it?

Muller: At the risk of reaching for the third rail here, which is the kind of a San Francisco West Coast joke, in the conversations that I'm having consistently around the globe, executives, both CIOs, but also non-IT executives, are realizing that big data is probably not the most helpful phrase. It’s not the size of the data that matters, but it’s what you do with it.

Muller
It’s about finding the connections between different data sets to help you improve competitiveness, help you improve efficiency if you are in the public sector, help you to detect fraud pattern. It's about what you do with the data in that connected intelligence that matters.

To make that work, it’s about not just the volume of data. That certainly helps, not having to throw out my data or overly summarize it. Having high-fidelity data absolutely helps, but it’s also the variety of data. Less than 15 percent of what we deal with on a daily basis is in structured form.

Most of the people I meet are still dealing with information in rows and columns, because traditionally that’s what a computer has understood. They’ve not built the unstructured things like video, audio, images, and for that matter social, as Chris just mentioned.

Finally, it’s about timeliness. Nobody wants to might be making tomorrow’s decision with last week’s data, if that makes sense. In other words, with a lot of the decisions we have to make, it’s usually done through a revision mirror, which is not helpful, if you're trying to operate today’s thoughts as well.

Variety of systems

Gardner: Chris, it seems as if we have more interest, more business activities, and more constituencies within businesses looking for inputs that help them make decisions or analysis. But we’ve got a variety of systems. We’ve got relational databases, flat files, and all sorts of social APIs that we can draw on.

How do you make sense of this? Is there a common thread now? Is there a way for us to think about data beyond the traditional IT definition of data, and what does that mean for actually then getting access and managing it?

Selland: To pick up on what Paul was saying. I have a love-hate relationship with the term "big data." The love part is the fact that it really has been adopted. People gravitate to it and are starting to realize that there is something here they need to pay attention to. And that’s not just IT.

It’s funny because if you go to something like Wikipedia and you look for the origins of the term "big data," you’ll actually find that in IT circles, we've been talking about big data for about a dozen years. There are probably five or six different people. There is a discussion on Quora, you can look it up if you are interested in the creation of the term which was about a dozen years ago.

As a matter of fact, this is the problem that Vertica was created to solve. It was that, as this big data thing became real, which it is now, traditional databases would be unable to handle it. So the good news is that there has been a recognition in business circles outside the CIO -- the CMO, the COO, and the CFO -- that has just started to happen in the last 18 to 24 months, in a big way.
The love part is that people are paying attention to big data. The hate part is that it’s much more than “big”.

The love part is that people are paying attention to big data. The hate part is that it’s much more than “big”.

I like the Doug Laney definition of big data. Doug is an analyst who is now at Gartner Group, although when he coined term, he was actually at another firm. He said it is the 3Vs -- volume, velocity, and variety. Volume is a part of it and it’s certainly about big.

But as Paul was just talking about, there is also a tremendous variety these days. We've already talked a little bit about social media, but the fact that people equate "social media" with "big data" is another pet peeve of mine.

Social media is driving big data, but it’s only a very small part of it. But it’s an important part, because it’s what’s brought a lot of that other attention. You're looking at audio, video, and all of this user-created content and such, and there is such a variety. Then, of course, it’s coming in so fast. Then, we’d like to sometimes add the forth V, which is value. How is this all going to make money for me? What do we do about this strategically as a business.

So there is just a lot going on here and this is really what’s driven the HAVEn initiative and the HAVEn strategy. We have this tremendous portfolio of assets here at HP from software to hardware to services and HAVEn is about putting that portfolio behind these different analytic engines – Vertica, IDOL, Logger, and Hadoop - that complement each other and their ability to integrate and build solutions.

Broad strategy

So how do we bring this together under a single broad strategy to help companies and global enterprises get their hands around all of this, because it’s a lot more than big? Big data is great. It’s great that the term is taken off, but it’s a lot bigger than that.

Gardner: All right. Before we go into the HAVEn announcement, I’d like to remind our readers and listeners that there is a lot of information available, if they search online for HP, HAVEn, or HP Discover 2013. But before we go there, let’s go to Tom Norton.

We've been talking about data, big data, the movement and shift in the market, and we also find ourselves talking about platforms and certain types of data format and technologies, but there is more than that. It seems that if we're going to change these organizations so that they use data more effectively, we need to go beyond the technology. Give me an idea from the technology services' perspective of what also needs to be considered when we go about these shifts in the market.

Norton: When you think about a data platform, that’s not new. Both Paul and Chris mentioned that data platforms and data analysis have been around for years, but this is a shift. It is different in a number of ways: We mentioned velocity, volume, and variety, but there is also a demand, as Chris mentioned, to have this access to information faster.

Norton
The traditional systems or platforms that IT is used to providing are now becoming legacy. In other words, they're not providing the type of service level to meet the workload demands of the organization. So IT is faced with the challenge of how to transform that BI environment to more of a data refinement model or a big data ecosystem, if you want to still hang on to big data as a term.

IT is challenged there, and the goal overall is to be able to provide that service level that Paul mentioned to be able to support through timeliness, and the type of actions the business wants to take. So the business is now demanding an action from IT.

The ability to respond quickly to this platform transformation is what we want to help our customers do from our technology services' perspective. How can we speed the maturity or speed the transformation of those traditional BI systems which are more sequential and more structured to be able to deal with the demands of the business to have relevant and refined information available to them at the time they need it, whether it’d be 1.5 seconds or 15 hours.

The business needs the information to be able to compete and IT needs to be able to adapt, to have that kind of flexible, secure, and high-performing platform that can deal with the different complexities of raw data that’s available to them today.

Gardner: Tom, on other programs, we’ve talked about application modernization and application transformation. We're following a similar trajectory with data. We're bringing in more data types, but we don’t necessarily want to assimilate them into a common warehouse or format. We're looking to do integration with the data, do hybrid activities with the data, buy-and-sell data, or barter it. It’s really transformed data.

It used to be that the way data came about was as a refuge from the application. So is the role of services for managing the data continuum and lifecycle similar to what we did with applications over the past 10 years?

Similar to cloud

Norton: I think it's similar It’s actually very similar to cloud in some ways, when you think of a platform which enables a service. When you consider the models that people are looking at today concerning cloud, there is a maturity reality that goes with it. We start with a platform and then you start looking at the service-level catalogs, automation, and security, and then you look at the presentation layers.

Data platforms are exactly the same. You have to take what was the very singular service that was offered and start looking at more complex content. So you have to consider data sources, which could come from many different places. You have to consider data source from a cloud, from a traditional BI system, or from other data sources within the organization.

Acquiring data in that context has to be considered. Then, as was mentioned earlier, you have to consider that processing and the service levels for processing of that raw material to produce refined information that’s useful.

And that’s very similar to when you start thinking about what cloud would do. Like the performance from a presentation perspective of how quickly the environment is able to deliver an app, is very similar in terms of presenting information that can be useful to the business. Then you have to look at the presentation format.
You have to consider data source from a cloud, from a traditional BI system, or from other data sources within the organization.

We've had discussions about mobile users, for example, on how social media not only produces information, but there are expectations from mobile users today of how they can get access to it. Considering that format, it's very similar to what we've done in terms of applications and very similar to the approach that you need to take. When you look at a cloud platform, you have to look at that.

Data is unique in that it is both the platform and the service. It’s slightly different than cloud at least in that way, where you're presenting services from that. Data is unique because there is a specialized platform that needs to be integrated, but you have to consider the information service that’s presented and approach it like you would in application. It’s a really interesting approach and an interesting transformation for IT.

Gardner: Chris Selland, let’s get back to the news of the day of the HAVEn initiative, the HAVEn vision. Tell us in a nutshell what it is, what it includes, and then we can talk about what it means.

Selland: I talked about the tip of the spear before. In this case the tip of the spear are our analytic engines, our analytic platforms, the Vertica Analytics Platform, Autonomy IDOL, ArcSight Logger. HAVEn is about taking this entire HP portfolio and then combining those with the power of Hadoop.

We have been talking about our open partnership. There are a number of Hadoop distributions, and we support them all. It's taking that software platform, running it on HP’s Converged Infrastructure, wrapping HP’s services around it, and then enabling our customers, our consultants of course, our channel partners, our systems integrators, and our resellers to build these next-generation analytic-enabled solutions and big-data analytic enabled solutions that customers need.

I keep talking about big data is in a classic crossing-the-chasm moment -- for those of you who have read the book, and while I don't want to do a primer on the book, it’s basically about when the attention around this topic starts to shift, and of course IT still remains very much at the center, but now it becomes a business-enabler.

Changing the business

It’s when technology starts to change the business, and that’s what’s going on right now. When you're talking to businesspeople, you can't talk about platforms and you can’t talk about speeds and feeds. When you say Hadoop to a businessperson they usually say, "God bless you," these days.

You have to talk about customer analytics. You have to talk about preventing fraud. You have to talk about being able to operationally be more effective, more profitable, and all of those things that drive the business. It really becomes more-and-more a solutions discussion.

HAVEn is the HP platform that provides our customers, our partners, and of course, our consultants, when our customers choose to have us do it for them, the ability to deliver these solutions. They're big-data solutions, analytic-enabled solutions. They're the solutions that companies, organizations, and global enterprises need to take their businesses forward and to make their customers more satisfied to become more profitable. That's what HAVEn is all about, the fundamental story behind the HAVEn initiative.

Gardner: It’s very interesting and fascinating to think about these working in some sort of concert. When I first looked at the announcement and heard the presentations, I thought, "Oh ArcSight. Isn’t that an odd man out? Isn't that an outlier?

Why, in your understanding, would having great insights to all the data from your system be something relevant to alter the data that you're driving from your applications, your outside data sources, your customer interactions, the social media, the whole kit and caboodle. Help me understand better why ArcSight is actually a good partner?
Even though social media has been the tip of the spear here for business attention around big data, it’s much, much bigger than that.

Selland: It really goes back to what I said earlier, that even though social media has been the tip of the spear here for business attention around big data, it’s much, much bigger than that. One of the terms that people are starting to hear now, and you're going to hear a lot more about, is the "Internet of things."

There are various third-party estimates out there that within the next few years, there are going to be about 150 sensors per person worldwide, and that number is going to keep growing. Think about all the things that go on in your car, on a factory floor, in a supply chain.

We tend to think about the fact that everybody is walking around with a computer in their pocket these days, a smartphone, but that’s not just communicating with you. It’s communicating with the network to provide quality of service, to monitor what’s going on, to obviously manage your calls and your downloads, and everything else.

There's so much data flowing around out there. The Logger Engine essentially reads and interprets and connects to all of these different sources, various types of machines, system log files, and real-time data as well. It’s not just about being able to interpret social media. It’s being able to pull in all of these different data types.

As the internet of things grows, and the sensors go everywhere, McKinsey estimates that, just to give a tangible example, a typical jet engine throws off about two terabytes per hour of data. What do you do with all that data? How do you manage that data?

Internet of things

Think about all of our IT systems, all of our physical systems, all of our network systems. Think about all these sensors that are in this Internet of things. It’s becoming huge and the ability to process this data from machines, systems, and log files is a huge, huge part of this.

Gardner: Paul Muller, we understand now that we can bring Hadoop benefits to Autonomy's breadth and depth of information, unstructured information to Vertica, speed and ability to do analytics very rapidly and efficiently to ArcSight with machine and other data. How do you take this out to an enterprise, a C-class group of people, and make them understand that you are, in fact, giving them some tools that really weren’t available before, and certainly weren’t cobbled together in such a way? How do you put this in business terms so they can get just how powerful this really is?

Muller: Dana, did you just say Hadoop?

Gardner: I did.

Muller: Bless you.

Selland: Well played.

Muller: Had to be done, Chris. That’s ultimately the question. Let me just give you an example that we talk about and that I share with people quite frequently, and it usually generates a bit of a smirk. We’ve all been on the telephone and called a company or a public service, where you've been told by the machine that the call will be monitored for quality of service purposes. And I am sure we’re all thinking, "Gosh, if only."

The scary part is that all those calls are recorded. They're not only recorded, but they're recorded digitally. In other words, they're recorded to a computer. Much like the airline example that Chris just gave, almost all of that data is habitually thrown away, unless there is an exception to the rule.
What we're able to do with the HAVEn announcement is combine those concepts into one integrated platform.

If there is a problem with the flight or if there is some complaint about the call that escalates the senior management, they may eventually look at it. But think about how much information, how much valuable insight is thrown away on a daily basis across a company, across the country, across the planet. What we've aimed to do with HAVEn is liberate that information for us to find that connected intelligence.

In order to do that, we get back to this key concept that you need to be able to integrate telemetry from your IT systems. What’s happening inside them today? For example, if somebody to send an email to somebody outside of the company, that typically will spawn a question that asks who they send that email to? Was there an attachment there? Is it a piece of sensitive information or not? Typically that would require a person to look at it.

Finally, it's to be able to correlate patterns of activity that are relevant to think about revenue, earnings, or whatever that might be. What we're able to do with the HAVEn announcement is combine those concepts into one integrated platform. The power of that would be something like in that call center example. We can use autonomy technology to listen to the call, to understand people's emotions, and whether they’ve said, "If you don't solve this problem, I'm never going to buy from you again."

Take that nugget of information, marry that to things like whether they are a high net worth customer, what their spending patterns have been, whether they're socially active, are they more likely to tell people about their bad experience, and correlate that all in real-time to help give you insight. That's the sort of being the HAVEn can do it, and that's a real world application that we're trying to communicate in business.

Norton: I want to echo that. I have one more example of what Paul has just indicated. Take healthcare, for example. We're working with the healthcare providers. There are some three-tier healthcare providers. A major healthcare organization could have as many as 50 different business units. These separate business units have their own requirements for information that they want to feed to hospital systems.

Centralized structure

So you have a centralized organizational IT structure. You have a requirement of a business unit within the organization that has its own processing requirement, and then you have hospital systems that buy and share information with the business unit.

Think about three-tiered structure and you think of some of the component pieces that HAVEn brings to that. You have IT which can manage some of those central systems that becomes that data lake or data repository, collecting years and years of historical healthcare information from the hospital systems, from the business units, but also from the global healthcare environment that could be available globally.

IT provides this ecosystem around the data repository that needs to be secured, and and that data pool needs to be governed.

Then, you combine that with information that's coming publicly and needs to be secured. You have those corner pieces which are natural to the Hadoop distributed system inside that data lake that keeps that repository of healthcare information.

The business unit has a requirement because it wants to be able to feed information to the healthcare providers or the hospital systems, and to collect from them as well. Their expectations of IT is that they may need instant response. They may need a response from a medical provider in seconds, or they may look at reporting on changes in healthcare in certain environmental situations that are creating change in healthcare. So they might get daily reporting or they might have half-day reporting.
That's what's driving IT, because they need that very flexible and responsive data repository.

Within HAVEn, you look at Vertica, to drive that immediate satisfaction of that query that comes from the hospital system. Combine that with Hadoop and combine that with the kind of data-governance models that Autonomy brings. Then, look at security policies around the sensors from patients that are being sent to that hospital system. That combination is a very powerful equation. It's going to enable that business to be very successful in terms of how it handles information and how it produces it.

When we start looking at that integration of those components, that's what's driving IT, because they need that very flexible and responsive data repository that can provide that type of insight that the hospital systems need from that from the business unit that's driving the healthcare IT organization itself.

Those are the fits even in a large enterprise, where you can take that platform and apply it in an industry sense, and it makes complete sense for that industry overall.

Gardner: Chris Selland, I think about what companies, governments, and verticals like healthcare, the leaders and innovators in those areas, can do with this. It could really radically change how they conduct their businesses, not by gut, not by instinct, not by just raw talent, but by empirical evidence that can be then reestablished and retested time after time. It strikes me that it's a fundamentally different value that HP is bringing to the market.

HP has, of course, been a very large company with a long heritage, but are we really stepping outside of the traditional role that HP has played? It sounds as if HP is becoming a business-services company, not a technology services company. Correct me if I'm wrong.

Bridging the gap

Selland: Yes and no. First of all, we do need to acknowledge that there is a need to bridge the gap between the IT organization and the business organization, and enable them to talk the same language and solve problems together.

First of all, IT has to become more of an enabler. Second, and I mentioned this earlier and I really want to play this up, it's absolutely an opportunity for our partners. HP has a number of assets, but one of our greatest assets is HP's partner network -- our partner ecosystem, our global systems integrators, our technology partners, even our services providers, our training providers, all of the companies that work in and around the global HP.

We can't know every nuance of every business at HP. So the HAVEn initiative is very much about enabling our partners to create the solutions we're creating. We're using our own platform to create solutions for the core audiences that we serve, which in many cases, are things like IT management solutions or security solutions which are being featured and will continue to be featured.

We're going to need to get into all of these different nuances of all of these different industries. How do these companies and organizations compete with each other in particular verticals? We can’t possibly know all of that. So we're very reliant on our partners.

The great news is we have, we have what I believe, is the world's greatest partner network and this is very much about enabling those partners and those solutions. In many cases, those solutions will be delivered by partners and that’s what the solutions are all about as well.
We have what I believe, is the world's greatest partner network and this is very much about enabling those partners and those solutions.

Gardner: Just to drill down on that a bit, if there are these technologies that are available to these ecosystems within verticals and attacking different business problems, what's the next step with HAVEn? Now that we put together the various platforms, given the whole is greater than the sum of the parts in terms of a business value, what's the vision beyond that to making these usable, exploitable?

Are there APIs and tools or is that something also that you are going to look to the partners for, or both? How does it work in terms of the go to market?

Selland: There absolutely are APIs and tools. We need to prime the pump, to some degree, with building and creating some of our own solutions to show what can be done in the markets we serve, which we're doing, and we also we have partners on board already.

If you look at the HAVEn announcements, you'll see partners like Avnet and Accenture and other partners that are already adopting and building HAVEn-based solutions. In many cases, we've started delivering to customers already.

It's really a matter of showing what can be done, building what can be built, and delivering them. I mentioned earlier the crossing-the-chasm moment we're having. The other thing that happens, when you get into this market, is you're moving from its being purely a CIO decision to where the business starts getting involved.

Great ROI

There is great return on investment (ROI), there's this big data analytic solution we're going to enable, and we are going to build to deliver better customer loyalty. We are going to better customer retention and lower churn. The first thing I need to say is, "Okay, show me the numbers, show me the money." Those are Jerry Maguire terms, and the best way to do that is show examples of other companies that have done it.

So you run into a situation where you need to be able to show who is doing it, how they're doing it, and how they're making money with it. You've got to get that early momentum, but we're already in the process of getting it, and we've already got partners on board. So we're really excited.

Gardner: Tom Norton, what are your thoughts about my observation that this takes HP to a different plane in terms of the level of value it can bring to a business, and then perhaps some additional thoughts based on what Chris said in terms of how this fits into a value chain?

Norton: You can take two separate perspectives, but you can't separate them. In order for my group, TS, to be able to help IT transform, IT has to be aligned to that business decision anyway, or they have to be aligned to the business requirements and the workloads that business may be presenting.

For me to help to build an integration plan or to build a design for a data platform like this transformation of a data platform, I have to have some idea of what the workload requirements may be from the business. I have to know if the business is trying to do something that's going to require an immediate type of satisfaction, or they are going to do something that can be done in more of a batch format.
I have to have some idea of what the workload requirements may be from the business.

Those expectations of a business in terms of when they want to be presented with that business aligned information, that's going to determine short term and midterm what IT needs to do.

You can't separate those two, especially when we're starting to drive and accelerate the kind of format and the kind of workloads that businesses may need. You may get requirements from 20 different businesses and each business may have 10 different business requirements that they have in terms of the presentation of information.

So how can we get to the point where we can separate from the business, the view of what IT is doing? The business shouldn't need to know about Hadoop, as Chris mentioned earlier. They shouldn't need to know how Hadoop is integrated with Vertica, integrated with Autonomy, or how the three are combined and secured, but they should have an expectation that they're going to get the information that they need at the time they need it.

We really can't design a platform, unless we know that spectrum, and how we can create a road map for how to resolve that and how to mature it. So we have to know that, and the second part is going to be, as you've mentioned before, from how the business needs to access it.

Flexible technology

If the business is going to a more distributed, a remote, or a mobile type of workforce or mobile access, our design requirements for IT have to be for the infrastructure. The technology has to be flexible enough to deliver information to those consumption formats.

If you're dealing with finance, for example, and you're going to have a sales force selling capital investments to their largest investors, a $100 million a year investors, the expectation of those salespeople to that investment model is that they can provide their customers -- probably the most important customers that that finance organization has -- information within 15-30 minutes. That's the time that the salesperson is talking to them about what may be happening with their portfolio.

Think about how complex that can be. You have to access social media, as was brought up earlier, and be able to get information on Twitter feed so that they can provide a meaning-based analysis on how this stock portfolio is being reflected in the market.

To get that in that time frame of 0-30 minutes requires a different design, than someone who is going to look at market reporting trends over a 24-hour period and present that each morning. So it’s very important that we have that alignment between technology and business, and unless we can understand both, we're not going to be able to drive that road map in the direction that's going to satisfy the business requirements.

Gardner: Paul Muller, when we think about the value to the business, and we recognize that IT is in the middle between when data is analyzed and inferences are gathered, acting on those inferences and putting them into place perhaps goes back in through IT.
It seems to me that HP is in a unique situation now by pulling together these different data analysis types.

There are applications that need to be addressed. There are mobile devices that need to be reached. It seems to me that HP is in a unique situation now by pulling together these different data analysis types, making it available in a holistic context, but also being a provider of the means to then be actionable, to create applications, to populate applications, and to allow IT to be the traffic cop on this two-way street or multi-way street.

Tell me how HP is differentiated. Given what we've now seen with the HP Discover announcements with cloud, with converged infrastructure and with HAVEn, give us a bit more of an understanding of how HP is uniquely positioned?

Muller: Dana, you made such a great point. Insight without action is a bit like saying that you have a strategy without execution. In other words, it’s pretty close to hallucination, right?

The ability to take that insight and then reflect that into your business rapidly is critical. I have a point of view that says that almost every enterprise is defined by software these days. In other words, when you make an insight and you want to make a change, you're changing the size. If you are Mercedes, you're changing one of the 100 million lines of code in your typical S class. Some of the major based around the planet now hire more programmers than Microsoft has working on Windows today.

Most companies are defined by software. So when they do get in an insight, they need to rapidly reflect that insight in the form of a new application or a new service, it’s typically going to require IT.

Absolutely critical

Your ability to quickly take that insight and turn that into something a customer can see, touch, and smell is absolutely critical, and using technique like Agile delivery, increasing automation levels, DevOps approaches, are all critical to being able to execute to get to that.

I would like to come back up to Chris’ response to just touch on a conversation I had with a CIO last week, where he said to me, "Paul, my problem is actually not about big data. It’s great, and we’ve got it, but I still can’t work out what to do with it. We should have a conversation about innovation in the profits of big data." So, Chris, do you want to maybe take Dana’s question?

Selland: It’s really, first of all, our focus. It's not just big data, but helping our customers be successful in leveraging big data is a core focus and a core pillar of HP strategy. So first of all it’s focus.

Second of all, it’s breadth. I talked about this earlier, so I don’t want to repeat myself too much. The software, hardware, and converged cloud assets, capabilities of services, and of course their service’s portfolio -- all of the resources that the global HP brings to bear -- are focused on big data.

And it’s also the uniqueness. Obviously, being an HP Software Executive, I'm most familiar with the software. If you really look at it, nobody, none of HP’s competitors, has anything like Vertica. None of HP’s competitors have anything like IDOL. None of HP’s competitors has anything like ArcSight Logger. None of HP's competitors has the ability to bring those assets together and get them interoperating with each other and get them solving problems and building solutions.
Your ability to quickly take that insight and turn that into something a customer can see, touch, and smell is absolutely critical.

Then, you take our partner channel, wrap it around that, and you combine it with the power of open-source industry initiatives like Hadoop. HP has very much openness of the core of everything we're doing. We have all sorts of partners helping and supporting us around here.

I haven’t even talked about technology partners, BI, or visualization partners. We're partnering with all of the major Hadoop distribution. So there is just tremendous breadth and depth of resources focused on the problem. At the end of the day, it really is about execution, because that’s the other thing that I talked about earlier, customers. They want to hear big ideas and they want to know how technology helps them get there, but they also want to see proof points.

Muller: Let’s just start from that. Chris, maybe we'll finish on a slightly controversial note here, but it’s worth talking about. Then, maybe this is potentially a good segue to Tom. I met with a CIO again. I was speaking to some of our listeners and met with some CIOs in South Africa a couple of weeks back. This head of manufacturing turned to me and said, "You know, Paul, I understand big data technology is there, I understand. I can pretty much ingest this. At least the potential is there that I can.

"What I'm not sure is, in my industry, how does it matter to me? Don’t just talk to me about technology. How can I turn that into a justifiable business case that the business will want to invest in?" And it kind of struck me that the technology in some respect is slightly ahead of our customer’s ability to think of themselves as innovators rather than as infrastructure managers.

Part of the problem

Selland: You certainly just defined part of the problem. There is no one-size-fits-all big-data-in-a-box solution, because the answer to that question is something that you really need to have a significant understanding of the business and it’s really a consultative question, right?

You’ve got to have a broad enough portfolio to know that you’ve got the confidence and the assets to eventually solve the problem, but at the same time start with understanding the problem, the industry, and solutions. This is where our service is, and this is where our partner ecosystem comes into play. And having the breadth of the portfolio of software/hardware and cloud services to be able to deliver on it is really what’s it’s all about, but there is no one-size-fits-all answer to the question we just asked.

Gardner: Tom Norton, when we think about the observation that the technology is getting a bit out in front of what the businesses understand they can do with it, it sounds like a really good opportunity for a technology consultant and a technology services organization to come in. It sounds as if you have to bring together disparate parts of companies.

We talked about developers. If the people are allowing for analytics to develop wonderful insights, but they’ve never really dealt with the App Dev people, and the App Dev people have never really dealt with the BI people, what do we need to do to try to bring them together? In your company, how would you go about bringing them together so that as insights develop, new ways of delivering those insights to more people and more situations are possible? I guess we're talking about cultural shifts here?
There is no one-size-fits-all big-data-in-a-box solution.

Norton: HP actually has, from a services' perspective, a unique approach to this. You've seen it before in the cloud and you've seen it before in the days of IT transformation, where we started looking at that transformation experience.

HP has developed these workshops over time. They bring IT together with the business to help IT build a plan for how it's going to address the business needs and pull out from the business what the business requirements of IT will be.

It’s no different, now that we're in the data world. Through our services' groups within HP, we have the ability from an information management and analytics approach to work with companies to understand the business value that they're trying to drive with information, and ideally try to understand what data is available to them today that is going to provide that business aligned information.

Through the Big Data Discovery Experience workshops, we're able to ask, "What is the business I am capable of doing with the data they have available to them today, and how can that be enhanced with alternative data sources that may fall outside of the organization today?"

As we mentioned earlier, it’s that idea of what can be done. What's the art of the possible here that is going to provide value to the organization? Through services we can take that all the way down, then say, now once you have got the idea, that says I’ve got a road map for analytical value and the management of the information that we have, and we could have made available to the businesses.

Then, you can align that, as I mentioned before, through IT strategies where you do the same thing. You align the business to IT and ask how IT is going to be able to enable those actions that the business wants to take on that information.

Entire lifecycle

So there's an entire lifecycle of raw material data to business-aligned and business-valued information through a service’s approach, through a consultative approach, that HP is able to bring to our customers.

That’s unique, because we have the ability through that upfront strategy from business value of information to the collection and refinement of raw materials and meeting in the middle in this big data ecosystem. HP can supply that from end to end, all the way from software to hardware to services, very unique.

Muller: I’ve got to summarize this by saying that the great part about HAVEn is that you can pretty much answer any question you could think of. The challenge is whether you can think of smart questions to ask.

Gardner: I think that’s exactly the position that businesses want to be in -- to be able to think about what the questions are to then propel their businesses forward.

Selland: Let me give you a tangible example that I was reading about not long ago in The Wall Street Journal. They were talking about how the airline industry is starting to pay attention to social media. Paul talked before about intersections. What do we mean by intersections?
The great part about HAVEn is that you can pretty much answer any question you could think of. The challenge is whether you can think of smart questions to ask.

This article in The Wall Street Journal was talking about how airlines are starting to pay attention to social media, because customers are tweeting when they're stuck at the airport. My flight is delayed, and I am upset. I'm going to be late to go visit my grandmother -- or something like that.

So somebody tweets. Paul tweets "I'm stuck at the airport, my flight is delayed and I am going to be late to grandma’s house." What can you really do about that besides respond back and say, "Oh, I'm sorry. Maybe I can offer you a discount next time," or something like that? But it doesn’t do anything to solve the problem.

Think about the airline industry, customer loyalty programs or frequent-flyer programs. Frequent-flyer programs were among the first customer loyalty problems. They have all this traditional data, as well which some might call customer relationship management (CRM). In the airline industry, they call it reservation systems.

I gave the example before about a jet engine throwing off two terabytes of data per hour. By the way, on any flight that I'm on, I want that to be pretty boring data that just says all systems are go, because that’s what you want.

At the same time, you don’t want to throw it away, because what if there are blips, or what if there are trends? What if I can figure out a way to use that to do a better job of doing predictive maintenance on my jets?

Better job

By doing a better job of predictive maintenance on my jets, I keep my flights on time. By keeping my flights on time, then I do a better job of keeping my customers satisfied. By keeping my customers more satisfied, I keep them more loyal. By keeping my customers more loyal, I make more money.

So all of this stuff starts to come together. You think about the fact there is a relationship between these two terabytes per hour of sensor data that’s coming off the sensors on the engine, and the upset customers, and social media tweeting in the airport. But if you look at the stuff in a stove-piped fashion, we don’t get any of that.

That’s just one example, and I use that example, because most of us are businesspeople and get stuck in airports from time-to-time. We can all relate to it, but there’s a variant of that kind of example in any and every industry.

How do we start to bring this stuff together? This stuff does not sit in a single database and it’s not a single type of structure and it’s coming in all over the place. How do I make sense of it?

As Paul said very well, ask smart questions, figure out the big picture, and ultimately make my organization more successful, more competitive, and really get to the results I want to get to. But really, it’s a much, much bigger set of questions than just "My database is getting really big. Yesterday, I had this many terabytes and I am adding more terabytes a day." It’s a lot bigger than that.
HAVEn gives us that platform model, which is scalable, flexible, secure, and integrated.

We need to think bigger and you need to work with an organization that has the breadth of resources and the breadth not just inside the organization but within our partnerships to be able to do that. HP has got the unmatched capability to do that, in my view, and that’s why this HAVEn initiative is so very exciting and why we have such great expectations from this.

Gardner: What really jumped out of me in listening to the announcements was that so often in technology we get products and services that allow us to do things faster, better, cheaper, all of which is very important. But what’s quite new here, and different with HAVEn is that we're able to now start enabling organizations to do things they simply could not have done before or in any other way.

It’s really opening up to me a new chapter in business services enablement, both internal services and, external benefits, and external services. So last word to each of quickly on why this HAVEn announcement is something that’s unique and is really more than just a technology announcement. Let’s start quickly with you, Tom Norton.

Norton: I think it’s interesting, because we just talked before about integration. Customers with data as complex as it can be, you need models. HAVEn gives us that platform model, which is scalable, flexible, secure, and integrated. It's what the customers need to be able to react quickly, what IT needs to be able to stay relevant, and what the business needs to know they are going to have a predictable and responsive platform that they can base their analytics on. It’s an answer to a very difficult question and very impactful.

Gardner: Paul Muller, why does this go beyond the faster, better, cheaper variety of announcements?

Fundamental difference

Muller: It’s the ability to bring together a set of technologies that allow you to look at all the data all of the time in real-time. I think that that’s the fundamental difference. As I said, shifting the discussion from why can’t we do it to what do we need to do next is an exciting possibility.

Gardner: Last word to you, Chris Selland, why is this going beyond repaving cow paths and charting new territory?

Selland: I just gave a long answer. So I'll give a short one. It’s really about the future, the competitiveness of the business, and IT becoming an enabler for that. It’s about the CIO, really having a chance to play a key role in driving the strategy of the business, and that’s what all CIOs want to do.
Is this big-data thing real? We think it’s very real and we think you're going to see more-and-more examples.

We have these inflection points in the marketplace, the last one was like 12 years ago, when the whole e-business thing came along. And, while I just used a competitor's tag line, it changed everything. The web did change everything. It forced businesses to adapt, but it also enabled the lot of businesses to change how they do business, and they did.

Now, we're at another one, a very critical inflection point. It really does change everything, and there is still some skepticism out there. Is this big-data thing real? We think it’s very real and we think you're going to see more-and-more examples. We're working with customers today or showing some of those examples how it really does change everything.

Gardner: Great. I am afraid we'll have to leave it there. We've been exploring the vision and implications of the HAVEn news that’s been delivered here at Discover and we are learning more about HP strategy for businesses to gain actionable intelligence from a universe of sources and data types. So if you want more information on HAVEn, you can find it online by searching under HP Discover 2013 or HP HAVEn.

I'd like to now wrap up by thanking our co-host, Chief Evangelist at HP Software, Paul Muller. Thanks again so much, Paul.

Muller: It’s not the size; it’s how you use it, when it comes to big data, mate.

Gardner: Also a big thank you to Chris Selland, Vice President of Marketing at HP Vertica. Thank you, Chris.

Selland: It’s great to be here, thanks.

Gardner: And lastly, a thank you to Tom Norton, Vice President of Big Data Technology Services at HP. Thank you, Tom.

Norton: Thank you very much, Dana; it’s been a pleasure.

Gardner: Great. And also of course the biggest thank to our audience for joining us for this special HP Discover Performance podcast coming to you from the HP Discover 2013 Conference in Las Vegas.

I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HP sponsored discussion.

Thanks again for listening and come back next time.

Listen to the podcast. Find it on iTunes. Download the transcript. Sponsor: HP.

Transcript of a BriefingsDirect podcast on how HP's new HAVEn Initiative puts the power of big data in the hands of companies. Copyright Interarbor Solutions, LLC, 2005-2013. All rights reserved.

You may also be interested in: