Transcript
of a BriefingsDirect podcast on how HP Vertica is evolving to meet the
needs of enterprises as data continues to grow.
Listen to the podcast. Find it on iTunes. Download the transcript. Sponsor: HP.
Dana Gardner: Hello, and welcome to the next edition of the
HP Discover Performance Podcast Series. I'm
Dana Gardner, Principal Analyst at
Interarbor Solutions, your moderator for this ongoing discussion of
IT innovation and how it’s making an impact on people’s lives.
Once again, we’re focusing on how IT leaders are
improving their business performance for better access,
use and analysis of their data and information. This time we’re coming to you directly
from the
HP Vertica Big Data Conference in Boston and we're delighted to welcome the General Manager of HP Vertica to his debut on BriefingsDirect.
Please join me in welcoming
Colin Mahony, General Manager at
HP Vertica. Good to have you with us, Colin. [Follow
Colin on Twitter.] [Disclosure:
HP is a sponsor of
BriefingsDirect podcasts.]
Colin Mahony: Thanks, Dana. It’s great to be here. I appreciate you having me.
Gardner:
Well, it's been well over two years since
HP acquired Vertica and, as
we begin the inaugural 2013 Big Data Conference, how would you best
characterize how Vertica has evolved since its
founding back in 2005?
Mahony:
Oh, wow. We’ve evolved quite a bit. It’s been a busy couple of years
here, certainly post the acquisition. But I think at a high level, we’ve
really shifted and expanded from being an
MPP column store, very narrowly-focused database company, really into an
analytic platform company.
With
that comes several developments, obviously
on the product side, but
also as an organization, going through that maturation in terms of being
able to operate at a global scale across the spectrum of what you would
expect an analytics provider to offer.
Gardner:
And how do you characterize the difference between a store and a
platform? Are there many ecosystem players or is this an organic
evolution of your capabilities or both?
Mahony: It’s both,
the ecosystem and the tools that you interact with. And of course, we support a very rich and vibrant ecosystem of
business-intelligencve (BI) tools,
extract, transform and load (ETL) tools, and other types of management tools. Not just the ecosystem around it, but also looking within our own products.
So it's adding a lot of the capabilities like
backup and recovery, additional analytics capabilities beyond just standard
SQL with the
SDKs
that Vertica supports, the ability to run both the procedural and the
other types of code within the product, being able to express things
like
MapReduce beyond what a traditional database system would do.
Since
the founding of the company, we've tried to take the best part of the
database world and the best parts of the SQL world, but address the most
challenging issues that traditional databases have had. So whether it
is scalability or it’s being able to run things beyond SQL or it’s just
the performance, those are all the things that we have taken into
account while we built Vertica, and I think we have always been on the
fast track to a platform.
We knew it would be a
journey and we knew that building a product and a platform from the
bottom up is not an easy thing, but we also knew that once we got there,
once we sort of crossed that chasm, if you will, then all those
decisions that made in the beginning about this product and building an
engine from the bottom up would pay off.
Platform modularity
For
probably the last year, that's where we’ve been. Right now, we're
seeing that it’s easy to add functionality to the platform because of
the modularity of the platform, and we can add that functionality
without giving up any of the performance.

For me, it’s probably the most exciting time. Being part of
HP
offers us so many things that
make it a lot easier to become a platform, not only on the development side, but a much greater
ecosystem, a global scale, being able to support customers globally
24/7.
Gardner: This is a large conference. I'm
pretty impressed with the attendance, but for our audience, this might
be
an introduction. Tell our listeners and readers a bit
more about yourself and your background?
Mahony: I've been with Vertica since the beginning. In fact, long before Vertica, my background has always been
databases.
I've always loved computer science, and had a minor in computer science
in my undergraduate degree. In my first job out of school, I was taking
databases -- it's one of our competitors now, so I won't name them --
but I was using their database, and working with civilian US Government
clients, and getting a lot of information published up to the web in the
earliest days of the web.
I had a couple of other
roles, but they were always very technology focused. Then I got my MBA
on the business side and went into venture capital for seven years.
That's where I met
Mike Stonebraker, the founder of Vertica.
Those are all the things that we have taken into account while we built
Vertica, and I think we have always been on the fast track to a
platform.
I just loved the idea, everything I
knew about databases and the challenges of traditional database and
everything I knew about the new world order of information -- at the
time we didn’t even talk about the term
big data -- it just seemed to align really well.
So
I decided to leave the dark side of venture capital and I jumped into
something that I have been incredibly passionate about. If you look at
that lifecycle even my own background with Vertica and where we’ve come,
it’s just been a great. The timing was great and as always it takes a
lot more than just great technology and great people.
There
is definitely a lot of luck and timing, and I had the fortune of
stepping into the right market at the right time, being part of a great
team, and learning from a lot of great people along the way.
This
is
our first user conference. It’s ironic that we've never had one
before, but I think also this is a testament to that scale I was
referring to with what HP can bring. We have wanted a user conference
since the beginning. Obviously, it takes some critical mass to get there
which we now have, but also it takes the support of an organization
that knows how to do these conferences and understand the value of them.
So it's just wonderful to be here. It’s wonderful to
see all of these partners, customers, employees and friends of Vertica
and HP here in Boston, of course Vertica’s hometown, so truly exciting.
Gardner:
You mentioned the marketplace and the timing. I have to go back to that
because in 2005, while scale and performance were very important. This
whole notion of big data being so prevalent in the market really hadn't
happened yet. What’s the
state of the union, if you will, with this
marketplace? Do more and more IT functions and business functions begin
and end with Big data? It seems to be at the center of so many things.
Exponential growth
Mahony:
It is. To go back to the founding of Vertica, I remember when Mike
Stonebraker was giving the early presentations on the need for it. He
talked a lot about
the exponential growth of data and how that was
outpacing any laws like
Moore’s law
or other hardware laws. So much information was being created, there
was no way that just using more paralyzed hardware was going to be able
to address the issue.
The state of the union back then
was, just as you said, there was no such thing as big data, but I think
Mike, as a visionary, knew what was going to happen in the industry. And it has happened.
It wasn’t a long time ago, but I remember that I was trying to find our first sample
dataset that was over a
terabyte
and we had a difficult time finding it. When we would talk to the early
customers, they looked at us like we were crazy when we were asking
about a terabyte.
We have an easy time now finding
terabytes of data. The state of the union today is that what's driving
so much around big data is that you have obviously the
volume, variety, and velocity that we talk about often, but what's really driving those three things is human information, whether it's
social media,
tweets, or expressive content that’s just so prevalent right now, as well machine information.
If you look at the traditional
structured database
market by any number, it’s a small percentage of the amount of data
that’s out there. The strength of Vertica, and really the strength of HP
overall, is that we have the best assets for the
unstructured human information in Autonomy, as well as the best assets when it comes to machine information and large data.
When we would talk to the early customers, they looked at us like we were crazy when we were asking about a terabyte.
That
has some structure. It’s semi-structured information, but it’s not your
traditional transaction system. The power of all of that data comes
together when you can have an engine that applies some structure to it
and then is able to deliver the analytics that the organization needs.
It's both IT as well as line of business, and even this new category we
often talk about, which is the
data scientist.
One of the great things about this show here is that we’ve got
Billy Beane of
Moneyball
fame as our keynote speaker. The reason that we wanted Billy to come
speak here is that Moneyball is exactly what’s happening right now in
the world when it comes to big data.
You have the data
scientist or the statistician, you have the line of business folks, and
you have IT. They all have a part to play in the success of how
information is used in companies. By bringing them together and by
making the software that much easier for them to come together and solve
these problems, you can create very real and differentiated value
within organization.
So Moneyball is exactly what’s
happening, certainly in corporate America, but also in government and in
many other institutions that want to leverage information to be more
efficient and create a competitive advantage.
Gardner:
Before we delve into the latest and greatest with Vertica, let’s put
some context around this. It’s only been a few months since the
HP Discover 2013 Conference in Las Vegas where the
HAVEn Initiative
was announced. This puts Vertica in a very prominent place among other
HP properties, technologies, platforms and approaches to solving this
big data issue. Recap for us, if you would, what HAVEn is and why
Vertica formed such an important pillar for this larger HP initiative?
Big-data lake
Mahony:
What companies are looking for is this notion of the big-data lake. To
me, it can mean many different things, but at the end of the day,
companies want to take all the information assets that they have and
they want to put them into a safe place, but a place where access to
that information can be used by many different constituencies, whether
it's IT, line of business, or data scientist.
So the
notion of having a safe place, a harbor, or a port is what we announced
as HP HAVEn, which is
HP’s big data platform. It is primarily for
analytics, but it can be used for just about anything when it comes to
information and data.
What's so important about
information right now is that there are different constituencies in the
companies that want to take the information. First of all they want to
capture all the information, not just structured, not just unstructured,
but 100 percent of their information.
They want to get
it to a place where they can leverage it and use it for a lot of
different use cases, but the first part is get that information into the
right place. For us, that is one of three components of HAVEn, which is
the connectors.
We have over 700 connectors as part of HAVEn coming from
Autonomy, coming from our
Enterprise Security Group,
the ArcSight core Logger and those connectors. That can be human
information, extreme log information, or traditional database structured
information.
They're driven by vast volumes of information and they close the loop,
meaning that the experiences that are happening with an application.
Step
one is the connectors to get these components. Step two is to put that
data into the best engine for that data. Vertica obviously is one
component, but you also have the
Autonomy IDOL Engine, you have the
ArcSight Logger engine, and also open-source technologies like
Hadoop, which is actually the HP HAVEn. So we’ve got a place to put the information.
Step three is any
N number of applications. What I'm seeing happening in the industry right now is just like we went from
mainframe to
client-server, and client-server to
LAN,
we're in a period now where applications are being developed. They're
certainly web-based and distributed, but they're also analytical in
nature.
They're driven by vast volumes of information
and they close the loop, meaning that the experiences that are happening
with an application, if you're driving a car, or whatever it might be,
information is being passed, closed loop, back to a system that can then
optimize the experience. That is creating a new class of applications.
For
that new class of applications, you need the platform to be able to
drive those. What we're bringing together in HAVEn is Hadoop, Autonomy,
Vertica, Enterprise Security, core assets, and the N number of
applications.
At Discover, we announced some of our
own internal applications, which are powered by the HAVEn platforms. We
announced our
HP Analytics offering, which is built using Hadoop,
Vertica, Enterprise Security, and Autonomy assets.
About community
We're
making some of our own applications, but this is about the community
and getting people to be able to build new set of applications that can
use these components to really change how people are interacting with
their data.
That’s HAVEn, and I am always careful to
point out to people that HAVEn itself is not a product, but it's a
platform and it’s a broader platform than the one that is just Vertica,
Autonomy, or Enterprise Security. It’s a platform where 1+1+1+1+1,
instead of equaling 5, should equal 8 or 10 or 12, and that's the goal.
Of course, it's also a roadmap into areas that each of these components
are working on to bring those closer together. So it’s exciting.
Gardner:
Let’s look a bit more specifically at Vertica and try to factor why
it’s differentiated in the market, but then also get a sense of where
it’s going.
One of the things that strikes me about
the market nowadays is that there seems to be a sense of tradeoffs going
on when organizations are trying to pick their data engine or their
platform. They have a set of value on one side, but it’s opposed by
value on the other. They can’t have everything. One size does not fit
all.
So how are you at Vertica able to help people
deal with these tradeoffs that they're facing when it comes to a
next-generation data platform?
Vertica was founded on the premise that one size does not fit all.
Mahony:
Before I explain the tradeoffs, I couldn’t agree with you more, Dana.
In fact, Vertica was founded on the premise that one size does not fit
all. Using a single
OLTP transactional database to do everything, including analytics, just doesn't make a lot of sense.
If
you think about the areas that the people have to trade off, usually
it’s scale for performance or analytics functionality for performance.
One of things that I've spent a lot of time looking at is, especially
over the last couple of years, is just some of the alternative
platforms, not just for analytics, but for all of the different data
needs.
You can take something like Hadoop as an
example. Hadoop really is a distributed file system and has capabilities
to run rudimentary analytics and transform processed data. But I think
what people love about Hadoop is that it's really easy to load data into
Hadoop. You don't have to define the
schema or anything.
Instead
of schema on write or load time, it’s schema on read time. People like
that. They also like at least the perception that it is free and the
scalability of it. On the database side, what people love about the
database is that you're going to get really good performance, because
the data is structured. If you're using a
NexGen MPP platform like Vertica, you'll get the performance of the scalability.
So
what we’re trying to do and what we've always done a pretty good job of
at Vertica is look at the things that would make sense for Vertica to
do. We look at expanding the platform in ways that, number one, we have
the expertise and the capability to do, not only from the development
standpoint, but from the support standpoint. And number two, we have the
ability to create something differentiated. If we don't, or it’s not
core, then we won’t do it, sticking to the purity of one size doesn’t
fit all.
Hadoop-like
We've
been doing a lot of work in areas like making it easier to get the data
into the platform, doing more with it, making it seem much more like a
Hadoop-like environment. You can look at our past releases and see that
there's been a lot of work done on that and we continue to make those
investments.
One thing has been consistent at Vertica
since the beginning. What we focus on is to make it really easy for
people to get information onto the platform. Then, we make sure we
continue to deliver new capabilities, performance, and functionality
within the platform.
We make sure we’re enabling our
customers and partners to deploy Vertica anywhere and everywhere,
whether it’s cloud appliances, software, or the like. Those are the
three tenets of the company. It’s all around this notion of making data
matter and help people make better decisions that lead to better
outcomes with superior information.
There's so much
that can be done in this space, but I think the key for us is to focus
on the things that we know we do really well. The good news is that it's
such a large space with so many demands that we know we can make a huge
impact without trying to take on the world. We know we can make a huge
impact in what we’re doing.
I think you'll continue to
see some interesting developments along the lines of what I'm
describing, and it's very much in line with where we've been.
No matter what on-ramp they take, they tend to find a lot of the other capabilities once they get on.
Gardner:
While we're at the user conference, there are some great use cases and
some examples. It's one of my favorite points of communication that it's
always better to show than to tell.
Of the various
user organizations and use cases here, are there are any that jump at
you personally when you think about what Vertica started out as and what
it became? Are there any ways that some users are putting this to work
to really capture, "This is what we intended, and this is what we went
through those paces to allow, to encourage, and to now see the fruits
of?"
So, from all of the happenings here with the conference, what sort of gets your blood flowing?
Mahony:
One thing I've certainly noticed over the years with our customers is
that the shiny object of why a customer chooses Vertica may look very
different across our customers. For some, it's the price. For some, it's
the performance and the scale, massive volumes. For some it's a
particular analytic function or several pattern matching capabilities.
And for others, it's something entirely different.
But
what's so exciting, especially about this conference, is that no matter
what on-ramp they take, they tend to find a lot of the other
capabilities once they get on. Hopefully, here at the conference, we're
going to accelerate some of that just by getting our customers and our
partners together in an environment where they can share stories.
Partners and customers
In
fact, if you look at the agenda for the conference, it's very light on
Vertica presentations. It's very heavy on
partner and customer presentations, because this is the time that we want our partners and
our customers to learn from each other. We want them to talk about how
they are using it.
To answer your question directly,
what gets me most jazzed up is when a customer is taking advantage of
nearly everything that we do. Again, it's a cycle. It's not something
that can happen immediately.
There are so many
customers here that have been with us for four or five years and had
just been great partners for the Vertica organization in terms of the
feature we are developing and the direction that we are taking the
product. They tend to be the ones who are using just about every feature
in the product. So it gets me really excited.
I have
got a customer that's got massive volumes of information, lot of
diversity in the information, many different lines of business
constituents who are accessing the information, data scientists, DBAs,
programmers, different people who are creating applications and keeping
the system up and through all that change in the organization.
Sometimes
it's not only change in the organization, but potentially change in the
industry and changing the way that people are interacting with data and
may be changing healthcare outcomes, or drastically improving the
quality of mobile phone service or other types of services.
It is about the connection between our customers and our partners, so that they can talk to each other.
So
there isn't any one customer of whom I'd say, "You have to go see these
guys." The reality is that you should see all of our customers and hear
what they have to say. For me, that's the most important part of this
conference.
It is about the connection between our
customers and our partners, so that they can talk to each other. We can
just be a fly on the wall and listen to some of the things that they're
saying, good, bad, or ugly -- hopefully very good. But we can even hear
things that they want us to improve. That's an important part of any
company, certainly a software company, and that's what we're hoping to
get out of it. For our customers and partners, they're going to get a
lot of out of this just by talking to each other.
Gardner:
Colin, what about the notion of business transformation. We've been
hearing about this for 30 years. It's been big part of the academic work
in business schools.
Process re-engineering has evolved into
balanced scorecards, and the flavor of the day is about how to change the nature
of companies.
But it strikes me that this whole greater
than the sum of the parts that you alluded to earlier, where data and
analytics is made more available across easier applications to morph
that, is inside the company that can then access more types of
information across the boundaries of the organization into supply chain
and ecosystems.
Getting more detailed information in
real time about the customers and the marketplace probably has as much
or more of a opportunity to transform businesses than just about
anything else that's happened, with the possible exception of the
Internet itself, over the past 20 years.
More than technology
So
without going too much into a
hype curve, the interest of the
incredible amount of attention paid to big data in the past few years is
about more than the technology. It's really about an empirical
data-driven approach, a cultural shift if you will, within businesses.
How you have been seeing that manifest itself here at the conference?
Mahony:
It's an enormous opportunity for business transformation and definitely
the whole is greater than the sum of the parts. What makes companies
really successful with information is not trying to boil the ocean, not
trying to do a traditional enterprise
data warehouse project that's going to take 24 months, if you're lucky, 36 most likely.
They’ll
end up with some monolithic inflexible platform that will probably be
outdated by the time it gets deployed. What is making a lot of companies
successful is they find a particular use, they find a problem area that
they want to drill down on, and they mobilize to do it.
For
that, they need a solution that is quickly deployed, but also has that
capability to become something much larger. Whether it's Vertica,
Talend,
or any of the other portfolios that we offer, we strive to make sure
that somebody can get up and running quickly, whether it's Autonomy and
human information analytics, Vertica and machine data or other types of
transactional structured data.
The most important thing
is that you find that business case, you focus on it, and prove very
quickly. There's something we refer to as “Time to Terabyte,” which is
less than a month, typically for Vertica. You get a
return on investment (ROI)
in less than a month for the investments that you made. If you prove
that out, then everybody in the organization is happy, the line of
business, the technology folks in IT, even the statisticians, data
scientists.
It's not just about faster speeds and feeds. It's about fundamentally stepping back and asking how we're running this business.
From
there, you start expanding the project, and that's exactly how we win
most of our customers. We very rarely go in and say, "Buy an enterprise
license for our product across the company." We certainly do those, but
more typically we get into a business unit, we find the acute pain, and
we solve that problem.
What they're betting on is the
ability for us to expand and for them to expand in this platform. That's
why we are, on the one hand, all about the platform and the
integration, but on the other hand, not about to lose the flexibility
and the modularity of what we do, because that's also a huge
differentiator for HP's portfolio.
I think that this is
a wonderful time in the world of business transformation, and I think,
unlike what has been talked about for the last 30 years, you now have
the data that can back it up and prove it in real-time to the
organization.
That's the big difference. You gave the
balanced scorecard as an example. If you look at the balance scorecard
methodology, you can take that methodology and drill down into a
thousand fields of detail and be able to get that information in real
time. That's the opportunity here, and that's I think why this market is
so huge.
It's not just about faster speeds and feeds.
It's about fundamentally stepping back and asking how we're running
this business. What assets, especially information assets, do we have
that could dramatically boost the productivity to the same extent that
computers, when they were first introduced, boosted productivity. That's
the goal that everybody is looking for when it comes to information.
Cloud and hybrid
Gardner:
For our last item today, I wonder if we could take out our crystal ball
apparatus and try to do a little blue-sky thinking. One of the other
big trends these days of course is
cloud computing
and
hybrid models for the distribution of workloads for applications,
but also for data. I'm wondering, as we go down this journey over the
next year or two, how do big data and cloud computing come together?
There's this notion of an
analytics platform-as-a-service (PaaS)
deploy for developers, but now maybe more for data scientists and for
those that are doing BI and other analytic chores. How do you foresee
some of this whole greater than the sum of the parts extending beyond
the technical capabilities into the deployment models and what is that
portend, for additional paybacks or payoffs?
Mahony:
As I mentioned in terms of the three things that we are focused on,
number one is make it easy to get data into the platform. Number two is
do a lot more with the platform, so that there is better analytic
capabilities, better pattern matching, and better analytics packs on top
of it.
Number three is make sure you can
deploy Vertica everywhere, and in the everywhere and anywhere categories, the
cloud is certainly the first name that comes to mind. That is absolutely
the future of computing. In some ways, I guess, it's the past, but it's
interesting how the past repeats itself.
All these activities that are happening up on the cloud are generating a
lot of information, information that will be analyzed, I'm sure, in
many different ways.
We do run Vertica on hosted environments like
Amazon cloud. We're in a private beta on the
HP Cloud Service. So there are definitely offerings and developments that that has been underway here at Vertica for a while.
We
embrace that, and to us, it's not mutually exclusive. What you
described in the hybrid environment where you can run certain things
locally. You can burst up to the cloud to do other workloads, especially
if you're looking to pull some quick processing power and storage.
That's going to be the future and that's the way, just like any other
utilities, that we're going to consume some of these capabilities.
This
is one of the strengths of a company the size and scale of HP. We have
these offerings, whether it's software only, appliance, or cloud. We
have the ability to deliver however the customer wants it, and we can
also provide not only the flexible technologies, but the flexible
business capabilities to make that happen with a lot of ease.
It's
an exciting time. If you look at the pillars of the HP, we have cloud,
mobility, big data, and security. All four of those pillars tie well
into one another, because they're all related. Of course, all these
activities that are happening up on the cloud are generating a lot of
information, information that will be analyzed, I'm sure, in many
different ways.
So it's something that kind of feeds on
itself, the same way the mobility does. All of that is a good thing for
the analytic space, wherever it is. The final thing I would say is
that the most important thing about analytics is that you do want it
embedded into the various applications, just like when you are driving a
car, you just want the GPS system to tell you where you are going.
Analytics
is the same. You want it within the context of whatever it is that you
are doing. Given that so many things are going to be served off the
cloud, it's natural that that's the place that will host some of the
analytics as well.
So it's an incredibly exciting time,
and we're looking forward to having many more of these User Conferences
and are certainly going to enjoy the rest of the show this week.
Gardner:
Well great. I'm afraid we will have to leave it there. We've been
learning more about the ongoing evolution of the HP Vertica platform and
its capabilities, and we've developed better understanding about
Vertica's growing role and making among the most challenging big data
analytic chores more successful and impactful.
So, join me in extending a huge thank you to our special guest Colin Mahony, General Manager at HP Vertica. Thanks so much.
Mahony: Thank you, Dana. [Follow
Colin on Twitter.]
Gardner:
And also thank you to our audience for joining us for this special HP
Discover Performance podcast, coming to you from the HP Vertica Big Data
Conference in Boston.
I'm Dana Gardner, Principal
Analyst at Interarbor Solutions; your host for this ongoing series of HP
sponsored discussions. Thanks again for listening and come back next
time.
Listen to the podcast. Find it on iTunes. Download the transcript. Sponsor: HP.
Transcript
of a BriefingsDirect podcast on how HP Vertica is evolving to meet the
needs of enterprises as data continues to grow. Copyright Interarbor
Solutions, LLC, 2005-2013. All rights reserved.
You may also be interested in: