Transcript
of a BriefingsDirect discussion on how a consumer research and data
analysis firm gleans rich marketing data from customers' shared sales receipts.
Listen to the podcast. Find it on iTunes. Download the transcript. Get the mobile app for iOS or Android. Sponsor: HP.
Dana Gardner: Hello, and welcome to the next edition of the
HP Discover Podcast Series. I'm
Dana Gardner, Principal Analyst at
Interarbor Solutions,
your host and moderator for this ongoing sponsored discussion on IT
innovation and how it’s making an impact on people's lives.
Our next big data innovation case study interview highlights how
InfoScout
in San Francisco gleans new levels of accurate insights into retail
buyer behavior by collecting data directly from consumers’ sales
receipts.
In order to better analyze actual retail
behaviors and patterns, InfoScout provides incentives for buyers to
share their receipts, but InfoScout is then faced with the daunting task
of managing and cleansing that essential data to provide actionable and
understandable insights.
To learn more about how big
-- and even messy -- data can be harnessed for near real time business
analysis benefits, please join me in welcoming our guests,
Tibor Mozes, Senior Vice President of Data Engineering at InfoScout. Welcome, Tibor.
Become a member of myVertica
Register now
Gain access to the HP Vertica Community Edition
Tibor Mozes: Good morning. Thanks for having us.
Gardner: I'm glad you're with us. We're also joined today by
Jared Schrieber, the Co-founder and CEO at InfoScout, based in San Francisco. Welcome, Jared.
Jared Schrieber: Glad to be here.
Gardner:
Jared, let’s start with you. We don’t often get the option of choosing
how the best data comes to us. In your business, you've been able to
uniquely capture strong data, but you need to treat it a lot to use it
and you also need a lot of that data in order to get good trend
analysis. So the payback is that you get far better information on
essential buyer behaviors, but you need a lot of technology to
accomplish that.
Tell us why you wanted to get to this specific kind of data and then your novel way of acquiring it, please.
Consumer panels
Schrieber: A quick history lesson is in order. In the market research industry,
consumer purchase panels
have been around for about 50 years. They started with diaries in
people’s homes, where they had to write down exactly every single
product that they bought, day-in day-out, in this paper diary and mail
it in once a month.
About 20 years ago, with the advent of modems in people’s homes, leading research firms like
Nielsen
would send a custom barcode scanner into people’s homes and ask them to
scan each product they bought and then thumb into the custom scanner
the regular price, the sales price, any coupons or deals that they got,
and details about the overall shopping trip, and then transfer that
electronically. That approach has not changed in the last 20 years.
With the advent of
smartphones and
mobile apps,
we saw a totally new way to capture this information from consumers
that would revolutionize how and why somebody would be willing to share
their purchase information with a market research company.
Gardner:
Interesting. What is it about mobile that is so different from the
past, and why does that provide more quality data for your purposes?
Schrieber: There are two reasons in particular. The first is, instead of having consumers scan the
barcode
of each and every item they purchase and thumb in the pricing details,
we're able to simply have them snap a picture of their shopping receipt.
So instead of spending 20 minutes after a grocery shopping trip
scanning every item and thumbing in the details, it now takes 15 seconds
to simply open the app, snap a picture of the shopping receipt, and be
done.
The
second reason is why somebody would be willing to participate. Using
smartphone apps we can create different experiences for different kinds
of people with different reward structures that will incentivize them to
do this activity.
For example, our
Shoparoo app is a next-generation school fundraiser akin to
Box Tops for Education.
It allows people to shop anywhere, buy anything, take a picture of
their receipt, and then we make an instant donation to their kid’s
school every time.
Another app is more of a
Tamagotchi game called
Receipt Hog,
where if you download the app, you have adopted a virtual runt. You
feed it pictures of your receipt and it levels-up into a fat and happy
hog, earning coins in a piggy bank along the way that you can then
cash-out from at the end of the day.
These kinds of
experiences are a lot more intrinsically and extrinsically rewarding to
the panelists and have allowed us to grow a panel that’s many times
larger than the next largest panel ever seen in the world, tracking
consumer purchases on a day-in day-out basis.
Gardner:
What is it that you can get from these new input approaches and
incentivization through an app interface? Can you provide me some sort
of measurement of an improved or increased amount of participation
rates? How has this worked out?
Leaps and bounds
Schrieber:
It's been phenomenal. In fact, our panel is still growing by leaps and
bounds. We now have 200,000 people sharing with us their purchases on a
day-in day-out basis. We capture 150,000 shopping trips a day. The next
largest panel in America captures just 10,000 shopping trips a day.
In addition to the shopping trip data, we're capturing
geolocation information,
Facebook
likes and interests from these people, demographic information, and
more and more data associated with their mobile device and the email
accounts that are connected to it.
Gardner: So yet another unanticipated consequence of the mobility trend that’s so important today.
Tibor,
let’s go to you. The good news is that Jared has acquired this trove of
information for you. The bad news is that now you have to make sense of
it. It’s coming in, in some interesting ways, as almost a picture or an
image in some cases, and at a great volume. So you have velocity,
variability, and volume. So what does that mean for you as the Vice
President of Data Engineering?
Mozes: Obviously
this is a growing panel. It’s creating a growing volume of data that
has created a massive data pipeline challenge for us over the years, and
we had to engineer the pipeline so that is capable of processing this
incoming data as quickly as possible.
It’s creating a growing volume of data that has created a massive data pipeline challenge for us over the years.
As
you can imagine, our data pipeline has gone through an evolution. We
started out with a simple solution at the beginning with
MySQL and then we evolved it using Elastic
Map Reduce and
Hive.
But
we felt that we wanted to create a data pipeline that’s much faster, so
we can bring data to our customers much faster. That’s how we arrived
at Vertica. We looked at different solutions and found
Vertica a very suitable product for us, and that’s what we're using today.
Gardner:
Walk me through the process, Tibor. How does this information come in,
how do you gather it, and where does the data go? I understand you're
using the HP Vertica platform as a cloud solution in the
Amazon Web Services Cloud. Walk me through the process for the data lifecycle, if you will.
Mozes:
We use AWS for all of our production infrastructure. Our users, as
Jared mentioned, typically download one of our several apps, and after
they complete a receipt scan from their grocery purchases, that receipt
is immediately uploaded to our back-end infrastructure.
We try to
OCR that image of the receipt, and if we can’t, we use
Amazon Mechanical Turk
to try to make sense of the image and turn that image into text. At the
end of the day, when an image is processed, we have a fairly clean
version of that receipt in a text format.
Next phase
In
the next phase, we have to process the text and try to attribute
various items on the receipt and make the data available in our Vertica
data warehouse.
Then, our customers, using a
business intelligence (BI)
platform that we built especially for them, can analyze the data. The
BI platform connects to Vertica, so our customers can analyze various
metrics of our users and their shopping behavior.
Gardner:
Jared, back to you. There's an awful lot of information on a receipt.
It’s supposed to be very complex, given not just the date and the place
and the type of retail organization, but all the different
SKUs, every item that’s possibly being bought. How do you attack that sort of a data problem from a schema and cleansing and
extract, transform, load (ETL) and then making it therefore useful?
Schrieber:
It’s actually a huge challenge for us. It's quite complex, because
every retailer’s receipt is different. The way that they structure the
receipt, the level of specificity about the items on the receipt, the
existence of product codes, whether they are public product codes like
the kind of you see on a
barcode
for a soda product versus an internal product code that retailers use
as a stock keeping unit internally versus just a short description on
the receipt.
One of our challenges as a company is to
figure out the algorithmic methods that allow us to identify what each
one of those codes and short descriptions actually represent in terms of
a real world product or category, so that we can make sense of that
data on behalf of our client. That’s one of the real challenges
associated with taking this receipt-based approach and turning that into
useful data for our clients on a daily basis.
One of our challenges as a company is to figure out the algorithmic
methods that allow us to identify what each one of those codes and short
descriptions actually represent.
Gardner:
I imagine this would be of interest to a lot of different types of
information and data gathering. Not only are pure data formats and text
formats being brought into the mix, as has been the case for many years,
but this image-based approach, the non-structured approach.
Any
lessons learned here in the retail space that you think will extend to
other industries? Are we going to be seeing more and more of this
image-based approach to analysis gathering?
Schrieber: We certainly are. As an example, just take
Google Maps and
Google Street View,
where they're driving around in cars, capturing images of house and
building numbers, and then associating that to the actual map data.
That’s a very simple example.
A lot of the techniques
that we're trying to apply in terms of making sense of short
descriptions for products on receipts are akin to those being used to
understand and perform social-media analytics. When somebody makes a
tweet,
you try to figure out what that tweet is actually about and means, with
those abbreviated words and shortened character sets. It’s very, very
similar types of natural language processing and regular expression
algorithms that help us understand what these short descriptions for
products actually mean on a receipt.
Gardner: So
we've had some very substantial data complexity hurdles to overcome.
Now we have also the basic blocking and tackling of data transport,
warehouse, and processing platform.
Going back to
Tibor, once you've applied your algorithms, sliced and diced this
information, and made it into something you can apply to a typical data
warehouse and BI environment, how did you overcome these issues about
the volume and the complexity, especially now that we're dealing with a
cloud infrastructure?
Compression algorithms
Mozes:
One of the benefits of Vertica, as we went into the discovery process,
was the compression algorithms that Vertica is using. Since we have a
large volume of data to deal with and build analytics from, it has
turned out to be beneficial for us that Vertica is capable of
compressing data extremely well. As a result of that, some of our core
queries that require a BI solution can be optimized to run super fast.
You
also talked about the cloud solution, why we went into the cloud and
what is the benefit of doing that. We really like running our entire
data pipeline in AWS because it’s super easy to scale it up and down.
It’s
easy for us to build a new Vertica cluster, if we need to evaluate
something that’s not in production yet, and if the idea doesn’t work,
then we can just pull it down. We can scale Vertica up, if we need to,
in the cloud without having to deal with any sort of contractual issues.
Become a member of myVertica
Register now
Gain access to the HP Vertica Community Edition
Schrieber:
To put this in context, now we're capturing three times as much data
every day as we were six months ago. The queries that we're running
against this have probably gone up 50X to a 100X in that time period as
well. So when we talk about needing to scale this up quickly, that’s a
prime example as to why.
Gardner: What has
happened in just last six months that’s required that ramp up? Is it
just because of the popularity of your model, the impactfulness and
effectiveness of the mobile app acquisition model, or is it something
else at work here?
Schrieber: It’s twofold. Our
mobile apps have gotten more and more popular and we've had more and
more consumers adopt them as a way to raise money for their kid’s school
or earn money for themselves in a gamified way by submitting pictures
of their receipts. So that’s driven massive growth in terms of the data
we capture.
Also, our client base has more than
tripled in that time period as well. These additional clients have
greater demands of how to use and leverage this data. As those increase,
our efforts to answer their business questions multiplies the number of
queries that we are running against this data.
Gardner:
That, to me, is a real proof point of this whole architectural
approach. You've been able to grow by a factor of three in your client
base in six months, but you haven’t gone back to them and said, "You'll
have to wait for six months while we put in a warehouse, test it, and
debug it." You've been able to just take that volume and ramp up. That’s
very impressive.
Schrieber: I was just going
to say, this is a core differentiator for us in the marketplace. The
market research industry has to keep up with the pace of marketing, and
that pace of marketing has shifted from months of lead time for TV and
print advertising down to literally hours of lead time to be able to
make a change to a digital advertising campaign, a social media
campaign, or a search engine campaign.
So the pace of
marketing has changed and the pace of market research has to keep up.
Clients aren’t willing to wait for weeks, or even a week, for a data
update anymore. They want to know today what happened yesterday in order
to make changes on-the-fly.
Reports and visualization
Gardner:
We've spoken about your novel approach to acquiring this data. We've
talked about the importance of having the right platform and the right
cloud architecture to both handle the volume as well as scale to a
dynamic rapidly growing marketplace.
Let’s talk now
about what you're able to do for your clients in terms of reports,
visualization, frequency, and customization. What can you now do with
this cloud-based Vertica engine and this incredibly valuable retail data
in a near real-time environment for your clients?
Schrieber:
A few things on the client side. Traditional market research providers
of panel data have to put a very tight guardrails on how clients can
access and run reports against the data. These queries are very complex.
The numerators and denominators for every single record of the reports
are different and can be changed on-the-fly.
If, all of a sudden, I want to look at anyone who shopped at
Walmart
in the last 12 months that has bought cat food in the last month and
did so at a store other than Walmart, and I want to see their purchase
behavior and how they shop across multiple retailers and categories, and
I want to do that on-the-fly, that gets really complex. Traditional
data warehousing and BI technologies don't support allowing general
business-analyst users to be able to run those kinds of queries and
reports on-demand, yet that’s exactly what they want.
They
want to be able to ask those business questions and get answers. That’s
been key to our strategy, which is to allow them to do so themselves,
as opposed to coming back to them and saying, "That’s going to be a
pretty big project. It will require a few of our engineers. We'll come
back to you in a few weeks and see what we can do." Instead, we can hand
them the tools directly in a guided workflow to allow them to do that
literally on-the-fly and have answers in minutes versus weeks.
They want to be able to ask those business questions and get answers. That’s been key to our strategy.
Gardner:
Tibor, how does that translate into the platform underneath? If you're
allowing for a business analyst type of skill set to come in and apply
their tools, rather than deep
SQL
queries or other more complex querying tools, what is it that you need
from your platform in order to accommodate that type of report, that
type of visualization, and the ability to bring a larger set of
individuals into this analysis capability?
Mozes:
Imagine that our BI platform can throw out very complex SQL queries.
Our BI platform essentially is using, under the hood, a query engine
that's going to run queries against Vertica. Because, as Jared
mentioned, the questions are so complex, some of the queries that we run
against Vertica are very different than your typical BI use cases.
They're very specialized and very specific.
One of the
reasons we went with Vertica is its ability to compute very complex
queries at a very high speed. We look at Vertica not as simply another
SQL database that scales very well and that’s very fast, but we also
look at it as a compute engine.
So as part of our
query engine, we are running certain queries and certain data
transformations that would be very complicated to run outside Vertica.
We take advantage of the fact that you can create and run custom
UDFs
that is not part of the ANSI 99 SQL. We also take advantage some of the
special functions that are built into Vertica allowing data to be
sessionized very easily.
Analyzing behavior
Jared
can talk about some of the use cases where we like to analyze user’s
entire shopping trips. In order to do that, we have to stitch together
different points in time that the user has gone through and shopped at
various locations. And using some of the built –in functions in Vertica
that’s not standard SQL, we can look at shopping journeys, we call them
trip circuits, and analyze user behavior along the trip.
Gardner: Tibor, what other ways can you be using and exploiting the Vertica capabilities in the deliverables for your clients?
Mozes:
Another reason we decided to go with Vertica is its ability to optimize
very complex queries. As I mentioned, our BI platform is using a query
engine under the hood. So if a user asks a very complicated business
question, our BI platform turns that question into a very complicated
query.
One of the big benefits of using Vertica is to
be able to optimize these queries on the fly. It’s easy to do this with
running the database optimizer to build custom projections, making
queries running much faster than we could do before.
Another reason we decided to go with Vertica is its ability to optimize very complex queries.
Gardner:
I always think more impactful for us to learn through an example rather
than just hear you describe this. Do you have any specific InfoScout
retail client use cases where you can describe how they've leveraged
your solution and how some of these both technical and feature
attributes have benefited them -- an example of someone using InfoScout
and what it's done for them?
Schrieber: We
worked with a major retailer this holiday season to track in real time
what was happening for them on Thanksgiving Day and Black Friday. They
wanted to understand their core shoppers, versus less loyal shoppers,
versus non-core shoppers, how these people were shopping across
retailers on Thanksgiving Day and Black Friday, so that the retailer
could try to respond in more real time to the dynamics happening in the
marketplace.
You have to look at what it takes to do
that, for us to be able to get those receipts, process them, get them
transcribed, get that data in, get the algorithms run to be able to map
it to the brands and categories and then to calculate all kinds of
metrics. The simplest ones are market share; the most complex ones have
to do with what Tibor had mentioned: the shopper journey or the trip
circuit.
We tried to understand, when this retailer
was the shopper's first stop, what were they most likely to buy at that
retailer, how much were they likely to spend, and how is that different
than what they ended up buying and spending at other retailers that
followed? How does that contrast to situations where that retailer was
the second stop or the last stop of the day in that pivotal shopping day
that is Black Friday?
For them to be able to
understand where they were winning and losing among what kinds of
shoppers who were looking for what kinds of products and deals was an
immense advantage to them -- the likes of which they never had before.
Decision point
Gardner:
This must be a very sizable decision point for them, right? This is
going to help you decide where to build new retail outlets, for example,
or how to structure the experience of the consumer walking through that
particular brick-and-mortar environment.
When we bring
this sort of analysis to bear, this isn’t refining at a modest level.
This could be a major benefit to them in terms of how they strategize
and grow. This could be something that really deeply impacts their
bottom line. Is that not the case?
Schrieber: It
has implications as to what kinds of categories they feature in their
television, display advertising campaigns, and their circulars. It can
influence how much space they give in their store to each one of the
departments. It has enormous strategic implications, not just tactical
day-to-day pricing decisions.
Gardner: Now, that
was a retail example. I understand you also have clients that are
interesting in seeing how a brand works across a variety of outlets or
channels. Is there another example you can provide on somebody who is
looking to understand a brand impact at a wider level across a geography
for example?
It has enormous strategic implications, not just tactical day-to-day pricing decisions.
Schrieber:
I'll give you another example that relates to this. A retailer and a
brand were working together to understand why the brand sales were down
at this particular retailer during the summer time. To make it clear for
you, this is a brand of ice-cream. Ice cream sales should go up during
the summer, during the warmer months, and the retailer couldn’t
understand why their sales were underperforming for this brand during
the summer.
To figure this out, we had to
piece-together, along the shopper journey over time, not only in the
weeks during the summer months, but year round to understand this
dynamic of how they were shopping. What we were able to help the client
quickly discover was that during the summer months people eat more
ice-cream. If they eat more ice-cream, they're going to want larger pack
sizes when they go and buy that ice-cream. This particular retailer
tended to carry smaller pack sizes.
So when the summer
months came around, even though people has been buying their ice-cream
at this retailer in the winter and spring, they now wanted larger pack
sizes and they were finding them at other retailers, and switching their
spend over to these other retailers.
So for the brand,
the opportunity was a selling story to the retailer to give the brand
more freezer space and to carry an additional assortment of products to
help drive greater sales for that brand, but also to help the retailer
grow their ice cream category sales as well.
Idea of architecture
Gardner: So just that insight could really help them figure that out. They probably wouldn’t have been able to do it any other way.
We've
seen some examples of how impactful this can be and how much a business
can benefit from it. But let’s go back to the idea of the architecture.
For me, one of my favorite truths in IT is that architecture is
destiny. That seems to be the case with you, using the combination of
AWS and HP Vertica.
It seems to me that you don’t have
to suffer the costs of a large capital outlay of having your own data
center and facilities. You're able to acquire these very advanced
capabilities at a price point that's significantly less from a capital
outlay and perhaps predictable and adjustable to the demand.
Is
that something you then can pass along? Tell me a little bit about the
economics of how this architectural approach works for you?
Mozes:
One of the benefits of using AWS is that it’s very easy for us to
adjust our infrastructure on demand, as we see fit. Jared has referred
to some of the examples that we had before. We did a major analysis for a
large retailer on Black Friday, and we had some special promotions to
our mobile app users going on at that point. Imagine that our data
volume would grow tremendously from one day to the next couple of days,
and then after when the promotion is over and the big shopping season is
over, our volume would come down somewhat.
It’s very cost efficient to run an operation where you can just add
additional computing power as you need, and then when you don’t need
that anymore, you can scale it down.
When you run
an infrastructure in the cloud in combination with online data storage
and data engine, it's very easy to scale it up and down. It’s very cost
efficient to run an operation where you can just add additional
computing power as you need, and then when you don’t need that anymore,
you can scale it down.
We did this during a time
period, when we had to bring a lot fresh data online quickly. We could
just add additional nodes, and we saw very close to linear scalability
by increasing our cluster size.
Schrieber: On
the business side, the other advantage is we can manage our cash flows
quite nicely. If you think about running a startup, cash is king, and
not having to do large capital outlays in advance, but being able to
adjust up and down with the fluctuations in our businesses, is also
valuable.
Gardner: We're getting close to the
end of our time. I wonder if you have any other insights into the
business benefits from an analytics perspective of doing it this way.
That is to say, incentivizing consumers, getting better data, being able
to move that data and then analyze it at an on-demand infrastructure
basis, and then deliver queries in whole new ways to a wider audience
within your client-base.
I guess I'm looking for how
this stands up both to the competitive landscape, but also to the past.
How new and how innovative is this in marketing? Then we'll talk about
where we go next? Let’s try to get a level set as to how new and how
refreshing this is, given what the technology enables both at cloud
basis and the mobility basis and then the core stuff, the underlying
analytics platform basis.
Product launch
Schrieber:
We have an example that's going on right now around a major new product
launch for a very large consumer goods company. They chose us to help
monitor this launch, because they were tired of waiting for six months
for any insight in terms of who is buying it, how they were discovering
it, how they came about choosing it over the competition, how their
experience was with the product, and what it meant for their business.
So
they chose to work with us for this major new brand launch, because we
could offer them visibility within days or weeks of launching that new
product in the market to help them understand who were the people who
were buying, was it the target audience that they thought it was going
to be, or was it a different demographic or lifestyle profile than they
were expecting. If so, they might need to change their positioning or
marketing tactics and targeting accordingly.
How are
these people discovering the products? We're able to trigger surveys to
them in the moment, right after they've made that purchase, and then
flow that data back through to our clients to help them understand how
these people are discovering it. Was it a TV advertisement? Was it
discovered on the shelf or display in the store? Did a friend tell them
about it? Was their social media marketing campaign working?
Often, hundreds of millions of dollars spent by major consumer goods
companies on new brand launches to get this quick feedback in terms of
what’s working and what’s not.
We're also able to
figure out what these people were buying before. Were they new to this
category of product? Or did they not use this kind of product before and
were just giving it a try? Were they buying a different brand and have
now switched over from that competitor? And, if so, how did they like it
by comparison, and will they repeat purchase? Is this brand going to be
successful? Is this meeting needs?
These are enormous
decisions. Often, hundreds of millions of dollars spent by major
consumer goods companies on new brand launches to get this quick
feedback in terms of what’s working and what’s not, who to target with
what kind of messaging, and what it’s doing to the marketplace in terms
of stealing share from competitors.
Driving new people
to the product category can influence major investment decisions along
the lines of whether we need to build the new manufacturing facility, do
we need to change our marketing campaigns, or should we go ahead and
invest in that TV Super Bowl ad, because this really has a chance to go
big?
These are massive decisions that these companies
can now make in a timely manner, based on this new approach of capturing
and making use of the data, instead of waiting six months on a new
product launch. They're now waiting just weeks and are able to make the
same kinds of decisions as a result.
Gardner: So, in a word it’s unprecedented. You really just haven’t been able to do this before.
Schrieber: It’s not been possible before at all, and I think that’s really what’s fueling the growth in our business.
Look to the future
Gardner:
Let’s look to the future quickly. We hear a lot about the Internet of
Things. We know that mobile is only partially through its evolution.
We're going to see more smart phones in more hands doing more types of
transactions around the globe. People will be using their phones for
more of what we have thought of as traditional business in commerce. So
that opens up a lot more information that’s generated and therefore need
to gather and then analyze.
So where do we go next?
How does this generate additional novel capabilities, and then where do
we go perhaps in terms of verticals? We haven’t even talked about food
or groceries, hospitality, or even health care.
So
without going too far -- this could be another hour conversation in
itself -- maybe we could just tease the listener and the reader with
where the potential for this going forward is.
Schrieber: If you think about
Internet of Things
as it relates to our business, there are a couple of exciting
developments. One is the use of things like beacons inside of stores.
Now we can know exactly which aisle people have walked down and what
shelf they’ve stood in front of, and what product they've interacted
with. That beacon is communicating with their smartphone and that
smartphone is tied to our user account in a way that we're surveying
these individuals or triggering surveys to them, in-the-moment, as they
shop.
That will open up entirely new fields of research and consumer
understanding about how people shop and make decisions at the shelf.
That’s
not something that’s been doable before. It’s something that the
Internet of Things, and very specifically beacons linking with
smartphones, will allow us to do going forward. That will open up
entirely new fields of research and consumer understanding about how
people shop and make decisions at the shelf.
The same
is true inside the home. We talk about the Internet of Things as it
relates to smart refrigerators or smart laundry machines, etc.
Understanding daily lifestyle activities and how people make the choice
of which product to use and how to use them inside their home is a field
of research that is under-served today. The Internet of Things is
really going to open up in the years to come.
Gardner: Just quickly, what are other retail sectors or vertical industries where this would make a great deal of sense.
Schrieber: I have a friend who runs an amazing business called
Wavemark,
which is basically an Internet of Things for medical devices and
medical consumables inside of hospitals and care facilities, with the
ability to track inventory in real time, tying it to patients and
procedures, tying it back to billing and consumption.
Making
all of that data available to the medical device manufacturers, so that
they can understand how and when their products are being used in the
real world in practice, is revolutionizing that industry. We're seeing
it in healthcare, and I think we're going to see it across every
industry.
Engineering perspective
Gardner:
Last word to you, Tibor. Given what Jared just told us about the
greater applicability. The model, the architecture comes back to mind
for me, the cloud, the mobile device, the data, the engine, the ability
to deal with that velocity, volume, and variability at a cost point that
is doable and scales up and down. Are there any thoughts about this
from an engineering perspective and where we go next?
Mozes:
We see that with all these opportunities bubbling up, the amount of
data that we have to process on a daily basis is just going to
continually grow at an exponential rate. We continue to get additional
information on shopping behavior and more data from external data
sources. Our data is just going to grow. We will need to engineer
everything to be as scalable as possible.
Gardner:
Very good. I'm afraid we will have to leave it there. We've been
learning about how InfoScout in San Francisco gleans new levels of
accurate insights into consumer behavior by collecting data directly
from sales receipts.
In order to better analyze that
data and use it, we have seen how they have used an architecture based
on the AWS public cloud, the infrastructure as a service and data as a
service capability, but built on HP Vertica as the engine for analytics
and for delivery of the analysis.
InfoScout is faced
with the daunting task of managing and cleansing this data and they've
been able to scale very impressively over the past six months using
Vertica in the cloud.
Become a member of myVertica
Register now
Gain access to the HP Vertica Community Edition
To
learn more, we've been here with our two guests, and I’d really like to
thank them. Tibor Mozes, Senior Vice President of Data Engineering
at InfoScout. Thank you so much, Tibor.
Mozes: Thank you.
Gardner: And also Jared Schrieber, Co-founder and CEO at InfoScout. Thank you so much, Jared.
Schrieber: Pleasure, Dana. Thank you.
Gardner: And a big thank you as well to our audience for joining us for this special new style of big data discussion.
I'm
Dana Gardner, Principal Analyst at Interarbor Solutions, your host for
this ongoing series of HP-sponsored discussions. Thanks again for
joining, and don’t forget to come back next time.
Listen to the podcast. Find it on iTunes. Download the transcript. Get the mobile app for iOS or Android. Sponsor: HP.
Transcript
of a BriefingsDirect discussion on how a consumer research and data
analysis firm gleans rich marketing data from customers' shared sales receipts. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved.
You may also be interested in: