Showing posts with label HP. Show all posts
Showing posts with label HP. Show all posts

Wednesday, August 05, 2015

How Localytics Uses Big Data to Improve Mobile App Development and Marketing

Transcript of a BriefingsDirect discussion on how big data helps an analytics company improve data-driven marketing on a variety of platforms.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Download the transcript. Sponsor: HP.

Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on IT innovation and how it’s making an impact on people’s lives.

Gardner
Our next big data case study interview highlights how Localytics uses data and associated analytics to help providers of mobile applications improve their applications -- and also allow them to better understand the uses for their apps and dynamic customer demands.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
To learn more about how big data helps mobile application developers better their products and services, please join me in welcoming our guest, Andrew Rollins, Founder and Chief Software Architect at Localytics, based in Boston. Welcome, Andrew.

Andrew Rollins: Thank you for having me.

Gardner: Tell us about your organization. You founded it to do what?

Rollins: We founded in 2008, two other guys and I. We set out initially to make mobile apps. If you remember back in 2008, this is when the iPhone App Store launched. So there was a lot of excitement around mobile apps at that time.

Rollins
We initially started looking at different concepts for apps, but then, over a period of a couple months, discovered that there really weren't a whole lot of services out there for mobile apps. It was basically a very bare ecosystem, kind of like the Wild, Wild West. [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

We ended up focusing on whether there was a services play in this industry and we settled on analytics, which we then called Localytics. The analogy we like to use is, at the time it was a little bit of a gold rush, and we want to sell the pickaxes. So that’s what we did.

Gardner: That makes a great deal of sense, and it has certainly turned into a gold rush. For those folks who do the mining, creating applications, what is it that they need to know?

Analytics and marketing

Rollins: That’s a good question. Here's a little back story on what we do. We do analytics, but we also do marketing. We're a full-service solution, where you can measure how your application is performing out in the wild. You can see what your users are doing. You can do anything from funnel analysis to engagement analysis, things like that.

From there, we also transition into the marketing side of things, where you can manage your push notifications, your in/out messaging.

For people who are making mobile apps, often they want to look at key metrics and then how to drive those metrics. That means a lot of A/B testing, funnel analysis, and engagement analysis.

It means not only analyzing these things, but making meaningful interactions, reaching out to customers via push notifications, getting them back in the app when they are not using the app, identifying points of drop-off, and messaging them at the right time to get them back in.

An example would be an e-commerce app. You've abandoned the shopping cart. Let’s get you back in the application via some sort of messaging. Doing all of that, measuring the return on investment (ROI) on that, measuring your acquisition channels, measuring what your users are doing, and creating that feedback loop is what we advocate mobile app developers do.

Gardner: You're able to do data-driven marketing in a way that may not have been very accessible before, because everything that’s done with the app is digital and measurable. There are logs, servers -- and so somewhere there's going to be a trail. It’s not so much marketing as it is science. We've always thought of marketing as perhaps an art and less of a science. How do you see this changing the very nature of marketing?

Everything ultimately that you are doing really does need to be data-driven. It's very hard to work off just intuition alone.
Rollins: Everything ultimately that you are doing really does need to be data-driven. It's very hard to work off of just intuition alone. So that's the art and science. You come out with your initial hypothesis, and that’s a little bit more on the craft or art side, where you're using your intuition to guide you on where to start.

From there, you have to use the data to iterate. I'm going to try this, this, and this, and then see which works out. That would be like a typical multivariate kind of testing.

Determine what works out of all these concepts that you're trying, and then you iterate on that. That's where measuring anything you do, any kind of interaction you have with your user, and then using that as feedback to then inform the next interaction is what you have to be doing.

Gardner: And this is also a bit revolutionary when it comes to software development. It wasn't that long ago that the waterfall approach to development might leave years between iterations. Now, we're thinking about constantly updating, iterating, getting a feedback loop, and condensing the latency of that feedback loop so that we really can react as close to real-time as possible.

What is it about mobile apps that's allowed for a whole different approach to this notion of connectedness and feedback loops to an app audience?

Mobile apps are different

Rollins: This brings up a good point. A lot of people ask why we have a mobile app analytics company. Why did we do that? Why is typical web analytics not good enough? It kind of speaks to something that you're talking about. Mobile apps are a little bit different than the regular web, in the sense that you do have a cycle that you can push apps out on.

You release to, let’s say, the iPhone App Store. It might take a couple of weeks before your app goes out there. So you have to be really careful about what you're publishing, because your turnaround time is not that of the web. [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

However, there are certain interactions you can have, like on the messaging side, where you have an ability to instantly go back and forth. Mobile apps are a different kind of market. It requires a little different understanding than the traditional approach.

... We consume the data in a real-time pipeline. We're not doing background batch processing that you might see in something like Hadoop. We're doing a lot of real-time pipeline stuff, such that you can see results within a minute or two of it being uploaded from a device. That's largely where HP Vertica comes in, and why we ended up using Vertica, because of its real-time nature. It’s about the scale.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
Gardner: If I understand correctly, you have access to the data from all these devices, you are crunching that, and you're offering reports and services back to your customers. Do they look to you as also a platform provider or just a data-service provider? How do the actual hosting and support services for these marketing capabilities come about?

Rollins: We tend to cater more toward the high end. A lot of our customers are large app publishers that have an ongoing application, let’s say a shopping application or news application.

In that sense, when we bring people on board, oftentimes they tend to be larger companies that aren’t necessarily technically savvy yet about mobile, because it's still new for some people. We do offer a lot of onboarding services to make sure they integrate their application correctly, measure it correctly, and are looking at the right metrics for their industry, as compared to other apps in that industry.

Then, we keep that relationship open as they go along and as they see data. We iterate on that with them. Because of the newness of the industry it does require education.

Gardner: And where is HP Vertica running for you? Do you run it on your own data center? Are you using cloud? Is there a hybrid? Do you have some other model?

Running in the cloud

Rollins: We run it in the cloud. We are running on Amazon Web Services (AWS). We've thought a lot about whether we should run it in a separate data center, so that we can dictate the hardware, but presently we are running it in AWS.

Gardner: Let’s talk about what you can do when you do this correctly. Because you have a capacity to handle scale, you've developed speed, and you understand the requirements in the market, what are your customers getting from the ability to do all this?

Rollins: It really depends on the customer. Something like an e-commerce app is going to look heavily at things like where users are dropping off and what's preventing them from making that purchase.

Another application, like news, which I mentioned, will look at something different, usually something more along the lines of engagement. How long are they reading an article for? That matters to them, so that they can give those numbers to advertisers.

So the answer to that largely depends on who you are and what your app is. Something like an e-commerce app is going to look heavily at things like where users are dropping off and what's preventing them from making that purchase.
Something like an e-commerce app is going to look heavily at things like where users are dropping off and what's preventing them from making that purchase.

Gardner: I suppose another benefit of developing these insights, as specific and germane as they might be to each client, is the ability to draw different types of data in. Clearly, there's the data from the App Store and from the app itself, but if we could join that data with some other external datasets, we might be able to determine something more about why they drop-off or why they are spending more, or time doing certain things.

So is there an opportunity, and do you have any examples of where you've been able to go after more datasets and then be able to scale to that?

Rollins: This is something that's come up a lot recently. In the past year, we have our own products that we're launching in this space, but the idea of integrating different data types is really big right now.

You have all these different silos -- mobile, web, and even your internal server infrastructure. If you're a retail company that has a mobile app, you might even have physical stores. So you're trying to get all this data in some collective view of your customer.

You want to know that Sally came to your store and purchased a particular kind of item. Then, you want to be able to know that in your mobile app. Maybe you have a loyalty card that you can tie across the media and then use that to engage with her meaningfully about stuff that might interest her in the mobile app as well.

"We noticed that you bought this a month ago. Maybe you need another one. Here is a coupon for it."

Other datasets

That's a big thing, and we're looking at a lot of different ways of doing that by bringing in other datasets that might not be from just a mobile app itself.

We're not even focused on mobile apps any more. We're really just an app analytics company, and that means the web and desktop. We ship in Windows, for example. We deal with a lot of Microsoft applications. Tying together all of that stuff is kind of the future. [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

Gardner: For those organizations that are embarking on more of a data-driven business model, that are looking for analytics and platforms and requirements, is there anything that you could offer in hindsight having traveled this path and worked with HP Vertica. What should they keep in mind when they're looking to move into a capability, maybe it's on-prem, maybe it's cloud. What advice could you offer them?

At scale, you have to know what each technology is good at, and how you bring together multiple technologies to accomplish what you want.
Rollins: The journey that we went through was with various platforms. At the end of day, be aware of what the vendor of the big-data platform is pitching, versus the reality of it.

A lot of times, prototyping is very easy, but actually going to large scale is fairly difficult. At scale, you have to know what each technology is good at, and how you bring together multiple technologies to accomplish what you want.

That means a lot of prototyping, a lot of stress testing and benchmarking. You really don’t know until you try it with a lot of these things. There are a lot of promises, but the reality might be different.

Gardner: Any thoughts about Vertica’s track record, given your length of experience?

Rollins: They're really good. I'm both impressed with the speed of it as compared to other things we have looked at, as well as the features that they release. Vertica 7 has a bunch of great stuff in it. Vertica 6, when it came out, had a bunch of great stuff in it. I'm pretty happy with it.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
Gardner: I'm afraid we will have to leave it there. We've been learning about how Localytics uses big data to improve data-driven marketing for a variety of mobile application creators and distributors.

I'd like to thank our guest, Andrew Rollins, Founder and Chief Software Architect at Localytics, based in Boston. Thank you, Andrew.

Rollins: Thank you very much for having me.

Gardner: And thanks to you, our audience, for joining as well. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HP-sponsored discussions. Thanks again for joining, and do come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Download the transcript. Sponsor: HP.

Transcript of a BriefingsDirect discussion on how big data helps an analytics company improve data-driven marketing on a variety of platforms. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved.

You may also be interested in:

Thursday, July 30, 2015

Full 360 Takes Big Data Analysis Cloud Services to New Business Levels

Transcript of a BriefingsDirect discussion on the benefits of joining data analysis and the cloud.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Download the transcript. Sponsor: HP.

Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on IT innovation.

Gardner
Our next cloud case study interview highlights how Full 360 uses big data and analytics to improve their financial operations. To learn how, we're joined by  Eric Valenzuela, Director of Business Development at Full 360, based in New York. Welcome, Eric.

Eric Valenzuela: Good morning. Thank you for having me.

Gardner: Tell us about Full 360 and the role it plays in the financial sector. [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

Valenzuela: Full 360 is a consulting and services firm, and we purely focus on data warehousingbusiness intelligence (BI), and hosted solutions. We build and consult and then we do managed services for hosting those complex, sophisticated solutions in the cloud, in the Amazon cloud specifically.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
Gardner: And why is cloud a big differentiator for this type of service in the financial sector?

Valenzuela: It’s not necessarily just for finance. It seems to be beneficial for any company that has a large initiative around data warehouse and BI. For us, specifically, the cloud is a platform that we can develop our scripts and processes around. That way, we can guarantee 100 percent that we're providing the same exact service to all of our customers.

Valenzuela
We have quite a bit of intellectual property (IP) that’s wrapped up inside our scripts and processes. The cloud platform itself is a good starting point for a lot of people, but it also has elasticity for those companies that continue to grow and add to their data warehousing and BI solutions.

Gardner: Eric, it sounds as if you've built your own platform as a service (PaaS) for your specific activities and development and analytics on top of a public cloud infrastructure. Is that fair to say?

Valenzuela: That’s a fair assumption.

Primary requirements

Gardner: So as you are doing this cloud-based analytic service, what is it that your customers are demanding of you? What are the primary requirements you fulfill for them with this technology and approach?

Valenzuela: With data warehousing being rather new, Vertica specifically, there is a lack of knowledge out there in terms of how to manage it, keep it up and running, tune it, analyze queries and make sure that they're returning information efficiently, that kind of thing. What we try to do is to supplement that lack of expertise.

Gardner: Leave the driving to us, more or less. You're the plumbers and you let them deal with the proper running water and other application-level intelligence?

Valenzuela: We're like an insurance policy. We do all the heavy lifting, the maintenance, and the management. We ensure that your solution is going to run the way that you expect it to run. We take the mundane out, and then give the companies the time to focus on building intelligent applications, as opposed to worrying about how to keep the thing up and running, tuned, and efficient.

Gardner: Given that Wall Street has been crunching numbers for an awfully long time, and I know that they have, in many ways, almost unlimited resources to go at things like BI -- what’s different now than say 5 or 10 years ago? Is there more of a benefit to speed and agility versus just raw power? How has the economics or dynamics of Wall Street analytics changed over the past few years?
We're like an insurance policy. We do all the heavy lifting, the maintenance, and the management.

Valenzuela: First, it’s definitely the level of data. Just 5 or 10 years ago, either you had disparate pieces of data or you didn’t have a whole lot of data. Now it seems like we are just managing massive amounts of data from different feeds, different sources. As that grows, there has to be a vehicle to carry all of that, where it’s limitless in a sense.

Early on, it was really just a lack of the volume that we have today. In addition to that, 8 or 10 years ago BI was still rather new in what it could actually do for a company in terms of making agile decisions and informed decisions, decisions with intent.

So fast forward, and it’s widely accepted and adopted now. It’s like the cloud. When cloud first came out, everybody was concerned about security. How are we going to get the data in there? How are we going to stand this thing up? How are we going to manage it? Those questions come up a lot less now than they did even two years ago. [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

Gardner: While you may have cut your teeth on Wall Street, you seem to be branching out into other verticals -- gaming, travel, logistics. What are some of the other areas now to which you're taking your services, your data warehouse, and your BI tools?

Following the trends

Valenzuela: It seems like we're following the trends. Recently it's been gaming. We have quite a few gaming customers that are just producing massive amounts of data.

There's also the airline industry. The customers that we have in airlines, now that they have a way to -- I hate this term -- slice and dice their data, are building really informed, intelligent applications to service their customers, customer appreciation. It’s built for that kind of thing. Airlines are now starting to see what their competition is doing. So they're getting on board and starting to build similar applications so they are not left behind.

Banking was pretty much the first to go full force and adopt BI as a basis for their practice. Finance has always been there. They've been doing it for quite a long time.

Gardner: So as the director of business development, I imagine you're out there saying, "We can do things that couldn’t have been done before at prices that weren’t available before." That must give you almost an unlimited addressable market. How do you know where to go next to sell this?
At first, we were doing a lot of education. Now, it’s just, "Yes, we can do this."

Valenzuela: It’s kind of an open field. From my perspective, I look at the different companies out there that come to me. At first, we were doing a lot of education. Now, it’s just, "Yes, we can do this," because these things are proven. We're not proving any concepts anymore. Everything has already been done, and we know that we can do it.

It is an open field, but we focus purely on the cloud. We expect all of our customers will be in the Amazon cloud. It seems that now I am teaching people a little bit more -- just because it’s cloud, it’s not magic. You still have to do a lot of work. It’s still an infrastructure.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
But we come from that approach and we make sure that the customer is properly aligned with the vision that this is not just a one- or two-month type commitment. We're not just going to build a solution, put it in our pocket, and walk away. We want to know that they're fully committed for 6-12 months.

Otherwise, you're not going to get the benefits of it. You're just going to spend the money and the effort, and you're not really going to get any benefits out of it if you're not going to be committed for the longer period of time. There still are some challenges with the sales and business development.

Gardner: Given this emphasis on selling the cloud model as much as the BI value, you needed to choose an analytics platform that was cloud-friendly and that was also Amazon AWS cloud-friendly. Tell me how Vertica and Amazon -- and your requirements -- came together.

Good timing

Valenzuela: I think it was purely a timing thing. Our CTO, Rohit Amarnath, attended a session at MIT, where Vertica was first announced. So he developed a relationship there.

This was right around the time when Amazon announced that they were offering its public cloud platform, EC2. So it made a lot of sense to look at the cloud as being a vision, looking at the cloud as a platform, looking at column databases as a future way of managing BI and analytics, and then putting the two together.

It was more or less a timing thing. Amazon was there. It was new technology, and we saw the future in that. Analytics was newly adopted. So now you have the column database that we can leverage as well. So blend the two together and start building some platform that hadn’t been done yet.
There are a lot of Vertica customers out there that are going to reach a limitation. That may require procuring more hardware, more IT staff. The cloud aspect removes all of that.

Gardner: What about lessons learned along the way? Are there some areas to avoid or places that you think are more valuable that people might appreciate? If someone were to begin a journey toward a combination of BI, cloud, and vertical industry tool function, what might you tell them to be wary of, or to double-down on?

Valenzuela: We forged our own way. We couldn’t learn from our competitors’ mistakes because we were the ones that were creating the mistakes. We had to to clear those up and learn from our own mistakes as we moved forward.

Gardner: So perhaps a lesson is to be bold and not to be confined by the old models of IT?

Valenzuela: Definitely that. Definitely thinking outside the box and seeing what the cloud can do, focus on forgetting about old IT and then looking at cloud as a new form of IT. Understanding what it cannot do as a basis, but really open up your mind and think about it as to what it can actually do, from an elasticity perspective.

There are a lot of Vertica customers out there that are going to reach a limitation. That may require procuring more hardware, more IT staff. The cloud aspect removes all of that.

Gardner: I suppose it allows you as a director of business development to go downstream. You can find smaller companies, medium-sized enterprises, and say, "Listen, you don’t have to build a data warehouse at your own expense. You can start doing BI based on a warehouse-as-a-service model, pay as you go, grow as you learn, and so forth."

Money concept

Valenzuela: Exactly. Small or large, those IT departments are spending that money anyway. They're spending it on servers. If they are on-premises, the cost of that server in the cloud should be equal or less. That’s the concept.

If you're already spending the money, why not just migrate it and then partner with a firm like us that knows how to operate that. Then, we become your augmented experts, or that insurance policy, to make sure that those things are going to be running the way you want them to, as if it were your own IT department.

Gardner: What are the types of applications that people have been building and that you've been helping them with at Full 360? We're talking about not just financial, but enterprise performance management. What are the other kinds of BI apps? What are some of the killer apps that people have been using your services to do?
I don’t know how that could be driven if it weren’t for analytics and if it weren’t for technology like Vertica to be able to provide that information.

Valenzuela: Specifically, with one of our large airlines, it's customer appreciation. The level of detail on their customers that they're able to bring to the plane, to the flight attendants, in a handheld device is powerful. It’s powerful to the point where you remember that treatment that you got on the plane. So that’s one thing.

That’s something that you don’t get if you fly a lot, if you fly other airlines. That’s just kind of some detail and some treatment that you just don’t get. I don’t know how that could be driven if it weren’t for analytics and if it weren’t for technology like Vertica to be able to provide that information.

Gardner: I'm afraid we'll have to leave it there. You've been learning about how Full 360 uses HP Vertica in the Amazon cloud to provide data warehouse and BI applications and services to its customers from Wall Street to the local airport. [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

So join me in thanking Eric Valenzuela, Director of Business Development at Full 360 in New York. Thanks so much, Eric.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
Valenzuela: Thank you.

Gardner: And I'd like to thank our audience as well for joining us for this IT innovation discussion. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HP-sponsored discussions. Thanks again for listening, and come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Download the transcript. Sponsor: HP.


Transcript of a BriefingsDirect discussion on the benefits of joining data analysis and the cloud. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved.

You may also be interested in:

Tuesday, July 28, 2015

How Big Data Technologies Hadoop and Vertica Drive Business Results at Snagajob

Transcript of a BriefingsDirect discussion on how an employment search company uses data analysis to bring better matches for job seekers and employers.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Download the transcript. Sponsor: HP.

Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on IT innovation and how it’s making an impact on people’s lives.

Gardner
Our next innovation case study interview highlights how Snagajob in Richmond, Virginia -- one of the largest hourly employment networks for job seekers and employers – uses big data to improve their performance and to better understand how their systems provide rapid services to their users.

Snagajob recently delivered nearly 500,000 new jobs in a single month through their systems. To learn how they're managing such impressive scale, we welcome Robert Fehrmann, Data Architect at Snagajob in Richmond, Virginia.

Robert Fehrmann: Thank you for the introduction.

Gardner: First, tell us about your organization. You’ve been doing this successfully since 2000. How are hourly workers different from regular employment? What type of employment are we talking about? Let's understand the role you play in the employment market.

Fehrmann: Snagajob, as you mentioned, is America's largest hourly network for employees and employers. The hourly market means we have, relatively speaking, high turnover.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
Another aspect, in comparison to some of our competitors, is that we provide an inexpensive service. So our subscriptions are on the low end, compared to our competitors.

Gardner: Tell us how you use big data to improve your operations. I believe that among the first ways that you’ve done that is to try to better analyze your performance metrics. What were you facing as a problem when it came to performance? [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

Signs of stress

Fehrmann: A couple of years ago, we started looking at our environment, and it became obvious that our traditional technology was showing some signs of stress. As you mentioned, we really have data at scale here. We have 20,000 to 25,000 postings per day, and we have about 700,000 unique visitors on a daily basis. So data is coming in very, very quickly.

Fehrmann
We also realized that we're sitting on a gold mine and we were able to ingest data pretty well. But we had problem getting information and innovation out of our big data lake.

Gardner: And of course, near real time is important. You want to catch degradation in any fashion from your systems right away. How do you then go about getting this in real time? How do you do the analysis?

Fehrmann: We started using Hadoop. I'll use a lot of technical terms here. From our website, we're getting events. Events are routed via Flume directly into Hadoop. We're collecting about 600 million key-value pairs on a daily basis. It's a massive amount of data, 25 gigabytes on a daily basis.

The second piece in this journey to big data was analyzing these events, and that’s where we're using HP Vertica. Second, our original use case was to analyze a funnel. A funnel is where people come to our site. They're searching for jobs, maybe by keyword, maybe by zip code. A subset of that is an interest in a job, and they click on a posting. A subset of that is applying for the job via an application. A subset is interest in an employer, and so on. We had never been able to analyze this funnel.

The dataset is about 300 to 400 million rows, and 30 to 40 gigabytes. We wanted to make this data available, not just to our internal users, but all external users. Therefore, we set ourselves a goal of a five-second response time. No query on this dataset should run for more than five seconds -- and Vertica and Hadoop gave us a solution for this.

Gardner: How have you been able to increase your performance reach your key performance indicators (KPIs) and service-level agreements (SLAs)? How has this benefited you?

Fehrmann: Another application that we were able to implement is a recommendation engine. A recommendation engine is that use where our jobseekers who apply for a specific job may not know about all the other jobs that are very similar to this job or that other people have applied to.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
We started analyzing the search results that we were getting and implemented a recommendation engine. Sometimes it’s very difficult to have real comparison between before and after. Here, we were able to see that we got an 11 percent increase in application flow. Application flow is how many applications a customer is getting from us. By implementing this recommendation engine, we saw an immediate 11 percent increase in application flow, one of our key metrics.

Gardner: So you took the success from your big-data implementation and analysis capabilities from this performance task to some other areas. Are there other business areas, search yield, for example, where you can apply this to get other benefits?

Brand-new applications

Fehrmann: When we started, we had the idea that we were looking for a solution for migrating our existing environment, to a better-performing new environment. But what we've seen is that most of the applications we've developed so far are brand-new applications that we hadn't been able to do before.

You mentioned search yield. Search yield is a very interesting aspect. It’s a massive dataset. It's about 2.5 billion rows and about 100 gigabytes of data as of right now and it's continuously increasing. So for all of the applications, as well as all of the search requests that we have collected since we have started this environment, we're able to analyze the search yield.
Most of the applications we've developed so far are brand-new applications that we hadn't been able to do before.

For example, that's how many applications we get for a specific search keyword in real time. By real time, I mean that somebody can run a query against this massive dataset and gets result in a couple of seconds. We can analyze specific jobs in specific areas, specific keywords that are searched in a specific time period or in a specific location of the country.

Gardner: And once again, now that you've been able to do something you couldn't do before, what have been the results? How has that impacted change your business? [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

Fehrmann: It really allows our salespeople to provide great information during the prospecting phase. If we're prospecting with a new client, we can tell him very specifically that if they're in this industry, in this area, they can expect an application flow, depending on how big the company is, of let’s say in a hundred applications per day.

Gardner: How has this been a benefit to your end users, those people seeking jobs and those people seeking to fill jobs?

Fehrmann: There are certainly some jobs that people are more interested in than others. On the flip side, if a particular job gets a 100 or 500 applications, it's just a fact that only a small number going to get that particular job. Now if you apply for a job that isn't as interesting, you have much, much higher probability of getting the job.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
Gardner: I'm afraid we will have to leave it there. We've been talking with Snagajob about how they use big data on multiple levels to improve their business performance, their system’s performance, and ultimately how they go about understanding their new challenges and opportunities.

With that, I'd like to thank our guest, Robert Fehrmann, Data Architect at Snagajob in Richmond, Virginia. Thank you.

Fehrmann: Thank you, Dana.

Gardner: And I’d like to thank our audience as well for joining us for this special new style of IT discussion. [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HP-sponsored discussions. Thanks again for listening, and do come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Download the transcript. Sponsor: HP.

Transcript of a BriefingsDirect discussion on how an employment search company uses data analysis to bring better matches for job seekers and employers. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved.

You may also be interested in:

Wednesday, July 22, 2015

Zynga Builds Big Data Innovation Culture by Making Analytics Open to All

Transcript of a BriefingsDirect discussion on how data-driven companies can gain a competitive advantage in making as much analysis available to as many people in their organizations as possible.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Download the transcript. Sponsor: HP.

Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on IT innovation and how it’s making an impact on people’s lives.

Gardner
Our next big data case study interview highlights how Zynga in San Francisco depends on big-data analytics to improve its business via a culture of pervasive analytics and experimentation.

To learn more about how big data impacts Zynga in the fast-changing and highly competitive mobile gaming industry, please welcome Joanne Ho, a Senior Engineering Manager at Zynga. Welcome, Joanne.

Joanne Ho: Hi.

Gardner: And also, Yuko Yamazaki, Head of Analytics at Zynga. Welcome, Yuko.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
Yuko Yamazaki: Thank you.

Gardner: How important is big data analytics to you as an organization?

Ho
Ho: To Zynga, big data is very important. It's a main piece of the company and as a part of the analytics department, big data is serving the entire company as a source of understanding our users' behavior, our players, what they like, and what they don’t like about games. We are using this data to analyze the user’s behavior and we also will personalize a lot of different game models that fit the user’s player pattern.

Gardner: What’s interesting to me about games is the people will not only download them but that they're upgradable, changeable. People can easily move. So the feedback loop between the inferences, information, and analysis you gain by your users' actions is rather compressed, compared to many other industries.

What is it that you're able to do in this rapid-fire development-and-release process? How is that responsiveness important to you?

Real-time analysis

Ho: Real-time analysis, of course, is critical, and we have our streaming system that can do it. We have our monitoring and alerting system that can alert us whenever we see any drops in user’s install rating, or any daily active users (DAU). The game studio will be alerted and they will take appropriate action on that.

Gardner: Yuko, what sort of datasets we are talking about? If we're going to the social realm, we can get some very large datasets. What's the volume and scale we're talking about here?

Yamazaki: We get data of everything that happens in our games. Almost every single play gets tracked into our system. We're talking about 40 billion to 60 billion rows a day, and that's the data that our game product managers and development engineers decide what they want to analyze later. So it’s already being structured and compressed as it comes into our database.

Gardner: That’s very impressive scale. It’s one thing to have a lot of data, but it’s another to be able to make that actionable. What do you do once that data is assembled?

Yamazaki: The biggest success story that I will normally tell about Zynga is that we make data available to all employees. From day one, as soon as you join Zynga, you get to see all the data through our visualization to whatever we have. Even if you're FarmVille product manager, you get to see what Poker is doing, making it more transparent. There is an account report that you can just click and see how many people have done this particular game action, for example. That’s how we were able to create this data-driven culture for Zynga.

Yamazaki
Gardner: And Zynga is not all that old. Is this data capability something that you’ve had right from the start, or did you come into it over time? 

Yamazaki: Since we began Poker and Words With Friends, our cluster scaled 70 times.

Ho: It started off with three nodes, and we've grown to 230 node clusters.

Gardner: So you're performing the gathering of the data and analysis in your own data centers?

Yamazaki: Yes.

Gardner: When you realized the scale and the nature of your task, what were some of the top requirements you had for your cluster, your database, and your analytics engine? How did you make some technology choices?

Biggest points

Yamazaki: When Zynga was growing, our main focus was to build something that was going to be able to scale and provide the data as fast as possible. Those were the two biggest points that we had in mind when we decided to create our analytics infrastructure.

Gardner: And any other more detailed requirements in terms of the type of database or the type of analytics engine?
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
Yamazaki: Those are two big ones. As I mentioned, we wanted to have everyone be able to access the data. So SQL would have been a great technology to have. It’s easy to train PMs instead of engineering sites, for example, MapReduce for Hadoop. Those were the three key points as we selected our database.

Gardner: What are the future directions and requirements that you have? Are there things that you’d like to see from HP, for example, in order to continue to be able do what you do at increasing scale?

Ho: We're interested in real-time analytics. There's a function aggregated projection that we're interested in trying. Also Flex Tables [in HP Vertica] sounds like a very interesting feature that we also will attempt to try. And cloud analytics is the third one that we're also interested in. We hope HP will get it matured, so that we can also test it out in the future.
We we have 2,000 employees, and  at least 1,000 are using our visualization tool on a daily basis.

Gardner: While your analytics has been with you right from the start, you were early in using Vertica?

Ho: Yes.

Gardner: So now we've determined how important it is, do you have any metrics of what this is able to do for you? Other organizations might be saying they we don't have as much of a data-driven culture as Zynga, but would like to and they realize that the technology can now ramp-up to such incredible volume and velocity, What do you get back? How do you measure the success when you do big-data analytics correctly?

Yamazaki: Internally, we look at adoption of systems. We we have 2,000 employees, and  at least 1,000 are using our visualization tool on a daily basis. This is the way to measure adoption of our systems internally.

Externally, the biggest metric is retention. Are players coming back and, if so, was that through the data that we collect? Were we able to do personalization so that they're coming back because of the experience they've had?

Gardner: These are very important to your business, obviously, and it’s curious about that buy-in. As the saying goes, you can lead a horse to water, but you can't make him drink. You can provide data analysis and visualization to the employees, but if they don’t find it useful and impactful, they won’t use it. So that’s interesting with that as a key performance indicator for you.

Any words of advice for other organizations who are trying to become more data-driven, to use analytics more strategically? Is this about people, process, culture, technology, all the above? What advice might you have for those seeking to better avail themselves of big data analytics?

Visualization

Yamazaki: A couple of things. One is to provide end-to-end. So not just data storage, but also visualization. We also have an experimentation system, where I think we have about 400-600 experiments running as we speak. We have a report that shows you run this experiment, all these metrics have been moved because of your experiment, and A is better than B.

We run this other experiment, and there's a visualization you can use to see that data. So providing that end-to-end data and analytics to all employees is one of the biggest pieces of advice I would provide to any companies.

One more thing is try to get one good win. If you focus too much on technology or scalability, you might be building a battleship, when you actually don’t need it yet. It's incremental. Improvement is probably going to take you to a place that you need to get to. Just try to get a good big win of increasing installs or active users in one particular game or product and see where it goes.

Gardner: And just to revisit the idea that you've got so many employees and so many innovations going on, how do you encourage your employees to interact with the data? Do you give them total flexibility in terms of experiments? How do they start the process of some of those proof-of-concept type of activities?

Yamazaki: It's all freestyle. They can log whatever they want. They can see whatever they want, except revenue type of data, and they can create any experiments they want. Her team owns this part, but we also make the data available. Some of the games can hit real time. We can do that real-time personalization using that data that you logged. It’s almost 360-degree of the data availability to our product teams.
If you focus too much on technology or scalability, you might be building a battleship, when you actually don’t need it yet.

Gardner: It’s really impressive that there's so much of this data mentality ingrained in the company, from the start and also across all the employees, so that’s very interesting. How do you see that in terms of your competitive edge? Do you think the other gaming companies are doing the same thing? Do you have an advantage that you've created a data culture?

Yamazaki: Definitely, in online gaming you have to have big data to succeed. A lot of companies, though, are just getting whatever they can, then structure it, and make it analyzable. One of the things that we've done that do well was to make a structure to start with. So the data is already structured.

Product managers are already thinking about what they want to analyze before hand. It's not like they just get everything in and then see what happens. They think right away about, "Is this analyzable? is this something we want to store?" We're a lot smarter about what we want to store. Cost-wise, it's a lot more optimized.

Gardner: We'll have to leave it there. We have been hearing about how Zynga in San Francisco has, right from its inception, created a very strong culture around big data as it grabs as much data as they can from the massive volumes created by its games. It then goes further, using HP Vertica, to make the results of that data acquisition available to its employees.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
I'd like to thank our guests, Joanne Ho, Senior Engineering Manager at Zynga, and Yuko Yamazaki, Head of Analytics at Zynga. And a big thank you to our audience as well, for joining us for this special new style of IT discussion.

I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HP-sponsored discussions. Thanks again for listening, and come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Download the transcript. Sponsor: HP.

Transcript of a BriefingsDirect discussion on how data-driven companies can gain a competitive advantage in making as much analysis available to as many people in their organizations as possible. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved.

You may also be interested in:

Monday, July 20, 2015

How Big Data Powers GameStop to Gain Retail Advantage and Deep Insights into its Markets

Transcript of a BriefingsDirect discussion on how a gaming retailer uses big data to gather insights into sales trends and customer wants and needs.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Download the transcript. Sponsor: HP.

Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing sponsored discussion on IT innovation and how it’s making an impact on people’s lives.

Gardner
Once again, we're focusing on how companies are adapting to the new style of IT to improve IT performance and deliver better user experiences, as well as better business results.

Our next innovation case study interview highlights how GameStop, based in Grapevine, Texas uses big data to improve how it conducts its business and serve its customers. To learn more about how they deploy big data and use the resulting analytics, we are joined by John Crossen, Data Warehouse Lead at GameStop. Welcome, John.

John Crossen: Thank you for having me.
Become a member of myVertica today
Register now
Access the FREE HP Vertica Community Edition
Gardner: Tell us a little bit about GameStop. Most people are probably familiar with the retail outlets that they see, where you can buy, rent, trade games, and learn more about games. Why is big data important to your organization?

Crossen: We wanted to get a better idea of who our customers are, how we can better serve our customers and what types of needs they may have. With prior reporting, we would get good overall views of here’s how the company is doing or here’s how a particular game series is selling, but we weren’t able to tie that to activities of individual customers and possible future activity of future customers, using more of a traditional SQL-based platform that would just deliver flat reports.

Crossen
So, our goal was to get s more 360-degree view of our customer and we realized pretty quickly that, using our existing toolsets and methodologies, that wasn’t going to be possible. That’s where Vertica ended up coming into play to drive us in that direction.

Gardner: Just so we have a sense of this scale here, how many retail outlets does GameStop support and where are you located?

Crossen:  We're international. There are approximately 4,200 stores in the US and another 2,200 international.

Gardner: And in terms of the type of data that you are acquiring, is this all internal data or do you go to external data sources and how do you to bring that together?

Internal data

Crossen: It's primarily internal data. We get data from our website. We have the PowerUp Rewards program that customers can choose to join, and we have data from individual cash registers and all those stores.

Gardner: I know from experience in my own family that gaming is a very fast-moving industry. We’ve quickly gone from different platforms to different game types and different technologies when we're interacting with the games.

It's a very dynamic changeable landscape for the users, as well as, of course, the providers of games. You are sort of in the middle. You're right between the users and the vendors. You must be very important to the whole ecosystem.

Crossen: Most definitely, and there aren’t really many game retailers left anymore. GameStop is certainly the preeminent one. So a lot of customers come not just to purchase a game, but get information from store associates. We have Game Informer Magazine that people like to read and we have content on the website as well.

Gardner: Now that you know where to get the data and you have the data, how big is it? How difficult is it to manage? Are you looking for real-time or batch? How do you then move forward from that data to some business outcome?

Crossen: It’s primarily batch at this point. The registers close at night, and we get data from registers and loads that into HP Vertica. When we started approximately two years ago, we didn't have a single byte in Vertica. Now, we have pretty close to 24 terabytes of data. It's primarily customer data on individual customers, as well Weblogs or mobile application data.
Become a member of myVertica today
Register now
Access the FREE HP Vertica Community Edition
Gardner: I should think that when you analyze which games are being bought, which ones are being traded, which ones are price-sensitive and move at a certain price or not, you're really at the vanguard of knowing the trends in the gaming industry -- even perhaps before anyone else. How has that worked for you, and what are you finding?

Crossen: A lot of it is just based on determining who is likely to buy which series of games. So you won't market the next Call of Duty 3 or something like that to somebody who's buying your children's games. We are not going to ask people buy Call of Duty 3, rather than My Little Pony 6.

The interesting thing, at least with games and video game systems, is that when we sell them new, there's no price movement. Every game is the same price in any store. So we have to rely on other things like customer service and getting information to the customer to drive game sales. Used games are a bit of a different story.

Gardner: Now back to Vertica. Given that you've been using this for a few years and you have such a substantial data lake, what is it about Vertica that works for you? What are learning here at the conference that intrigues you about the future?

Quick reports

Crossen: The initial push with HP Vertica was just to get reports fast. We had processes that literally took a day to run to accumulate data. Now, in Vertica, we can pull that same data out in five minutes. I think that if we spend a little bit more time, we could probably get it faster than half of that.

The first big push was just speed. The second wave after that was bringing in data sources that were unattainable before, like web-click data, a tremendous amount of data, loading that into SQL, and then being able to query it out of SQL. This wasn't doable before, and it’s made it do that. At first, it was faster data, then acquiring new data and finding different ways to tie different data elements together that we haven’t done before.

Gardner: How about visualization of these reports? How do you serve up those reports and do you make your inference and analytics outputs available to all your employees? How do you distribute it? Is there sort of an innovation curve that you're following in terms of what they do with that data?
We had processes that literally took a day to run to accumulate data. Now, in Vertica, we can pull that same data out in five minutes.

Crossen: As far as a platform, we use Tableau as our visualization tool. We’ve used a kind of an ad-hoc environment to write direct SQL queries to pull data out, but Tableau serves the primary tool.

Gardner: In that data input area, what integration technologies are you interested in? What would you like to see HP do differently? Are you happy with the way SQL, Vertica, Hadoop, and other technologies are coming together? Where would you like to see that go?

Crossen: A lot of our source systems are either SQL-server based or just flat files. For flat files, we use the Copy Command to bring data, and that’s very fast. With Vertica 7, they released the Microsoft SQL Connector.

So we're able to use our existing SQL Server Integration Services (SSIS) data flows and change the output from another SQL table to direct me into Vertica. It uses the Copy Command under the covers and that’s been a major improvement. Before that, we had to stage the data somewhere else and then use the Copy Command to bring it in or try to use Open Database Connectivity (ODBC) to bring it in, which wasn’t very efficient.

20/20 hindsight

Gardner: How about words of wisdom from your 20/20 hindsight? Others are also thinking about moving from a standard relational database environment towards big data stores for analytics and speed and velocity of their reports. Any advice you might offer organizations as they're making that transition, now that you’ve done it?

Crossen: Just to better understand how a column-store database works, and how that's different from a traditional row-based database. It's a different mindset, everything from how you are going to lay out data modeling.
Become a member of myVertica today
Register now
Access the FREE HP Vertica Community Edition
For example, in a row database you would tend to freak out if you had a 700-column table. In the column stores, that doesn’t really matter. So just to get in the right mindset of here’s how a column-store database works, and not try to duplicate row-based system in the column-store system.

Gardner: Great. I am afraid we’ll have to leave it there. I’d like to thank our guest, John Crossen, the Data Warehouse Lead at GameStop in Grapevine, Texas. I appreciate your input.

Crossen: Thank you.

Gardner: And also thank to our audience for joining us for this special new style of IT discussion. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HP-sponsored discussions. Thanks again for listening, and come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Download the transcript. Sponsor: HP.

Transcript of a BriefingsDirect discussion on how a gaming retailer uses big data to gather insights into sales trends and customer wants and needs. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved.

You may also be interested in: