Tuesday, January 17, 2017

Fast Acquisition of Diverse Unstructured Data Sources Makes IDOL API Tools a Star at LogitBot

Transcript of a discussion on how high-performing big-data analysis powers an innovative artificial intelligence-based investment tool.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Dana Gardner: Hello, and welcome to the next edition to the Hewlett Packard Enterprise (HPE) Voice of the Customer podcast series. I’m Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on digital transformation. Stay with us now to learn how agile businesses are fending off disruption -- in favor of innovation.

Gardner
Our next case study highlights how high-performing big-data analysis powers an innovative artificial intelligence (AI)-based investment opportunity and evaluation tool. We'll learn how LogitBot in New York identifies, manages, and contextually categorizes truly massive and diverse data sources.

By leveraging entity recognition APIs, LogitBot not only provides investment evaluations from across these data sets, it delivers the analysis as natural-language information directly into spreadsheets as the delivery endpoint. This is a prime example of how complex cloud-to core-to edge processes and benefits can be managed and exploited using the most responsive big-data APIs and services.

To describe how a virtual assistant for targeting investment opportunities is being supported by cloud-based big-data services, we're joined by Mutisya Ndunda, Founder and CEO of LogitBot in New York. Welcome.

Mutisya Ndunda: Thank you so much for having us.

Gardner: We're also here with Michael Bishop, CTO of LogicBot. Welcome, Michael.

Michael Bishop: Thank you for having us. It’s good to be here.
Humanization of Machine Learning
For Big Data Success
Learn More
Gardner: Let’s look at some of the trends driving your need to do what you're doing with AI and bots, bringing together data, and then delivering it in the format that people want most. What’s the driver in the market for doing this?

Ndunda: LogitBot is all about trying to eliminate friction between people who have very high-value jobs and some of the more mundane things that could be automated by AI.

Ndunda
Today, in finance, the industry, in general, searches for investment opportunities using techniques that have been around for over 30 years. What tends to happen is that the people who are doing this should be spending more time on strategic thinking, ideation, and managing risk. But without AI tools, they tend to get bogged down in the data and in the day-to-day. So, we've decided to help them tackle that problem.

Gardner: Let the machines do what the machines do best. But how do we decide where the demarcation is between what the machines do well and what the people do well, Michael?

Bishop: We believe in empowering the user and not replacing the user. So, the machine is able to go in-depth and do what a high-performing analyst or researcher would do at scale, and it does that every day, instead of once a quarter, for instance, when research analysts would revisit an equity or a sector. We can do that constantly, react to events as they happen, and replicate what a high-performing analyst is able to do.

Gardner: It’s interesting to me that you're not only taking a vast amount of data and putting it into a useful format and qualitative type, but you're delivering it in a way that’s demanded in the market, that people want and use. Tell me about this core value and then the edge value and how you came to decide on doing it the way you do?

Evolutionary process

Ndunda: It’s an evolutionary process that we've embarked on or are going through. The industry is very used to doing things in a very specific way, and AI isn't something that a lot of people are necessarily familiar within financial services. We decided to wrap it around things that are extremely intuitive to an end user who doesn't have the time to learn technology.

So, we said that we'll try to leverage as many things as possible in the back via APIs and all kinds of other things, but the delivery mechanism in the front needs to be as simple or as friction-less as possible to the end-user. That’s our core principle.

Bishop: Finance professionals generally don't like black boxes and mystery, and obviously, when you're dealing with money, you don’t want to get an answer out of a machine you can’t understand. Even though we're crunching a lot of information and  making a lot of inferences, at the end of the day, they could unwind it themselves if they wanted to verify the inferences that we have made.

Bishop
We're wrapping up an incredibly complicated amount of information, but it still makes sense at the end of the day. It’s still intuitive to someone. There's not a sense that this is voodoo under the covers.

Gardner: Well, let’s pause there. We'll go back to the data issues and the user-experience issues, but tell us about LogitBot. You're a startup, you're in New York, and you're focused on Wall Street. Tell us how you came to be and what you do, in a more general sense.

Ndunda: Our professional background has always been in financial services. Personally, I've spent over 15 years in financial services, and my career led me to what I'm doing today.

In the 2006-2007 timeframe, I left Merrill Lynch to join a large proprietary market-making business called Susquehanna International Group. They're one of the largest providers of liquidity around the world. Chances are whenever you buy or sell a stock, you're buying from or selling to Susquehanna or one of its competitors.

What had happened in that industry was that people were embracing technology, but it was algorithmic trading, what has become known today as high-frequency trading. At Susquehanna, we resisted that notion, because we said machines don't necessarily make decisions well, and this was before AI had been born.

Internally, we went through this period where we had a lot of discussions around, are we losing out to the competition, should we really go pure bot, more or less? Then, 2008 hit and our intuition of allowing our traders to focus on the risky things and then setting up machines to trade riskless or small orders paid off a lot for the firm; it was the best year the firm ever had, when everyone else was falling apart.

That was the first piece that got me to understand or to start thinking about how you can empower people and financial professionals to do what they really do well and then not get bogged down in the details.

Then, I joined Bloomberg and I spent five years there as the head of strategy and business development. The company has an amazing business, but it's built around the notion of static data. What had happened in that business was that, over a period of time, we began to see the marketplace valuing analytics more and more.

Make a distinction

Part of the role that I was brought in to do was to help them unwind that and decouple the two things -- to make a distinction within the company about static information versus analytical or valuable information. The trend that we saw was that hedge funds, especially the ones that were employing systematic investment strategies, were beginning to do two things, to embrace AI or technology to empower your traders and then also look deeper into analytics versus static data.

That was what brought me to LogitBot. I thought we could do it really well, because the players themselves don't have the time to do it and some of the vendors are very stuck in their traditional business models.

Bishop: We're seeing a kind of renaissance here, or we're at a pivotal moment, where we're moving away from analytics in the sense of business reporting tools or understanding yesterday. We're now able to mine data, get insightful, actionable information out of it, and then move into predictive analytics. And it's not just statistical correlations. I don’t want to offend any quants, but a lot of technology [to further analyze information] has come online recently, and more is coming online every day.

For us, Google had released TensorFlow, and that made a substantial difference in our ability to reason about natural language. Had it not been for that, it would have been very difficult one year ago.

At the moment, technology is really taking off in a lot of areas at once. That enabled us to move from static analysis of what's happened in the past and move to insightful and actionable information.
Relying on a backward-looking mechanism of trying to interpret the future is kind of really dangerous, versus having a more grounded approach.

Ndunda: What Michael kind of touched on there is really important. A lot of traditional ways of looking at financial investment opportunities is to say that historically, this has happened. So, history should repeat itself. We're in markets where nothing that's happening today has really happened in the past. So, relying on a backward-looking mechanism of trying to interpret the future is kind of really dangerous, versus having a more grounded approach that can actually incorporate things that are nontraditional in many different ways.

So, unstructured data, what investors are thinking, what central bankers are saying, all of those are really important inputs, one part of any model 10 or 20 years ago. Without machine learning and some of the things that we are doing today, it’s very difficult to incorporate any of that and make sense of it in a structured way.

Gardner: So, if the goal is to make outlier events your friend and not your enemy, what data do you go to to close the gap between what's happened and what the reaction should be, and how do you best get that data and make it manageable for your AI and machine-learning capabilities to exploit?

Ndunda: Michael can probably add to this as well. We do not discriminate as far as data goes. What we like to do is have no opinion on data ahead of time. We want to get as much information as possible and then let a scientific process lead us to decide what data is actually useful for the task that we want to deploy it on.

As an example, we're very opportunistic about acquiring information about who the most important people at companies are and how they're connected to each other. Does this guy work on a board with this or how do they know each other? It may not have any application at that very moment, but over the course of time, you end up building models that are actually really interesting.

We scan over 70,000 financial news sources. We capture news information across the world. We don't necessarily use all of that information on a day-to-day basis, but at least we have it and we can decide how to use it in the future.

We also monitor anything that companies file and what management teams talk about at investor conferences or on phone conversations with investors.

Bishop: Conference calls, videos, interviews.

Audio to text

Ndunda: HPE has a really interesting technology that they have recently put out. You can transcribe audio to text, and then we can apply our text processing on top of that to understand what management is saying in a structural, machine-based way. Instead of 50 people listening to 50 conference calls you could just have a machine do it for you.

Gardner: Something we can do there that we couldn't have done before is that you can also apply something like sentiment analysis, which you couldn’t have done if it was a document, and that can be very valuable.

Bishop: Yes, even tonal analysis. There are a few theories on that, that may or may not pan out, but there are studies around tone and cadence. We're looking at it and we will see if it actually pans out.

Gardner: And so do you put this all into your own on-premises data-center warehouse or do you take advantage of cloud in a variety of different means by which to corral and then analyze this data? How do you take this fire hose and make it manageable?

Bishop: We do take advantage of the cloud quite aggressively. We're split between SoftLayer and Google. At SoftLayer we have bare-metal hardware machines and some power machines with high-power GPUs.
Humanization of Machine Learning
For Big Data Success
Learn More
On the Google side, we take advantage of Bigtable and BigQuery and some of their infrastructure tools. And we have good, old PostgreSQL in there, as well as DataStax, Cassandra, and their Graph as the graph engine. We make liberal use of HPE Haven APIs as well and TensorFlow, as I mentioned before. So, it’s a smorgasbord of things you need to corral in order to get the job done. We found it very hard to find all of that wrapped in a bow with one provider.

We're big proponents of Kubernetes and Docker as well, and we leverage that to avoid lock-in where we can. Our workload can migrate between Google and the SoftLayer Kubernetes cluster. So, we can migrate between hardware or virtual machines (VMs), depending on the horsepower that’s needed at the moment. That's how we handle it.

Gardner: So, maybe 10 years ago you would have been in a systems-integration capacity, but now you're in a services-integration capacity. You're doing some very powerful things at a clip and probably at a cost that would have been impossible before.

Bishop: I certainly remember placing an order for a server, waiting six months, and then setting up the RAID drives. It's amazing that you can just flick a switch and you get a very high-powered machine that would have taken six months to order previously. In Google, you spin up a VM in seconds. Again, that's of a horsepower that would have taken six months to get.

Gardner: So, unprecedented innovation is now at our fingertips when it comes to the IT side of things, unprecedented machine intelligence, now that the algorithms and APIs are driving the opportunity to take advantage of that data.

Let's go back to thinking about what you're outputting and who uses that. Is the investment result that you're generating something that goes to a retail type of investor? Is this something you're selling to investment houses or a still undetermined market? How do you bring this to market?

Natural language interface

Ndunda: Roboto, which is the natural-language interface into our analytical tools, can be custom tailored to respond, based on the user's level of financial sophistication.

At present, we're trying them out on a semiprofessional investment platform, where people are professional traders, but not part of a major brokerage house. They obviously want to get trade ideas, they want to do analytics, and they're a little bit more sophisticated than people who are looking at investments for their retirement account.  Rob can be tailored for that specific use case.

He can also respond to somebody who is managing a portfolio at a hedge fund. The level of depth that he needs to consider is the only differential between those two things.

In the back, he may do an extra five steps if the person asking the question worked at a hedge fund, versus if the person was just asking about why is Apple up today. If you're a retail investor, you don’t want to do a lot of in-depth analysis.

Bishop: You couldn’t take the app and do anything with it or understand it.
If our initial findings here pan out or continue to pan out, it's going to be a very powerful interface.

Ndunda: Rob is an interface, but the analytics are available via multiple venues. So, you can access the same analytics via an API, a chat interface, the web, or a feed that streams into you. It just depends on how your systems are set up within your organization. But, the data always will be available to you.

Gardner: Going out to that edge equation, that user experience, we've talked about how you deliver this to the endpoints, customary spreadsheets, cells, pivots, whatever. But it also sounds like you are going toward more natural language, so that you could query, rather than a deep SQL environment, like what we get with a Siri or the Amazon Echo. Is that where we're heading?

Bishop: When we started this, trying to parameterize everything that you could ask into enough checkboxes and forums pollutes the screen. The system has access to an enormous amount of data that you can't create a parameterized screen for. We found it was a bit of a breakthrough when we were able to start using natural language.

TensorFlow made a huge difference here in natural language understanding, understanding the intent of the questioner, and being able to parameterize a query from that. If our initial findings here pan out or continue to pan out, it's going to be a very powerful interface.

I can't imagine having to go back to a SQL query if you're able to do it natural language, and it really pans out this time, because we’ve had a few turns of the handle of alleged natural-language querying.

Gardner: And always a moving target. Tell us specifically about SentryWatch and Precog. How do these shake out in terms of your go-to-market strategy?

How everything relates

Ndunda: One of the things that we have to do to be able to answer a lot of questions that our customers may have is to monitor financial markets and what's impacting them on a continuous basis. SentryWatch is literally a byproduct of that process where, because we're monitoring over 70,000 financial news sources, we're analyzing the sentiment, we're doing deep text analysis on it, we're identifying entities and how they're related to each other, in all of these news events, and we're sticking that into a knowledge graph of how everything relates to everything else.

It ends up being a really valuable tool, not only for us, but for other people, because while we're building models. there are also a lot of hedge funds that have proprietary models or proprietary processes that could benefit from that very same organized relational data store of news. That's what SentryWatch is and that's how it's evolved. It started off with something that we were doing as an import and it's actually now a valuable output or a standalone product.

Precog is a way for us to showcase the ability of a machine to be predictive and not be backward looking. Again, when people are making investment decisions or allocation of capital across different investment opportunities, you really care about your forward return on your investments. If I invested a dollar today, am I likely to make 20 cents in profit tomorrow or 30 cents in profit tomorrow?

We're using pretty sophisticated machine-learning models that can take into account unstructured data sources as part of the modeling process. That will give you these forward expectations about stock returns in a very easy-to-use format, where you don't need to have a PhD in physics or mathematics.
We're using pretty sophisticated machine-learning models that can take into account unstructured data sources as part of the modeling process.

You just ask, "What is the likely return of Apple over the next six months," taking into account what's going on in the economy.  Apple was fined $14 billion. That can be quickly added into a model and reflect a new view in a matter of seconds versus sitting down in a spreadsheet and trying to figure out how it all works out.

Gardner: Even for Apple, that's a chunk of change.

Bishop: It's a lot money, and you can imagine that there were quite a few analysts on Wall Street in Excel, updating their models around this so that they could have an answer by the end of the day, where we already had an answer.

Gardner: How do the HPE Haven OnDemand APIs help the Precog when it comes to deciding those sources, getting them in the right format, so that you can exploit?

Ndunda: The beauty of the platform is that it simplifies a lot of development processes that an organization of our size would have to take on themselves.

The nice thing about it is that a drag-and-drop interface is really intuitive; you don't need to be specialized in Java, Python, or whatever it is. You can set up your intent in a graphical way, and then test it out, build it, and expand it as you go along. The Lego-block structure is really useful, because if you want to try things out, it's drag and drop, connect the dots, and then see what you get on the other end.

For us, that's an innovation that we haven't seen with anybody else in the marketplace and it cuts development time for us significantly.

Gardner: Michael, anything more to add on how this makes your life a little easier?

Lowering cost

Bishop: For us, lowering the cost in time to run an experiment is very important when you're running a lot of experiments, and the Combinations product enables us to run a lot of varied experiments using a variety of the HPE Haven APIs in different combinations very quickly. You're able to get your development time down from a week, two weeks, whatever it is to wire up an API to assist them.

In the same amount of time, you're able to wire the initial connection and then you have access to pretty much everything in Haven. You turn it over to either a business user, a data scientist, or a machine-learning person, and they can drag and drop the connectors themselves. It makes my life easier and it makes the developers’ lives easier because it gets back time for us.

Gardner: So, not only have we been able to democratize the querying, moving from SQL to natural language, for example, but we’re also democratizing the choice on sources and combinations of sources in real time, more or less for different types of analyses, not just the query, but the actual source of the data.
The power of a lot of this stuff is in the unstructured world, because valuable information typically tends to be hidden in documents.

Bishop: Correct.

Ndunda: Again, the power of a lot of this stuff is in the unstructured world, because valuable information typically tends to be hidden in documents. In the past, you'd have to have a team of people to scour through text, extract what they thought was valuable, and summarize it for you. You could miss out on 90 percent of the other valuable stuff that's in the document.

With this ability now to drag and drop and then go through a document in five different iterations by just tweaking, a parameter is really useful.

Gardner: So those will be IDOL-backed APIs that you are referring to.

Ndunda: Exactly.

Bishop: It’s something that would be hard for an investment bank, even a few years ago, to process. Everyone is on the same playing field here or starting from the same base, but dealing with unstructured data has been traditionally a very difficult problem. You have a lot technologies coming online as APIs; at the same time, they're also coming out as traditional on-premises [software and appliance] solutions.

We're all starting from the same gate here. Some folks are little ahead, but I'd say that Facebook is further ahead than an investment bank in their ability to reason over unstructured data. In our world, I feel like we're starting basically at the same place that Goldman or Morgan would be.

Gardner: It's a very interesting reset that we’re going through. It's also interesting that we talked earlier about the divide between where the machine and the individual knowledge worker begins or ends, and that's going to be a moving target. Do you have any sense of how that changes its characterization of what the right combination is of machine intelligence and the best of human intelligence?

Empowering humans

Ndunda: I don’t foresee machines replacing humans, per se. I see them empowering humans, and to the extent that your role is not completely based on a task, if it's based on something where you actually manage a process that goes from one end to another, those particular positions will be there, and the machines will free our people to focus on that.

But, in the case where you have somebody who is really responsible for something that can be automated, then obviously that will go away. Machines don't eat, they don’t need to take vacation, and if it’s a task where you don't need to reason about it, obviously you can have a computer do it.

What we're seeing now is that if you have a machine sitting side by side with a human, and the machine can pick up on how the human reasons with some of the new technologies, then the machine can do a lot of the grunt work, and I think that’s the future of all of this stuff.
I don’t foresee machines replacing humans, per se. I see them empowering humans.

Bishop: What we're delivering is that we distill a lot of information, so that a knowledge worker or decision-maker can make an informed decision, instead of watching CNBC and being a single-source reader. We can go out and scour the best of all the information, distill it down, and present it, and they can choose to act on it.

Our goal here is not to make the next jump and make the decision. Our job is to present the information to a decision-maker.

Gardner: It certainly seems to me that the organization, big or small, retail or commercial, can make the best use of this technology. Machine learning, in the end, will win.

Ndunda: Absolutely. It is a transformational technology, because for the first time in a really long time, the reasoning piece of it is within grasp of machines. These machines can operate in the gray area, which is where the world lives.

Gardner: And that gray area can almost have unlimited variables applied to it.

Ndunda: Exactly. Correct.
Humanization of Machine Learning
For Big Data Success
Learn More
Gardner: I'm afraid we'll have to leave it there. We've been exploring how high-performing big-data analysis powers an innovative artificial intelligence-based investment opportunity in a valuation tool, and we've learned how LogitBot in New York identifies, manages, and contextually categorizes truly massive and diverse data sources.

So please join me in thanking our guests, Mutisya Ndunda, Founder and CEO of LogitBot in New York. Thank you, sir.

Ndunda: It was a pleasure. Thank you so much.

Gardner: We've also been here with Michael Bishop, CTO of LogicBot. Thank you, Michael.

Bishop: Thank you, Dana.

Gardner: And a big thank you as well to our audience for joining us for this Hewlett-Packard Enterprise, Voice of the Customer digital transformation discussion.

I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HPE sponsored interviews. Thanks again for listening, and do come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Transcript of a discussion on how high-performing big-data analysis powers an innovative artificial intelligence-based investment opportunity. Copyright Interarbor Solutions, LLC, 2005-2016. All rights reserved.

You may also be interested in:

Friday, January 06, 2017

How Lastminute.com Uses Machine Learning to Improve Real-Time Travel Bookings

Transcript of a discussion on how lastminute.com manages massive volumes of data to support a cutting-edge machine-learning algorithmic approach to matching the best experience in travel with end user requirements.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Dana Gardner: Hello, and welcome to the next edition to the Hewlett Packard Enterprise (HPE) Voice of the Customer podcast series. I’m Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on digital transformation. Stay with us now to learn how agile businesses are fending off disruption -- in favor of innovation.

Gardner
Our next case study highlights how online travel and events pioneer lastminute.com leverages big-data analytics with speed at scale to provide business advantages to travel bookings. We'll explore how lastminute.com manages massive volumes of data to support cutting-edge machine-learning algorithms to allow for speed and automation while buying such services online.

So join us as we learn how a culture of IT innovation helps make highly dynamic customer interactions for online travel a major differentiator for lastminute.com. To describe how machine learning is improving travel services fulfillment, we're joined by Filippo Onorato, Chief Information Officer at lastminute.com group in Chiasso, Switzerland. Welcome, Filippo.

Filippo Onorato: Thank you very much.

Gardner: Most people these days are trying to do more things more quickly amid higher complexity. What is it that you're trying to accomplish in terms of moving beyond disruption and being competitive in a highly complex area?
Join myVertica
To Get The Free
HPE Vertica Community Edition
Onorato: The travel market -- and in particular the online travel market -- is a very fast-moving market, and the habits and behaviors of the customers are changing so rapidly that we have to move fast.

Disruption is coming every day from different actors ... [requiring] a different way of constructing the customer experience. In order to do that, you have to rely on very big amounts of data -- just to style the evolution of the customer and their behaviors.

Gardner: And customers are more savvy; they really know how to use data and look for deals. They're expecting real-time advantages. How is the sophistication of the end user impacting how you work at the core, in your data center, and in your data analysis, to improve your competitive position?

Onorato
Onorato: Once again, customers are normally looking for information, and providing the right information at the right time is a key of our success. The brand we came from was called Bravofly and Volagratis in Italy; that means "free flight." The competitive advantage we have is to provide a comparison among all the different airline tickets, where the market is changing rapidly from the standard airline behavior to the low-cost ones. Customers are eager to find the best deal, the best price for their travel requirements.

So, the ability to construct their customer experience in order to find the right information at the right time, comparing hundreds of different airlines, was the competitive advantage we made our fortune on.

Gardner: Let’s edify our listeners and reader a bit about lastminute.com. You're global. Tell us about the company and perhaps your size, employees, and the number of customers you deal with each day.

Most famous brand

Onorato: We are 1,200 employees worldwide. Lastminute.com, the most famous brand worldwide, was acquired by the Bravofly Rumbo Group two years ago from Sabre. We own Bravofly; that was the original brand. We own Rumbo; that is very popular in Spanish-speaking markets. We own Volagratis in Italy; that was the original brand. And we own Jetcost; that is very popular in France. That is actually a metasearch, a combination of search and competitive comparison between all the online travel agencies (OTAs) in the market.

We span across 40 countries, we support 17 languages, and we help almost 10 million people fly every year.

Gardner: Let’s dig into the data issues here, because this is a really compelling use-case. There's so much data changing so quickly, and sifting through it is an immense task, but you want to bring the best information to the right end user at the right time. Tell us a little about your big-data architecture, and then we'll talk a little bit about bots, algorithms, and artificial intelligence.

Onorato: The architecture of our system is pretty complex. On one side, we have to react almost instantly to the search that the customers are doing. We have a real-time platform that's grabbing information from all the providers, airlines, other OTAs, hotel provider, bed banks, or whatever.

We concentrate all this information in a huge real-time database, using a lot of caching mechanisms, because the speed of the search, the speed of giving result to the customer is a competitive advantage. That's the real-time part of our development that constitutes the core business of our industry.

Gardner: And this core of yours, these are your own data centers? How have you constructed them and how do you manage them in terms of on-premises, cloud, or hybrid?

Onorato: It's all on-premises, and this is our core infrastructure. On the other hand, all that data that is gathered from the interaction with the customer is partially captured. This is the big challenge for the future -- having all that data stored in a data warehouse. That data is captured in order to build our internal knowledge. That would be the sales funnel.
Right now, we're storing a short history of that data, but the goal is to have two years worth of session data.

So, the behavior of the customer, the percentage of conversion in each and every step that the customer does, from the search to the actual booking. That data is gathered together in a data warehouse that is based on HPE Vertica, and then, analyzed in order to find the best place, in order to optimize the conversion. That’s the main usage of the date warehouse.

On the other hand, what we're implementing on top of all this enormous amount of data is session-related data. You can imagine how much a data single interaction of a customer can generate. Right now, we're storing a short history of that data, but the goal is to have two years' worth of session data. That would be an enormous amount of data.

Gardner: And when we talk about data, often we're concerned about velocity and volume. You've just addressed volume, but velocity must be a real issue, because any change in a weather issue in Europe, for example, or a glitch in a computer system at one airline in North America changes all of these travel data points instantly.

Unpredictable events

Onorato: That’s also pretty typical in the tourism industry. It's a very delicate business, because we have to react to unpredictable events that are happening all over the world. In order to do a better optimization of margin, of search results, etc, we're also applying some machine-learning algorithm, because a human can't react so fast to the ever-changing market or situation.

In those cases, we use optimization algorithms in order to fine tune our search results, in order to better deal with a customer request, and to propose the better deal at the right time. In very simple terms, that's our core business right now.

Gardner: And Filippo, only your organization can do this, because the people with the data on the back side can’t apply the algorithm; they have only their own data. It’s not something the end user can do on the edge, because they need to receive the results of the analysis and the machine learning. So you're in a unique, important position. You're the only one who can really apply the intelligence, the AI, and the bots to make this happen. Tell us a little bit about how you approached that problem and solved it.
Join myVertica
To Get The Free
HPE Vertica Community Edition
Onorato: I perfectly agree. We are the collector of an enormous amount of product-related information on one side. On the other side, what we're collecting are the customer behaviors. Matching the two is unique for our industry. It's definitely a competitive advantage to have that data.

Then, what you do with all those data is something that is pushing us to do continuous innovation and continuous analysis. By the way, I don't think something can be implemented without a lot of training and a lot of understanding of the data.

Just to give you an example, what we're implementing, the machine learning algorithm that is called multi-armed bandit, is kind of parallel testing of different configurations of parameters that are presented to the final user. This algorithm is reacting to a specific set of conditions and proposing the best combination of order, visibility, pricing, and whatever to the customer in order to satisfy their research.

What we really do in that case is to grab information, build our experience into the algorithm, and then optimize this algorithm every day, by changing parameters, by also changing the type of data that we're inputting into the algorithm itself.
It's endless, because the market conditions are changing and the actors in the market are changing as well.

So, it’s an ongoing experience; it’s an ongoing study. It's endless, because the market conditions are changing and the actors in the market are changing as well, coming from the two operators in the past, the airline and now the OTA. We're also a metasearch, aggregating products from different OTAs. So, there are new players coming in and they're always coming closer and closer to the customer in order to grab information on customer behavior.

Gardner: It sounds like you have a really intense culture of innovation, and that's super important these days, of course. As we were hearing at the HPE Big Data Conference 2016, the feedback loop element of big data is now really taking precedence. We have the ability to manage the data, to find the data, to put the data in a useful form, but we're finding new ways. It seems to me that the more people use our websites, the better that algorithm gets, the better the insight to the end user, therefore the better the result and user experience. And it never ends; it always improves.

How does this extend? Do you take it to now beyond hotels, to events or transportation? It seems to me that this would be highly extensible and the data and insights would be very valuable.

Core business

Onorato: Correct. The core business was initially the flight business. We were born by selling flight tickets. Hotels and pre-packaged holidays was the second step. Then, we provided information about lifestyle. For example, in London we have an extensive offer of theater, events, shows, whatever, that are aggregated.

Also, we have a smaller brand regarding restaurants. We're offering car rental. We're giving also value-added services to the customer, because the journey of the customer doesn't end with the booking. It continues throughout the trip, and we're providing information regarding the check-in; web check-in is a service that we provide. There are a lot of ancillary businesses that are making the overall travel experience better, and that’s the goal for the future.

Gardner: I can even envision where you play a real-time concierge, where you're able to follow the person through the trip and be available to them as a bot or a chat. This edge-to-core capability is so important, and that big data feedback, analysis, and algorithms are all coming together very powerfully.

Tell us a bit about metrics of success. How can you measure this? Obviously a lot of it is going to be qualitative. If I'm a traveler and I get what I want, when I want it, at the right price, that's a success story, but you're also filling every seat on the aircraft or you're filling more rooms in the hotels. How do we measure the success of this across your ecosystem?
We can jump from one location to another very easily, and that's one of the competitive advantages of being an OTA.

Onorato: In that sense, we're probably a little bit farther away from the real product, because we're an aggregator. We don’t have the risk of running a physical hotel, and that's where we're actually very flexible. We can jump from one location to another very easily, and that's one of the competitive advantages of being an OTA.

But the success overall right now is giving the best information at the right time to the final customer. What we're measuring right now is definitely the voice of the customer, the voice of the final customer, who is asking for more and more information, more and more flexibility, and the ability to live an experience in the best way possible.

So, we're also providing a brand that is associated with wonderful holidays, having fun, etc. 

Gardner: The last question, for those who are still working on building out their big data infrastructure, trying to attain this cutting-edge capability and start to take advantage of machine learning, artificial intelligence, and so forth, if you could do it all over again, what would you tell them, what would be your advice to somebody who is merely more in the early stages of their big data journey?

Onorato: It is definitely based on two factors -- having the best technology and not always trying to build your own technology, because there are a lot of products in the market that can speed up your development.

And also, it's having the best people. The best people is one of the competitive advantages of any company that is running this kind of business. You have to rely on fast learners, because market condition are changing, technology is changing, and the people needs to train themselves very fast. So, you have to invest in people and invest in the best technology available.

Gardner: I'm afraid we will have to leave it there. We've been exploring how online travel and events pioneer lastminute.com group leverages big-data analytics with incredible speed and at huge scale to provide business advantages in travel and other bookings.

And we've learned how the OTA manages these massive volumes of data to support a cutting-edge machine-learning algorithmic approach to matching the best experience in travel with end user requirements.

So, please join me in thanking our guest, Filippo Onorato, Chief Information Officer at lastminute.com group in Chiasso, Switzerland. Thank you, sir.
Join myVertica
To Get The Free
HPE Vertica Community Edition
Onorato: Thank you very much.

Gardner: And thanks to our audience as well for joining us for this Hewlett Packard Enterprise Voice of the Customer digital transformation discussion. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HPE-sponsored interviews. Thanks again for listening, and please do come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Transcript of a discussion on how lastminute.com manages massive volumes of data to support a cutting-edge machine-learning algorithmic approach to matching the best experience in travel with end user requirements. Copyright Interarbor Solutions, LLC, 2005-2017. All rights reserved.

You may also be interested in:

Monday, December 19, 2016

Veikkaus Digitally Transforms as it Emerges as New Combined Finnish National Gaming Company

Transcript of a discussion on how a culture of IT innovation is helping to establish a single wholly nationally owned company to operate gaming in Finland.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Dana Gardner: Welcome to the next edition to the Hewlett Packard Enterprise (HPE) Voice of the Customer podcast series. I’m Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on digital transformation. Stay with us now to learn how agile businesses are fending off disruption in favor of innovation.

Gardner
Our next case study highlights how combined Finnish national gaming company, Veikkaus, manages a complex merger process while bringing a digital advantage to both its operations and business model. We'll explore how Veikkaus uses a powerful big-data analytics platform to respond rapidly to the challenges of digitization.

Learn how a culture of IT innovation has established a single nationally owned company to operate gaming and gambling in Finland. To describe how Veikkaus is transforming itself for better customer experience from the computing core to the edge, we're joined by Harri Räsänen, Information and Communications Technology Architect at Veikkaus in Helsinki.

Welcome, Harri.

Harri Räsänen: Thank you. It’s nice to be here.
Join myVertica
To Get The Free
HPE Vertica Community Edition
Gardner: Why has Veikkaus reinvented its data infrastructure technology?

Räsänen: Our data warehouse solution was a traditional data warehouse, and had been around for 10 years. Different things had gone wrong. One of the key issues we faced was that our data wasn’t real-time. It was far from real time -- it was data that was one or two days old.

We decided that we need to get data quicker and in more detail because we now had aggregate data.

Gardner: What were some of your top requirements technically in order to accomplish that?

Real-time data

Räsänen: As I said, we had quite a old-fashioned data warehouse. Initially, we needed our game-service provider to feed us data more in real-time. They needed to build up a mechanism to complete data, and we needed to build out capabilities to gather it. We needed to rethink the information structure -- totally from scratch.

Räsänen
Gardner: When we think about gambling, gaming, or lotteries, in many cases, this is an awful lot of data, a very big undertaking. Give us a sense of the size of the data and the disparity of the three organizations that came together including the Finnish national football gaming reorganization.

Räsänen: I'll talk about our current-situation records, for the new combined company we are starting up in 2017.

We have a big company from a customer point of view. We have 1.8 million consumers. Finland has a population of 5.5 million. So, we have a quite a lot of Finnish consumers. When it comes to transactions, we get one to three million transactions per day. So it’s quite large, if you think about the transactional data.

In addition to that, we gather different kinds of information about our web store; it’s one of the biggest retail web stores in Finland.

Gardner: It’s one thing to put in a new platform, but it’s another to then change the culture and the organization -- and transform into a digital business. How is the implementation of your new data environment aiding in your cultural shift?

Räsänen: Luckily, Veikkaus has a background of doing things quite analytically. If you think about a web store, there is a culture that we need to be able to monitor what we're doing if we're running some changes in our web store -- whether it works or not. That’s a good thing.

But, we are redoing our whole data technology. We added the Apache Kafka integration point and then, Cloudera, the Hadoop system. Then, we added a new ETL tool for us, Pentaho, and last but not least, HPE Vertica. It's been really challenging for us, with lots of different things to consider and learn.

Luckily, we've been able to use good external consultants to help us out, but as you said, we can always make the technology work better. In transforming the culture of doing things, we're still definitely in the middle of our journey.

Gardner: I imagine you'll want to better analyze what takes place within your organization so it’s not just serving the data and managing the transactions. There's an opportunity to have a secondary benefit, which is more control of your data. The more insight you have allows you to adapt and improve your customer experience and customer service. Have you been able to start down that path of that secondary analysis of what goes on internally?

New level of data

Räsänen: Some of our key data was even out of our hands in our service-provider environments. We wanted to get all the relevant data with us, and now we've been working on that new level of data access. We have analysts working on that, both IT and business people, browsing the data. They already have some findings on things that previously they could have asked or even thought about. So, we have been getting our information up-to-date.

Gardner: Can you give us more specific examples of how you've been able to benefit from this new digital environment?

Räsänen: Yeah, consumer communication on CRM is one of the key successes, things we needed to have in place. We've been able to constantly improve on that. Before, we had data that was too old, but now, we have near real-time data. We get one-minute-old data, so we can communicate with the consumers better. We know whether they've been playing their lotteries or betting on football matches.

We can say, "It’s time for football today, and you haven’t yet placed a bet." We can communicate, and on the other hand, we can avoid disturbing customers by sending out e-mails or SMS messages about things they've already done.
Join myVertica
To Get The Free
HPE Vertica Community Edition
Gardner: Yes, less spam, but more help. It’s important, of course, with any organization like this in a government environment, for trust and safety to be involved. I should think that there's some analysis to help keep people from overdoing it and managing the gaming for a positive outcome.

Räsänen: Definitely. That’s one of the key metrics we're measuring with our consumer so that gaming is responsible. We need to see that all things they do can be thought of as good, because as you said, we're a national company, it’s a very regulated market, and that kind of thing.

Gardner: But a great deal of good comes from this. I understand that more than 1 billion euros a year go to the common good of people living in Finland. So, there are a lot of benefits when this is done properly.

Now that you've gone quite a ways into this, and you're going to need to be going to the new form and new organization the first of 2017, what advice would you be able to give to someone who is beginning a big data consolidation and modernization journey? What lessons have you learned that you might share?

Out of the box

Räsänen: If you're experimenting, you need to start to think a little bit out of the box. Integration is one of crucial part, and avoid all direct integration as much as possible.

We're utilizing Apache Kafka as an integration point, and that’s one of the crucial things, because then you can "appify" everything. You're going to provide an application interface for integrating systems and that will help those of us in gaming companies.

Gardner: A lot a services-orientation?

Räsänen: That’s one of the components of our data architecture. We have been using our Cloudera Hadoop system for archiving and we are building our capabilities on top of that. In addition, of course, we have HPE Vertica. It’s one of our most crucial things in our data ecosystem because it’s a traditional enterprise data warehousing in that sense it is a SQL database. Users can take a benefit out of that, and it’s lightning-fast. You need to design all the components and make those work on that role that they are based at.

Gardner: And of course SQL is very commonly understood as the query language. There's no great change there, but it's really putting it into the hands of more people.

Räsänen: I've been writing or talking in SQL since the beginning of the ’90s, and it’s actually a pretty easy language to communicate, even between business and IT, because at least, at some level, it’s self-explanatory. That’s where the communication matters.

Gardner: Just a much better engine under the hood, right?

Räsänen: Yeah, exactly.

Gardner: I am afraid we'll have to leave it there. We've been exploring how combined Finnish state gaming company, Veikkaus, is managing a complex merger process, while also bringing more of a digital advantage to its operations. And we've learned how a culture of IT innovation is helping to establish a state company to operate gaming in Finland.

Please join me in thanking our guest, Harri Räsänen, Information and Communications Technology Architect at Veikkaus in Helsinki. Thank you, Harri.
Join myVertica
To Get The Free
HPE Vertica Community Edition
Räsänen: Thank you.

Gardner: And a big thank you as well to our audience for joining us for this Hewlett Packard Enterprise Voice of the Customer digital transformation discussion.

I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HPE-sponsored interviews. Thanks again for listening, and please do come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Transcript of a discussion on how a culture of IT innovation is helping to establish a single wholly owned state company to operate gaming in Finland. Copyright Interarbor Solutions, LLC, 2005-2016. All rights reserved.

You may also be interested in: