Tuesday, December 01, 2015

Nottingham Trent University Elevates Big Data’s Role to Improving Student Retention in Higher Education

Transcript of a BriefingsDirect discussion on how one university uses big data analysis to track student performance and quickly identify at-risk students.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Dana Gardner: Hello, and welcome to the next edition of the HPE Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on IT innovation and how it’s making an impact on people’s lives.

Gardner
Our next big-data case-study interview examines how Nottingham Trent University in England has devised and implemented an information-driven way to encourage higher student retention.

By gathering diverse data and information and making rapid analysis, Nottingham Trent is able to quickly identify those students having difficulties. They can thereby provide significant reductions in dropout rates while learning more about what works best to usher students into successful academic careers.

What’s more, the analysis of student metrics is also setting up the ability to measure more aspects of university life and quality of teaching, and to make valuable evidence-based correlations that may well describe what the next decades of successful higher education will look like.

To learn more about taking a new course in the use of data science in education, we're pleased to welcome Mike Day, Director of Information Systems at Nottingham Trent University in Nottingham, UK. Welcome, Mike.

Mike Day: Thank you.

Gardner: First tell us about Nottingham Trent University. It’s a unique institute, a very large student body and many of them attending university for the first time in their families.

Day: That’s right. We've had around 28,000 students over the last few years, and that’s probably going to increase this year to around 30,000 students. We have, as you say, many, many students who come from poor backgrounds -- what we call "widening participation" students. Many of them are first generation in their family to go to university.

Sometimes, those students are a little bit under-confident about going to university. We’ve come to call them "doubter students," and those doubters are the kinds of people that when they struggle, they believe it’s their fault, and so they typically don't ask for help.

Gardner: So it's incumbent upon you to help them know better where to look for help and not internalize that. What do you use to measure the means by which you can identify students that are struggling?

Low dropout rate

Day: We’ve always done very well in Nottingham Trent. We had a relatively low dropout rate, about seven percent or so, which is better than sector average. Nevertheless, it was really hard for us to keep students on track throughout their studies, especially those who were struggling early in their university career. We tended to find that we have to put a lot of effort into supporting students when they had failed exams, which for us, was too late.

Day
We needed to try to find a way where we could support our students as early as possible. To do that, we had to identify those students who were finding it a bit harder than the average student and were finding it quite difficult to put their hand up and say so.

So we started to look at the data footprint that a student left across the university, whether that was a smart card swipe to get them in and out of buildings or to use printers, or their use of the library, in particular taking library books out, or accessing learning materials through our learning management system. We wanted to see whether those things would give us some indication as to how well students were engaged in their studies and therefore, whether they're struggling or not.

Gardner: So this is not really structured information, not something you would go to a relational database for, part of a structured packaged application, for example. It's information that we might think of as breadcrumbs around the organization that you need to gather. So what was the challenge for dealing with such a wide diversity of information types?
Solutions That Unify Development and Operations
To Accelerate Business Innovation
Get More Information
Day: We had a very wide variety of information types. Some of it was structured, and we put a lot effort into getting some good quality data over the years, but some of it was unstructured. Trying to bring those different and disparate datasets together was proving very difficult to do in very traditional business intelligence (BI) ways.

We needed to know, in about 600 terabytes of data, what really mattered, what were the factors that in combination told us something about how successful students behave, and therefore something about comparing those that were not having such an easy time at the university how to compare that to those who were succeeding in it.
We needed ultimately to get to a position where we could create great relationships between people, particularly between tutors or academic counselors and individual students.

Gardner: It sounds as if the challenges were not only in the gathering of good information but in how to then use that effectively in drawing correlations that would point out where students rapidly were struggling. Tell us about both the technology side and then also the methodologies that you then use to actually create those correlations?

Day: You're absolutely right. It was very difficult to find out what matters and to get the right data for that. We needed ultimately to get to a position where we could create great relationships between people, particularly between tutors or academic counselors and individual students.

On the technology side, we engaged with a partner, that was a company called DTP Solutionpath, who brought with them the HPE IDOL engine. That allowed us to submit about five years worth of back data into the IDOL engine to try to create a model of engagement, in other words, to pick out what factors within that data in combination gave as a high confidence around student engagement.

Our partners did that. They worked very closely with us in a very collaborative way, with our academic staff, with our students, importantly -- because we have to be really clear and transparent about what we are doing in all of this, from an ethical point of view -- and with my IT technical team. And that collaboration really helped us to boil down what sorts of things really mattered.

Anonymizing details

Gardner: When you look at this ethically you have to anonymize a great deal of this data in order to adhere to privacy and other compliance issues. Is that the case?

Day: Actually, we needed to be able to identify individual students in all of this, and so there were very real privacy issues in all of this. We had to check quite carefully our legal position to make sure that we did comply with UK Data Protection Act, but that’s only a part of it.

What’s acceptable to the organization and ultimately to individual students is perhaps even more important than the strict legal position in all of this. We worked very hard to explain to students and staff what we were trying to do and to get them on board early, at the beginning of this project, before we had gone too far down the track, to understand what would be acceptable and what wouldn’t.

Gardner: I suppose it’s important to come off as a big brother and not the Big Brother in this?

Day: Absolutely. Friendly big brother is exactly what we needed to be. In fact, we found that how we engage with our student body was really important in all of this. If we try to explain this in a technical way. then it was very much Big Brother. But when we started to say, "We're trying to give you the very best possible support, such that you are most likely to succeed in your time in higher education and reap the rewards of your investment in higher education," then it became a very different story.
We worked very hard to explain to students and staff what we were trying to do and to get them on board early.

Particularly, when we were able to demonstrate the kind of visualizations of engagement to students, that shifted completely, and we've had very little, if any, problems with ethical concerns among students.

Gardner: It also seems to me that the stakes here are rather high. It's hard to put a number on it, but for a student who might struggle and drop out in their first months at university, it means perhaps a diminished potential for them over their lifetime of career, monetization of income, and contribution to society, and so forth.

So for thousands of students, this could impact them over the course of a generation. This could be a substantial return on investment (ROI), to be a bit crass and commercial about it.

Day: If you take all of this from the student’s perspective, clearly students are investing significant amounts of money in their education.

In the UK, that’s £9,000 (USD $13,760) a year at the moment, plus the accommodation costs, and the cost of not getting a job early, and all of those sorts of things that those students put into to invest in their early university career. To lose that means that they come out of the university experience being less positive than it could have been, with much, much lower earning potential over their lifetime.
Solutions That Unify Development and Operations
To Accelerate Business Innovation
Get More Information
That also has an impact on UK PLC, in that it isn’t perhaps generating as many skilled individuals as it might. That has implications for tax returns and also from a university point of view. Clearly if our students dropout, they aren’t paying their fees, and those slots are now empty. In terms of university efficiency, there was also a problem. So everybody wins if we can keep students on course.

On the journey

Gardner: Certainly a worthy goal. Tell us a little bit about where you are now? I think we have the vision. I think we understand the stakes and we understand some of the technologies we’ve employed. Where are you on this journey? Then, we can talk about so far what some of the results have been.

Day: It was very quick to get to a point where the technology was giving us the right kinds of answers. In about two to three months, we got into a position where the technology was pretty much done, but that was only a really part of the story. We really needed to look at how that impacted our practice in the university.

So we started to run a series of pilots into the series of courses. We did that over the course of a year about 18 months ago and we looked at every aspect of academic support for students and how this might change all of this. If we see that a student is disengaging from their studies, and we can see that now about a month or two before it otherwise would have been able to do that, we can have a very early conversation about what the problem might be.

In more than 90 percent of the cases that we have seen so far, those early conversations result in an immediate upturn in student engagement. We’ve seen some very real tangible results and we saw those very early on.
We've started to see students competing with each other to be the best engaged in their course. That’s got to be a good thing.

We expected that it would take as a considerable amount of time to demonstrate the system would give us a value at an institutional level, but actually it didn't. It took about six months or so into that pilot period that would set a year aside for to get to a position where we were convinced, as an institution ,that we roll out across the whole university. We did that at the beginning of this academic year and we rolled out about six months earlier than we thought. So we might even start thinking about that.

We now have had another year thinking about what good practice is, seeing that academic tutors are starting to share good practice among themselves. So there is a good conversation going on there. There is a much, much more positive relationship between those academic tutors and the students being reported from both the students and the tutors, we see that being very positive.

Importantly, there is also a dialogue going on between students themselves. We've started to see students competing with each other to be the best engaged in their course. That’s got to be a good thing.

Gardner: And how would they measure that? Is there some sort of a dashboard or visualization that you can provide to the students, as well as perhaps other vested interests in the ecosystem, so that they can better know where they are, where they stand?

Day: There absolutely is. The system provides a dashboard that gives a very simple visualization. It’s two lines on a chart. One of those lines is the average engagements of the cohort on a course by course basis. The other line is the individual student’s engagement compared to that average engagement in the course; in other words, comparing them with some of their peers on that.

We worked very hard to make that visualization simple, because we wanted that to be consistent. It needed to be something that prompted a conversation between tutors and students, and tutors sharing best practice with other tutors. It's a very simple visualization.

Sharing the vision

Gardner: Mike, it strikes me that other institutions of higher learning might want to take a page from what you've done. Is there some way of you sharing this or packaging it in some way, maybe even putting your stamp and name and brand on it? Have you been in discussions with other universities or higher education organizations that might want to replicate what you’ve done?

Day: Yes, we have. We're working with our supplier SolutionPath who have created now a model that is used to replicate in other universities. It starts with a readiness exercise because this is not about technology mostly. It's about how ready you are, as an organization, to address things like privacy and ethics in all of this. We've worked very closely with that.

We’ve spoken to two dozen universities already about how they might adopt something similar not necessarily exactly the same solution. We've done some work across the sector in the UK with a thing called the Joint Information Systems Committee, which looks at technology across all 150 universities in UK.

Gardner: Before we close out, I'm curious.When you’ve got the apparatus and the culture in the organization to look more discretely at data and draw correlations about things like student attainment and activities, it seems to me that we're only in the opening stages of what could be a much more data-driven approach to higher education. Where might this go next?
Research is another area where we might be able to think about how data helps us, what kind of research might we best be able to engage in.

Day: There’s no doubt at all that this solution has worked in its own right, but what it actually formed is a kind of bridgehead, which will allow us to take the principles and the approach that we have taken around the specific solution and apply to other aspects of the universities business.

For example, we might be able to start to look at which students might succeed on different courses across the university, perhaps regardless of traditional ways of recruiting students through their secondary school education qualification. It's looking at what other information might be a good indicator of success in a course.

We could start looking at the other end of the spectrum. How do students make their way into the world of work? What kinds of jobs do they get? And is this something about linking right at the beginning of a student’s university career, perhaps even at application stage, to the kinds of careers they might succeed in, and to try and advise early on those sorts of things that student might want to get involved with and engaged with. It’s a whole raft of things that we can start to think about.
Solutions That Unify Development and Operations
To Accelerate Business Innovation
Get More Information
Research is another area where we might be able to think about how data helps us, what kind of research might we best be able to engage in, and so on and so forth.

Gardner: Very interesting. Clearly educating a student is an art as well as a science, but it certainly sounds as if you're blending them in a creative and constructive way. So congratulations on that.

I'm afraid we’ll have to leave it there. We’ve been exploring how Nottingham Trent University in England has devised and implemented an information-driven way to encourage higher student retention, and we’ve heard how by gathering diverse data and information, and by making rapid analysis from that, Nottingham Trent has been able to quickly identify students with difficulties.

So please join me in thanking Mike Day, Director of Information Systems at Nottingham Trent University. Thank you, sir.

Day: Thank you.

Gardner: I’d like to thank our audience as well for joining us for this big data innovation case study discussion.

I'm Dana Gardner; Principal Analyst at Interarbor Solutions, your host for this ongoing series of HPE-sponsored discussions. Thanks again for listening, and come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Transcript of a BriefingsDirect discussion on how one university uses big data analysis to track student performance and identify at-risk students. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved.

You may also be interested in:

Monday, November 30, 2015

Forrester Analyst Kurt Bittner on the Inevitability of DevOps

Transcript of a BriefingsDirect discussion on what’s making DevOps such a hot topic and steps that organizations are taking to make it successful.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Dana Gardner: Hello, and welcome to the next edition of the HPE Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on IT innovation and how it’s making an impact on people’s lives.

Gardner
Our next thought leadership discussion explores the building interest in DevOps -- of making the development, test, and ongoing improvement in software creation a coordinated, lean, and proficient process for enterprises.

We're here with a prominent IT industry analyst from Forrester Research to explore why DevOps is such a hot topic, and to identify steps that successful organizations are taking to make advanced applications development a major force for business success.

Please join me in welcoming Kurt Bittner, Principal Analyst, Application Development and Delivery at Forrester Research. Welcome, Kurt.

Kurt Bittner: Thanks, Dana. Great to be here.

Gardner: Let’s start by looking at the building interest in DevOps. What’s driving that?
Solutions That Unify Development and Operations
To Accelerate Business Innovation
Get More Information
Bittner: It’s essentially the end-user or client organizations as they face increasing pressure from competition and increasing expectations from customers to delivering functionality faster.

I was at a dinner the other night, and there were half a dozen or so large banks there. They were all saying, to my surprise, that they didn’t feel like they were competing with one another, but that they felt like they were competing with companies like Apple, Google, PayPal, and increasingly startup companies. Square is a good example, too.

They're getting into the payment mechanism, and that’s siphoning our business from the banks. The banks are beginning to see drops in their own bottom lines because of the competition from ... software companies. You see companies like Uber having a big impact on traditional taxi companies and transportation.

Increasing competition

So it’s essentially increasing competition, driven by increasing customer expectations. We're all part of that as consumers where we've gravitating toward our mobile smartphones. We're increasingly interacting with companies through mobile devices.

Bittner
Delivering new functionality through mobile experiences, through cloud experiences, through the web, through various kinds of payment mechanisms -- all of these things contribute to the need to deliver services much faster.

Startup companies get this and they're already adopting these techniques in large numbers. What we're finding is that traditional companies are increasingly saying, "We have to do this. This a competitive threat to us." Like Blockbuster Video, they may cease to exist if they don’t.

Gardner: Companies like Apple or Uber probably define themselves as being technology companies. That’s what they do. Software is a huge part of what makes them a successful company. It defines them. What is it that DevOps brings to the table for them and others?

Bittner: DevOps optimizes the software delivery pipeline, all the steps that you have to go through between when you have an idea and when a customer starts benefiting from that idea. In the traditional delivery processes, you have lots of hand-offs, lots of stops and starts. You have relatively inefficient processes, and it can take months -- and sometimes years -- to go from idea to having somebody get a benefit.

With DevOps, we're reducing the size of the things you're delivering, so you can deliver more frequently. Then, you can eliminate hand-offs and inefficiencies in the delivery process, so that you can deliver it as fast as possible with higher quality.

Gardner: And what was broken? What needs to be fixed? Wasn’t Agile suppose to fix this?

Bittner: Agile is part of the solution, but many Agile teams find that they'd like to be more agile. They're held back by lack of testing environments. They're held back by lack of testing automation. They're held back by lack of deployment automation. They, themselves, have lots of barriers.

So, Agile is part of the solution in the sense of involving the business more on a day-to-day basis in the project decision-making. It also provides the ability to break a problem down into smaller increments, and at least demonstrate in smaller increments, but it doesn’t actually deliver into production in smaller increments.

Other capabilities

You need to have other capabilities to do that. One illustration of how DevOps helps to accelerate Agile came in talking to a large manufacturing organization that was making the transition to Agile.

They had a problem in that they weren't able to get to development or test environments for months. IT operations processes had been set up in a very siloed way. Development and testing environments got low priority when other things were going on.

So, as much as the team wanted to work in an Agile way, they couldn’t get a rapid test environment. In effect, they were completely stopped from any forward progress. There's only so much you can do on a developer workstation.

These DevOps practices benefit Agile as well, by enabling Agile to really fully realize the promise that it’s had.
These DevOps practices benefit Agile as well, enabling Agile to really fully realize the promise that it’s had.

Gardner: Is there a change in philosophy, too, Kurt, where software is released before it's really cooked and let the environment, the real world, be their test bed, their simulation if you will? And then they do rapid iterations? Are we going to begin seeing that now, as DevOps gains ground in established traditional enterprises?

Bittner: You're right. There is a tendency toward getting functionality out there, seeing what the market says about it, and then improving. That works in certain areas. For example, Google has an internal motto that says if you're not somewhat embarrassed by your first release, you didn’t move fast enough.

But we also have to realize that we have software in our automobiles and in our aircraft, and you don’t want to put something out there into those environments that’s basically not functional.

I separate the measures of quality from measures of aesthetic qualities. The software that gets delivered early has to be high-quality. It can’t be buggy. It has to work and satisfy a certain set of needs. But there's a wide variety of variability on whether people will like it or not or whether people will use it or not.

So when organizations are delivering quickly and getting feedback from the market, they're really getting feedback on things like usability and aesthetics and not necessarily on some critical business-processing capability. Or let’s say the software in your anti-lock braking system (ABS) system in your car. You don’t want that to fail, but you might be very interested in how the climate-control system works.

That may be subject to wide variations. To get better fuel efficiency, you may be willing to sacrifice something in the air conditioner to provide better efficiency. So, it’s largely driving feedback on non-safety-critical features. That's where most organizations are focused. 

More feedback

Gardner: You mentioned feedback. That seems to be a core aspect of DevOps, more feedback between operations, the real world, the use of software, and the development  and test process. How do we compress that feedback loop -- not only for user experience, but also data coming out of an embedded system, for example -- so that we can improve? Let’s address feedback and compressing the feedback-loop.

Bittner: If you think about what traditional application releases do, they tend to bundle a lot of different features into a single release. If you think about this from a statistical perspective, that means you have a lot of independent variables. You can’t tell when something improves. You can’t tell why it improved, because you have so many variables in there.

In the feedback loop with DevOps, you want to make the increment of releases as small as possible, basically one thing at a time, and then measure the result from that, so you know that your results improve because of that one single feature.

The other thing is that we start to shift toward a more outcome-oriented software release. You're not releasing features, but you're doing things that will change a customer’s outcome. If it doesn’t change a customer’s outcome, the customer doesn’t really care.
You optimize the delivery cycle, removing waste and hand-offs to make that as fast as possible with a high degrees of automation.

So by having the increment of a release be one outcome at a time, and then measuring the result from that, you get the capabilities out there as quickly as possible. Then you can tell whether you actually improved because of what you just did. If you didn’t improve, then you stop doing that and do something else.

Gardner: Is that what you mean by continuous delivery, these iterative small parts, rather than the whole big dump every six to 12 months?

Bittner: That’s a big part of it. Continuous delivery is also, more precisely, a process by which you make small changes. You optimize the delivery cycle, removing waste and hand-offs to make that as fast as possible with a high degrees of automation, so that you can get out there and get the feedback as quickly as possible.

So, it’s a combination. It needs not just fast delivery, but a number of techniques that are used to improve that delivery.

Gardner: Folks listening and reading this might very well like the idea of DevOps: "I'd like to do DevOps; where do I buy it?" DevOps, though, isn't really a product, a box, or a download. It’s a way of thinking in a methodological approach. How people go about implementing DevOps? Where do you start?
Solutions That Unify Development and Operations
To Accelerate Business Innovation
Get More Information
Bittner: You’re right. It's more of a philosophy than a product. It’s not even really a product category, but a bunch of different products, and processes, and to some degree, a philosophy behind that. When we talk to organizations that implemented this successfully, there are a couple of patterns.

First of all, you don't implement DevOps across an entire organization all at once. It tends to happen product by product, team by team. It happens first in the applications that are very customer-facing, because that's where the most pressure is right now. That’s where the biggest benefit is. So on the team-by-team basis, first of all you have to have some executive mandate to make a change. Somebody has to feel like this is important enough to the company.

While developers, engineers, and IT Ops people can be passionate about this, it typically requires executive leadership to get this to happen, because these changes cut across traditional organizational silos. Without some executive sponsorship, these initiatives tend not to go very far.
There's too much wait time when people are assigned to multiple projects or multiple applications.

The first step – and this is sort of very mundane area -- tends to be changing the way that environments are provisioned. That includes getting environments provisioned on-demand, using techniques like infrastructure-as-code to automatically generate environments based on configuration settings so that you can have an environment anytime you need it. That removes a lot of friction and a lot of delays.

The second thing that tends to be implemented are techniques like continuous integration and then, after that, test automation, based on APIs. There's a shift to APIs on an integrated architecture for the applications, and then usually deployment automation comes after that. Once you have environments provisioned in code that you can put into those environments, you need a way to move that code between environments.

As you make those changes, you start to run into organizational barriers, silos in the organization, that prevent effectively working together. There's too much wait-time when people are assigned to multiple projects or multiple applications.

There's a shift in team structure to become more product-oriented with dedicated resources to a product, so that you can release, and do release after release most effectively. That tends to break the organization silos down and start shifting to a more product-centric organization and away from a functionally oriented organization.

All of those changes together typically take years, but it usually starts with some sort of executive mandate, then environment provisioning, and so on.

Management capability

Gardner: It sounds, too, that it's important to have better management capabilities across these silos -- with metrics, dashboards, validating efforts, of being able to measure discretely what's going on, and then reinforce the good and discard the bad.

Are there any particular existing ways of doing that? I'm thinking about the long-term application lifecycle management (ALM) marketplace. Does that lend itself to DevOps? Should we start from scratch and create a new management layer, if you will, across the whole continuum of software design, test, and delivery?

Bittner: It’s a little bit of both. DevOps is really an outgrowth of ALM, and all of the aspects of ALM are there. You need to be able to manage the work, track the work, and to determine what work got done. In addition to that, you’re adding automation in the areas that I was just describing; environment provisioning, continuous integration, test automation, and deployment automation.

There's another component that becomes really important, because out of those applications, you want to start gathering customer experience data. So things like operational and application analytics are important to start measuring the customer experience.
You don’t find one DevOps suite from one company that provides everything.

Combining all of those into a single view, single dashboard is evolving now. The ALM tools are evolving in that direction, and there are ways of visualizing that. But right now it tends to be a multi-vendor ecosystem. You don’t find one DevOps suite from one company that provides everything.

But the good news is that the same thing that’s been happening in the rest of the industry around services and interoperability has happened in applications. We have a high degree of interoperability between tools from different vendors today that allows you to customize this delivery pipeline to give you the DevOps capability.

Gardner: It seems that, in some ways, the prominence of hybrid cloud models, mobile, and mobile-first thinking, when it comes to development, are accelerants to DevOps. If you have that multiple cloud goal, you're going to want to standardize on your production environment. Hence, also the interest in containers these days. And, of course, mobile-first forces you to think about user experience, small iterations apps, rather than applications. Do you see an acceleration from these other trends reinforcing DevOps?

Bittner: It’s both reinforcing it and, to some degree, causing it, because it's mobile that’s triggered this explosion and the need for DevOps -- the need for faster delivery. To a large degree, the mobile application is the proverbial tip of the iceberg. Very few mobile applications stand alone. They all have very rich services running behind them. They have systems of record providing the data. Virtually every mobile application is really a composite application with some parts in the cloud and some parts in traditional data centers.

The development across all of those different code lines and the coordination of releases across all those different code lines really requires the DevOps approach to be able to do that successfully.

Demand and complexity

So it's both demand created by higher customer expectations from mobile customers, but also the complexity of delivering these applications in a really rapid way across all those different platforms. You made an interesting point about cloud and containers being both drivers for demand and also enablers, but they're also changing the nature of the work.

As containers and microservices become more prevalent -- we’re seeing growth in those areas -- it's increasing the complexity of application delivery. It simplifies the deployment, but it increases the complexity. Now, instead of having to coordinate dozens of moving parts, you have to coordinate hundreds and, we think, in the future, thousands of moving parts. That's well beyond what somebody can do with spreadsheets and manual management techniques.

The other thing is that cloud simplifies environment provisioning tremendously and it provides this great elastic infrastructure for deploying applications. But it also simplifies it by standardizing environments, making it all software configurable. It's a tremendous benefit to delivering applications faster and it gives you much more flexibility than traditional data-center applications. There's definitely movement toward those kind of applications, especially for DevOps.
Cloud simplifies environment provisioning tremendously and it provides this great elastic infrastructure for deploying applications.

Gardner: When I heard you mention the complexity, it certainly sounds like automating and moving away from manual processes, standardizing processes across your development test-to-deploy continuum, would be really important steps to take.

Bittner: Absolutely. I would say more than important. It’s absolutely essential that, without automation and that data-driven visibility into what's happening in the applications, there's almost no way to deliver these applications at speed. We find that many organizations are releasing quarterly now, not necessarily the same app every quarter, but they have a quarterly release cycle. At quarterly rates of speed, through seat of the pants and sort of brute force, you can manage to get that release out. It’s pretty painful, but you can survive.

If you turn up the clock rate faster than that and try to get down to monthly, those manual processes completely fall apart. We have organizations today that want to be delivering at weekly and daily intervals, especially in SaaS-based environments or cloud-based environments. Those kinds of delivery speeds are inconceivable with any kind of manual processes. As organizations move away from quarterly releases to faster releases, they have to adopt these techniques.

Gardner: Listening to you Kurt, it sounds like DevOps isn't another buzzword or another flashy marketing term. It really sounds inevitable, if you're going to succeed in software.

Bittner: It is inevitable, and over the next five years, what we’ll see is that the word itself will probably fade, because it will simply become the way that organizations work.

Gardner: I'm afraid we'll have to leave it there. We've been exploring the popularity of DevOps for making sure that development, test, deployment and ongoing improvement in software creation are coordinated, lean, and proficient process. We've heard from a prominent industry analyst at Forrester Research about what’s making DevOps such a hot topic and steps that organizations are taking to make it successful.

Please join me in thanking Kurt Bittner, Principal Analyst Applications Development and Delivery at Forrester Research.
Solutions That Unify Development and Operations
To Accelerate Business Innovation
Get More Information
And a big thank you also to our audience as well for joining us for this DevOps thought leadership discussion. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HPE-sponsored discussions. Thanks again for listening, and come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Transcript of a BriefingsDirect discussion on what’s making DevOps such a hot topic and steps that organizations are taking to make it successful. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved.

You may also be interested in:

Thursday, November 19, 2015

Agile on Fire: IT Enters the New Era of 'Continuous' Everything

Transcript of a BriefingsDirect discussion on the concept of continuous processes around development and deployment of applications.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Dana Gardner: Hello, and welcome to the next edition of the HPE Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on IT innovation and how it’s making an impact on people’s lives.

Gardner
Our next DevOps Thought Leadership Discussion explores the concept of continuous processes around the development and deployment of applications and systems. Put the word continuous in front of many things and we help define DevOps: continuous delivery, continuous testing, continuous assessment, and there is more.

To help better understand the continuous nature of DevOps, we're joined by two guests, James Governor, Founder and Principal Analyst at RedMonk, and Ashish Kuthiala, Senior Director of Marketing and Strategy for Hewlett Packard Enterprise (HPE) DevOps. Welcome to you both.
Solutions That Unify Development and Operations
To Accelerate Business Innovation
Get More Information
Ashish Kuthiala: Hi, glad to be here, Dana.

Gardner: Ashish, we hear a lot about feedback loops in DevOps between production and development, test and production. Why is the word "continuous" now cropping up so much? What do we need to do differently in IT in order to compress those feedback loops and make them impactful?

Kuthiala: Gone are the days where you would see the next version 2.0 coming out in six months and 2.1 coming out three months after that.

Kuthiala
If you use some of the modern applications today, you never see Facebook 2.0 is coming out tomorrow or Google 3.1 is being released. They are continuously and always making improvements from the back-end onto the platforms of the users -- without the users even realizing that they're getting improvements, a better user experience, etc.

In order to achieve that, you have to continuously be building those new innovations into your product. And, of course, as soon as you change something you need to test it and roll it all the way into production.

In fact, we joke a lot about how if everything is continuous, why don’t we drop the word continuous and just call it planning, testing, or development, like we do today, and just say that you continuously do this. But we tend to keep using this word "continuous" before everything.

I think a lot of it is to drive home the point across the IT teams and organizations that you can no longer do this in chunks of three, six, or nine months -- but you always have to keep doing this.

Governor: How do you do the continuous assessment of your continuous marketing?

Continuous assessment

Kuthiala: We joke about the continuous marketing of everything. The continuous assessment term, despite my objections to the word continuous all the time, is a term that we've been talking about at HPE.

The idea here is that for most software development teams and production teams, when they start to collaborate well, take the user experience, the bugs, and what’s not working on the production end at the users’ hands -- where the software is being used -- and feed those bugs and the user experience back to the development teams.

When companies actually get to that stage, it’s a significant improvement. It’s not the support teams telling you that five users were screaming at us today about this feature or that feature. It’s the idea that you start to have this feedback directly from the users’ hands.

We should stretch this assessment piece a little further. Why assess the application or the software when it’s at the hands of the end users? The developer, the enterprise architects, and the planners design an application and they know best how it should function.

Whether it’s monitoring tools or it’s the health and availability of the application, start to shift left, as we call it. I'd like James to comment more about this, because he knows a lot about the development space. The developer knows his code best; let him experience what the user is starting to experience.

Governor: My favorite example of this is that, as an analyst, you're always looking for those nice metaphors and ways to talk about the world -- one notion of quality I was very taken with was when I was reading about the history if ship-building and the roles and responsibilities involved in building a ship.

Governor
One of the things they found was that if you have a team doing the riveting separate from doing the quality assurance (QA) on the riveting, the results are not as good. Someone will happily just go along -- rivet, rivet, rivet, rivet -- and not really care if they're doing a great job, because somebody else is going to have to worry about the quality.

As they moved forward with this, they realized that you needed to have the person doing the riveting also doing the QA. That’s a powerful notion of how things have changed.

Certainly the notion of shifting left and doing more testing earlier in the process, whether that be in terms of integration, load testing, whatever, all the testing needs to happen up front and it needs to be something that the developers are doing.

The new suite of tools we have makes it easier for developers to have better experiences around that, and we should take advantage.

Lean manufacturing

One of the other things about continuous is that we're making reference to manufacturing modes and models. Lean manufacturing is something that led to fewer defects, apart from one catastrophic example to the contrary. And we're looking at that and asking how we can learn from that.

So lean manufacturing ties into lean startups, which ties into lean and continuous assessment.

What’s interesting is that now we're beginning to see some interplay between the two and paying that forward. If you look at GM, they just announced a team explicitly looking at Twitter to find user complaints very, very early in the process, rather than waiting until you had 10,000 people that were affected before you did the recall.

Last year was the worst year ever for recalls in American car manufacturing, which is interesting, because if we have continuous improvement and everything, why did that happen? They're actually using social tooling to try to identify early, so that they can recall 100 cars or 1,000 cars, rather than 50,000.

It’s that monitoring really early in the process, testing early in the process, and most importantly, garnering user feedback early in the process. If GM can improve and we can improve, yes.

Gardner: I remember in the late '80s, when the Japanese car makers were really kicking the pants out of Detroit, that we started to hear a lot about simultaneous engineering. You wouldn’t just design something, but you designed for its manufacturability at the same time. So it’s a similar concept.
Solutions That Unify Development and Operations
To Accelerate Business Innovation
Get More Information
But going back to the software process, Ashish, we see a level of functionality in software that needs to be rigorous with security and performance, but we're also seeing more and more the need for that user experience for features and functions that we can’t even guess at, that we need to put into place in the field and see what happens.

How does an enterprise get to that point, where they can so rapidly do software that they're willing to take a chance and put something out to the users, perhaps a mobile app, and learn from its actual behavior? We can get the data, but we have to change our processes before we can utilize it. 

Kuthiala: Absolutely. Let me be a little provocative here, but I think it’s a well-known fact that the era of the three-year, forward-looking roadmaps is gone. It’s good to have a vision of where you're headed, but what feature, function and which month will you release so that the users will find it useful? I think that’s just gone, with this concept of the minimum viable product (MVP) that more startups take off with and try to build a product and fund themselves as they gain success.

It’s an approach even that bigger enterprises need to take. You don't know what the end users’ tastes are.

I change my taste on the applications I use and the user experience I get, the features and functionality. I'm always looking at different products, and I switch my mind quite often. But if I like something and they're always delivering the right user experience for me, I stick with them.

Capture the experience

The way for an enterprise to figure out what to build next is to capture this experience, whether it’s through social media channels or engineering your codes so that you can figure out what the user behavior actually is.

The days of business planners and developers sitting in cubicles and thinking this is the coolest thing I'm going to invent and roll out is not going to work anymore. You definitely need that for innovation, but you need to test that fairly quickly.

Also gone are the days of rolling back something when something doesn’t work. If something doesn’t work, if you can deliver software really quickly at the hands of end users, you just roll forward. You don’t roll back anymore.

It could be a feature that’s buggy. So go and fix it, because you can fix it in two days or two hours, versus the three- to six-month cycle. If you release a feature and you see that most users -- 80 percent of the users -- don’t even bother about it, turn it off, and introduce the new feature that you were thinking about.

This assessment from the development, testing, and production that you're always doing starts to benefit you. When you're standing up for that daily sprint and wondering what are the three features I'm going to work on as a team, whether it’s the two things that your CEO told you you have to absolutely do it, because "I think it’s the greatest thing since sliced bread," or it’s the developer saying, "I think we should build this feature," or some use case is coming out of the business analyst or enterprise architects.
We have wonderful new platforms that enable us to store a lot more data than we could before at a reasonable cost.

Now you have data. You have data across all these teams. You can start to make smarter decisions and you can choose what to build and not build. To me, that's the value of continuous assessment. You can invest your $100 for that day in the two things you want to do. None of us has unlimited budgets.

Gardner: For organizations that grok this, that say, "I want continuous delivery. I want continuous assessment," what do we need to put in place to actually execute on it to make it happen?

Governor: We've spoken a lot about cultural change, and that’s going to be important. One of the things, frankly, that is an underpinning, if we're talking about data and being data-driven, is just that we have wonderful new platforms that enable us to store a lot more data than we could before at a reasonable cost.

There were many business problems that were stymied by the fact that you would have to spend the GDP of a country in order to do the kind of processing that you wanted to, in order to truly understand how something was working. If we're going to model the experiences, if we are going to collect all this data, some of the thinking about what's infrastructure for that so that you can analyze the data is going to be super important. There's no point talking in being data-driven if you don’t have architecture for delivering on that.

Gardner: Ashish, how about loosely integrated capabilities across these domains, tests, build, requirements, configuration management, and deployment? It seems that HPE is really at the center of a number of these technologies. Is there a new layer or level of integration that can help accelerate this continuous assessment capability?

Rich portfolio

Kuthiala: You're right. We have a very rich portfolio across the entire software development cycle. You've heard about our Big Data Platform. What can it really do, if you think about it? James just referred to this. It’s cheaper and easier to store data with the new technologies, whether it’s structured, unstructured, video, social, etc., and you can start to make sense out of it when you put it all together.

There is a lot of rich data in the planning and testing process, and all the different lifecycles. A simple example is a technology that we've worked on internally, where when you start to deliver software faster and you change one line of code and you want this to go out. You really can’t afford to do the 20,000 tests that you think you need to do, because you're not sure what's going to happen.

We've actually had data scientists working internally in our labs, studying the patterns, looking at the data, and testing concepts such as intelligent testing. If I change this one line of code, even before I check it in, what parts of the code is it really affecting, what functionality? If you are doing this intelligently, does it affect all the regions of the world, the demographics? What feature function does it affect?
We've actually had data scientists working internally in our labs, studying the patterns, looking at the data, and testing concepts such as intelligent testing.

It's helping you narrow down whether will it break the code, whether it will actually affect certain features and functions of this software application that’s out there. It's narrowing it down and helping you say, "Okay, I only need to run these 50 tests and I don't need to go into these 10,000 tests, because I need to run through this test cycle fast and have the confidence that it will not break something else."

So it's a cultural thing, like James said, but the technologies are also helping make it easier.

Gardner: It’s interesting. We're borrowing concepts from other domains in the past as well -- just-in-time testing or fit-for-purpose testing, or lean testing?

Kuthiala: We were talking about Lean Functional Testing (LeanFT) at HP Discover. I won't talk about that here in terms of product, but the idea is exactly that. The idea is that the developer, like James said, knows his code well. He can test it well before and he doesn’t throw it over the wall and let the other team take a shot at it. It’s his responsibility. If he writes a line of code, he should be responsible for the quality of it.

Gardner: And it also seems that the integration across this continuum can really be the currency of analysis. When we have data and information made available, that's what binds these processes together, and we're starting to elevate and abstract that analysis up and it make it into a continuum, rather than a waterfall or a hand-off type of process.

Before we close out, any other words that we should put in front of continuous as we get closer to DevOps -- continuous security perhaps?

Security is important

Kuthiala: Security is a very important topic and James and I have talked about it a lot with some other thought leaders. Security is just like testing. Anything that you catch early on in the process is a lot easier and cheaper to fix than if you catch it in the hands of the end users, where now it’s deployed to tens and thousands of people.

It’s a cultural shift. The technology has always been there. There's a lot of technology within and outside of HP that you need to incorporate the security testing and the discipline right into the development and planning process and not leave it towards the end.

In terms of another continuous word, I mean I can come up with continuous Dana Gardner podcast.

Governor: There you go.

Gardner: Continuous discussions about DevOps.
One of the things that RedMonk is very interested in, and it's really our view in the world, is that, increasingly, developers are making the choices, and then we're going to find ways to support the choices they are making.

Governor: One of the things that RedMonk is very interested in, and it's really our view in the world, is that, increasingly, developers are making the choices, and then we're going to find ways to support the choices they are making.

It was very interesting to me that the term continuous integration began as a developer term, and then the next wave of that began to be called continuous deployment. That's quite scary for a lot of organizations. They say, "These developers are talking about continuous deployment. How is that going to work?"

The circle was squared when I had somebody come in and say what we're talking to customers about is continuous improvement, which of course is a term again that we saw in manufacturing and so on.

But the developer aesthetic is tremendously influential here, and this change has been driven by them. My favorite "continuous" is a great phrase, continuous partial attention, which is the world we all live in now.

Gardner: I'm afraid we will have to leave it there. We've been exploring the concept of continuous processes around development and deployment of applications.

I'd like to thank our guests, James Governor, Founder and Principal Analyst at RedMonk, and Ashish Kuthiala, Senior Director of Marketing and Strategy for HPE DevOps.
Solutions That Unify Development and Operations
To Accelerate Business Innovation
Get More Information
I'd also like to thank our audience for joining this special DevOps Thought Leadership Discussion. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HPE-sponsored discussions. Thanks again for listening, and come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Transcript of a BriefingsDirect discussion on the concept of continuous processes around development and deployment of applications. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved.

You may also be interested in:

Wednesday, November 18, 2015

Big Data Enables Top User Experiences and Extreme Personalization for Intuit TurboTax

Transcript of a BriefingsDirect discussion on how TurboTax uses big data analytics to improve performance despite high data volumes during peak usage.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Dana Gardner: Hello, and welcome to the next edition of the HPE Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing discussion on IT innovation and how it’s making an impact on people’s lives.

Gardner
Our next big data innovation case study highlights how Intuit uses deep-data analytics to gain a 360-degree view of its TurboTax application's users’ behavior and preferences. Such visibility allows for rapid applications improvements and enables the TurboTax user experience to be tailored to a highly detailed degree.

Here to share how analytics paves the way to better understanding of end-user needs and wants, we're joined by Joel Minton, Director of Data Science and Engineering for TurboTax at Intuit in San Diego. Welcome to Briefings Direct, Joel.

Joel Minton: Thanks, Dana, it’s great to be here.
HP Vertica Community Edition
Start Your Free Trial Now
Gain Access to All Features and Functionality
Gardner: Let’s start at a high-level, Joel, and understand what’s driving the need for greater analytics, greater understanding of your end-users. What is the big deal about big-data capabilities for your TurboTax applications?

Minton: There were several things, Dana. We were looking to see a full end-to-end view of our customers. We wanted to see what our customers were doing across our application and across all the various touch points that they have with us to make sure that we could fully understand where they were and how we can make their lives better.

Minton
We also wanted to be able to take that data and then give more personalized experiences, so we could understand where they were, how they were leveraging our offerings, but then also give them a much more personalized application that would allow them to get through the application even faster than they already could with TurboTax.

And last but not least, there was the explosion of available technologies to ingest, store, and gain insights that was not even possible two or three years ago. All of those things have made leaps and bounds over the last several years. We’ve been able to put all of these technologies together to garner those business benefits that I spoke about earlier.

Gardner: So many of our listeners might be aware of TurboTax, but it’s a very complex tax return preparation application that has a great deal of variability across regions, states, localities. That must be quite a daunting task to be able to make it granular and address all the variables in such a complex application.

Minton: Our goal is to remove all of that complexity for our users and for us to do all of that hard work behind the scenes. Data is absolutely central to our understanding that full end-to-end process, and leveraging our great knowledge of the tax code and other financial situations to make all of those hard things easier for our customers, and to do all of those things for our customers behind the scenes, so our customers do not have to worry about it.

Gardner: In the process of tax preparation, how do you actually get context within the process?

Always looking

Minton: We're always looking at all of those customer touch points, as I mentioned earlier. Those things all feed into where our customer is and what their state of mind might be as they are going through the application.

To give you an example, as a customer goes though our application, they may ask us a question about a certain tax situation.

When they ask that question, we know a lot more later on down the line about whether that specific issue is causing them grief. If we can bring all of those data sets together so that we know that they asked the question three screens back, and then they're spending a more time on a later screen, we can try to make that experience better, especially in the context of those specific questions that they have.

As I said earlier, it's all about bringing all the data together and making sure that we leverage that when we're making the application as easy as we can.

Gardner: And that's what you mean by a 360-degree view of the user: where they are in time, where they are in a process, where they are in their particular individual tax requirements?

Minton: And all the touch points that they have with not only things on our website, but also things across the Internet and also with our customer-care employees and all the other touch points that we use try to solve our customers’ needs.
During our peak times of the year during tax season, we have billions and billions of transactions.

Gardner: This might be a difficult question, but how much data are we talking about? Obviously you're in sort of a peak-use scenario where many people are in a tax-preparation mode in the weeks and months leading up to April 15 in the United States. How much data and how rapidly is that coming into you?

Minton: We have a tremendous amount of data. I'm not going to go into the specifics of the complete size of our database because it is proprietary, but during our peak times of the year during tax season, we have billions and billions of transactions.

We have all of those touch points being logged in real-time, and we basically have all of that data flowing through to our applications that we then use to get insights and to be able to help our customers even more than we could before. So we're talking about billions of events over a small number of days.

Gardner: So clearly for those of us that define big data by velocity, by volume, and by variety, you certainly meet the criteria and then some.

Unique challenges

Minton: The challenges are unique for TurboTax because we're such a peaky business. We have two peaks that drive a majority of our experiences: the first peak when people get their W-2s and they're looking to get their refunds, and then tax day on April 15th. At both of those times, we're ingesting a tremendous amount of data and trying to get insights as quickly as we can so we can help our customers as quickly as we can.

Gardner: Let’s go back to this concept of user experience improvement process. It's not just something for tax preparation applications but really in retail, healthcare, and many other aspects where the user expectations are getting higher and higher. People expect more. They expect anticipation of their needs and then delivery of that.

This is probably only going to increase over time, Joel. Tell me a little but about how you're solving this issue of getting to know your user and then being able to be responsive to an entire user experience and perception.

Minton: Every customer is unique, Dana. We have millions of customers who have slightly different needs based on their unique situations. What we do is try to give them a unique experience that closely matches their background and preferences, and we try to use all of that information that we have to create a streamlined interaction where they can feel like the experience itself is tailored for them.
So the most important thing is taking all of that data and then providing super-personalized experience based on the experience we see for that user and for other users like them.

It’s very easy to say, “We can’t personalize the product because there are so many touch points and there are so many different variables.” But we can, in fact, make the product much more simplified and easy to use for each one of those customers. Data is a huge part of that.

Specifically, our customers, at times, may be having problems in the product, finding the right place to enter a certain tax situation. They get stuck and don't know what to enter. When they get in those situations, they will frequently ask us for help and they will ask how they do a certain task. We can then build code and algorithms to handle all those situations proactively and be able to solve that for our customers in the future as well.

So the most important thing is taking all of that data and then providing super-personalized experience based on the experience we see for that user and for other users like them.

Gardner: In a sense, you're a poster child for a lot of elements of what you're dealing with, but really on a significant scale above the norm, the peaky nature, around tax preparation. You desire to be highly personalized down to the granular level for each user, the vast amount of data and velocity of that data.

What were some of your chief requirements at your architecture level to be able to accommodate some of this? Tell us a little bit, Joel, about the journey you’ve been on to improve that architecture over the past couple of years?

Lot of detail

Minton: There's a lot of detail behind the scenes here, and I'll start by saying it's not an easy journey. It’s a journey that you have to be on for a long time and you really have to understand where you want to place your investment to make sure that you can do this well.

One area where we've invested in heavily is our big-data infrastructure, being able to ingest all of the data in order to be able to track it all. We've also invested a lot in being able to get insights out of the data, using Hewlett Packard Enterprise (HPE) Vertica as our big data platform and being able to query that data in close to real time as possible to actually get those insights. I see those as the meat and potatoes that you have to have in order to be successful in this area.

On top of that, you then need to have an infrastructure that allows you to build personalization on the fly. You need to be able to make decisions in real time for the customers and you need to be able to do that in a very streamlined way where you can continuously improve.

We use a lot of tactics using machine learning and other predictive models to build that personalization on-the-fly as people are going through the application. That is some of our secret sauce and I will not go into in more detail, but that’s what we're doing at a high level.

Gardner: It might be off the track of our discussion a bit, but being able to glean information through analytics and then create a feedback loop into development can be very challenging for a lot of organizations. Is DevOps a cultural parallel path along with your data-science architecture?
HP Vertica Community Edition
Start Your Free Trial Now
Gain Access to All Features and Functionality
I don’t want to go down the development path too much, but it sounds like you're already there in terms of understanding the importance of applying big-data analytics to the compression of the cycle between development and production.

Minton: There are two different aspects there, Dana. Number one is making sure that we understand the traffic patterns of our customer and making sure that, from an operations perspective, we have the understanding of how our users are traversing our application to make sure that we are able to serve them and that their performance is just amazing every single time they come to our website. That’s number one.

Number two, and I believe more important, is the need to actually put the data in the hands of all of our employees across the board. We need to be able to tell our employees the areas where users are getting stuck in our application. This is high-level information. This isn't anybody's financial information at all, but just a high-level, quick stream of data saying that these people went through this application and got stuck on this specific area of the product.

We want to be able to put that type of information in our developer’s hands so as the developer is actually building a part of the product, she could say that I am seeing that these types of users get stuck at this part of the product. How can I actually improve the experience as I am developing it to take all of that data into account?

We have an analyst team that does great work around doing the analytics, but in addition to that, we want to be able to give that data to the product managers and to the developers as well, so they can improve the application as they are building it. To me, a 360-degree view of the customer is number one. Number two is getting that data out to as broad of an audience as possible to make sure that they can act on it so they can help our customers.

Major areas

Gardner: Joel, I speak with HPE Vertica users quite often and there are two major areas that I hear them talk rather highly of the product. First, has to do with the ability to assimilate, so that dealing with the variety issue would bring data into an environment where it can be used for analytics. Then, there are some performance issues around doing queries, amid great complexity of many parameters and its speed and scale.

Your applications for TurboTax are across a variety or platforms. There is a shrink-wrap product from the legacy perspective. Then you're more along the mobile lines, as well as web and SaaS. So is Vertica something that you're using to help bring the data from a variety of different application environments together and/or across different networks or environments?

Minton: I don't see different devices that someone might use as a different solution in the customer journey. To me, every device that somebody uses is a touch point into Intuit and into TurboTax. We need to make sure that all of those touch points have the same level of understanding, the same level of tracking, and the same ability to help our customers.

Whether somebody is using TurboTax on their computer or they're using TurboTax on their mobile device, we need to be able to track all of those things as first-class citizens in the ecosystem. We have a fully-functional mobile application that’s just amazing on the phone, if you haven’t used it. It's just a great experience for our customers.

From all those devices, we bring all of that data back to our big data platform. All of that data can then be queried, because you want to understand, many questions, such as when do users flow across different devices and what is the experience that they're getting on each device? When are they able to just snap a picture of their W-2 and be able to import it really quickly on their phone and then jump right back into their computer and finish their taxes with great ease?
You need to be able to have a system that can handle that concurrency and can handle the performance that’s going to be required by that many more people doing queries against the system.

We need to be able to have that level of tracking across all of those devices. The key there, from a technology perspective, is creating APIs that are generic across all of those devices, and then allowing those APIs to feed all of that data back to our massive infrastructure in the back-end so we can get those insights through reporting and other methods as well.

Gardner: We've talked quite a bit about what's working for you: a database column store, the ability to get a volume variety and velocity managed in your massive data environment. But what didn't work? Where were you before and what needed to change in order for you to accommodate your ongoing requirements in your architecture?

Minton: Previously we were using a different data platform, and it was good for getting insights for a small number of users. We had an analyst team of 8 to 10 people, and they were able to do reports and get insights as a small group.

But when you talk about moving to what we just discussed, a huge view of the customer end-to-end, hundreds of users accessing the data, you need to be able to have a system that can handle that concurrency and can handle the performance that’s going to be required by that many more people doing queries against the system.

Concurrency problems

So we moved away from our previous vendor that had some concurrency problems and we moved to HPE Vertica, because it does handle concurrency much better, handles workload management much better, and it allows us to pull all this data.

The other thing that we've done is that we have expanded our use of Tableau, which is a great platform for pulling data out of Vertica and then being able to use those extracts in multiple front-end reports that can serve our business needs as well.

So in terms of using technology to be able to get data into the hands of hundreds of users, we use a multi-pronged approach that allows us to disseminate that information to all of these employees as quickly as possible and to do it at scale, which we were not able to do before.
There's always going to be more data that you want to track than you have hardware or software licenses to support.

Gardner: Of course, getting all your performance requirements met is super important, but also in any business environment, we need to be concerned about costs.

Is there anything about the way that you were able to deploy Vertica, perhaps using commodity hardware, perhaps a different approach to storage, that allowed you to both accomplish your requirements, goals in performance, and capabilities, but also at a price point that may have been even better than your previous approach?

Minton: From a price perspective, we've been able to really make the numbers work and get great insights for the level of investment that we've made.

How do we handle just the massive cost of the data? That's a huge challenge that every company is going to have in this space, because there's always going to be more data that you want to track than you have hardware or software licenses to support.

So we've been very aggressive in looking at each and every piece of data that we want to ingest. We want to make sure that we ingest it at the right granularity.

Vertica is a high-performance system, but you don't need absolutely every detail that you’ve ever had from a logging mechanism for every customer in that platform. We do a lot of detail information in Vertica, but we're also really smart about what we move into there from a storage perspective and what we keep outside in our Hadoop cluster.

Hadoop cluster

We have a Hadoop cluster that stores all of our data and we consider that our data lake that basically takes all of our customer interactions top to bottom at the granular detail level.

We then take data out of there and move things over to Vertica, in both an aggregate as well as a detail form, where it makes sense. We've been able to spend the right amount of money for each of our solutions to be able to get the insights we need, but to not overwhelm both the licensing cost and the hardware cost on our Vertica cluster.

The combination of those things has really allowed us to be successful to match the business benefit with the investment level for both Hadoop and with Vertica.

Gardner: Measuring success, as you have been talking about quantitatively at the platform level, is important, but there's also a qualitative benefit that needs to be examined and even measured when you're talking about things like process improvements, eliminating bottlenecks in user experience, or eliminating anomalies for certain types of individual personalized activities, a bit more quantitative than qualitative.
We're actually performing much better and we're able to delight our internal customers to make sure that they're getting the answers they need as quickly as possible.

Do you have any insight, either anecdotal or examples, where being able to apply this data analytics architecture and capability has delivered some positive benefits, some value to your business?

Minton: We basically use data to try to measure ourselves as much as possible. So we do have qualitative, but we also have quantitative.

Just to give you a few examples, our total aggregate number of insights that we've been able to garner from the new system versus the old system is a 271 percent increase. We're able to run a lot more queries and get a lot more insights out of the platform now than we ever could on the old system. We have also had a 41 percent decrease in query time. So employees who were previously pulling data and waiting twice as long had a really frustrating experience.

Now, we're actually performing much better and we're able to delight our internal customers to make sure that they're getting the answers they need as quickly as possible.

We've also increased the size of our data mart in general by 400 percent. We've massively grown the platform while decreasing performance. So all of those quantitative numbers are just a great story about the success that we have had.

From a qualitative perspective, I've talked to a lot of our analysts and I've talked to a lot of our employees, and they've all said that the solution that we have now is head and shoulders over what we had previously. Mostly that’s because during those peak times, when we're running a lot of traffic through our systems, it’s very easy for all the users to hit the platform at the same time, instead of nobody getting any work done because of the concurrency issues.

Better tracking

Because we have much better tracking of that now with Vertica and our new platform, we're actually able to handle that concurrency and get the highest priority workloads out quickly, allow them to happen, and then be able to follow along with the lower-priority workloads and be able to run them all in parallel.

The key is being able to run, especially at those peak loads, and be able to get a lot more insights than we were ever able to get last year.
HP Vertica Community Edition
Start Your Free Trial Now
Gain Access to All Features and Functionality
Gardner: And that peak load issue is so prominent for you. Another quick aside, are you using cloud or hybrid cloud to support any of these workloads, given the peak nature of this, rather than keep all that infrastructure running 365, 24×7? Is that something that you've been doing, or is that something you're considering?

Minton: Sure. On a lot of our data warehousing solutions, we do use cloud in points for our systems. A lot of our large-scale serving activities, as well as our large scale ingestion, does leverage cloud technologies.

We don't have it for our core data warehouse. We want to make that we have all of that data in-house in our own data centers, but we do ingest a lot of the data just as pass-throughs in the cloud, just to allow us to have more of that peak scalability that we wouldn’t have otherwise.
The faster than we can get data into our systems, the faster we're going to be able to report on that data and be able to get insights that are going to be able to help our customers.

Gardner: We're coming up toward the end of our discussion time. Let’s look at what comes next, Joel, in terms of where you can take this. You mentioned some really impressive qualitative and quantitative returns and improvements. We can always expect more data, more need for feedback loops, and a higher level of user expectation and experience. Where would you like to go next? How do you go to an extreme focus even more on this issue of personalization?

Minton: There are a few things that we're doing. We built the infrastructure that we need to really be able to knock it out of the park over the next couple of years. Some of the things that are just the next level of innovation for us are going to be, number one, increasing our use of personalization and making it much easier for our customers to get what they need when they need it.

So doubling down on that and increasing the number of use cases where our data scientists are actually building models that serve our customers throughout the entire experience is going to be one huge area of focus.

Another big area of focus is getting the data even more real time. As I discussed earlier, Dana, we're a very peaky business and the faster than we can get data into our systems, the faster we're going to be able to report on that data and be able to get insights that are going to be able to help our customers.

Our goal is to have even more real-time streams of that data and be able to get that data in so we can get insights from it and act on it as quickly as possible.

The other side is just continuing to invest in our multi-platform approach to allow the customer to do their taxes and to manage their finances on whatever platform they are on, so that it continues to be mobile, web, TVs, or whatever device they might use. We need to make sure that we can serve those data needs and give the users the ability to get great personalized experiences no matter what platform they are on. Those are some of the big areas where we're going to be focused over the coming years.

Recommendations

Gardner: Now you've had some 20/20 hindsight into moving from one data environment to another, which I suppose is equivalent of keeping the airplane flying and changing the wings at the same time. Do you have any words of wisdom for those who might be having concurrency issues or scale, velocity, variety type issues with their big data, when it comes to moving from one architecture platform to another? Any recommendations you can make to help them perhaps in ways that you didn't necessarily get the benefit of?

Minton: To start, focus on the real business needs and competitive advantage that your business is trying to build and invest in data to enable those things. It’s very easy to say you're going to replace your entire data platform and build everything soup to nuts all in one year, but I have seen those types of projects be tried and fail over and over again. I find that you put the platform in place at a high-level and you look for a few key business-use cases where you can actually leverage that platform to gain real business benefit.

When you're able to do that two, three, or four times on a smaller scale, then it makes it a lot easier to make that bigger investment to revamp the whole platform top to bottom. My number one suggestion is start small and focus on the business capabilities.

Number two, be really smart about where your biggest pain points are. Don’t try to solve world hunger when it comes to data. If you're having a concurrency issue, look at the platform you're using. Is there a way in my current platform to solve these without going big?

Frequently, what I find in data is that it’s not always the platform's fault that things are not performing. It could be the way that things are implemented and so it could be a software problem as opposed to a hardware or a platform problem.
HP Vertica Community Edition
Start Your Free Trial Now
Gain Access to All Features and Functionality
So again, I would have folks focus on the real problem and the different methods that you could use to actually solve those problems. It’s kind of making sure that you're solving the right problem with the right technology and not just assuming that your platform is the problem. That’s on the hardware front.

As I mentioned earlier, looking at the business use cases and making sure that you're solving those first is the other big area of focus I would have.

Gardner: I'm afraid we will have to leave it there. We've been learning about how Intuit uses deep-data analytics to gain a 360-degree view of its TurboTax applications user behavior and preferences. And we have heard about how such visibility allows for rapid applications improvements, providing an extreme personalization level and enabling the user of TurboTax to experience a higher degree of customization, something tailored directly for their situation.

So join me in thanking Joel Minton, Director of Data Science and Engineering for TurboTax at Intuit in San Diego. Thanks so much, Joel.

Minton: Thank you, Dana. I really enjoyed it.

Gardner: And I'd also like to thank our audience for joining this big-data innovation case study discussion. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HPE-sponsored discussions. Thanks again for listening, and come back next time.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: Hewlett Packard Enterprise.

Transcript of a discussion on how TurboTax uses big data analytics to improve performance despite high data volumes during peak usage. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved.

You may also be interested in: