Efficient Development of Predictive Cloud Applications

March 27, 2013

Introduction

Basketball great Michael Jordan said, “Talent wins games, but teamwork and intelligence wins championships”. I fundamentally believe that the way a company thinks about their way of working and the structure  of teams can either become a millstone around the neck or a competitive advantage. I received quite a few questions on how we develop software. So in this blog post I will talk about how we think about engineering teams at AgilOne for building marketing focused big data predictive applications.

As is illustrated in the “marchitecture” diagram below, the depth of AgilOne technology is extraordinary.  As a consequence there is tremendous variety and depth of skills that are required to deliver features in our applications. Each of our features require domain expertise in the specific predictive techniques we are developing. We have to make sure our calculated results are relevant and better than user’s intuition. The predictions need to be actionable and valid. Development also requires skills in data science and machine learning and an understanding of how to translate an idea into real algorithms. Implementing an algorithm requires very particular skills in understanding the limitations of data available as well as dependencies in data. In addition we need to address architectural objectives in making the solution scalable, robust, extensible, etc.  All features are exposed as functions in a highly graphical user interface, requiring skilled front-end designers and developers. In addition, we make features and predictions available through our APIs.


All these things taken together make it quite challenging to iteratively deliver enterprise-grade, easy to use, accurate, predictive features on a big data platform. But we have found a way.

How do we execute?

Like many other companies we use an agile methodology based on scrum . In fact we use scrum for all teams in our technology organization, including technical operations and IT. Each scrum team has the skills to deliver complete features end to end. As in many other engineering organizations this implies each team has an associated UI designer, front-end developers, and back end developers. For us it also means each team has knowledge in technologies such as Hadoop, hBase, R, Cascading, and OLAP. Last but not least our teams need to have the skill set of a data scientist. They need to understand the data, detect the patterns, and apply the right algorithms.

Within AgilOne we do have data scientists working closely with clients (who are all high-volume marketers with large data volumes) exploring what can be found out by analyzing their data. Their work is tremendously valuable for our customers and ultimately also for us. It allows us to learn what customers are interested in. We can generalize patterns and analysis across our client base. This results in great benefit for AgilOne as well as our customers.

In the engineering team we have taken the data scientist role one step further by defining a role we call Product Data Scientist. A product data scientist is a data scientist who is also part of an engineering scrum team. They perform generalized analysis that it can be applied across organizations in a multi-tenant cloud application. The product data scientist is a very rare individual. They must have a specific skillset, particular interests, and a unique personality. An AgilOne product data scientist is a world class mathematician with an analytical mind. But they are also a developer who can help turn an algorithm into a scalable working implementation. Some of our product scientists are mathematicians who have learned the art of software engineering while others are experienced software engineers what have had a lifelong interest in math and modeling. This role has enabled us to develop new analysis and models -- and implement them within the same team -- avoiding the “throw over the wall” problem.

To lead the delivery of these applications we have product managers that are extraordinary drivers. They are customer-oriented having worked directly with clients and have an analytical background and mindset.

I believe that our team set up has enabled us to deliver our predictive applications in a a more iterative fashion than a more traditional analyst v.s engineering structure. The reason is that teams have the analytical competency to understand where trade-offs can be made while keeping features useful and relevant. Since we removed the wall between data science and engineering we improve features and algorithms, and ship, very efficiently.

What we have learned so far

I have been amazed on how our team composition has multiplied the competency of each individual.  We have teams that can iteratively deliver advanced machine learning-based, scalable, multi-tenant predictive applications with great UI. The product data science role has worked out well. We have been experimenting with having product data scientists work alongside client data scientists. That cross pollination has been great as these individuals speak the same language which make sharing ideas and knowledge essentially effortless.

I believe we have turned the difficult problem of delivering functionality of extreme depth and breadth into a competitive advantage.