Drivetrain Method

Aidan Coco
4 min readMar 7, 2021

I started watching the Deep Learning lecture series for insights on how to get PyTorch up and running and the ins and outs of Neural Nets. In the process, I have learned so much more than that. Sylvian Gugger and Jeremy Howard constantly make a tangible effort to teach machine learning in a way that will produce great products. In the second lesson, they show a simple way to get a GUI up and running for prototyping an image classifier application. Obviously not every data scientist needs to be a web developer but having the power to make something that works from start to finish is such an important, practical skill that reinforces the way that machine learning is used in a real-world setting.

Often when you are learning about complicated modeling processes and data cleaning, it’s easy to lose sight of your overall goal which is to make meaningful insights to help people with a problem. The most accurate model is completely pointless if it never gets deployed or is not structured in an intelligent grounded way. From the very start, Howard wants you to ask, why am I making this? How can this solve a problem? In doing so he introduces a method that he originated called the drivetrain approach.

Machine learning is only a small part of this approach. The goal of a machine learning model should be as part of a larger goal of changing the input to produce a better output.

Howard recommends a process with the following steps:

  1. Define a business problem (models should be made to address a problem, problems should not be made to fit a model).
  2. Identify the inputs (levers) that affect this problem. How can the company change its inputs? Should they produce more of x or y? Or should they focus on providing for their current customer base or expand? These are examples of inputs that can be changed based on the predictions made by a model and it’s important to know what the possibilities of your model are.
  3. The third step is to find data or develop methods to collect data for this problem and construct your model (kind of crazy that what I used to think of as my whole job is all collated into one step.
  4. Optimize based on the predictions of the model and change the inputs.

Two things stood out to me about this process. There’s no point where data is generated just to generate data. During NFL games they always have some advanced stats about how fast a certain player was going but is top-end speed what’s important or is its acceleration. Does any of this correlate with production or are we just generating statistics? A campaign might have trouble targeting a certain group of voters but does that insight allow them to optimize anything about their strategy, how? The drivetrain method emphasizes a big-picture approach. Each step is part of a larger groundwork for addressing a tangible problem.

The second major takeaway for me is the last step, optimization. How can your model be used to optimize the inputs defined in step two and produce a better result for the company? Okay, you can predict a dog breed with 90% accuracy? Do vets or pet owners have trouble knowing the breed of their dog? Are there health implications? A good model can provide insight but a good model and an optimization plan can provide actionable insight. You should always have a goal in mind for what your model can change or improve if it does work. Data science is about results because of predictions, not predictions.

Another benefit of this approach is that considers implementation from the start. Even if a machine learning model provides meaningful insight that leads to a more optimal process if the model cannot be used successfully it's pointless. Taking a long view at the start of creating a model allows you to code with a use case in mind. For example, one of my colleagues wanted to make a clickbait detector that could work in real-time. Knowing that it needed to be a browser add on he focused his approach from start to finish on simplicity over maximum accuracy. This led him to prefer a naive Bayes classifier approach. By asking the right questions at the start you can avoid a product rollout where implementation is impossible.

One thing I have realized about myself in applying this method is that it can make it harder to get started sometimes. Making sure that your model is tailored to a very specific business case can lead to analysis paralysis. In the long term, If you’re trying to learn something it's better to do it imperfectly many times than perfectly once. Additionally, experimentation is at the heart of a lot of important discoveries. I recommend completing some projects where your goal is just to train a model. Then emphasize a rigorous orientation towards a business problem periodically. Think of some of your work as sketches then your ultra-focused projects as a painting based on a concept from start to finish. If you are just starting out in machine learning the best thing you can do is make something but as you grow, evolve a process centered around something useful, doable and figure out how your predictions will ultimately lead to an optimized solution.