top of page
Writer's pictureWilliam Lum

Data Science Method Framework - How to navigate the dizzying plethora of solutions to pick the best

Updated: Jun 24

In data science there are a lot of methods that solve similar types of problems but how do you choose the right one?

I must admit I love tools. As a hobby, I like building and fixing things and have acquired many tools over the years. Some might wonder how many hammers do you need? How are they different? For some jobs, that specialized tool reduces the effort tremendously and is much more effective than a standard tool. The same can be said for algorithms and methods in Data Science.


If you have been educated as a data scientist and understand the underlying math and have experience using the methods, you will know what situations call for what methods. But if you are just starting your journey in data science or are a citizen data scientist (i.e. your day job is as a marketer) then the large toolbox of methods can be overwhelming. I'm an engineer by education and wanted to put some structure to helping me (and hopefully others) organize my toolbox so I have a place to start. I found a couple of excellent resources that help me frame my understanding... the Udacity nano degree "Predictive Analytics for Business" and the book "The Field Guide to Data Science" by Booz | Allen | Hamilton are 2 among the many useful resources I ran across. Building on those and other sources I created this Methodology Framework spreadsheet.


How to use the Methodology Framework

I've simplified this to make it easier to use and don't pretend to understand all the math behind the data science algorithms. This is a pragmatic approach to understanding what tools to use for the business problem you are trying to tackle.



Read the Data Science Methodology Framework from left to right.

  1. Goal - Start with the type of business problem you are trying to tackle and your goal for the analysis. Note sometimes your problem will have multiple goals, you should break those apart when looking at this methodology Framework.

    1. Describe - typical goals data analysts tackle... pulling data together to describe the status or the business and to draw some insights from it.

    2. Discover - dives a little deeper into the patterns in the data to define groups (cohorts) and understand what fields are important for differentiating groups

    3. Predict - uses the data to predict outcomes, how something will be classified, or what the future values will be, etc

    4. Advise - Recommended course of action

  2. Problem Characteristic - Next we look at characteristics of the problem. Do we need a certain type of output? Does that data type limit options, etc

  3. Method / Process - Finally we have the recommended method or process to use for the business problem / goal

  4. Notes - Links and details to give you more info about the suggested method or process

This method framework is as mentioned build on the work of others much smarter than me and is meant to help people on a learning journey similar to mine. Please help keep this up to date and add and update this Methodology Framework.

(In the comments please share suggestions on content updates and ideas on how to improve this)



Comments


buymeacoffee_sq.png
subscribe_sq.png
bottom of page