Data Science Project Checklist To Use Before You Start A Project To Convey You Can Actually Get Work Done

Data Science Project Checklist To Use Before You Start A Project To Convey You Can Actually Get Work Done


You're relatively new to DS, you're about to start project, and you're looking to make sure that you're going to be on the right track so that you don't fall into bad habits and methods early on and you can nip them in the bud. Not only that, your main objective with doing this project is to show your future data science interviewers that you have a track record of doing data science. You are aware that many people apply to a data science position with limited abilities to make an impact so you want to make sure you stand out.

Your project should convey to others that you can actually get some data science work done in the real world

To do this, you have to think about your project from a hiring manager's point of view. Not only should your project match up with some of the keywords that are in the data science job listings you are looking at, it should show that you have practical abilities in addition to a great educational background. You also need to show that you can write production level code at a respectable level. And lastly, you need to show that you understand the business standpoint of data science and how it can help the company.

Checklist breakdown - Before Project, During Project, After Project

To properly convey that you can actually get work done with your data science project, you have to do some thinking before, during, and then after the project. In each step you want to focus on and think about slightly different things, so rather than have one big checklist, the checklists have been split up into three different sections. This way you can refer to them and use them at different times. This article will focus on the checklist of things you should think about before you start your project.

Before Starting The Data Science Project Checklist

The checklist to go through for before starting the project is further broken down into five different sections.

  1. What question are you asking/answering and for whom?
  2. What data are you using?
  3. What techniques are you going to try?
  4. How will you evaluate your methods and results?
  5. What do you expect the result to be?

For each section, there will be additional questions that you should think about and answer before you get started with your data science project.

What question are you asking/answering and for whom?

The first section, "what question are you asking/answering and for whom?", focuses on making sure you've thought about what you are going to ask as well as why someone should care about this. The questions you should ask yourself are as follows:

  • Do you have one question are you looking to answer?
  • Do you have one organization / team example who would be interested in this answer?
  • Do you have three reasons why this organization / team would be interested in this answer?
  • Do you have three reasons why you are personally interested in this answer?

What data are you using?

The second section, "what data are you using?", focuses on making sure you've thought about what data you think you are going to need to answer the question you are going to ask. The questions you should ask yourself are as follows:

  • What data am I going to need?
  • Where am I going to get the data?
  • What is the potential size of the data?
  • Is it enough data?
  • What data cleaning, munging, scraping, massaging, etc will I need to do to this data?

What techniques are you going to try?

The third section, "what techniques are you going to try?", focuses on making sure you've thought about the machine learning, data science, programming, and statistical methods and techniques you are going to use to answer the question you are going to ask. The questions you should ask yourself are as follows:

  • What methods/techniques should I use?
  • Why do I think these are the correct methods/techniques to use for this type of problem and data set?
  • Are there similar projects / papers that have already done this that I can learn from before I get started?
  • Are these techniques ones that I would want to use/do in a data science job?

How will you evaluate your methods and results?

The fourth section, "how will you evaluate your methods and results?", focuses on making sure you've thought about how you will evaluate what you've done and what results you achieved when you finish your project. The questions you should ask yourself are as follows:

  • How will I know I did the analysis and project correctly?
  • What are key parts of the project that will tell me that I am doing things incorrectly?
  • What numbers / results / insights will I sense check?
  • What are simple logical chronological checkpoints I can put into my project to ensure I check to see what if what I am doing is working?

What do you expect the result to be?

The fifth section, "What do you expect the result to be?", focuses on making sure you've thought about what you expect the result to be and why. The questions you should ask yourself are as follows:

  • What do I expect the result to be?
  • Why do I expect the result to be this?
  • Does this result match the results / experiences other people have had with similar methods and techniques on similar data?
  • What do I expect the results to be at the simple chronological checkpoints I put into my project?

Ask, think, and answer these questions and your project will convey to others that you can actually get some data science work done in the real world.

By going through the five sections and asking the four broad questions in each section, you'll be thinking about your project from a hiring manager's point of view. This will make sure your project matches up with keywords from potential data science job listings, as well as showing your pratical abilities to get data science projects done. Additionally, it'll also show thorough thinking that will show you understand the business standpoint of data science and how it can help an organization.

Good luck with your project and start thinking today!

Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.