When you read online descriptions of data analyst jobs do you feel like you’re not qualified? Do you feel like a cycle repair technician because you aren’t working with terabyte databases, running PhD-level math equations, and implementing your own machine learning algorithms? Do you feel like you are missing out on the sexy parts of data analysis, while all you do is the grunt work?
A reader sent us an email asking us whether they could consider what they were doing data analysis. That is, could they say they know a bit of data analysis, even though the work that they do involves no heavy mathematics.
What they are currently doing
Here is the word for word description of what the reader is doing: “recently, the boss wanted to know how our overnight tests were faring. The results of the last few months were spread over text files, csv files, and a Mysql database. I had to extract all of them, munge them together, and produce graphs based on them. The boss was pleased.”
What do you think - is that data analysis? Does it matter that the reader wasn’t doing heavy mathematics? Does it matter that the boss was pleased?
According to Wikipedia, here’s how data analysis is defined “Data Analysis is the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data.” Notice the “and/or” in the definition. While statistical methods can involve heavy mathematics, logical techniques can be applied without heavy mathematics.
Analyst or not?
From this definition and from the fact that the reader is extracting data, munging data, visualizing data to aid in decisions / extracting insight from the data then by all means the reader can say they know a bit of data analysis because these are all things that a data analyst does. True, not heavy mathematics are involved, but the logic is there.
So not math ever?
The machine learning, terabyte databases, PHD level math equations are also part of what a data analyst does. It's just a different set of problems they are solving.
The dirty secret of data science / data analysis right now is that what everyone talks about is machine learning, Kaggle competitions, "deep learning", or other things. This is because it's sexy and it sounds good.
The truth is that 80% of what people actually do is data munging and data visualization. Sure, a few companies like Google, Facebook, Baidu, and others use the high level math, however, those are generally anomalies. There are thousands of business that would kill to have the reader do what they did for their boss.
Seriously, the truth is that being directionally right is more important for 99% of business than the 1% improvement that a new machine learning technique can give you.
Data Analyst, Data Engineer, Data Scientist, ...
Here's an interesting take on it from a "data science bootcamp" company: Insight Data Science - "Data Science vs Data Enginering"
The key paragraph for this conversation is the following:
A good data engineer is has extensive knowledge on databases and best engineering practices. These include handling and logging errors, monitoring the system, building human-fault-tolerant pipelines, understanding what is necessary to scale up, addressing continuous integration, knowledge of database administration, maintaining data cleaning, and ensuring a deterministic pipeline.
So given what our reader described doing, most of what they did with the extracting, munging, making the process replicable would fall into the rubric of the title called "Data Engineering". The analysis part and the visualization part then would fall into the "Data Science" / "Data Analyst" rubric.
The boss was pleased
The sexy parts (which a handful of people may do) get talked about, while the grunt work, which is actually more commercially useful, is ignored. The key thing the reader mentioned was that the boss was pleased. This is after all the thing that matters. Not whether heavy math was used or logic or a combination, the key thing that matters is that a problem was solved.
To that end, we told the reader, that yes - someone like them can say they know a bit of data analysis, even though the work they do involves no heavy mathematics?
So the next time you feel math envy - really ask yourself what problems you want to be solving and is more and more math really the way to go.