Step 2 is to collect correct data. There goes lots of work to do this. And I don’t like lots of work, so at least I will make it simple. Lots of simple works vs Lots of complex works, I will take lots of simple works, if someone twist my arm force me to pick one of them.
In order to collect CORRECT data, we have to know how data is composed first. The composition of data object is usually in combination or one of below;
Yes, but this is important and fundamental concept.
Let’s say, you want to predict next year revenue for your business. One year of revenue is addition of monthly revenue, and monthly revenue is addition of daily revenue, etc. So now you know how your annual revenue data is composed and know how granular you should go to make your prediction.
And if it’s to predict profit, that is composed of your revenue minus the business operation cost.
And if it is to predict amount of team members output you need to achieve your revenue goal then that’s division of total amount of output and number of team members.
So now you get it.
Make sure you spend enough time to think thoroughly about having FULL and COMPLETE coverage of data compositions, if you are missing one composition of your data, your analytic is A LOT LESS convincing and NOT trust worthy, and guessy (don’t think there is such word). Check MECE principle (https://en.wikipedia.org/wiki/MECE_principle) and work under this principle as a good guideline.
Once you know your data composition, let’s think about collecting those data somehow.
And, now is a good time to know that, in data, there are quantitative and qualitative data. Quantitative data is like actual revenue data and qualitative data is like news articles.
And secondary and primary data set. Where secondary is data that you have to create new, and primary is data that already exist elsewhere.
If you got all quantitative data, and primary, you are almost all set, you are ready to go to next step. BUT if you are not, then, you have to be creative (but not too off and crazy) about collecting and creating data.
When you are working with primary but data from elsewhere, you have to make sure that is the data you want. Quick guide line is to use 6W1H and your answers to this 6W1H is good enough fit to be your data.
- Who collected data? and for?
- Who collected for who?
- Why did they collect this?
- When did they collect this?
- Where did they collect this?
- What fields of data they collected and is that enough? or what you want?
- How and which method did they use to collect data?
And can relate to this 6W1H and here goes another VERY important thing to know when you are collecting data. You MUST collecting data in group. Back to the example -, if you want to predict next year revenue of your business, you must have this year, last year, and year before data, or if you want to (should) go dive into monthly revenue granularity, then you have to have last year of that month revenue to compare.
Another important things is, when you are to read data, you have to make sure only one field set at a time should be different. If you are to read revenue data of the year 1 and year 2, year 1 and 2 must be made of same compositions. You can’t have year 1 with product a and b and year 2 with product c and d, and come up with insights. (in general..)
OKi, let’s talk about how to make data and make use of qualitative data next.