… you’re working on a startup
❖ ❖ ❖
You can’t anticipate what problems you will need to solve, and when you do need to solve them – you will need to learn more about how people have behaved and how those behaviors correlate with what’s important; once you identify a problem to solve, you don’t want to wait a long time to gather the relevant data.
Therefore:
Instrument everything and store it locally
We capture user actions in pages (e.g. button clicks) and between pages (link clicks)
We originally used AJAX to send the click data back, but it was unreliable due to race conditions in browser – so we use parameters in HTTP request
We capture “semantic” actions as well (e.g. service sent a welcome email)
We are constantly adding events – not because we know we need them, but because for this aspect of our system, we drive the work in anticipation of it’s use.
We store all the data locally. At one point we looked at CouchDB (or other key/value pairs) but decided to stick with MySQL – works well.
For a brief period, we had some data which existed only in 3rd party services (e.g. KissMetrics, Google Analytics, MixPanel). However, when we would use this data to do problem solving, we ran into several problems
- we suspected accuracy of 3rd party stored data or 3rd party rendering – and we could not validate accuracy once it was “over there”
- In more than one instance, we found major discrepancies between our data and 3rd party stored data –
- The API to 3rd party services never let us be 100% confident that it had our data (i.e. not transctions, no receipts, no after-the-fact validation)
- We want to slice & dice the data differently later, so 3rd party services would limit out ability to do this – i.e. we don’t know what problem we will be solving later, so optimizing the data for a rendering today loses too much data & we can’t recover
Now, we continue to use 3rd party services (e.g. Google Analytics) but everything over there is slave copy of what we have locally