What is data?


Simply put, data is a collection of measured facts about something.  Rainfall, population counts, weight, height, dollars spent, days elapsed.  It is raw, it is copious, and each data point alone doesn’t tell us more than a single fact.

In more depth, data is a representation of reality.  Something happened, we record a record of that event with symbols.  The symbols themselves are just an abstract representation of what occurred.  

Collections of data can help us find patterns in those facts, allowing us to begin to understand the systems being measured, and the causes driving those systems.   A symbolic representation of single data points can convey facts, identifying trends in collections of those symbols and facts can convey information.  There is where knowledge and wisdom dwells.

When we need to make decisions, we can go with our guts, what Nobel-prize winner Daniel Kahneman calls the brain’s System 1.  Prone to over-simplifying, prone to missing important details, it is a great way to make quick – though often inaccurate – decisions.  This has its place in life, though we need to be careful not to rely too heavily on this system’s initial determinations, or to become too attached to them.

In order to make more accurate choices, we need to engage the analytical System 2; the slower, consciously thinking part of the brain.  That system takes the data we have at hand and looks for patterns in it to find answers.  We can use this part of the brain to find more complex patterns than may be initially obvious when just looking at raw collections of symbols and facts.  This takes time, and System 2 can also make mistakes (biases will be its own topic!).

Today, the melding of both options is available to us thanks to advances in computing systems.  By taking the raw data and analyzing it, we can give ourselves pre-processed results with patterns highlighted.  Visualizations of processed data allow us to make good decisions quickly, by moving the heavy lifting away from manual human-powered processes and into automated ones.  We can save time, money, and avoid mistakes.

Over the past 20 years, I have worked with some of the largest international companies around the world, helping them build good data structures to track the facts they rely on to make their decisions, analyze that data to help guide them.  I’ve worked with federal, state and local governments within the US to track their law case data, public requests, budgets, files, building computing systems that help them do their jobs more efficiently.

It all comes back to data.  Collecting data effectively, identifying and cleaning up bad data, visualizing the results, and at the end of the day, making the right choices.