We are constantly being told that we are living in the Information Age. The Industrial Revolution marked the beginning of the Industrial Age and similarly, the Digital Revolution started the Information Age.
It’s true that at no other time in history have the human race had instant access to such vast quantities of information. The amount of information generated every day is continuously growing at an exponential rate; adding to the mountain of data we already have. We like things to come in mega sizes – it’s gluttony on a technological scale!* Do we really need more data though? Or do we need to be better at analysing it?
Is Big Data Just a Buzzword?
The disastrous Mars Climate Orbiter mission in 1998 is an interesting example. NASA was attempting to put an unmanned space probe into the orbit of Mars to monitor the climate of the planet. Like most space missions, it was a fairly expensive exercise, costing around US$125m and by September of the following year, the probe had successfully made it to Mars and was ready to go into orbit. As the probe travelled around the far side of the planet, the team at mission control waited with bated breath for it to reappear around the other side of Mars. But the space craft was never seen again.
A unit conversion error had meant the trajectory was 100km off course, causing the probe to crash into Mar’s atmosphere and be destroyed. This embarrassing and terribly expensive catastrophe could have been easily prevented. All the data required to put the probe into orbit was correct. There was no problem with the data – the problem lay with the interpretation of the data. The software on board the Orbiter was transmitting the data in one unit of measurement, but mission control was transmitting the corrections in another. They didn’t need MORE data – they just needed to be able to interpret it correctly.
According to Viktor Mayer-Schonberger and Kenneth Cukier**, we’re in the middle of a huge infrastructure change in our world; bigger than those in the past such as Roman aqueducts and the encyclopaedia. The Romans built the aqueducts which facilitated the flow of water in the cities, the printing press enabled the Enlightenment, and technology now enables the flow of data enabling communication.
Ninety-eight percent of all information is now stored digitally. The world’s information has become “datafied”. There are 1.3 billion Facebook users – that’s over 10% of the world’s population – datafied. A recently analysis of Twitter and Facebook updates showed that people’s moods follow a daily or weekly pattern all over the world. Moods have become datafied! Everything can be boiled down to binary code; ones and zeros.
Debunking the Big Data Myth
A few hundred years ago it was unthinkable to capture reams of data using a ledger book, ink and a quill. Now it’s so easy to capture information, and the cost of data storage is constantly decreasing, so we store it instead of deleting it. Look at your phone; how many photos do you take and then forget to delete? Maybe you’ll try to cut down on recording video, as you run out of storage but soon the capacity of your phone will have increased so much it won’t matter. Over the past 50 years, the cost of digital storage has halved every two years. Just because we now have the technology to collect more data does not mean that we should. We could collect the information on the eye colour and breakfast eating habits of every single person we come across each day, but it’s highly unlikely to help us in any way whatsoever. Until you have figured out how to use the data you already have, just getting more data will only distract you from the task at hand.
What we actually DO with the data is far more important that getting more of it. We don’t need big data, we need to understand and use the data we already have – regardless of size.
The Police Force, investigates and reviews millions of “metadata”; fragmented and encrypted information for the purposes of national security. Time after time however, it is the intelligence analysts’ ability to connect the dots, and put two and two together that can prevent a terrorist attack or other crime. Again, it’s the interpretive skill of the analyst which is more important than the quantity or “bigness” of the data. It’s that tiny little piece of information which can really make the difference. It’s “little data”, not “big data” that prevents a terrorist attack.
Take LinkedIn for example – it’s great for networking, recruiting, meeting the right people. LIONs (LinkedIn Open Networkers) accept all invitations and have thousands of connections. To be an effective networker though, you need to be able to make that right connection between people. Having thousands of people on LinkedIn can be beneficial but actually remembering that you know someone who can help someone else in their particular situation is priceless. Again, it’s your ability to “join the dots” which adds value, not simply the quantity or even quality of the data.
Using Excel for Big Data
So how does the use of Excel fit into all this? Excel, the “Swiss army knife” of software doesn’t cope with more than a million rows. This is where Power Pivot takes over and provides a form of data warehouse which can handle “bigger data”. Power Pivot does not cost anything to install as it’s already part of an Excel 2010 and some 2013 Excel licenses, and advanced Excel users already comfortable with pivot tables and aggregation functions find it reasonably easy to grasp. We have to remember that a lot of this Big Data hype has been generated by software vendors who want to “go where no one has gone before” and dream up new ways of collecting and analysing data. In a lot of cases though what actually needs to be done is not quite so glamorous! We need to simply do a better job of analysing the data we have. We can’t solve our problems by collecting and storing more data. Analysing it in a meaningful way – that’s the difficult bit. Using the Excel software we already have, and getting better at using it to perform meaningful analysis is much more productive than believing the hype and looking for more and bigger data. Analytical errors are more due to flawed logic than insufficient data. We don’t need bigger data, we need better analysis which in many cases can be performed quite competently in plain Excel or Power Pivot.
So How Can we do a Better Job of Analysis?
Summarise the data, find the relationships and look at outliers. Use charts to identify trends and correlations. Add alerts to save time and give us quick and easy exception reporting. Start with what you’ve already got. Go for the quick wins. You’ve probably got data already that you’re collecting that you could turn into a dashboard tomorrow.
Of course there are some fantastic business intelligence and data warehousing systems out there. They’ve got iPad-enabled dashboards which update at the click of a button – all for a nice fees of course. These are great if you’ve got the budget. Consider a simple Excel solution though. You’ve already got the software, and your analysts already have the skills.
Be solutions-orientated. Data investigations need to be top down, not bottom up. Apply the data to the business issue; find the problem you need to solve and then find the data to support it, rather than wasting time and resources by data mining from the bottom up; it’s like looking for a needle in a haystack.
Remember that information or data has no value in itself. It is only valuable when we use it to do something worthwhile!
* Big Data, Big Ruse, Stephen Few,
**”Big Data: A Revolution That Will Transform How We Live, Work and Think” Viktor Mayer-Schonberger and Kenneth Cukier, 2013