Big Data – it’s everywhere.
Is data getting bigger? – no question. Do we need new ways of storing, processing and analysing that data? – no question. Is this a revolution? Not necessarily.
Data has been getting bigger since the dawn of time, at the tender age of 35 I’ve seen data storage problems grow and evolve over my career, not just my lifetime. Hadoop and its brethren are the latest soldiers in this war of data incarceration, and Hadoop in particular seems to be winning the hearts and minds of the C-level. In response data technology providers have clamoured for a piece of the action, from Cloudera and Hortonworks through Microsoft to Jaspersoft and Tableau – if you’re serious about data you must have Hadoop on your radar, right? Why then do Hadoop and other Big Data technologies still remain the preserve of the privileged few? and as for analysing the data stored therein…
It’s all about Data Science….
As a data professional you have to forgive me if I’m getting fed up of hearing about Data Scientists. Headlines like “Data Scientist – the sexiest job of the 21st century” and “Big Data skills bring Big Dough” don’t help, as I’ll be honest I’m not sure I have what it takes to be a data scientist. I have the background – I’ve a first degree mathematics BSc, I have a technical background in data development spanning ten years at a leading data provider and I have a good knowledge of product; but there’s more to data science than that. I love this visualisation below from Swami Chandrasekara; it describes the skills needed for today’s Data Scientist and it hits the nail on the head for me – I mean WOW. I’ll let you soak it all up.
There’s no wonder Data Science is the new sexy, I mean there’s a lot going on here. From maths/stats, through R programming, through Hadoop (seeing up a Hadoop cluster is not for the faint hearted), through Machine learning, ETL, Java, Python, SQL, etc, etc.
Where does that leave the Data Analyst?
So where does that leave me, the data analyst? Well let’s take a step back, how many people really have these skills. There are the privileged few Data Scientists, working for the Big Data giants of this world – and I don’t begrudge them their hard-earned money but the rest? Well there’s already a predicted shortage of Data Scientists and who will be filling the gaps? The humble Data Analyst (or Business Analyst – call them what you will). i.e. me.
That’s why today’s announcement from Alteryx and Revolution Analytics is so important in my opinion. Not for what it promises – I’m a big Alteryx advocate and from what Revolution R says it can deliver then I’m more than sure they will make a great partnership – but it’s promising for what it heralds, i.e. a new dawn for Big Data. This is the start of the revolution (if you’ll excuse the pun), it’s time to end this data incarceration and bring the power back to the data analyst. I need instant access to my Big Data stack; not when the data scientist, or an IT engineer, can pass me a feed. The questions I have, or my C-level exec has, aren’t going to wait while we recruit a data scientist. Data is real time, and answers need to be too – and that means democratising data.
That’s why I love Alteryx – I can build connectors out to disparate datasets (including Hadoop) using a simple toolset and then, in the same module, use some “data-munging” tools to combine and parse the data. I can then use predictive tools based on R (and soon Revolution Analytics on the desktop) – again in the same module and interface – to analyse and gain insight into the data before exporting and visualising the results – in Tableau if I want – again via a simple tool. Finally I can then publish the final module as an “analytical app” to my less tech-savy colleagues (who simply fill in a few parameters) to run via a web interface on a server install – and yes, you guessed it, still using the same interface.
I’m not saying that Alteryx mitigates all of the skills we said a data scientist needed in the metro map – but boy does it simplify that learning curve for us mere mortals! Do I still need to be intelligent to understand how to piece it together? Obviously. But I can focus that intelligence where it needs to be – on my analysis, not on programming / understanding several different tools.
The Alteryx \ Revolution Analytics combo isn’t the only tool aiming to simplify Big Data Analytics, and they won’t be the last, but if they save a Data Scientists salary – then I’m sure they’ll be winning a few more fans before the year is out.
and finally, if anyone has the artistic talent to rewrite the metro map data science image on the right but from an Alteryx perspective then please do so. I’d love to see Alteryx as a hub in the middle covering a lot of this, with a few fundamentals feeding out. How do I become an Alteryx scientist? Get on the train 🙂
See below for a video of Alteryx and Revolution R working together: