Using Inspect / Javascript to scrape data from visualisations online

My last post talked about making over this visualisation from The Guardian:

2016-11-13_12-55-29

What I haven’t explained is how I found the data. That is what I intend to outline in this post. Learning these skills is very useful if you need to find data for re-visualising data visualisations / tables found online.

The first step with trying to download data for any visualisation online is by looking checking how it is made, it may simply be a graphic (in which case it may be hard unless it is a chart you can unplot using WebPlotDigitiser) but in the case of interactive visualisations they are typically made with javascript unless they are using a bespoke product such as Tableau.

Assuming it is interactive then you can start to explore by using right-click on the image and choose Inspect (in Chrome, other browsers have similar developer tools).

2016-11-13_19-26-35

I was treated with this view:

2016-11-13_19-28-09.png

I don’t know much about coding but this looking like the view is being built by a series of paths. I wonder how it might be doing this? We can find out by digging deeper, let’s visit the Sources tab:

2016-11-13_19-31-30

Our job on this tab is to look for anything unusual outside the typical javascript libraries (you learn these by being curious and looking at lots of sites). The first file gay-rights-united-states looks suspect but as can be seen from the image above it is empty.

Scrolling down, see below, we find there is an embedded file / folder (flat.html) and in that is something new all.js and main.js….

2016-11-13_19-34-05

Investigating all.js reveals nothing much but main.js shows us something very interesting on line 8. JACKPOT! A google sheet containing the full dataset.

2016-11-13_19-38-25

And we can start vizzing! (btw I transposed this for my visualisation to get a column per right).

Advanced Interrogation using Javascript

Now part way through my visualisation I realised I needed to show the text items the Guardian had on their site but these weren’t included in the dataset.

2016-11-13_19-41-27

I decided to check the javascript code to see where this was created to see if I could decipher it, looking through main.js I found this snippet:

function populateHoverBox (type, position){

 var overviewObj = {
 'state' : stateData[position].state
 }
.....
if(stateData[position]['marriage'] != ''){
 overviewObj.marriage = 'key-marriage'
 overviewObj.marriagetext = 'Allows same-sex marriage.'
 } else if(stateData[position]['union'] != '' && stateData[position]['marriageban'] != ''){
 overviewObj.marriage = 'key-marriage-ban'
 overviewObj.marriagetext = 'Allows civil unions; does not allow same-sex marriage.'
 } else if(stateData[position]['union'] != '' ){
 overviewObj.marriage = 'key-union'
 overviewObj.marriagetext = 'Allows civil unions.'
 } else if(stateData[position]['dpartnership'] != '' && stateData[position]['marriageban'] != ''){
 overviewObj.marriage = 'key-marriage-ban'
 overviewObj.marriagetext = 'Allows domestic partnerships; does not allow same-sex marriage.'
 } else if(stateData[position]['dpartnership'] != ''){
 overviewObj.marriage = 'key-union'
 overviewObj.marriagetext = 'Allows domestic partnerships.'
 } else if (stateData[position]['marriageban'] != ''){
 overviewObj.marriage = 'key-ban'
 overviewObj.marriagetext = 'Same-sex marriage is illegal or banned.'
 } else {
 overviewObj.marriagetext = 'No action taken.'
 overviewObj.marriage = 'key-none'
 }

…and it continued for another 100 odd lines of code. This wasn’t going to be as easy as I hoped. Any other options? Well what if I could extract the contents of the overviewObj. Could I write this out to a file?

I tried a “Watch” using the develop tools but the variable went out of scope each time I hovered, so that wouldn’t be useful. I’d therefore try saving the flat.html locally and try outputting a file with the contents to my local drive….

As I say I’m no coder (but perhaps more comfortable than some) and so I googled (and googled) and eventually stumbled on this post

http://stackoverflow.com/questions/16376161/javascript-set-file-in-download

I therefore added the function to my local main.js and added a line in the populateHoverBox function….okay so maybe I can code a tiny bit….

var str = JSON.stringify(overviewObj);
 
download(str, stateData[position].state + '.txt', 'text/plain');

In theory this should serialise the overviewObj to a string (according to google!) and then download the resulting data to a file called <State>.txt

Now for the test…..

downloadingfiles

BOOM, BOOM and BOOM again!

Each file is a JSON file

2016-11-13_20-07-21

Now to copy the files out from the downloads folder, remove any duplicates, and combine using Alteryx.

2016-11-13_20-04-59

As you can see using the wildcard input of the resulting json file and a transpose was simple.

2016-11-13_20-08-31

Finally to combine with the google sheet (called “Extract” below) and the hexmap data (Sheet 1) in Tableau…..

2016-11-13_20-09-41

Not the most straightforward data extract I’ve done but I thought it was useful blogging about so others could see that extracting data from visualisation online is possible.

You can see the resulting visualisation my previous post.

Conclusion

No one taught me this method, and I have never been taught how to code. The techniques described here are simply the result of continuous curiosity and exploration of how interactive tables and visualisations are built.

I have used similar techniques in other places to extract data visualisations, but no two methods are the same, nor can a generic tutorial be written. Simply have curiosity and patience and explore everything.

 

Advertisements

“Fitted” Gannts in Tableau

The Challenge

During Makeover Monday this week (week 22) I came across a problem: I needed to produce a Gantt chart for a huge amount of overlapping dates. Gantt was really the only way for me to go with start and end dates in the data (in the back of my head I’m thinking Mr Cotgrave will be loving this data given his fascination with the Chart of Bigraphy by Priestly) and I was fixated with showing the data in that way (I blame Andy) but everything I tried in Tableau left me frustrated.

Jittering left wide areas of open space and no room for labels, even if I zoomed into one area would render leave lots of the data unexposed.

2016-05-30_21-58-59

I knew what I wanted to do…I wanted to neatly stack / fit the bars in a decent arrangement to optimise the space and show as much data as possible at the top of the viz. The original author in the link for the makeover had done this as such:

Now Makeover Monday usually has a self-imposed “rule” that I tend to adhere to, spend an hour or less (if I didn’t stick to this I could spend hours), but here I was after half an hour without any real inspiration except something I knew wasn’t possible in Tableau. It was a challenge and to hell with rules I do like a challenge – especially given the public holiday in the UK meant I had a little time.

The Algoritm

So I turned to Alteryx, but how to stack the bars neatly.

Firstly I needed a clean data set, so I fixed some of the problems in the data with blank “To” dates and negative dates using a few formula and then I summarised the data to just give me a Name, From and To date for their life.

Algorithm-wise I think I wanted to create a bunch of discrete bins, or slots, for the data. Each slot would be filled  as follows:

  1. Grab the earliest born person who hasn’t been assigned a slot
  2. Assign them to a slot
  3. Find the next person born after they die, and assign them to the same slot
  4. Repeat until present day

In theory this would fill up one line of the Gantt. Then I could start again with the remaining people.

An iterative macro would be needed because I would step through data, then perform a loop on the remainder. First though I realised I needed a scaffold dataset, as I needed all the years from the first person (3100BC to present day).

I used the Generate Rows tool to create a row per year, and then joined it to my Name, Birth, Year data to create a module that looked like:

2016-05-30_22-10-07

Data:

2016-05-30_22-11-17

I’d fill the “slot” variable in my iterative process. So next up my iterative macro.

Translating the above algorithm I came up with a series of multi-row formula:

2016-05-30_21-29-41.png

The first multi-row formula would assign the first person in the dataset a counter, which would count down from their age. Once it hit zero it would stay at zero until a new person was born, at which time it would start counting down from their age.

The second multi-row formula would then look for counters that had started to work out who had been “assigned” in this slot and assign them the iteration number for the macro, i.e. first run would see everyone going in slot 1, second in slot 2, etc.

Perfect! Now to run it and attach the results to the original data:

2016-05-30_22-19-25

Easy peasy Alteryx-squeezy. That took me 30 mins or so, really not a long time (but then I have been using Alteryx longer than you….practice makes perfect my friend).

The Viz

So now back to Tableau:

2016-05-30_22-23-24

Neat, progress! Look at how cool those fitted Gannt bars look.  Now what….

Well I need to label each Gantt with the individuals name but to do that I really have to make my viz wide to give each one enough space….

2016-05-30_22-25-06

The labelling above is on a dashboard the maximum 4000 wide…..we need wider! But how? Tableau won’t let me….

Let’s break out the XML (kids don’t try this at home). Opening up the .twb in Notepad and….

2016-05-30_22-27-58

I changed the highlighted widths and low and behold back in Tableau – super wide!

Now I can label the points but what do I want to show – those Domain colours look garish….

So I highlighted Gender and….pop. Out came the women from history – nice story I think to myself. I decided not to add a commentary, what the viewer takes from it is up to them (for me I see very few women in comparison to men).

Other decisions

  • I decided to reverse the axis show the latest data first and make the reader scroll right for the past, mainly I did this because the later data is more interesting
  • I decided to zoom in at the top of the viz, generally I expect viewers won’t scroll down to the data below but while I toyed with removing it I decided that leaving it was a slightly better option. The top “slots” I’m showing are arbitrarily chosen but I feel this doesn’t spoil the story.
  • I decided to add a parameter to highlight anything the user chose (Gender or Occupation) – tying it into the title too.
  • I fixed AD / BC ont he axis using a custom format

2016-05-30_22-41-28

Conclusion

So I spent a couple of hours in total on this, way more that I planned today but that’s what I love about Makeover Monday – it sets me challenges I’d never have had if I hadn’t been playing with the data. I’ve not seen this done in Tableau before so it was a fun challenge to set myself.

Click on the image below for the final viz

2016-05-30_21-17-19

 

 

 

 

 

Best of Alteryx on the Web – November 2014

Another busy month in the Alteryx blogosphere and so here are some links to some of the best content you may have missed.

Tips and Tricks

3danim8’s Blog – How and Why Alteryx and Tableau allow me to innovate  – Part 1 and Part 2

The Information Lab – 7 Alteryx Tips you need to start using today

Inspiring Ingenuity – Alteryx – Optimising  modules for Speed

The Information Lab – Bite-sized Tips, Tricks and Tutorial Videos for Alteryx

Commentary

Alteryx.com – Data Blending for Dummies – Special Edition

Schiolistic Ramblings – The Business User and BI: Analytics, Visualisation and Testing

Alteryx.com – 5 Myths of Data Blending

Antivia – From raw data to interactive dashboard in minutes

Tool Guides and Macros

The Information Lab – What Time is it Alteryx – Part 1

Human Data Associates – Visualize all Dutch Cities and neighbourhoods in Tableau (nothing good happens without Alteryx)

Alteryx Gallery – X-Ray Browse Macro

Think I’ve missed anything, or you’ve got something worthy of next months roundup? Please reach out on Twitter (@ChrisLuv) or in the comments below.

The Business User and BI: Analytics, Visualisation and Testing

Excel is much maligned. One reason is it’s so easy to make mistakes; take the recent news that Tibco investors “lost” $100 million dollars due to a spreadsheet mistake, or the story last year that there was a mistake in the Excel calculation of a 2010 paper which has been widely cited to justify budget-cutting and austerity measures across the US and Europe.

Spreadsheet Hell

The reason these mistakes were even possible is because spreadsheets are so easy to use they fall out of the usual testing procedures applied to other business critical applications. The people using Excel are not skilled developers, they have no development background and so checks are not simply missed, they are not even considered.

Now, as I see the take up of BI Analytics tools like Tableau, Alteryx and their brethren accelerate and start to become locally ubiquitous I feel it is important to step back and issue a warning:

Excel, Tableau, Alteryx and other similar tools are development tools, and as such should be given the same standards in quality assurance and documentation as any other development tool / project.

Now when I write development tool I mean that when you are writing an Alteryx workflow, or a Tableau visualisation, or an Excel spreadsheet, you are writing a program – a set of instructions for the software to interpret. In C# you do this by writing lines of code, in Tableau you do this by dragging pills and placing them on panes, in Alteryx by dragging an configuring tools on the canvas and in Excel by using formulae in cells. Regardless of how you get there this set of instructions needs documenting and checking.

In days gone by all “programming” was done by skilled developers who were trained in unit tests, user acceptance testing, and other disciplines with Quality Assurance teams who would have dedicated resource to find bugs and issues. Step forward 5 years and those same routines and reports are now being written and promoted by business users with no formal development testing.

Could we be in a position in 5 or 10 years from now where we see headlines such as:

“Tableau mistake costs manufacturer $500 million”

I doubt it, neither Alteryx nor Tableau suffer from the horrible obfuscation that nested a nested VLOOKUP can bring, but I am making a series point that needs considering during any BI deployment driven by the business.

Solutions

As a BI consultant I’ve been asked about this testing problem many times, and so below I set out the common practices I’ve employed as I’ve led BI deployments as a consumer, and later as a consultant.

Purple People

Purple personIT has already learned the lessons of testing and has many best practices and methodologies laid out to catch and deal with these issues. Most of these best practices can be simply applied to BI in the business universe but they need experienced leaders from the IT world, who understand the business problems to help implement and maintain them.

Purple people are the solution – if they come dressed as this guy then ideal – but regardless they should have skills from IT but with an understanding of the business world, bringing the red of IT and mashing it with the blue of business. Employing a few purple people when recruiting a BI team can bring many advantages but the rigour of process they can bring from the IT world can really help drive testing and quality assurance.

Peer Review

My first manager and mentor Simon Hamm instilled this practice in me and I will never forget it, over the years it’s caught many mistakes in teams and projects I have run. It is simple and can be applied to any BI workflow or dashboard.

The premise:

Nothing leaves the team until it has been checked by a peer, that person is responsible for finding the issues you will inevitably have made.

Peer review should be the cornerstone of the business, it should apply to everyone and the assumption during checking should be that there are errors to find. So Senior members of the team are not exempt from errors (often they are the worse culprits as they take on pressure work and publish it quickly).

in the corporate world implementing a peer review policy can also be backed up by personal objectives, instilling an ethos of checking rather than a culture of blame, e.g. replace an objective saying “In the next 6 months your reports should produce no errors” with “Everything you produce should be reviewed by a peer”. People make mistakes, removing the blame increases efficiency and moral and shifts the onus onto ensuring the policy designed to catch those mistakes works.

Smoke Testing

Fact checking and other simple functional tests should be part and parcel of the early part of any report / data testing process. Producing a dashboard on the number of people in the UK, then check the top line numbers looks right. It’s often said the devil is in the detail but it’s important that the broad numbers are checked first – I’ve seen situations where reports that have been “checked extensively” have basic headline figures with glaring mistakes, which have been missed because of the focus on the detail – I imagine this could have been the issue with the Tibco numbers.

Replicate

This sits firmly alongside peer review, as replicating the results of the module / workbook should the mainstay of the checking process and are the responsibility of the peer “checker”. Thankfully tools like Tableau and Alteryx (and other rapid development BI tools) make this easy. Checking a Tableau report? Use Alteryx to do some adhoc analysis and check the numbers. Checking an Alteryx workflow? Drop the data into Tableau and do some visual checks.

Hand crank a few rows of data, say for an individual or product, through the entire process – are the results what you’d expect? Checking a few rows is much simpler than checking 10 million.

Trend Analysis

Ensure processes track MoM and YoY trends; small data quality issues can be difficult to pick up and will only manifest themselves over time. Keeping headline QA (Quality Assurance) figures of key datasets can help track these trends and pick out issues with data processing.

Unit Testing

Modular workflows like Alteryx are easy to build but in when building them people need to ensure that checks and balances are built into the logic; tools like Message and Test can be used to build simple checks – e.g. are joins 100%? Are there duplicate records? Build in outputs at each of these key stages and ensure these row level error logs are checked if they contain data. Without these checks modules can run unattended for a long time before anyone notices key lookup tables haven’t been updated and data dropped during the process.

User Acceptance Testing

With rapid development BI comes a whole new paradigm, UAT can and should be done in an agile and flexible way. Often business users are building their own reports but even if not then co-locating individuals can lead to a much better experience for both parties.

Documentation, Documentation, Documentation 

Just do it! Documentation doesn’t have to be dry and Word / Visio based though.

Annotate requirements in the tool itself (both Alteryx and Tableau provide a rich set of tools to allow users to do this as they build workflows and dashboards, and other similar tools have similar features). Comment formula and use visual workflows to provide commentary on the analysis and decisions. Hide and disable dead ends / investigations but don’t delete them – they are as useful as the finished result as they show the development process.

Also document the checking processes: released a report with an error? Learn from it. Keep a diary of checks for each dataset to refer back to, there’s nothing worse than a mistake that’s then repeated needlessly later.

Bug

Conclusion

Spiderman was once told: “With great power comes great responsibility” and that’s never been more relevant than to the new users picking up the BI tools of tomorrow. Throwing away 30+ years of software development lessons would be a shame, it’s important those lessons grow and change along with the tools.

Comparing against Next Generation – it’s Tough

I want to take you back in time in my time machine, back to the 1900’s and the new age of the automobile. Henry Ford has not yet perfected the mass production on the motor car, they are still the preserve of the rich and the average car is expensive – putting them out the reach of an average family. Though the car market is booming like no time before, it is still very small.

We land the time machine and I give you a simple job, help me sell the modern car to the people of the 1900’s. Easy right? Let’s see how things might pan out for you….

Look what you’re up against, it’s archaic, a relic from a bygone age. You set up a stand advertising a new way; a cheaper, modern alternative to the old way of doing things, effectively democratising automobiles for everyone. It will allow longer, faster journeys and with the effects of (de)inflation your cars are cheap enough for anyone to afford, surely this will be a piece of cake.

A portly gentleman in a bowler hat, clearly of means, pulls up and climbs down from his motor car.

Sir”, you say, “ould you like to take a trip with me in the car of the future? I feel confident it will revolutionise how you think about driving”.

“I don’t think we need to do that”, he counters, “A motor car will take you from A to B, they’re all the same really aren’t they? I don’t need to see it to believe it will work, I’ve seen hundreds of cars.”

Okay so a test drive would have helped you show him what he was missing, but it’s not really necessary as you have a compelling argument.

“Well Sir, my modern car is slightly different. Mine will take you from 0 – 60mph in just a few seconds, and will go considerably faster if you want it to, and what’s more everyone is driving them where I come from.”

 

1918 Oakland Tribune - click to read the full article

1918 Oakland Tribune – click to read the full article

 “?!”, a look of absolute horror crosses your new friends face,“I don’t think we want that now do we, they’ll kill themselves. Anyway we have a 20 mph speed limit in the 1900’s. Whatever next? Ha, you’ll be telling me you let women drive the blasted things!”

This last statement takes you back, you’d forgotten the prejudice of this bygone age, but you try not to let it show. You give a nervous laugh and carry on unfazed.

“Sir, my car is easily afforded by even an average family, everyone should be able to go from A to B no matter what their social standing”

Another harrumph, “I doubt it can be as well made as Mr Benz’s machines in that case, his are expensive for a reason, they are quality machines, not just for anyone“, he’s not convinced by your arguments. 

“I assure you there’s no difference in quality sir, and in fact mine is easier to use. I mean take a look at enormous hand-crank you need to use to get yours started, it doesn’t look easy to get her going.”

The gentleman smiles, clearly proud, he leans back and pulls out his braces, now in his element “It isn’t, but I’ve become quite the master I can tell you, on a cold morning I can start her in under 15 minutes.“. He looks for your approval, but you frown, his smile wavers when you say “but I can start mine immediately, with a tiny key….” but his frown is fleeting.

Well that tiny thing won’t work old chap“, he’s enjoying himself now, clearly starting to think you’re a bit of a nutcase “You’ll never get the engine turned over with that will you?!

Your patience is wearing thin, “Well with our way of doing things in the future, we don’t physically turn over the engine, we simply…”, but your friend is clearly not listening, he interrupts: “Listen my friend, no cars going to get started unless you turn over the engine, I’m an expert on these things, that’s the way we do things here”.

You make to continue the conversation, but the gentleman stops you, “Listen sonny, I’ve heard it all before and its poppycock, look at what I was offered last week.” He shows you a picture:

As you look over the picture he continues, “That thing looked more like a car that that monstrosity you’re touting, and that didn’t work, why should yours be any different?”…he turns on his and walks away.

Okay, you get the picture, analogy over…

Clearly it’s difficult to imagine the next generation, especially when you compare it to the standard today. It’s also only too easy to translate the message above back into software and “Next Generation” Data and Analytics. The market is still only just beginning and we don’t know what’s around the corner, but here are some thoughts on how to open your eyes to the potential that might be there:

1. Make sure the first thing you do is take a demo. Comparing features, particularly against the “standard”, can only get you so far and your list of features will undoubtedly miss the point – you can’t add features you don’t know about. You’ll unduly bias towards the status quo.

2.  Be willing to change. Democratising data isn’t easy, no one said it would be. It involves turning things on their head and perhaps getting a little bit uncomfortable, you might not be ready to drive at 80 mph yet but you might want to get out of first gear (or even let the women drive!).

3. Take it for a test drive yourself, that propeller driven car might look great in the demo but can you, in fact everyone who has access to data, take it for a spin? Again don’t expect it to be all plain sailing, you might hit a few bumps, but it should be a considerably smoother ride than you’re used to.

4. Be prepared to accept something that looks a bit different than what you’re used to.

With that in mind I’m going to sign off, and hopefully see you at Tableau #data14. Make sure you check out Alteryx while you’re there…you might just be seeing the future…

 

 alteryxlogo_307123

For data’s sake have some fun

We live in times when data analysis as a career is very much in the limelight; Nate Silver, Data Science, Big Data have all helped glamorise data analysis. However data analysis has the perception of being a dull, rather lifeless job – hours pouring over spreadsheets looking at numbers, or creating complex models using lines and lines of code; whether or not this is true will largely depend on which tools you’re using.

I recently heard Tom Brown of The Information Lab, for whom I work, talk about his career and how he ended up using Tableau. Tom described his life before Tableau, using other BI products, as “dull” and this echoes what I hear from a lot of people I speak to who have started using the new breed of Data Analysis tools  and who have started to have fun with data for the first time. My own career started in SAS and SQL, I enjoyed my job but I don’t remember ever calling it fun, for me my career only started becoming fun when I picked up Alteryx.

So what makes these tools fun?

Tableau and Alteryx aren’t the only fun data analysis tools I’m sure, but they’re the ones I’m most familiar with and so from there I can speak more generally about what characteristics they share, and where other BI software manufacturers should look if they want to emulate some of the success of Tableau (and increasingly so Alteryx) at capturing users imagination and creativity.

1. Ease of use

First and foremost to be fun software has to be easy to learn and intuitive, it has to have a level of ease of use that means users can dive right in and start using the product
immediately. It has to have a clean fresh interface that removes the complexity from the data analysis and breaks down the analysis into a set of simple, repeatable steps. Tableau achieves this by giving the user just one screen to build visualisations on and a simple drag and drop interface. Alteryx on the other hand takes a modular approach by providing tools, which are all configured in the same way, that are dragged onto a canvas and joined together to form a data flow. Neither tool has any complex code for users to write, again increasing simplicity.

2. Remove the mundane

No one likes repetitive or mundane actions, and they can quickly take any fun away from using BI tools. I think everyone has experienced the frustration of using Excel and having to copy/paste cells to move them around, or having to write multiple formula repeating the same thing for several files. Alteryx and Tableau both contain several neat shortcuts that remove any mundaneness; simple things like using wildcards in Alteryx to bring in multiple files with the same structure in an input tool are a real blessing when needed.

3. Enable creativity

To really become fun though tools must go further than just be easy to use, they must give their users a freedom to create something. Tableau and Alteryx have this in spades; I could ask 10 Alteryx experts to write solve a problem and they would all use different tools and approaches, no module would look the same. This is part of the appeal for me, solving a problem isn’t about finding the right way, it’s about find a way. Similarly with Tableau, as the recent Iron Viz challenges have seen, a subject can be tackled in many different ways leading to some informative and visually stunning visualisations. User communities that share their work and grow together as they collaborate are also key to having some fun, and the Alteryx Gallery and Tableau Public both enable this. You only need to look at some of the apps and dashboards on their to know that users are really having fun with these products.

4. Mass appeal

Data analysis and BI has mass appeal, Excel is the most widely used BI tool and shows what mass appeal can provide. So to truly become fun for everyone tools must go beyond the niche of Data Scientists / Data Analysts and appeal to everyone with a data background, usability plays a part in this but also they must solve a range of problems across a wide range of industries. As people use a tool to solve a diverse set of problems their enjoyment grows.

Why is having fun with data important?

I’m talking about fun for a reason, not only because I think it’s important for people to feel a sense of worth in what they do and to go to work with a smile, but also because having fun
leads to innovation and growth. If people are having fun with data we’ll learn more about what it can offer, and build richer models and better insights. Part time data journalists,
working at the weekend, will explore public datasets and produce insight and intelligence to improve policy and inform the wider public about key issues. The universe of data is growing exponentially, every gadget and tech now includes an array of data tracking, but the skills to interpret and work with data are still catching up. BI and analytics companies have a responsibility to provide fun tools so that children don’t just experience Excel at school – the world is more fun than Excel. Believe me, I’ve seen and used the tools of the future and they’re fun.

Full disclosure: my love of data and analytics, and in particular Alteryx and Tableau, have led me to work for The Information Lab, an Alteryx and Tableau partner and reseller.

 

The Art(isan) of Data Analysis

Firstly an announcement – I’m moving jobs, from the start of January I’m very pleased to say I’ll be working at The Information Lab, one of the longest standing Tableau Partners in the UK and Tableau’s EMEA Partner of the Year they also very recently became Alteryx partners. I approached Tom, Craig and the team because they have clearly demonstrated a passion with Tableau that mirrors my own passion for Alteryx and, having got to know the ethos of the company and their values, then I’m very excited for what the future holds – for me, my new colleagues and also for Tableau and Alteryx.

All this has got me thinking about our role and how we describe what we do. For their part Alteryx coined the term Data Artisan to describe the people using their software; often those people without analyst in their name but those who find themselves needing to solve problems without the need for coding or IT departments. To be honest I never really got it, but with my new role I started considering the name again and considering my own situation with Alteryx and Tableau and it started to make sense.

For starts let’s look at what those words mean and their origin:

Data, “facts and statistics collected together for reference or analysis”, is the nominative plural of datum, originally a Latin noun meaning “that is given”.

Artisan (according to www.oxforddictionaries.com/) is a worker in a skilled trade, especially one that involves making things by hand. It has it’s origins in the mid 16th century “from French, from Italian artigiano, based on Latin artitus, past participle of artire ‘instruct in the arts’, from ars, art- ‘art'”.

Okay, so technically yes, being in a skilled trade working on facts and statistics for analysis or reference I can call myself a Data Artisan. More specifically my new role will involve instructing others in “the arts” and so this will also ring true.

File:Mendel I 053 v.jpg

An artisan from the 15th century

So, I’m a Data Artisan technically – what about practically? Well let’s consider the tools of my trade:

Data – the raw materials / elements I work with

Alteryx – the tool of choice for data munging / data reshaping / data blending

Tableau – the tool of choice for data visualisation

The Dashboard –  a representation of how the analysis looks that helps people understand the overall story

What about an Artisan’s tools of choice? Let’s consider a painter:

Paint – the raw materials / elements (s)he works with

Palette – the tool of choice for paint blending

Canvas/Brushes – the tool of choice for paint visualisation

The Painting – a representation of how the scene looked that helps people understand the overall story

…and like an artist a “Data Artisan” their skill in telling the story means the result becomes greater than the sum of it’s parts, and they can represent analysis in very different ways by skewing their visualisation towards their own view or political bias.

So looking at it this way then I’m left to think perhaps I am a Data Artisan after all…

As a final, perhaps fatal, push on the metaphor I’d like to ask…would an artist mix his paints directly on the canvas? Would an artist paint his picture on his palette? If you’re a Tableau or Alteryx user then there’s no need to compromise on the end result – make sure you’re being true to your art because Alteryx and Tableau used together are the only way to true masterpieces. [okay I got a tiny bit cheesy there but you get the idea!]

Having said all that I don’t think I’ll be calling myself a Data Artisan too often, I think Paul Banoub (The VizNinja!) said it best when he said:

“… call yourself whatever you want. Call yourself a Ninja, or a Jedi or a Yeti or a data rockstar. I don’t care. Just keep on pushing the boundaries and discovering. You should be proud of yourself for trying.” – Paul Banoub

In future my blogging efforts will be mainly on The Information Lab Blog but I will continue to add things to this blog on a less frequent basis, and will be reviewing the best of the Alteryx and Tableau community in regular posts here.

Thanks for reading.

Appendum

As a tease, here’s the kind of thing you can create in Tableau if you mix your data in Alteryx first. Check in with the Info Lab in the New Year to find out how.

Embedded image permalink