The Data Lab
The numbers begin to add up…
PROFILE: The Data Lab
WHERE: Edinburgh, Aberdeen, Glasgow
FUNDING: £11.3 million from the Scottish Funding Council
The numbers begin to add up
The amount of data being created by users on smartphones, computers and corporate systems is increasing exponentially year after year, but our ability to make sense of this digital explosion is also improving, as data scientists help business and government use information to strategic advantage – to save resources and improve our way of life. And The Data Lab is leading the way in Scotland by taking on new challenges which call for the most innovative use of data science in everything from analysing crop yields to how children travel to school...
Data analytics is not a new science – for example, major retailers have used it for years to study customer behaviour so they can target products and promotions more effectively. The main advantage of data science today is that the hardware and software have improved so much in recent years that analysing vast amounts of data is now much more affordable and practical. “It's worth doing jobs which were simply not worth doing before,” says Gillian Docherty, the CEO of The Data Lab.
Unlike data science, however, The Data Lab is something new, and is also adopting a novel approach to the science. One of the eight innovation centres recently set up in Scotland, The Data Lab is driven by industry needs and is also now headed by someone with over 20 years’ experience in industry, working for computer giant IBM.
Six months after she left IBM to be part of the new innovation centre, Docherty stresses that The Data Lab only supports projects which themselves are innovative and could not be handled by commercial data science firms. “If the project isn't innovative,” she explains, “it will not be funded by us.”
The Data Lab has three criteria for projects:
1 It must have some commercial impact in terms of creating new jobs and driving
2 The Data Lab and its academic partners in Scotland must have the ability to
execute the project, attract commercial sponsorship and provide follow-on value.
3 It must be innovative in terms of the science involved or novel in the way it is
applied – for example, the technique may have already been used in the airline
industry, but not in oil and gas or engineering.
One of the most difficult challenges is how to measure the impact of any given initiative, but Docherty explains that every client is asked about impact right from the start of the project and is expected to provide extra details at various stages thereafter.
The economic contribution of The Data Lab is key to its long-term success. It was set up to help Scotland “capitalise on the growing market in analytics and ‘big data’ technology,” and its initial target was to create about 250 jobs and generate over £100 million in terms of economic value, over the first five years – for example, boosting oil recovery in the North Sea or saving millions in the healthcare sector, as well as other areas including climate change and manufacturing. The 17 projects planned so far are expected to create about 140 jobs and generate economic value of about £35.5 million; but the impact will go far beyond these headline figures – the benefits of optimising data also have a knock-on effect by freeing up more capital from savings in costs and enabling redeployment of resources. Meanwhile, The Data Lab itself has been busy creating new jobs, with staff numbers reaching 20 people this year, based in the Edinburgh headquarters and in hubs in Aberdeen and Glasgow.
“We're motoring,” says Docherty, “and collaborative projects are driving us forward. The concept is becoming reality now.”
The Data Lab was designed to “transform the nature of collaboration between industry, public sector and academic partners, and collaborative innovation will continue to be central to its mission, working on projects where “the outcome is greater than the sum of the parts.” For example, it is working with CENSIS and Alexander Dennis, “the world's leading supplier of double-decker buses,” to investigate the wider use of vehicle sensors; to monitor air quality and engine performance, as well as live passenger movements, to reduce pollution and improve traffic management.
Another initiative, working with DHI (the Digital Health & Care Institute) and SMS-IC (the Stratified Medicine Innovation Centre) is designed to improve cancer care, while Docherty is currently in talks with CENSIS and the Scottish Aquaculture Innovation Centre to measure the environmental impact of fish farms; not just to improve productivity and prevent pollution but also to shape future government policy.
Skills and training is another priority, including online learning, summer schools, the funding of new university courses and industry placements for students. The Data Lab is sponsoring 40 students at the University of Stirling, Robert Gordon University and Dundee University. As well as paying their fees, it brings them together at regular workshops with industry partners for training and networking sessions; for example, representatives from Skyscanner and IT recruitment specialists, MBN Solutions, who talked about the opportunities in industry, and Dr Malcolm Fairweather from The Institute of Sport in Stirling, who described the role of data science in elite sports. Docherty also reveals that 60 companies have come forward to ask about employing the first wave of graduates, and the first “Data Talent Scotland” event is also underway, bringing industry partners together with up to 250 MSc students.
There are now 14 MSc degree courses in data science in Scotland, and international demand will increase. A future brain drain would be a nice problem to have, “but we want to make sure we keep all this talent in Scotland,” says Docherty.
Community building is also a major activity. For example, The Data Lab recently sponsored eight Scottish companies to go on a Scottish Enterprise/SDI (Scottish Development International) mission to a data science conference in New York (Big Data and Analytics in Finance) and also visited companies including Verrisk Analytics, Dataminr and The Data Incubator. “There are fantastic opportunities for business there,” says Docherty, “and three US companies have also shown interest in coming to Scotland to set up data science operations, knowing that we have the talent they need. This also helps put Scotland on the map from a data science perspective.”
Community building has also led to a new partnership with Datakind, a network of volunteers who help third-sector organisations take advantage of data science. “Actionable insights” are the key assets of data science, but Datakind believes that “the same algorithms and techniques that companies use to boost profits can be leveraged by mission-driven organisations to improve the world; from battling hunger to advocating for child wellbeing and more.” Datakind has already taken part in Data Lab events and the two organisations will launch a new project in 2016 to bring together leading volunteer data scientists and charities to help drive new insights for the charities.
The key to success for all the innovation centres was to “put industry people in at the core of the organisation,” and Docherty certainly fits this description. She spent 22 years with IBM in London and Scotland, returning north in 2005. Her last two years were spent as Software Leader of IBM's Scotland Enterprise Business Unit, after three years as Scotland Territory Leader. Before then, she was a technical and sales specialist, and sales manager for the Systems and Technology Group. Her experience with IBM's “client-facing team” was also a good preparation for her current job, working with clients in both public and private sectors.
At IBM, Docherty also developed her knowledge of data science – her former employer is the world's biggest provider of data analytics solutions. Among the projects handled by her team in Scotland was working with Edinburgh City Council to develop a “visual dashboard” to monitor activities in different departments and provide “a single view of the citizen” to improve services and prevent fraud. And this kind of initiative could soon be launched in every industry in Scotland.
The Data Lab has had ambitious targets from the start, and as the other innovation centres become more established, it will play an increasingly critical role at the centre of various ground-breaking projects – not just crunching numbers, but turning raw data into valuable assets, for government and business organisations in Scotland.
More power to Aggreko
Glasgow-based Aggreko is a leading provider of modular, mobile power and temperature control solutions to various industries, including events, mining, oil and gas, petrochemical and refining, construction and manufacturing. The company has two business units: Rental Solutions, which operates in developed markets and hires its equipment for customers to operate; and Power Solutions, which supports emerging markets by operating as a power producer – for example, by supporting a country’s national grid or providing short-term power for a global sporting event or remote mining operation. In 2015, the company, which employs about 7,300 people globally, supported 55,000 customers and had 10,000 assets in the USA alone. Aggreko designs and manufactures its equipment in Dumbarton and is working with data scientists from the University of Strathclyde, using the latest machine-learning techniques and telemetry technologies, to develop a system able to predict equipment failure based on analysing historical data and comparing it with current symptoms. This technology will help its engineers proactively resolve issues, to help maintain the reliability of equipment and improve customer service.
Global Surface Intelligence
Edinburgh-based Global Surface Intelligence (GSI) has developed a powerful software solution which analyses satellite and other data to predict future conditions on the ground, so clients can make better environmental and business decisions. Working with a team of geoscientists from the University of Edinburgh, the company is studying satellite data, then taking account of the weather, to predict future crop yields – not just to help farmers but also commodity brokers, whose decisions can make or break investors.
The latest technology start-up from Edinburgh, supported by the RSE Enterprise Fellowship programme, Amiqus, is developing automated legal tools which use open data to help predict the outcome of claims and disputes, so legal issues can be resolved without necessarily going to court. Using natural language processing, the software tracks through case histories and court information to compare against current disputes, so all parties involved can make informed decisions over how to proceed – potentially saving significant money, time and court expenses.
The Data Explosion
> Facebook users share nearly 2.5 million pieces of content.
> Twitter users tweet nearly 300,000 times.
> Instagram users post nearly 220,000 new photos.
> YouTube users upload 72 hours of new video content.
> Apple users download nearly 50,000 apps.
> Email users send over 200 million messages.
> Amazon generates over $80,000 in online sales.
Before the latest Star Wars premiere, data scientists Richard Carter and Roman Popat from The Data Lab analysed the names of several hundred characters from the popular films to determine which language is the most likely source of each name.
After collecting over 500 names from Wikipedia, the data scientists used artificial intelligence and natural language processing to analyse the names, breaking them down into a sequence of character strings – e.g. “Luke” decomposes into “l”, “u”, “k”, “e”, “lu”, “uk”, “ke”, “luk” and “uke.” Using software called textcat (text categorisation), they compared the strings with dozens of languages and worked out the probability of any given name coming from any of the languages. Even though the methods used are highly sophisticated, the data scientists are keen to point out that they did it for fun and that the results are not meant to be taken too seriously.
The research did throw up some interesting conclusions, however – including an apparent connection between the names of Jabba the Hutt and the rest of the Hutt “clan” with Scots.
The Rise of the Scooter
According to Roman Popat of The Data Lab, “every child’s journey to school is an adventure of sorts,” and one of his recent projects was to analyse how children in Scotland travel to school – and in the process demonstrate how data science can help us to understand trends in social behaviour.
Using data from the “Hands Up Scotland” survey, Popat discovered that children from different types of school tend to use a different mode of transportation. For example, more children in primary and secondary schools walk to school than children who attend independent schools. Most children at Special Educational Need schools arrive by bus or taxi, whereas driving and Park & Stride schemes are more popular in primary and independent schools. The number of children arriving by scooters, skates and bicycles is also increasing, and when Popat drilled into the details, he found that an unusually large proportion of children go to school by roller skates or scooter – in East Lothian and Midlothian, over 10% of children travel by scooter.
Did you know?
> The Centre for Economics and Business Research (CEBR) estimates that the
Big Data marketplace could create 265,000 new jobs within the UK alone,
generating economic benefits of over £322 billion between 2015 and 2020.
> The increased use of sensors will lead to an explosion in the volume of data
– according to IBM, 2.5 quintillion (1018) bytes of data are added every day.