Finding value among the clutter: how to harness unstructured data


By James Adie, Vice President EMEA Sales, Ephesoft. 

Unstructured data, the term used for information that doesn’t sit neatly (or at all) in conventional databases, is a looming shadow for many businesses today. Rapid growth of technology – and the implications these innovations have had on our working lives – has meant that our ability to create information has vastly outpaced our ability to store and use it in effective ways.

We have valuable, but untamed, nuggets of data coming at us in all shapes and sizes: PDF files downloaded onto our desktops; images shared via email and other social channels; duplicated files saved on employees’ laptops; even web pages of note saved in our bookmarks tab. There are so many avenues of communication, but no standard methodology on what to do when useful information comes our way.

The result is what we are seeing in businesses all over the world today: organisations paralysed in the face of unrelenting unstructured data. Organising your data is a huge process that will affect all corners of your business. And because of the size of the task, it can be difficult for organisations to invest in the right technology and commit to a strategy. This hesitancy, however, only continues the growth of the clutter and inconvenience of unstructured data, like an attic accruing more and more boxes each year.

Tackling transformation

We produce 2.5 quintillion bytes of data every day. And with new, hyper-connected technology constantly coming to market, this figure is only heading in one direction (90 per cent of the Internet’s data has been produced since 2016). These mind-boggling stats make it easy to dismiss unstructured data as terminally useless – to throw our hands in the air and disregard the filing of digital data as an impossible task. But armed with the tools for transformation, your data can – and should – be hugely valuable.  

Cleaning your data starts with understanding what data you have. Once you have identified all the data sources you come into contact with, capturing the important pieces of information becomes a simple process with the right technology.

There are sophisticated data capture and extraction tools that use AI-based or machine learning algorithms to automatically process documents in any format – from mortgage applications to invoices to insurance claims – in order to extract their value. And if there is an error with recognition or extraction, the technology is smart enough to inform the user. Once the user makes the correction, the data capture system will learn and recognise different document types and layouts, becoming smarter over time.

With your data cleaned and converted into a structured format, it can be centrally stored and easily accessed – transforming previously unworkable data into a lucrative business asset. Organisations will be equipped with faster and more accurate information on their customers, paving the way for accelerated business growth.

GDPR and KYC compliance

Beyond the satisfaction of clearing out the mess and putting data to good use, organising unstructured data repositories is essential for an organisation’s compliance to GDPR, as well as a necessary step to maintaining know your customer (KYC) standards to prevent fraud and potential risks.

The siloed system of storing data – where structured and unstructured data sets sit in isolation from one another – has been developed by default in many businesses. Without a plan in place on how to deal with unstructured data sources, the volume of data has steadily risen over time. Organisations know that continuously pouring more data into these silos isn’t the right tactic, but until now they simply haven’t needed to address the rising levels.

GDPR has changed this, legislating that all data must be available for immediate deletion on request. If that data is locked or hidden in unstructured formats, this becomes almost impossible, as well as leaving organisations at a higher risk of data breaches. 

Don’t miss the boat

A recent Information Age article speculates that as much as 80 per cent of businesses’ data is unstructured. With the rate of data production as it is today, this figure will continue to rise if organisations don’t act to curb the growth.

To take full advantage of new data capture technology, businesses must get their data in order quickly – or run the risk of missing the boat and losing out on competitive advantage.

“Data is the new oil,” said mathematician, Clive Humby. “It’s valuable, but if unrefined, it cannot really be used.” From a business perspective, data has fast become the leading global currency of the twenty-first century. Failure to organise this resource – to take the ladder to the cluttered attic – will result in missed financial opportunity, restricted customer insight and increased compliance risk.    

Add a Comment

No messages on this article yet

Editorial: +44 (0)1892 536363
Publisher: +44 (0)208 440 0372
Subscribe FREE to the weekly E-newsletter