Why ‘human in the loop’ and specialised AI can help solve the problem of unstructured data.
What do we do about unstructured data?
If Messrs Rodgers and Hammerstein were data scientists and decided to make a musical about their job, they might ask: “How do we solve a problem like unstructured data?”
Because it is a problem.
They say 80 per cent of all data floating around your business is unstructured, although there is a problem here, unstructured data is so unstructured that we don’t really know how much there is. We do know that there is a lot. There is data from emails, invoices, letters, contracts, social media, images, video… well, the list goes on and on.
When data is unstructured, it is of limited use. Now, if data really is the new oil, then it is oil that is yet untapped, as is sunlight, wind and water, but at least we're doing something about that. Unstructured data is perhaps more like sunlight falling on a baron desert; without a human, let alone a solar panel, anywhere in sight.
So what can do we with all this unstructured data? We can get rows of unfortunate humans transcribing, typing, taking unstructured data from emails and videos, and documents and wherever else and manually keying it into databases, giving it order, structure.
But that is boring.
And people lose….
Sorry, what was that? Err, where was I?
Oh yes, people lose concentration.
They then smake mishtakes.
And structured data is all very well, but it ain’t much use if it is wrong.
Does AI hold the answer? RPA has done the job with structured data for the past 7 years but none of the RPA providers have managed to crack the unstructured data component of the data opportunity. So how do you teach AI to recognise data from a myriad, or even a super myriad, of different formats?
Take invoices. You will struggle to find two invoices, each from a different company, that are alike. The date is in different phases; the address is laid out differently; some have watermarks, some don’t, some have the invoice amount on the right-hand side, and some in the middle.
Super.AI, a Silicon Valley start-up has an interesting solution.
The team at super.ai talk about humans in the loop and different AI engines with very specific specialisations. An AI that specialises in watermarks, another in recognising dates, another in identifying the total, and so on.
Maybe this is analogous to how the human brain works — different neurons and networks of neurons linked by synapses responsible for specific tasks. We do know that humans work best in teams and that 'two brains are better than one'.
Super.ai's approach to utilise the best brains for a specific task, irrespective of where those brains are or whether they are human or artificial is not only unique but very exciting. The combination of human brain power with that of AI is not new but to be able to access whatever combination does the job best allows users to utilize the best tech through a single platform, that is new.
How do we solve a problem like unstructured data, join us to find out as I talk to Manish Rai and Brad Cordova of super.ai.
Brad is the founder of super.AI, a man who began working on a PhD in machine learning at MIT but instead created a new company which he took to Unicorn status — so he knows a thing or two about this stuff.
Click here to join us on LinkedIn Live: on May 4th, 16.30 BST, 11.30 ET, 8.30 PST.
The AI revolution is here
Jan 25, 2023
The impossible conclusion about technology becoming less disruptive and why it is so dangerous
Jan 20, 2023
Tech bubble! Are you kidding?
Jan 06, 2023