How can organisations use “unstructured data”?

You could argue that all data is, at some level, unstructured. If you take the classic information theory model of DIKW, then “data” may be thought of as the chaotic thronging of a swarm of bees. Patterns, or structure, in their noisy dance emerge as data “converts” to information, then knowledge and finally wisdom. The closer we look at the swarm, the more we see.

What is true for the swarming of insects is also true of the flocking of birds or the shoaling of fish. What at first seem to be a chaotic, random (albeit beautiful) flurries are underpinned by very simple rules.

In mathematics, or complexity theory, the study of cellular automata replicate these seemingly organic, impossibly complex behaviours. Simple rules given to an algorithm can produce a wide variety of seemingly unstructured patterns.

The data found in social spaces across The Network seems similarly unstructured and chaotic. It’s early days when it comes to finding the patterns in the data. It’s a Fuzzy Science. Natural Language Processing (NLP) is still operating well below the level of Artificial Intelligence so using tools like Sysomos and Radian 6 to parse large volumes of conversation has its limitations. Commercial NLP software for social networks can’t differentiate mainstream use of language from slang. Is “hectic” busy or good? Is “mashing” something you do to potatoes or your significant other?

That’s not to say organisations can’t use “unstructured” data to their advantage.

Have a look at Google Ripples to see how social endorsements spread through a network. Stress levels in the voice may be monitored over IVR systems and telephone calls routed appropriately. Facial recognition technology is cheap enough now to be deployed across retail estates, recognising repeat customers or simply customising screen content by age and gender. Google famously track the spread of flu epidemics by monitoring regional search volumes for related keywords. A trick for just-in-time manufacturers perhaps. Target’s systems can tell whether you’re pregnant or not by monitoring patterns in your purchase behaviour. House of Cards was created by Netflix based on an observation of patterns of viewing behaviour and preference on their platform.

The quantified self movement is built entirely on taking unstructured or ambient data about your mood, movement etc and mining it for intelligence. For organisations in the health and fitness industries especially, this can be a source of truly competitive advantage.

At the very least, unstructured social data can add depth and richness to an organisation’s understanding of a customer, allowing them to be personally relevant and even anticipate some of their needs.

Photo Credit: jumpinjimmyjava via Compfight cc