Humans and Their Unstructured Data: What to Do?

Human beings are messy—not physically messy (necessarily), but they leave behind a trail of messy data. We don’t navigate websites in a logical order; we type whatever we want into search engines, and use colloquial terms or inside jokes in our profiles. “Qualitative feedback” is every customer service representative’s worst nightmare. And those darn self-reported fields on forms, like “referral source,” are always full of junk. As marketers, though, we have to take a deep breath and dive into this data.

We were facing this problem ourselves at Socedo, as almost all information that a user inputs on Twitter is unstructured, including the location field in the profile—a huge problem that became a huge opportunity. In fact, by standardizing locations alone on the Socedo platform, prospect discovery increased by an average of 226%.

The Human Element

Unstructured data is messy stuff. Basically, it’s anything that does not have a predetermined format, field type, or organization. While the text is the most common form of unstructured data, not all text is unstructured. “Last Name” is a structured field. However, a self-reported “How did you hear about us?” is unstructured.

When a human is given the autonomy to enter information into a field, things start to get messy. In the Socedo world, we deal with massive amounts of unstructured data on social media.

The Problem

Unstructured data can be difficult to use. It does not fit easily into a CRM system, nor is it easy to analyze quantitatively. Unstructured data can also pose a challenge for automation, as it can be difficult to translate that data into a prescribed workflow. Since this data is not easily searchable, a query-based system, such as search keywords to find prospects, has a difficult time using this data. It requires more work on the human marketer to be responsible for every possible data input.

The Need

However, standardizing this unstructured data can yield significant benefits. In fact, Salesforce CEO Marc Benioff points out that, across marketing, sales, and service departments, unstructured data now outweighs structured data by 5 to 1, and the effectiveness of a marketing department to tap into this data will separate one business from its competitors.

Additionally, this is where human behaviour is really taking place and real-time marketing can shine. Humans are messy. We don’t follow a predefined workflow, so a marketing automation system that only maps to an easily quantifiable process will miss out on the nuances of what people are really thinking, talking about, or interested in. However, a machine that can continue to adjust with this new data will yield more accurate results—and save the human marketer a headache along the way.


In order to use unstructured data, you have to be able to organize it in some way. Andrew Davies suggests three different ways to organize: aggregate, analyze, and standardize.

Aggregation and analysis involve pulling your data together and sifting through it for patterns, trends, and other metrics, which will lead to very useful qualitative insights about the marketing landscape. Data can even be converted into structured data, such as the repetition of a keyword.

Standardization allows your data to be used by a machine—creating a predefined method for using the data, regardless of its structure. You can’t change the way humans enter data, but you can change the way your machine reads it. Therefore, when you need to use this data, you can work within a manageable format.


Unlike other social media platforms, like LinkedIn, Twitter allows users to input any value into the location field of their profile. Someone could list their location as the extrasolar planet “Kepler-22b” if they really wanted, but even for the Earth-grounded user, they have flexibility. Some may list their city and state, or just their city, or just their neighbourhood. They may write “CA” or “California” or “Cali.”

This is all unstructured data, and for the marketer wanting to search for leads based on a specific location, it requires a lot of creativity (and crossed fingers) to cover all the bases. Still, many potential prospects will be left unaccounted for. However, we recently standardized the location field on the Socedo platform, changing the way the system uses the unstructured data on Twitter. This has brought back an average 226% increase in prospect volume for accounts. Now the user only needs to enter “California,” and the platform smartly detects the rest.

We didn’t stop with locations, and we’re still working on artificial intelligence around keywords and bios as well. Marketing will always be filled with unstructured data because as marketers, we deal with messy human interactions, thoughts, and needs. This data does not have to be an obstacle to automation. Rather, automation can make the data easier to use for the human.

Leave a Comment