Machines are thinking more like us every day. Search engines now have ‘natural language processing’, which is a type of artificial intelligence that lets machines read, understand, and derive meaning from human language. It uses the same kind of word associations that humans use. This means web searches are more likely to result in content that’s concise and relevant to what you’ve searched for. The bots weed out ‘filler’ content, web-waffle, confusing business speak, and jargon.
Natural language processing means:
Natural language processing is a godsend from the deities of data. Here’s why.
Plain language puts the reader first. Natural language processing ‘reads’ the pieces of content like a human. So, it has the same needs as a human reader. It wants the relevant information quickly and in an easy-to-use format.
If you’re writing content for the web, it will rank better in searches if the first couple of paragraphs clearly state your main information. Don’t drown it in jargon, strings of nouns, or irrelevant content.
Basically, the search engine will favour your content if a human reader can use it quickly and easily. The search engine will ignore content that’s harder to process. Content stuffed with confusing business-speak or jargon will languish at the back of the internet.
You’ve encountered natural language processing if:
Natural language processing lets you use words with a machine as you would with a person. And you get a person-like response. Alexa, Siri, and other digital assistants have been designed to respond to commands as a human would. This makes it easier for people to feel comfortable giving verbal commands to a machine. Eventually, digital assistants will react enough like humans to pass the Turing Test — we won’t even be able to tell they’re not human.
Improved search results are good for everyone, but especially people with learning disabilities. They’ll now swiftly get the information they need to do the task at hand. They’ll no longer have to deal with the stress of filtering useless information.
And if sighted people find it annoying to scroll through screeds of confusing, irrelevant waffle, imagine what it’s like for someone using a screen reader. Few experiences can prepare you for an automated voice talking about ‘getting ahead of the curve ball’ to ‘incentivise’ its ‘collateral assets’.
Natural language processing uses the same linguistic tools that humans use to recognise words and how they’re meant to be used. Once the program understands what you’re asking, it uses different filters to test the relevance of the data it’s crawling through.
Let’s say someone wants to find a pattern to make a catnip mouse for their cat. They open their search engine and type in ‘easy catnip mouse sewing pattern’, or they verbally ask their digital assistant using the same words.
The search engine looks at the words easy, catnip, mouse, sewing, and pattern. It runs these words through the same kind of processes we use to figure out what they mean.
These processes are:
Once the program understands what you’re asking, it sends out crawlers to look for data. The data must match the keywords (catnip, mouse, sewing, pattern, easy) plus what the sentence means. The program knows to look for a sewing pattern of a catnip mouse that someone who’s still learning to sew can easily make. It won’t look for an easy sewing pattern that could be made by a sentient catnip mouse, or a catnip pattern for an easy sewing mouse. Even though the keywords are all there, they’re in the wrong order for that to make sense.
Now the program heads out into the web to collect sewing patterns. It uses the following analysis tools to decide whether the piece of content it’s looking at fits the description it was given.
The program looks at the tone of a piece of content to see if it has a positive or a negative attitude. For example, the program would read the sentence ‘This sewing pattern is fantastic’ and register the content as positive. It would register ‘This sewing pattern is the absolute worst’ as negative. This process is known as ‘sentiment analysis’.
The program judges how relevant the content is to what you searched (‘easy catnip mouse sewing pattern’). It looks at what the content is about, and whether or not it’s a good example of the topic. This process is known as ‘salience’.
To do well on a salience score, the relevant information must be close to the beginning of the page. A massive webpage that avoids the sewing pattern until the very end will score badly on salience.
The program looks at a piece of data, defines its subject matter, and groups it accordingly. It differentiates a picture of an easy catnip mouse sewing pattern that’s purely decorative from the real sewing pattern that can be downloaded and used. This process is known as ‘content categorisation’.
The program sorts through all the available data and ranks it according to how relevant and concise it is. Next, it displays this data for the user to see on the page of the search engine. The user chooses which pattern they like and, more importantly, their cat gets a new toy.
And there you have it: You ask the bots a human-like question, and they’ll give you a human-like answer. Natural language processing means you get exactly what you’re looking for first time.