A quick and easy introduction to natural language processing
Natural language processing has become one of the hottest technological topics.
From Alexa – the Amazon’s service that can interact with users – to the predictive texts of smartphones that autocomplete and autocorrect themselves, there is a world of opportunities for natural language processing applications today.
NLP (this is its acronym) is increasingly used in healthcare, social media, information extraction, and sentiment analysis; It can help lawyers and governments, it can read historical texts and your email as well.
But the field where the impact of this new technology is more significant is that of chatbots, voice bots, and call bots.
To say it in a word, we face a tech revolution that, yet silent, could bring about major social, industrial, and economical changes.
In this short guide, I will try to outline the working and application of natural language processing to get the reader to know the ropes of this new technology and be able to face it.
What is natural language processing?
With this expression, we mean the technology that is used to get computers or similar machines or artificial systems to understand the human language. It is not a case that natural language processing is also known as speech recognition or natural language understanding.
To reach this goal, some specific tools, like machine learning and artificial intelligence of which NLP is considered a branch, are utilized.
Making machines able to understand natural language is just a basic step to build devices that can speak to humans – this is called natural language generation.
When natural language generation and speech recognition are combined, they take human-machine interaction (HMI) and human-computer interaction to a superior level, as voice interaction is enabled between users, machines, computers, and interfaces.
How natural language processing works
Understanding natural language is a challenging task for both humans and machines.
Grammar and words are just some of the aspects to take into consideration. Morphology (each individual sound of a word) and pragmatics (how the context affects the meaning of a word in a sentence) add layers of complexity.
However, the most difficult task that a machine has to face is the ambiguity of natural language. For example, “a” in English can mean both the undefined adjective (as in: “a car”) and the first letter of the alphabet.
Another problem is the associative meaning of many sentences and expressions, as: “John is like a lion”.
Now, while it is clear to humans that John does not roar or tear apart people, this is not to computers, as they understand only a denotative language, which means they take the literal meaning of words.
When a machine faces natural language, the first step is parsing a text or a speech (this operation is called natural language parsing) to detect and analyze its constituent words and the syntactic structure of each individual statement.
The process ends with the conversion of natural language into a form that a computer can understand - usually a vectorial representation of the text or speech concerned.
The second step is trying to understand. The possible approaches that are used in natural language processing applications are:
- Symbolic. The application utilizes a set of rules, part hand-written by the developer, part self-learned by the computer through a process of machine learning. This is an old and tried approach, that is still in use with the duly upgrades.
- Statistical. The application utilizes a set of machine learning algorithms to learn the language. The machine must be trained on some parameters, which is a typical feature of machine learning, and this requires time.
- Neural networks. This is like the statistical approach, but with a difference: most important features are learned by the artificial neural network itself, without the need for human help.
In all these three approaches, artificial intelligence is a fundamental component. Neurosciences could pave the way to new approaches. However, these studies are still at the beginning and far from being fruitful for practical applications.
What can natural language processing really do?
Parsing, understanding, and translating natural language are tasks that today’s natural language processing software can do, even though it suffers some limitations.
We stress the point that the hardest task to solve in any natural language processing application is to find a way to understand the context and the pragmatics.
Other serious issues are related to nuances, as the same word in the same context can assume different implications just by changing the tone of voice.
Ambiguities and nuances are still an open problem, and this is the reason why it is important to utilize a cutting-edge solution that incorporates the most recent discoveries and developments in this industry.
Not surprisingly, in some cases, the attempt to understand natural language can fail. An epic failure happened during the 50s when the first steps to develop this technology were made. The sentence:
“The spirit is willing, but the flesh is weak.”
It was translated into Russian by a machine as:
“The vodka is good, but the meat is rotten”.
What are the most common natural language processing tasks?
Natural language processing can be utilized in many tasks.
- Text classification. This allows software to perform tasks, as deciding whether the latest email is spam or not or understanding the sentiment of a tweet or a post on Facebook.
- Text meaning. Software is made to be able to represent the meaning of a sentence, which is a basic step to answer your clients’ questions.
- Encoding and decoding a text, discovering hidden meanings or sequences in it.
- Dialog systems. The software can have conversations with your clients.
- Information retrieval and text mining (text analytics). This means the analysis of a text with natural language processing to transform it into a set of ordered and classified data. This is useful to discover new information in a scientific article.
Why is natural language processing used?
Because the interactions between computers and humans are so daily and frequent in our digitalized society that an increasing number of people think of the old buttons as something inadequate – and they are right.
In other words, the increasing use of natural language processing simply reflects the stronger and stronger need for more sophisticated user interfaces.
The growth of the industry well proves the success of natural language processing applications. According to Research and Market, its market should reach $35.1 billion by 2026.
Anyway, there is another reason behind the growth of natural language processing. It allows performing operations that are impossible for humans, as in the case of its application to big data. Let us zoom a bit into this point.
Natural language processing and big data
Big data are huge sets of structured or unstructured data, like the emails, letters, documents, briefs that an organization has been continuously piling on for years.
Natural language processing can be effective in dealing with big data. Consider that a computer can read in a few seconds a number of documents that a team of humans could not read in a century, as big as the team can be.
Once that software has read and understood the documentation, many further tasks can be performed, as finding new insights on your customers, or classifying and summarizing the contents of a specific set of documents.
The most interesting application of this new field is data mining: bills, letters, emails, and posts often hide valuable information on the preferences of your clients that humans cannot detect, at least not as easily as this kind of software does.
Are natural language processing solutions expensive?
Yes, a solution for processing natural language can be expensive. The problem is not only the technology and the data banks and libraries to be utilized but also the time that is necessary for machine learning.
Consider that a machine needs the training to understand a specific area of natural language for specific tasks (for example, to ask a phone call). And it often requests the dedication of a team of technicians for a length of time of over two years.
Luckily, in many cases, complex analyses are not necessary as some tasks are relatively easy or there are enough software libraries and data available.
Therefore, the ability of a skillful team is to deliver a solution for natural language processing that wisely mixes the available technologies for the assigned task, but without overperforming and overcharging the limited budget of the client.
Where natural language processing is used
Natural language processing comes from far in the past. Its idea stems from the early attempts of machine translations of foreign languages during the second world war.
However, the pioneers of this new discipline were not so lucky to find adequate financial support. During the 60ies, the research in this field was languishing, not to say that it was dead.
The change came at the beginning of the 80ies when natural language processing was revamped by the idea of using it to build artificial intelligence.
The 90ies can be seen as the golden age of this discipline. The pace of growth of research in natural language processing increased to the point that in the next decade it was possible to deliver products like Alexa and the first voice bots able to have a conversation with humans.
Today, natural language processing is widely applied in several industries, and not only in the traditional field of machine translations. Let us have a quick look at them.
Sentiment analysis
The goal of sentiment analysis is to understand the emotional implication of a name, expression or word, or the feelings that a targeted audience associates with a brand, a product, or a service at a certain time/date.
For example, if you post something about your latest leather jacket for men on Instagram, you will probably ask yourself whether readers like your post or not.
It is possible to discover your readers’ sentiments by analyzing the words, expressions, and emoticons that they use to comment on your post and other similar posts. However, this is quite a complicated and long task for a human being.
On the contrary, this task is quite easy and fast for a machine that is endowed with natural language processing software. And, if the question is to understand the polarity of a statement (negative, positive, neutral), without the help of this kind of software the task becomes so difficult that is hardly feasible.
Document management
Documents are a real pain in the neck for many organizations. A public library has thousands of texts to catalog, and every moment state bodies like the foreign office receive lots of sensitive documents that need to be read for the extraction of important information.
Luckily, software comes to help, as automatic summarization and text classification are two other tasks that natural language processing applications can easily perform. Your documents will be tagged according to a set of assigned rules, and the relevant information drilled or summarized in reports.
However, the real advantage of the utilization of natural language processing applications to perform these kinds of tasks is, as in any branch of digital technology, the saving of time and money that it is implied.
If we consider how much people and time an organization like a government needs to spend only to read and classify the stream of email that it receives every day, we immediately catch the importance of this new technology.
Conversational agents
It is the new frontier of natural language processing.
Hal, the nefarious computer that led the spacecraft in Space Odyssey to wreck in the depths of the space around Jupiter remains the most iconic example of how a computer can speak, play and be emphatic when endowed with a robust natural language processing software.
The good news is that the applications of this kind in the real world are far more reassuring and helpful than the malicious Hal. Today, natural language processing applications are mostly utilized in the field of automated online assistance.
Voice bots, call bots, and chatbots answer thousands and thousands of questions every day. Even though bots are still far from perfection and empathy, they already help millions of people everywhere in the world.
At Ideta, we made it really easy to add NLP to your chatbot. You just have to type in the training sentences and link it to Dialogflow. Try our AI chatbot builder!
Final thoughts
It was a long time ago when Eliza, a computer program that simulated a conversation with a Rogerian Psychotherapist, was introduced to the public (1964).
In half a century, natural language processing has made giant leaps forward, to the point that today it is possible to automate even creative tasks, as writing a post (Forbes did that in 2012) or a novel.
This gives evidence of the potentiality of the technology and how disruptive it can be, and not only in the field of chatbots and conversational agents.
Will this new decade be that of natural language processing? It seems to me that this new technology has what it takes to be.
If it will be so, let us hope that the devices with the ability to process natural language would be more responsible and lenient than Hal and would give a real contribution to drive the spacecraft of planet earth towards great results and a good, happy end.