Machine learning applications: How they work (and when they fail in text analytics)
Some specialists believe that machine learning applications are, on the one hand, magic boxes capable of doing whatever we want or, conversely, are alien-like solutions that are useless in everyday life. As it often happens, especially when it comes to new technologies, the truth lies somewhere in the middle.
How machine learning applications work
Prisma is a photo editing app that transforms users’ photos into works of art by applying the syles of famous artists or different and original patterns. Prisma doesn’t simply apply a filter (like Instagram does) but creates new photos following a model and, as the official description states, “a unique combination of neural networks and artificial intelligence helps you turn memorable moments into timeless art.”
How can Prisma turn a normal picture into a masterpiece?
All machine learning applications (and Prisma follows the same logic) learn from information, parameters and schemes and use them to improve their algorithms independently, without being explicitly programmed.
Machine learning is more pervasive than we think: there are numerous real-world applications – self-driving cars, speech and image recognition, text classification, web search, smart robots, etc. – that are included in this sub-set of artificial intelligence. They need specific training (Prisma learns works of art by Picasso or mosaic model features) and use these examples to make a system better (Prisma completely changes the style of the picture by applying a different one).
The limits of machine learning applications in text analytics
In business, we can say that machine learning offers a sophisticated approach, but there is a limit to the level of improvement possible in analyzing unstructured information.
In fact, machine learning applications:
– Need data or models that have been prepared manually by people. And even then the process is not completely automatic. Machine learning applications do not learn on their own; someone has to teach it the differences between topics, words and concepts, etc.
– Require a large set of data and examples for training related to the field or the topic. Machine learning can understand the difference among different information only if documents about different topics and information are uploaded during the training process.
– Obtain good results only if the training is frequent (and if the data set grows). Machine learning can improve its knowledge only by adding – over and over again – more information.
– Need different patterns. Too much data of the same genre makes the system less accurate. Machine learning can distinguish between the different meanings of the same word, or politics from ecology for example, only if these meanings, or other topics like history, medicine, math, etc. are known by the system.
– Do not learn in real time. You can’t add a new concept among the options that machine learning offers.
So, you can’t hope to train machine learning to identify many different words and different pieces of information without sufficient models and training. Even an extensive knowledge base cannot help you deal with a new word if machine learning has never seen it during training.
That said, going beyond text analytics, machine learning offers many opportunities in a variety of fields, and really learning complex information possible.
In general, we could say that all machine learning applications are neither a magic boxes nor a useless solution. They cover a very broad range of fields, some very critical (for example life science and health applications)… but how far can they go? Will machines ever learn everything and automatically?