It is one of the hottest topics in AI in 2020. Microsoft is actively developing it, and we offer a solution based on this service. It’s time to see Knowledge Mining in practice!
You have already had a chance to discover the theoretical foundations and use cases presented by Dawid. As a next step, Daniel has explained a more technical approach to the problem, describing all the necessary Microsoft Azure services.
Today, I want to close the series by telling you a bit more about our solution and how we used it as part of the implementation for one of our clients.
We have been working with this technology for some time now. It’s important to note that as we heavily use Microsoft Azure services, this is not a finite and out-of-the-box product. We can easily change and adapt it every time, to suit the individual business needs.
For this case, based on our previous projects, we have composed a solution which consists of two modules. The two components are Intelligent Search and Content Extractor.
You can modify and enhance both parts. They work seamlessly together as they are based on the same architecture, however they can be separated if needed as well.
Predica Knowledge Mining Solution consists of two key elements
Intelligent Search is the main pillar of our solution, widely described in the previous articles. It is suitable for companies which hold huge amounts of data, where most of it is not stored in a structured way or easily searchable.
For example, if an employee wants to find relevant information about an urgent or important topic, they have to manually go through documents, images and videos. Even if they are available in a digital format!
Intelligent Search uses an AI-based set of tools to efficiently store, index and inspect files in every form.
The second component is the Content Extractor. It is intended for business units processing large amounts of documents from various internal and external sources, which contain an excess of irrelevant content.
This is useful for when employees want to quickly extract information of interest and use it in a different format or as input for a separate analysis. Most importantly, this process should not be manual but automatic – with the appropriate parameters defined.
Content Extractor uses text mining and machine learning algorithms to locate specified and relevant elements among the entire content.
Enough of the theory – let me tell you how we used these tools to solve real business problems. We worked with the engineering department of one of the largest airlines in the world. They had two very specific challenges to tackle, for which our solution seemed to be a perfect fit.
Let’s start with the first problem. The engineering department receives requests or queries, which include a description of a defect to be fixed by the airline’s engineers. The inquiries are stated both in printed and handwritten text which makes understanding the content challenging.
Engineers then respond to these requests. They give specific recommendations and answers, often using handwriting again, but also adding images, e.g. to indicate the part of the aircraft they are referring to. All these conversations are kept in the same file and contain vast amounts of knowledge and information, as they result in the problem being fixed.
Now, the engineering department has more than 10 years of history which amounts to thousands of requests in a form of PDF files. So, several engineers thought – when a new request or defect information is coming, why don’t we look for similar cases and a solution among them? After all, not all of them have enough experience to solve the issue on the spot and time is money, especially within the airline industry.
There were, however, some challenges related to solving this problem. The airline’s data included loads of documents which were kept in multiple locations, content stored in various formats (printed text, images and handwritten notes), and many documents not relevant to end-users at all. This is where Intelligent Search came to rescue!
A screenshot from our demo showing our Knowledge Mining solution in practice
We have used the architecture which is very well described in Daniel’s article to perform the following steps:
This was, however, only part of the successful solution. We have also proposed an intuitive, tailor-made web app interface which employees can operate with ease.
This service has multiple functionalities. It allows the users to:
By using Knowledge Mining in practice, our client can take advantage of all their historical knowledge, without having to look for the same answer every time. This, in turn, reduces the time needed to find accurate and efficient fixes, meaning that problems can be solved faster.
Documents are retrieved in order of relevance to the user, thus further speeding up the process. They can also be categorized and filtered by specific fix types, making finding the right solution simpler.
Let’s move on to the second business challenge, where we have used Content Extractor.
Our client periodically receives new aircraft manuals and documentation. These can sometimes amount to more than a thousand pages per document, as they include recommendations for every single aircraft configuration – and, as you can imagine, there are loads of them.
However, our client needs data only regarding the fleet they own. They would take this information, place it in their internal document format and redistribute it to engineers. They, in turn, would implement the specific manufacturer’s recommendations in the machines they own.
This information is crucial, sometimes concerning vital parts of an airplane, so it was essential to extract and transform the relevant content quickly but without any loss of information.
Previously, engineers had to extract and transfer the relevant information entirely manually. Since they first had to find only the configuration and groups which concerned their aircraft and then extract text, tables and images, the whole operation could sometimes take hours, if not days. We wanted to minimize this tedious effort, so they could spend their time on more important tasks.
Luckily, there are tools for that!
Microsoft Form Recognizer demo (source)
We used different techniques to extract parts of the documents:
The operations above prove that extracting parts of a document is not such a simple task after all. Nevertheless, thanks to Azure Cognitive Services and Azure Functions we were able to implement an efficient solution.
In this case, we targeted and achieved a number of results. We managed to automate mundane tasks of extracting and copying relevant content, saving precious time. We also converted the information coming from different sources to a unified format, making it easier to search through.
The solution also allows to extract a section of text indicated using specific keywords for more accurate results. Finally, we were able to transfer content 1:1 without losing any valuable information.
Consequently, the service boosted engineers’ productivity, reducing the time needed for extracting the information from hours to minutes.
Using Knowledge Mining in practice gives you a range of benefits, but most importantly, you can save valuable time. No more trawling through endless files looking (sometimes repeatedly!) for a single piece of information. Whatever you need is ready to access within minutes.
And if that wasn’t enough, here are additional features of the solution which make it a great choice for business file search:
1. Cloud-based
The solution is based on the Microsoft Azure cloud. Therefore, it doesn’t require investment in hardware and is easily manageable.
2. Secure
It is fully secured by Azure AD and the data is stored in highly protected data centers. You can integrate additional solutions, such as Okta, for another layer of security.
3. Scalable
Using Azure Data Lake for storage and tiered Cognitive Search, the mechanism is very efficient and scalable. Desired storage and computing power can be increased with a single click.
4. Fueled by AI
Predica Knowledge Mining Solution is based on the newest, state-of-the-art services such as Cognitive Services, Form Recognizer and OCR mechanisms. Microsoft maintains and constantly enhances this suite of tools.
5. Customizable
Our reference architecture addresses the general problems in knowledge management. However, we can adjust the web app to your needs. We can also include additional mechanisms such as role-based access and managerial Power BI dashboards.
We are already seeing huge interest in our Knowledge Mining solution. What you need to note is that even if your case is different to what we’ve mentioned in our articles, we can easily adjust to your needs.
Here are some other cases which we are aiming to tackle:
Would you like to make more of your information? Just leave us your contact details and we’ll be in touch to schedule a demonstration for you!
Read similar articles