ArrowLinkToArchivePageBlog

Data mining and knowledge discovery How To Improve The Search Experience For Your Users?

At long last, organizations have a really powerful tool that boosts employees’ efficiency by speeding up their content search experience.

In the previous article in this series, Dawid discussed a great number of benefits of implementing a Knowledge Mining solution. This time, I would like to explain how Azure services make it work wonders.

KEY TAKEAWAYS:

  1. What are the three main stages of intelligent search?
  2. How to use Azure services to build a Knowledge Mining solution?
  3. How can AI-powered search help an organization in everyday work?

Before we discuss the architecture specifics, we need to be clear on how the process works. Without further ado, let’s explain its three basic steps.

Knowledge Mining orchestration steps

In the previous article on AI-driven search, we have discovered that using Knowledge Mining you can extract information from structured and unstructured data stored in different sources. To do this, the solution uses a range of pre-trained and custom AI services.

In a nutshell, Knowledge Mining works by orchestrating the entire enrichment pipeline. There are three main phases to the process. We call them:

  1. Ingestion
  2. Enrichment
  3. Exploration & Analysis

Here is a visualization of this structure:

Knowledge Mining orchestration steps

Knowledge Mining orchestration steps (source: Microsoft)

What does each of this phases do? Let’s go over them one by one.

Ingestion

Ingestion is the first phase of Knowledge Mining. Here, both structured and unstructured data is processed.

What is the difference between these two types of data?

Structured data has a defined data model, and it typically resides in a relational database like Azure SQL. Unstructured data, on the other hand, does not have a predefined data model. It can come from sources such as NoSQL databases or file stores. File types include PDFs, images, Word documents, PowerPoint presentations, and more.

What about data ingestion? It is a process in which raw data, structured or unstructured, coming from various sources, is aggregated into a persistent, centralized data store.

At the end of the ingestion phase, documents are cracked. In other words, the solution extracts or creates text content from non-text sources. In this process, optical character recognition (OCR) is very useful. It proves especially helpful when we want to extract data from images or PDF files.

Enrichment

Enrichment is the second phase of Knowledge Mining, utilizing AI. In this process, the solution identifies patterns, obtains information, and gains understanding from texts coming from images, PDF files, and other unstructured data sources.

Knowledge Mining performs enrichment on individual documents as a sequence of calls to AI models. What’s interesting, you can use Azure cloud-based AI services, but you may also build your own custom models, which you will use in this phase.

Exploration & Analysis

In the final stage, Knowledge Mining exposes the newly enriched, structured documents. They are now ready for exploration and analysis.

During exploration, the solution reviews the added enrichment to learn more about the collected data. The results are then available via search indexes or end-user and line-of-business applications, such as customer relationship management (CRM) or enterprise resource planning (ERP) systems.

After exploration, it’s time for analysis. Typically, this process involves applying analytics tools, such as Power BI. They serve to explore and gain a deeper understanding of the gathered information.

Sign up to our newsletter and join 2,500 professionals who get the latest industry insights every two weeks! Subscribe

The building blocks of an Azure-based Knowledge Mining solution

Let’s talk about the role the Azure cloud plays in building an intelligent search solution.

Microsoft Azure provides a number of useful services which make the solution work smoothly and effectively. Here are some examples.

Azure Cognitive Search

Azure Cognitive Search is a search-as-a-service solution. It gives developers the tools to provide a rich search experience.

Let me give you an example. Imagine a mobile app that you use to shop. You are looking for a specific product, so you immediately use the search box. There, you have a list of similar products. So, you decide to apply filtering by price. Still, the list of products is endless and your search experience leaves a lot to be desired.

You know very well that implementing your own search engine can be time-consuming. In such cases, Azure Cognitive Search speeds up the whole process and increases the quality of the search experience.

In general, there are two basic approaches that Azure Cognitive Search uses to ingest data and populate an index. We call them pull data and push data. Dawid has already touched upon them in the previous post. Let me just remind you briefly how they work.

Click to find out more about Knowledge Mining

Click to find out more

Pull Data

In this case, Azure Cognitive Search pulls data into the index from supported Azure data sources, such as:

Push Data

The second approach is quite different. The push model relies on custom applications to send documents directly into a search index. This is done programmatically. Applications can use either Azure Cognitive Search REST API or Azure Search SDK for .NET to send data into the index.

Sign up for Predica Newsletter

A weekly, ad-free newsletter that helps cutomer stay in the know. Take a look.

Azure Cognitive Services

When you use Azure Cognitive Search to build a Knowledge Mining solution, there is a wide range of pre-trained services in the Azure cloud that you can integrate. We call them Microsoft Cognitive Services. They serve e.g. to apply PDF file scanning using OCR, and to extract relevant content.

Here are some examples of pre-defined Azure Cognitive Services that a Knowledge Mining solution can use during the enrichment phase:

  • Face API – detects people on images
  • Computer vision API – scans PDF files with OCR
  • Form recognizer API – extracts data from documents
  • Text Analytics API – extracts the sentiment analysis result
  • Translator Text API – translates document content.

To see how these services work, you check out the Microsoft website for a number of demos.

You can try the Face API for yourself here. You simply submit an image and the service detects the faces on it.

If you would like to check how Form Recognizer API extracts data from documents, then a useful visualization is available here.

The Azure Cognitive Search architecture is extensible, so it allows you to assemble an enrichment pipeline from both predefined and custom cognitive skills.

The custom skills I’ve just mentioned may prove useful for many different organizations, as they provide a way to insert transformations that are unique to your content.

A custom skill executes independently. It applies any enrichment step we require. A good example would be data extraction from Word document tables.

Azure Functions

Should you decide to apply custom skills for the AI enrichment phase, Azure Functions are ideal for implementing them.

In the enrichment process, the solution can call Azure Functions, which in turn call other Cognitive Services, for example Form Recognizer. This service will analyze document content, and the solution will then pass the results to Azure Cognitive Search.

Below, you’ll find a sample architecture which presents custom skills in use.

Intelligent search with Azure Functions architecture

Intelligent search using Azure Functions

The image shows an Azure Function that calls the Form Recognizer API to perform two tasks: first, analyzing form document content, and then inserting the results to the Azure Cognitive Search index.

Time for a real-life scenario!

Together with my team at Predica, we delivered a Knowledge Mining solution to one of our clients. It was quite a big project related to extracting content from aircraft technical documentation to help users respond to technical queries and service requests.

Our goal was to develop a quick and effective tool that would allow the users to quickly find potential answers and solutions through a web app user interface.

How did we do this?

As we have already had some positive experience with Knowledge Mining and Azure services, the answer was clear from the very start. We used the services I described earlier to create an intelligent search service.

First, we implemented Azure Cognitive Search, combined with Azure Cognitive Services. To answer project requirements, we also added Form Recognizer which would localize different parts of the forms.

Using Azure Functions, we developed custom skills. We also used Azure Cosmos DB to keep the additional configuration values.

Finally, all source files were uploaded to the Azure Blob Storage, which provides a single source of data for the search engine.

Below you’ll find the architecture diagram.

Example Knowledge Mining solution architecture

An example Knowledge Mining solution architecture (click to view the full-size version)

If you would like to find out more about the solution we implemented, then keep an eye on our blog. Soon, there will be a dedicated article where I will provide more details.

Improve your users’ search experience with the Azure cloud

Creating a Knowledge Mining solution certainly poses some challenges. It may be especially tricky if there are a lot of documents in many different formats.

This is why I recommend using the Azure cloud when implementing the solution. That’s because with the help of Azure Search and other Microsoft Cognitive Services, you can build the required service faster and more efficiently.

Begin with the three process stages I outlined at the beginning of this article. Then, add custom skills and required cloud services to process and analyze your data. With a single solution, you can fight off inefficient work practices for good.

Interested in what Knowledge Mining can do for you? Book a free demo!

Key takeaways

  1. Knowledge Mining encompasses the complete data enrichment pipeline. It consists of three main phases: ingestion, enrichment, and exploration & analysis.
  2. Azure Cognitive Search is a search-as-a-service cloud solution that gives developers the tools to provide a richer search experience.
  3. There is a wide range of pre-trained services in the Azure cloud. We call them Microsoft Cognitive Services. They can be used to extract data and insights from many different types of documents.

Ready to learn more about us?

SHARE

Want more updates like this? Join thousands of specialists who already follow our newsletter.

Stay up to date with the latest cloud insights from our CTO