How to boost documents with scoring profile in AzureSearch

Did you ever think it would be possible to organize all your search resultset documents in such a way that those could be more valuable to your users?

Whether you have thought of it or not 😊, it is possible via scoring profiles for the desired set of outcomes. We will see its implementation further in the article.

Here, we will learn how to boost documents using weighted fields, and a few keyNotes while working with scoring profiles.

The basics of Azure search are a prerequisite for this blog post. Thus, If you are not familiar with Azure Search, I request you to please refer to this Microsoft post to learn more about it.

How are scores computed?

When searching, the azure search engine calculates a search score for each corresponding document and ranks documents in descending order by their score.

Azure Cognitive Search uses a default scoring algorithm to compute an initial score, but you can change the calculation through a scoring profile. Refer to Cognitive Search article to know more about the relevance score.

The overall score for each document is an aggregation of the individual scores for each field, where the individual score of each field is computed based on the term frequency and document frequency of the searched terms within that field known as TF-IDF or term frequency-inverse document frequency).

What does Boost mean when looking for Azure Search?

The world defines Boost as a push from below. And, this is what boost also means in azure search, where documents can be pushed up in the resultset using a scoring profile.

What are scoring profiles?

The scoring profile is a way for you to set up how documents can be classified, based on defined criteria. It is part of the search index definition and is comprised of weighted fields, functions, and parameters.

The intention of a scoring profile is to boost matching documents based on the criteria you provide.

Look up the weighted fields?

Weighted fields are nothing but a field or property of an index having an integer value defined as a weight. However, when we assign weights to a field, the default score of the field will be multiplied by the explicitly assigned weight.

This simply means that if the default field score of EventName is 3, the boosted score for that field becomes 6, contributing to an overall score for the document itself.

"scoringProfiles": [  
{  
  "name": "boostKeywords",  
  "text": {  
    "weights": {  
      "EventName": 2
    }  
  }  
}]

Populated Sample Data on Azure Search Index.

Let's consider we have an index eventsearchindex for storing the events which contain the below-mentioned properties.

 public class EventSearchIndexItem
 {
     public string Id { get; set; }
     public string EventName { get; set; }
     public string EventDesc { get; set; }
     public string Venue { get; set; }
     public DateTimeOffset EventDate { get; set; }
 }

Query to get a count of documents in the index

There are a total of 6 documents that have been populated on the eventsearchindex with the following details:

"Output": 
{
    "@odata.count": 6,
    "value": [
        {
            "@search.score": 1,
            "id": "3",
            "EventName": "Mumbai vs Chennai",
            "EventDesc": "Another exciting cricketing game where Chennai has already lost 2 games in the season.",
            "EventDate": "2022-04-03T13:56:45.556Z",
            "Venue": "Chennai"
        },
        {
            "@search.score": 1,
            "id": "2",
            "EventName": "Banglore vs Chennai",
            "EventDesc": "Another great cricket game, where both teams are preferred.",
            "EventDate": "2022-04-02T13:56:45.556Z",
            "Venue": "Delhi"
        },
        {
            "@search.score": 1,
            "id": "4",
            "EventName": "Mumbai vs Kolkatta",
            "EventDesc": "Another thriller arrives on your way, the two teams are in great shape.",
            "EventDate": "2022-04-05T13:56:45.556Z",
            "Venue": "Mumbai"
        },
        {
            "@search.score": 1,
            "id": "5",
            "EventName": "Kolkatta vs Delhi",
            "EventDesc": "Another thriller arrives on your way, the two teams are in great shape, but Delhi seems to be favored as they won the last game against a heavy line up Mumbai.",
            "EventDate": "2022-04-06T13:56:45.556Z",
            "Venue": "Mumbai"
        },
        {
            "@search.score": 1,
            "id": "6",
            "EventName": "Chennai vs Hydrebad",
            "EventDesc": "Both teams look good in their latest appearance against Mumbai.  Chennai showed great character against Mumbai in Delhi, but Hyderabad seems to be much preferred with the deep strike line up.",
            "EventDate": "2022-04-04T13:56:45.556Z",
            "Venue": "Mumbai"
        },
        {
            "@search.score": 1,
            "id": "1",
            "EventName": "Delhi vs Mumbai",
            "EventDesc": "Another exciting game of cricket where mumbai seems to be a favorite.",
            "EventDate": "2022-04-01T13:56:45.556Z",
            "Venue": "Delhi"
        }
    ]
}

How do I add a scoring profile?

We can add a scoring profile with the help of the Azure Search Client Library. The below code snippet shows how to instantiate a scoring object and specify the weight of the fields while creating the index. Refer to the create an index article to learn more about search indexes.

var sc = new Scoring("eventkeywordprofile", SearchableEvent.GetSearchableEventFields())
{
    FunctionAggregation = FunctionAggregationTypes.Sum
};
sc.Text.Weights["EventName"] = 5;

Or, If you would like to update the existing search index, we can add a scoring profile directly to a portal that can update its index definition on its own.

Adding of scoring profile in azure portal

How do I boost results based on specific fields?

The most common way you'd probably boost results is by having specific keywords in specific fields. For instance, if the end-user is searching for events with a specific keyword, if the EventName field in the document contains that search keyword would probably be boosted compared, to events where only the EventDesc or Venue contains that keyword.

Here's an example without scoring profile

If we search mumbai, we expect that the EventName having mumbai should be on top of the result set.

Here, we see the 4th and 5th document object contain a mumbai keyword in the EventName but still lying down compared to the 2nd and 3rd documents that do not have mumbai in EventName.

That is because azure search has applied the default algorithm and has found mumbai in EventDesc, Venue, or in both, and hence it has scored the documents accordingly. But this is not what we expected as a set of results. Now to overcome these scoring profiles come into the picture which helps to boost the documents according to your need.

Query to get documents having mumbai keyword

"Output": 
{
    "@odata.count": 5,
    "value": [
        {
            "@search.score": 1.3331333,
            "id": "4",
            "EventName": "Mumbai vs Kolkatta",
            "EventDesc": "Another thriller arrives on your way, the two teams are in great shape.",
            "EventDate": "2022-04-05T13:56:45.556Z",
            "Venue": "Mumbai"
        },
        {
            "@search.score": 1.1071612,
            "id": "5",
            "EventName": "Kolkatta vs Delhi",
            "EventDesc": "Another thriller arrives on your way, the two teams are in great shape, but Delhi seems to be favored as they won the last game against a heavy line up Mumbai.",
            "EventDate": "2022-04-06T13:56:45.556Z",
            "Venue": "Mumbai"
        },
        {
            "@search.score": 0.650463,
            "id": "6",
            "EventName": "Chennai vs Hydrebad",
            "EventDesc": "Both teams look good in their latest appearance against Mumbai.  Chennai showed great character against Mumbai in Delhi, but Hyderabad seems to be much preferred with the deep strike line up.",
            "EventDate": "2022-04-04T13:56:45.556Z",
            "Venue": "Mumbai"
        },
        {
            "@search.score": 0.5063205,
            "id": "1",
            "EventName": "Delhi vs Mumbai",
            "EventDesc": "Another exciting game of cricket where mumbai seems to be a favorite.",
            "EventDate": "2022-04-01T13:56:45.556Z",
            "Venue": "Delhi"
        },
        {
            "@search.score": 0.25316024,
            "id": "3",
            "EventName": "Mumbai vs Chennai",
            "EventDesc": "Another exciting cricketing game where Chennai has already lost 2 games in the season.",
            "EventDate": "2022-04-03T13:56:45.556Z",
            "Venue": "Chennai"
        }
    ]
}

Here's an example with the use of scoring profile

Now, If we pass the defined scoring profile within the query, then the documents having the mumbai keyword in EventName got boosted compared to the keyword present in the EventDesc and Venue.‌

"Output": 
{
    "@odata.count": 5,
    "value": [
        {
            "@search.score": 4.785652,
            "id": "4",
            "EventName": "Mumbai vs Kolkatta",
            "EventDesc": "Another thriller arrives on your way, the two teams are in great shape.",
            "EventDate": "2022-04-05T13:56:45.556Z",
            "Venue": "Mumbai"
        },
        {
            "@search.score": 1.5189614,
            "id": "1",
            "EventName": "Delhi vs Mumbai",
            "EventDesc": "Another exciting game of cricket where mumbai seems to be a favorite.",
            "EventDate": "2022-04-01T13:56:45.556Z",
            "Venue": "Delhi"
        },
        {
            "@search.score": 1.2658012,
            "id": "3",
            "EventName": "Mumbai vs Chennai",
            "EventDesc": "Another exciting cricketing game where Chennai has already lost 2 games in the season.",
            "EventDate": "2022-04-03T13:56:45.556Z",
            "Venue": "Chennai"
        },
        {
            "@search.score": 1.1071612,
            "id": "5",
            "EventName": "Kolkatta vs Delhi",
            "EventDesc": "Another thriller arrives on your way, the two teams are in great shape, but Delhi seems to be favored as they won the last game against a heavy line up Mumbai.",
            "EventDate": "2022-04-06T13:56:45.556Z",
            "Venue": "Mumbai"
        },
        {
            "@search.score": 0.650463,
            "id": "6",
            "EventName": "Chennai vs Hydrebad",
            "EventDesc": "Both teams look good in their latest appearance against Mumbai.  Chennai showed great character against Mumbai in Delhi, but Hyderabad seems to be much preferred with the deep strike line up.",
            "EventDate": "2022-04-04T13:56:45.556Z",
            "Venue": "Mumbai"
        }
    ]
}

Key Takeaways‌

Scoring profiles are optional, hence you should create one or more scoring profiles when the default ranking behavior doesn’t go far enough in meeting your business objectives.
You can set 100 scoring profiles, but you can only use one profile at a time.
The scoring profile allows us to mark one of the profiles as a default profile, which ensures that you don't need to pass it every time in the query.
You can define more than a property in the scoring profile with the same or different weight.
If no search keyword is passed in the query, the default score will be 1 for each document.

I hope you enjoyed reading this post and learned something new about scoring profiles. In our next article in this series, we'll look at how the freshness function helps us to boost the documents.

Everything You Need to Know About Boosting Documents with Weighted Fields in Azure Search