Did you ever think it would be possible to organize all your search resultset documents in such a way that those could be more valuable to your users?
Whether you have thought of it or not 😊, it is possible via scoring profiles for the desired set of outcomes. We will see its implementation further in the article.
Here, we will learn how to boost documents using weighted fields
, and a few keyNotes
while working with scoring profiles.
The basics of Azure search are a prerequisite for this blog post. Thus, If you are not familiar with Azure Search
, I request you to please refer to this Microsoft post to learn more about it.
How are scores computed?
When searching, the azure search engine calculates a search score
for each corresponding document and ranks documents in descending order by their score.
Azure Cognitive Search
uses a default scoring algorithm to compute an initial score, but you can change the calculation through a scoring profile
. Refer to Cognitive Search article to know more about the relevance score.
The overall score for each document is an aggregation of the individual scores for each field, where the individual score of each field is computed based on the term frequency and document frequency of the searched terms within that field known as TF-IDF or term frequency-inverse document frequency).
What does Boost mean when looking for Azure Search?
The world defines Boost as a push from below
. And, this is what boost
also means in azure search, where documents can be pushed up in the resultset using a scoring profile
.
What are scoring profiles?
The scoring profile is a way for you to set up how documents can be classified
, based on defined criteria. It is part of the search index definition and is comprised of weighted fields
, functions
, and parameters
.
The intention of a scoring profile is to boost matching documents based on the criteria you provide.
Look up the weighted fields?
Weighted fields are nothing but a field or property of an index having an integer value
defined as a weight
. However, when we assign weights to a field, the default score of the field will be multiplied
by the explicitly assigned weight.
This simply means that if the default field score of EventName
is 3, the boosted score for that field becomes 6, contributing to an overall score for the document itself. Â
"scoringProfiles": [
{
"name": "boostKeywords",
"text": {
"weights": {
"EventName": 2
}
}
}]
Populated Sample Data on Azure Search Index.
Let's consider we have an index eventsearchindex
for storing the events
which contain the below-mentioned properties.
public class EventSearchIndexItem
{
public string Id { get; set; }
public string EventName { get; set; }
public string EventDesc { get; set; }
public string Venue { get; set; }
public DateTimeOffset EventDate { get; set; }
}
There are a total of 6
documents that have been populated on the eventsearchindex
with the following details:
"Output":
{
"@odata.count": 6,
"value": [
{
"@search.score": 1,
"id": "3",
"EventName": "Mumbai vs Chennai",
"EventDesc": "Another exciting cricketing game where Chennai has already lost 2 games in the season.",
"EventDate": "2022-04-03T13:56:45.556Z",
"Venue": "Chennai"
},
{
"@search.score": 1,
"id": "2",
"EventName": "Banglore vs Chennai",
"EventDesc": "Another great cricket game, where both teams are preferred.",
"EventDate": "2022-04-02T13:56:45.556Z",
"Venue": "Delhi"
},
{
"@search.score": 1,
"id": "4",
"EventName": "Mumbai vs Kolkatta",
"EventDesc": "Another thriller arrives on your way, the two teams are in great shape.",
"EventDate": "2022-04-05T13:56:45.556Z",
"Venue": "Mumbai"
},
{
"@search.score": 1,
"id": "5",
"EventName": "Kolkatta vs Delhi",
"EventDesc": "Another thriller arrives on your way, the two teams are in great shape, but Delhi seems to be favored as they won the last game against a heavy line up Mumbai.",
"EventDate": "2022-04-06T13:56:45.556Z",
"Venue": "Mumbai"
},
{
"@search.score": 1,
"id": "6",
"EventName": "Chennai vs Hydrebad",
"EventDesc": "Both teams look good in their latest appearance against Mumbai. Chennai showed great character against Mumbai in Delhi, but Hyderabad seems to be much preferred with the deep strike line up.",
"EventDate": "2022-04-04T13:56:45.556Z",
"Venue": "Mumbai"
},
{
"@search.score": 1,
"id": "1",
"EventName": "Delhi vs Mumbai",
"EventDesc": "Another exciting game of cricket where mumbai seems to be a favorite.",
"EventDate": "2022-04-01T13:56:45.556Z",
"Venue": "Delhi"
}
]
}
How do I add a scoring profile?
We can add a scoring profile with the help of the Azure Search Client Library. The below code snippet shows how to instantiate a scoring object and specify the weight of the fields while creating the index. Refer to the create an index  article to learn more about search indexes.
var sc = new Scoring("eventkeywordprofile", SearchableEvent.GetSearchableEventFields())
{
FunctionAggregation = FunctionAggregationTypes.Sum
};
sc.Text.Weights["EventName"] = 5;
Or, If you would like to update the existing
search index, we can add a scoring profile directly to a portal that can update its index definition on its own.
How do I boost results based on specific fields?
The most common way you'd probably boost results is by having specific keywords in specific fields. For instance, if the end-user is searching for events
with a specific keyword, if the EventName
field in the document contains that search keyword would probably be boosted compared, to events where only the EventDesc
or Venue
contains that keyword.
Here's an example without scoring profile
If we search mumbai
, we expect that the EventName
having mumbai
should be on top of the result set.
Here, we see the 4th and 5th document object contain a mumbai
keyword in the EventName
but still lying down compared to the 2nd and 3rd documents that do not have mumbai
in EventName
.
That is because azure search has applied the default algorithm and has found mumbai
in EventDesc
, Venue
, or in both, and hence it has scored the documents accordingly. But this is not what we expected as a set of results. Now to overcome these scoring profiles
come into the picture which helps to boost the documents according to your need.
"Output":
{
"@odata.count": 5,
"value": [
{
"@search.score": 1.3331333,
"id": "4",
"EventName": "Mumbai vs Kolkatta",
"EventDesc": "Another thriller arrives on your way, the two teams are in great shape.",
"EventDate": "2022-04-05T13:56:45.556Z",
"Venue": "Mumbai"
},
{
"@search.score": 1.1071612,
"id": "5",
"EventName": "Kolkatta vs Delhi",
"EventDesc": "Another thriller arrives on your way, the two teams are in great shape, but Delhi seems to be favored as they won the last game against a heavy line up Mumbai.",
"EventDate": "2022-04-06T13:56:45.556Z",
"Venue": "Mumbai"
},
{
"@search.score": 0.650463,
"id": "6",
"EventName": "Chennai vs Hydrebad",
"EventDesc": "Both teams look good in their latest appearance against Mumbai. Chennai showed great character against Mumbai in Delhi, but Hyderabad seems to be much preferred with the deep strike line up.",
"EventDate": "2022-04-04T13:56:45.556Z",
"Venue": "Mumbai"
},
{
"@search.score": 0.5063205,
"id": "1",
"EventName": "Delhi vs Mumbai",
"EventDesc": "Another exciting game of cricket where mumbai seems to be a favorite.",
"EventDate": "2022-04-01T13:56:45.556Z",
"Venue": "Delhi"
},
{
"@search.score": 0.25316024,
"id": "3",
"EventName": "Mumbai vs Chennai",
"EventDesc": "Another exciting cricketing game where Chennai has already lost 2 games in the season.",
"EventDate": "2022-04-03T13:56:45.556Z",
"Venue": "Chennai"
}
]
}
Here's an example with the use of scoring profile
Now, If we pass the defined scoring profile within the query, then the documents having the mumbai
keyword in EventName
got boosted compared to the keyword present in the EventDesc
and Venue
.‌
"Output":
{
"@odata.count": 5,
"value": [
{
"@search.score": 4.785652,
"id": "4",
"EventName": "Mumbai vs Kolkatta",
"EventDesc": "Another thriller arrives on your way, the two teams are in great shape.",
"EventDate": "2022-04-05T13:56:45.556Z",
"Venue": "Mumbai"
},
{
"@search.score": 1.5189614,
"id": "1",
"EventName": "Delhi vs Mumbai",
"EventDesc": "Another exciting game of cricket where mumbai seems to be a favorite.",
"EventDate": "2022-04-01T13:56:45.556Z",
"Venue": "Delhi"
},
{
"@search.score": 1.2658012,
"id": "3",
"EventName": "Mumbai vs Chennai",
"EventDesc": "Another exciting cricketing game where Chennai has already lost 2 games in the season.",
"EventDate": "2022-04-03T13:56:45.556Z",
"Venue": "Chennai"
},
{
"@search.score": 1.1071612,
"id": "5",
"EventName": "Kolkatta vs Delhi",
"EventDesc": "Another thriller arrives on your way, the two teams are in great shape, but Delhi seems to be favored as they won the last game against a heavy line up Mumbai.",
"EventDate": "2022-04-06T13:56:45.556Z",
"Venue": "Mumbai"
},
{
"@search.score": 0.650463,
"id": "6",
"EventName": "Chennai vs Hydrebad",
"EventDesc": "Both teams look good in their latest appearance against Mumbai. Chennai showed great character against Mumbai in Delhi, but Hyderabad seems to be much preferred with the deep strike line up.",
"EventDate": "2022-04-04T13:56:45.556Z",
"Venue": "Mumbai"
}
]
}
Key Takeaways‌
- Scoring profiles are optional, hence you should create one or more scoring profiles when the default ranking behavior doesn’t go far enough in meeting your business objectives.
- You can set 100 scoring profiles, but you can only use one profile at a time.
- The scoring profile allows us to mark one of the profiles as a default profile, which ensures that you don't need to pass it every time in the query.
- You can define more than a property in the scoring profile with the same or different weight.
- If no search keyword is passed in the query, the default score will be 1 for each document.
I hope you enjoyed reading this post and learned something new about scoring profiles. In our next article in this series, we'll look at how the freshness function
helps us to boost the documents.