Efficient & robust search functionality is a necessity of any software application, may it be web, mobile, or desktop. It's a "must" to improve user experience (UX) and adoption. The search feature may seem very simple to a user but its very complex and challenging behind the scene to implement especially deep search.
For one of our web applications, we had a challenging requirement to add several nested filters on search result pages. The primary data was sourced through Azure search on these web pages. In this article, I am sharing our learnings about Azure search and the power of cognitive-based search.
How can we cover complex search scenarios using Azure Cognitive Search?
Basics of Azure search are a prerequisite for this article. Thus, If you are not familiar with Azure search, request you to please check the helpful links provided in the references section at the bottom of this article.
In the earlier versions of Azure Search, there were limited data types to define index properties. There were a few basic types and collections, but no way to model complex types like an object e.g.
"name": {
ย ย "first": "Sagar",
ย ย "last": "Pathak"
}
External datasets which we used to populate an Azure Search index can come in many forms. Sometimes they include hierarchical or nested substructures. Examples might consist of multiple addresses for a single customer or many ordered items of a single customer, and so on.
In terms of modeling, you might see these structures referred to as complex
, compound
, composite
, or aggregate
data types. Term Azure Search uses for this concept is a complex
type.
In Azure cognitive Search, structures of complex types are defined or modeled using complex fields. A complex field is a field that contains children as (sub-fields) which can be of any data type. Thus, a complex field works like, structured data types in a programming language.
In Azure Complex fields represent either a single object in the document or an array of objects, depending on the data type. Fields of typeEdm.ComplexType
represents single objects, while fields of typeCollection(Edm.ComplexType)
represent arrays of objects.
Example
Let's take a look at a customer
and product list
๐ , where the customers have bought multiple product items under different carts(Like Amazon shopping cart).
ProductId | Prdocut Name | Cost |
---|---|---|
P01 | Product 1 | 300 |
P02 | Product 2 | 450 |
P03 | Product 3 | 650 |
P04 | Product 4 | 700 |
P05 | Product 5 | 1000 |
Id | Name | City | State | PostalCode |
---|---|---|---|---|
111 | Harrison Reed | New York | New York | 10021 |
211 | Lizzie Richardson | Chicago | Illinois | 40003 |
311 | Angelica Bennett | Houston | Texas | 50021 |
411 | James Smith | Los Angeles | California | 60075 |
511 | Zanna Clifforde | San Diego | California | 40006 |
To understand this better, please have a look at the diagram ๐
Above diagram 1.1 represents a list of customers with purchase details where an individual cart is associated with a customer containing CartId
and Product info. For example, Customer Zanna Clifforde bought 2 product items P03
, and P04
, under two different carts where P03
in C51101
and P04
in C51102
.
Diagram (1.2) ๐ of a search index structure that we have created to support the customer purchase details on the Azure portal is composed of simple fields and complex fields.
Complex fields, such as Address
and OrderCarts
, have sub-fields. The Address
has a single set of values for those sub-fields since it's a single object in the document. In contrast, OrderedCarts
has multiple sets of values for its subfields, and OrderItems
under OrderedCarts
have multiple values for its subfields.
Below is the actual data posted on the search index via the postman. Find the body content of an API call here.
Azure Cognitive Search uses full-text search to filter out documents. But additionally also uses OData filter
expressions to apply additional criteria to a search query besides full-text search terms. To know more about OData Filter
check below references.
How to filter the documents?
To be specific, We will be making queries directly on the Azure portal but the same can be called via different languages like C# or java using their respective syntax of wiring up search index.
- Filter out a list of customers who belong to California.
$filter= CustomerAddress/State eq 'California'&$select=CustomerName&$count=true
Output :
{
"@odata.count": 2,
"value": [{
"CustomerName": "James Smith"
},
{
"CustomerName": "Zanna Clifforde"
}]
}
2. Filter out a list of customers who have bought only Product 1 (ProductId: P01
).
$filter= OrderedCarts/any(c: c/OrderItems/any(t: t/ProductId eq 'P01'))&$select=CustomerName&$count=true
Output :
{
"@odata.count": 2,
"value": [{
"CustomerName": "Angelica Bennett"
},
{
"CustomerName": "Harrison Reed"
}]
}
3. Filter out a list of customers who have bought Product 1 (ProductId: P01
) OR Product 5 (ProductId: P05
).
$filter= OrderedCarts/any(c: c/OrderItems/any(t: t/ProductId eq 'P01' or t/ProductId eq 'P05'))&$select=CustomerName&$count=true
Output :
{
"@odata.count": 4,
"value": [{
"CustomerName": "Angelica Bennett"
},
{
"CustomerName": "Lizzie Richardson"
},
{
"CustomerName": "James Smith"
},
{
"CustomerName": "Harrison Reed"
}]
}
4. Filter out a list of customers who have bought both Product 1 (ProductId: P01
) AND Product 2 (ProductId: P02
).
$filter= (OrderedCarts/any(c: c/OrderItems/any(t: t/ProductId eq 'P01')) and OrderedCarts/any(c: c/OrderItems/any(t: t/ProductId eq 'P02')))&$select=CustomerName&$count=true
Output :
{
"@odata.count": 2,
"value": [{
"CustomerName": "Angelica Bennett"
},
{
"CustomerName": "Harrison Reed"
}]
}
5. Get customers whose each products items Delivered inside of a cart.
$filter= OrderedCarts/all(c: c/OrderItems/all(t: t/DeliverStatus eq 'Delivered'))&$select=CustomerName&$count=true
Output :
{
"@odata.count": 1,
"value": [{
"CustomerName": "Harrison Reed"
}]
}
6. Get customer's who has bought product whose cost is greater than 700.
$filter= OrderedCarts/any(c: c/OrderItems/any(t: t/Cost ge 700))&$select=CustomerName&$count=true
Output :
{
"@odata.count": 4,
"value": [{
"CustomerName": "Angelica Bennett"
},
{
"CustomerName": "Lizzie Richardson"
},
{
"CustomerName": "James Smith"
},
{
"CustomerName": "Zanna Clifforde"
}]
}
Use of Search.in function keyword
7. Get Customer's who has bought Product 1 (ProductId: P01
) OR Product 5 (ProductId: P05
) by Product Name.
$filter= OrderedCarts/any(c: c/OrderItems/any(t: search.in(t/ProductName, 'Product 1,Product 5',',')))&$select=CustomerName&$count=true
Output :
{
"@odata.count": 4,
"value": [{
"CustomerName": "Angelica Bennett"
},
{
"CustomerName": "Lizzie Richardson"
},
{
"CustomerName": "James Smith"
},
{
"CustomerName": "Harrison Reed"
}]
}
8. Get Customers who have bought Product 5 (ProductId: P05
) but not Product 2 (ProductId: P02
).
$filter= OrderedCarts/any(c: c/OrderItems/any(t: search.in(t/ProductName, 'Product 5',','))) and OrderedCarts/all(c: c/OrderItems/all(t: not search.in(t/ProductName, 'Product 2',',')))&$select=CustomerName&$count=true
Output :
{
"@odata.count": 1,
"value": [
{
"CustomerName": "Lizzie Richardson"
}
]
}
I hope you have enjoyed reading this article and believe that will help you implement deep search with Azure Search complex fields.