Recently we had a bug in one of our app where search was not able to find anything based on sub keywords or sub text.
Lets say we have an article that has keyword "dedicated" in subject or body or in tags. Now if we try to search using keywords say | cat | cate | ted | dedi | dicat | dedi | then outcome was zero search results.
This issue was due to the way indexing was done along with missing query configuration in Solr. We fixed the indexing and seaching configuration in solr/conf/schema.xml file. We added/edited following code snippet in solr/conf/schema.xml file.
<pre>
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="1" maxGramSize="20" side="front"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
</pre>
This configuration snippet instructs solr to index the tokens by gram size of 1 to 20 and do partial word search. Note that our index size will grow now as we are have set gram size of large span from 1 to 20.
Check this article to know more about NGrams
and index tokens.
We also used "Solr Admin page" to test the queries and results.
Solr Admin page is available on following url on local - http://localhost:8982/solr/admin/
Check the port on which Solr is running on your local by issuing this command ps -ef | grep solr
and accordingly use this port in the URL
Refer attached screenshot. This is the way we queried the solr indexer to test search results based on partial text / keywords. So the solr admin really proved to be very useful for this index and search testing on local.
This article helped us with the query language syntax used in Solr Admin page.