Recommendations for Efficient prefix search


for our use case we have to index about 100K documents containing one text field, which most of the time consists of a single word (e.g. nickname), and several numeric fields.
Also, we need to be able to run prefix queries on the text field in an efficient way.

We have gone through the documentation and it looks like using prefix search could lead to severe performance issues since a query like “fan*” would be expanded potentially into thousands of terms (we expect the dictionary to contain more than 10K items).

As an alternative approach and hopefully more efficient approach, we were thinking of using the Redisearch index just for tag and numeric fields, and instead build our own index for the text field (using sets).

The idea would be to first retrieve from our own index all the document IDs matching the prefix query on the text field, then pass these IDs as arguments for the INKEYS parameter of the FT.SEARCH command.

However we’re wondering if using INKEYS with hundreds or thousands of values would basically lead to the same performance issues.

you can use the parameter MAXPREFIXEXPANSIONS since the prefix is well known, limiting the #numbers interactions required to match the query prefix.

I can’t see any limitation to use Redisearch for all your query needs based on your brief explanation using TEXT, TAGs and NUMERIC fields.

Hi @adriano_redis actually the prefix is not well known, it can be anything (fan* was just an example).
I thought of using MAXPREFIXEXPANSIONS but that would leave out some relevant results (some prefix searches could match with several thousands of items).