I noticed that when doing a prefix search, it runs on all the terms of the field value as opposed to the entire value. Here’s an example:
FT.CREATE idx:names ON hash PREFIX 1 "name:" SCHEMA name TEXT SORTABLE
HSET name:1 name "moon"
HSET name:2 name "Miz mooz"
FT.SEARCH idx:names "@name:moo*"
The last search would return “Miz mooz” which was unexpected to me.
Is there a way to run prefix searches that takes the entire field into account?
I saw in the docs that this behavior is due to tokenization on TEXT fields. An alternative is to use TAG fields but they require a separator.
Hi @clovis1122,
You are right about TAG fields and it seems to be a good solution for you.
Why are you worried about the separator? You don’t want whitespace to be a separator/tokenizer.
By default, the separator for TAG fields is ','.
In my dataset I have names with comma. I wouldn’t want to pick a random separator and hope that it isn’t present in my data. Is there any way to create a TAG field without a separator?
Found an answer to this. You can escape the whitespace in your data to prevent tokenization. I ended up doing this and removing stop words. So something like “Miz mooz” would be inserted as “Miz\ mooz”.
As the docs says, sore tools including redis-cli also require a space on their own, so this record would be inserted as: