I am using lettusearch to execute queries against a Redisearch instance.
Recently, I noticed that if you pass a tag value that includes non-alphanumeric characters such as “:” or “_”, then the query string passed to the lettusearch client raises a syntax error.
After spending some time to understand how to make such queries work, I found that “special” characters must be escaped: “:” becomes “\:”.
This leads me to ask 2 questions:
Why is escaping not done at the client level? Are there any utility functions that could be provided?
What are the characters that require escaping? is there a standard Java method that could be used?
Currently, I am passing each tag value to a custom Java escape function but I am most probably not escaping all characters I should:
private static String escape(String value) {
value = value.replaceAll("[:]", "\\\\:");
value = value.replaceAll("[_]", "\\\\_");
value = value.replaceAll("[-]", "\\\\-");
value = value.replaceAll("[@]", "\\\\@");
return value;
}
The documentation for the RediSearch query syntax doesn’t provide a list of characters but states:
Punctuation marks in tags should be escaped with a backslash ( \ ). It is also recommended (but not mandatory) to escape spaces; The reason is that if a multi-word tag includes stopwords, it will create a syntax error.
So, it sounds like everything that isn’t a letter.
As to why lettusearch doesn’t escape them for you, I can’t say. But @jruaux might have a better idea as he’s one of the primary contributors.
Hi Laurent, thank you for reporting this. Escaping at the client level could be done but it probably would need to be part of a query builder or “prepared statement” of some sort to avoid unintended escaping. In the mean time I’ll add a utility function like you have there so that it can be called explicitly.