I am trying to find an optimal way of reading around 70-80k different keys from Azure Redis cache. All of this data comes up to ~ 55mb
Currently I use the following:
- To get all keys matching a pattern
await RedisCacheClient.Db0.SearchKeysAsync(pattern)
- To read all the keys
await RedisCacheClient.Db0.GetAllAsync<T>(keys)
If I am the only user reading the data, I can read it all of in ~11.5 seconds.
However, if 2 API calls run in parallel, response takes ~20 seconds.
For 3 API calls, it goes to ~35 seconds.
Can you guys suggest a way I can read data in parallel without delaying response for another user ?
Language: Dot Net 5
DI Set up
services.AddSingleton<IRedisCacheConnectionPoolManager, RedisCacheConnectionPoolManager>();
services.AddSingleton<ISerializer, MsgPackObjectSerializer>();
var redisConfiguration = new RedisConfiguration()
{
ConnectionString = Configuration.GetSection(“Redis:ConnectionString”).Value
};
services.AddSingleton(redisConfiguration);
services.AddSingleton<IRedisCacheClient, RedisCacheClient>();
Redis is single-threaded at its core. While this means that it can only do one thing at a time, this rarely matters as everything is in memory and fast. However, some commands and libraries can hog the main thread if they are particularly intensive. Getting keynames is a common one that does this. This is why the KEYS
commands is so dangerous to use in production. The main thread is blocked while it walks the entire database and returns all the keys. Any pattern you apply to filter the keys does not ameliorate this as each key needs to be evaluated against the provided pattern.
I looked at the source code for StackExchange.Redis.Core for the methods you are calling. Looks like it is using SCAN
instead of KEYS
, which is good. However, it does it in a tight loop with an await
. Normally, StackExchange.Redis would pipeline multiple requests like this to improve the performance. However, SCAN
requires the result of the previous call to paginate properly. So, it can’t do that. This makes the call chatty.
Adding multiple users to the mix means that the parallel calls to SCAN
are interspersed, which increases the chattiness.
Based on the same source, the GetKeysAsync
function looks like it’s being pipelined just fine. Have you measured the time spent in the call to SearchKeysAsync
separate from GetKeysAsync
? I coudl be wrong, but I expect the call to GetKeysAsync
is what is taking all the time.