Hi,
I have a use case in which I am executing an FT.AGGREGATE which then returns a set of records that have a “Foreign Key” in different hashes. What would be the best way to retrieve these other records so that I can join them in my code for 5 records? What about for 1000?
Use a loop and execute HGETALL for each FK?
Create a TAG on the ID field and use FT.SEARCH to retrieve them by this tag.
In Redis currently there is no way to retrieve multiple hashes at the same time is there? Something like an MGETALL hash1 hash2, etc.
An example would be a collection of Flights, which have Airports and Aircrafts. In my flight hash I am only storing AirportID and AircraftID, and once I have retrieved the flights I need, I have to also attach the Airports to them.
So if your Foreign Key is the key name itself then you can hgetall on each key in a pipeline which should make it very fast. Another option is to use LUA and do the ft. aggregate + hgetall inside the LUA script and return only the airports, the downside of this approach is that it will not work on cluster. If you want to do something similar on cluster you can use RedisGears RedisGears - Programmable engine for data processing in Redis
Let me know which option you want to go with and I can help you further.
I am using c# (stackexchange’s library) and I went ahead and implemented pipelining/ batching as it seemed to be the easiest of the options provided, and it seems to work much better! So this should be much more efficient than using FT.SEARCH with tags right?
I have my dev environment running in the cloud as well so it’s a bit difficult at the moment to test performance, is there a recommended number of records per batch I should not go over? I am assuming sending something like 100k at the same time would likely be a bad idea?
Yes it should be the fastest, its a direct access to a hashtable instead of search inside inverted index.
Regarding pipeline size, for sure not 100K, Its also depends on the network, the hardware, and the size of the results (like how many fields in the hashes and the size of each field). You will have to test and see what gives the best results.