Hi all. I wrote and operate a BitTorrent tracker (which is basically an HTTP server with a single endpoint: /announce
) that uses Redis as its database. I am trying to determine if Sorted Sets are the most efficient way to store and operate on the data (in terms of memory usage). First a very dumbed down version of what the tracker does:
Remote clients will “announce” (to the tracker - an HTTP GET request) that they are downloading a particular torrent (each torrent has an “ID”). This gives me three pieces of information:
- Client’s address (IP address)
- Torrent ID
- Timestamp (current unix timestamp)
I then need to reply to the client with a list of other “peers” which have “announced” that particular torrent in the last 30 minutes. My sorted set design is as follows:
- The key name is the torrent ID
- For every “announce”, I
ZADD
the IP address, with a score as the current UNIX timestamp - For the reply, I get a list of peers using
ZRANGEBYSCORE
with min=(current_time-30min)
, max=current_time
.
Basically, the redis features I use sorted sets for are:
- Able to query by score - only get “peers” active in last 30 min
- Able to delete by score - I can
ZREMRANGEBYSCORE
“peers” stale for more than 30 min to free memory
However, sorted sets have another property, which is they keys are store lexographically. I don’t need this optimization, so I wonder if there might be a more memory efficient solution to my problem?
Additionally sorted set scores are 64bit longs, whereas technically for my use case even 32bit should be enough (or even 16 bit with some additional logic!)
Some numbers: In production, these are my memory stats:
used_memory:753239352
used_memory_human:718.35M
used_memory_rss:1037946880
used_memory_rss_human:989.86M
used_memory_peak:1283459224
used_memory_peak_human:1.20G
used_memory_peak_perc:58.69%
used_memory_overhead:216436768
used_memory_startup:810024
used_memory_dataset:536802584
used_memory_dataset_perc:71.34%
# smem -krt
PID User Command Swap USS PSS RSS
2432504 root /usr/local/bin/redis-server 346.1M 724.9M 724.9M 725.4M
Edit: FYI: For popular torrents a single sorted set may contain between 1000~10000 active entries.