Loading RDB causes high CPU usage

We had a Redis Graph database with 7.5 million nodes and 22 million relationships running on official Redis Graph image on Kubernetes. It uses 68 GiB of RAM and all this data is loaded with 4000 mCPU. We then recycled the pod and tried to load the data from the persistent storage (which is a 13 GiB RDB file). The CPU usage topped the limit of the pod and Redis process got killed. Later I increased its CPU limit up to 15000 mCPU but no matter the limit is, it hits it and got killed. The interesting part is it gets killed after loading 42 GiB of data into the memory (the memory limit is 100 GiB)
Here’s the logs and memory - cpu graphics during data loading (orange dotted line is request, blue dotted line is the limit)

1:C 29 Mar 2022 14:53:00.061 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 29 Mar 2022 14:53:00.061 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 29 Mar 2022 14:53:00.061 # Configuration loaded
1:M 29 Mar 2022 14:53:00.061 * monotonic clock: POSIX clock_gettime
1:M 29 Mar 2022 14:53:00.063 * Node configuration loaded, I'm aea04d86b2f1d058b2531daf8ff1123c6370bc99
1:M 29 Mar 2022 14:53:00.063 * Running mode=cluster, port=6379.
1:M 29 Mar 2022 14:53:00.063 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 29 Mar 2022 14:53:00.063 # Server initialized
1:M 29 Mar 2022 14:53:00.066 * <graph> Starting up RedisGraph version 2.8.9.
1:M 29 Mar 2022 14:53:00.069 * <graph> Thread pool created, using 72 threads.
1:M 29 Mar 2022 14:53:00.069 * <graph> Maximum number of OpenMP threads set to 72
1:M 29 Mar 2022 14:53:00.069 * Module 'graph' loaded from /usr/lib/redis/modules/redisgraph.so
1:M 29 Mar 2022 14:53:00.070 * Loading RDB produced by version 6.2.6
1:M 29 Mar 2022 14:53:00.070 * RDB age 51823 seconds
1:M 29 Mar 2022 14:53:00.070 * RDB memory usage when created 68107.15 Mb
1:signal-handler (1648565812) Received shutdown signal during loading, exiting now.

The majority of the data loading process in RedisGraph is performed on a single thread, it is when all the data (RDB) file been processed that we “commit” our internal data structures (GraphBLAS matrices) at which point GraphBLAS may used multiple threads, you might be able to limit the number of threads used by GraphBLAS by setting a limit on the number of OpenMP threads the system is allowed to use.

GraphBLAS may use multiple threads while loading the RDB file as well.

The maximum number of threads it uses can be configured with OMP_THREAD_COUNT.

Setting OMP_THREAD_COUNT to 1 did the trick for the data loading part. The data is loaded in 50 minutes with CPU usage below 1.
But I have another question, how will this configuration affect the query performance or concurrent clients running queries? What’s the sweet spot for the thread count? Number of cores, or in my case the CPU limit of the pod?

This will not have an effect on concurrent clients running queries; they use a threadpool whose size is determined by THREAD_COUNT.

Your individual queries might perform worse, as GraphBLAS won’t use its standard parallel strategies (this effect is going to be felt most strongly on Conditional Traverse ops).

By default, both of these configs are set to # of cores. It sounds like for you the CPU limit would be a better option, though you still run the risk of the overall overhead being greater due to the combination of GraphBLAS OpenMP threads and RedisGraph threadpool threads. Half the CPU limit would be safest, but I think you’re unnecessarily sacrificing performance at that level. You’ll probably have to fine-tune this a bit manually.