Performance problems

Hi there @fcosta_oliveira these days I have tested RedisGraph in different case with your branch and when I tried sending 1K 6-hop requests with 22 clients with the following command: (redis-server is set to 22 omp threads and 22 thread pool threads)

memtier_benchmark -c 1 -t 22 -n 1000 --hide-histogram --command="graph.query graph500_22 \"MATCH (s:graph500_22_unique_node_out)-[*6]->(t) WHERE ID(s)=__key__ RETURN count(t)\"" --key-maximum=100000 --distinct-client-seed --command-key-pattern G --key-prefix=""

the ALL STAT table I got is like this:

22        Threads
1         Connections per thread
1000      Requests per client


ALL STATS
=======================================================================================================
Type              Ops/sec    Avg. Latency     p50 Latency     p99 Latency    p100 Latency       KB/sec 
-------------------------------------------------------------------------------------------------------
Graph.querys        12.66       252.50357        38.65500      1036.28700      1048.57500         3.19 
Totals              12.66       252.50357        38.65500      1036.28700      1048.57500         3.19 

Obviously the latency seems too short compared with the 3-hop case with same other parameter:

22        Threads
1         Connections per thread
1000      Requests per client


ALL STATS
=======================================================================================================
Type              Ops/sec    Avg. Latency     p50 Latency     p99 Latency    p100 Latency       KB/sec 
-------------------------------------------------------------------------------------------------------
Graph.querys        48.23       379.90358       299.00700      1032.19100      1048.57500        12.09 
Totals              48.23       379.90358       299.00700      1032.19100      1048.57500        12.09 

I checked the content and noticed that Ops/sec * Avg. Latency / 1000 does not equal to 22 (22 clients on 22 threads). This occurs in the 3-hop and 6-hop cases, but not in 1-hop and 2-hop cases. However, the info printed in the process appears right:

[RUN #1 100%, 1770 secs]  1 threads:       21998 ops,       5 (avg:      12) ops/sec, 1.52KB/sec (avg: 3.13KB/sec)
[RUN #1 100%, 1770 secs]  0 threads:       22000 ops,       5 (avg:      12) ops/sec, 1.52KB/sec (avg: 3.13KB/sec)
, 1569.84 (avg: 1770.28) msec latency
# 1770.28 * 12.66 / 1000 = 22.4

I wonder which Latency I should refer to, and why do these two value turn to be different?