Use List of HashSet to cache hashes instead of OrderedDictionary in FIFOSet

**Summary or problem description**

When there are many elements cached in the `FIFOSet`, its `Add` and `ExceptWith` efficiency will become worse. It is very obvious during performance testing, in the TaskManager's RestartTask, 
https://github.com/neo-project/neo/blob/44e7f3b715aedcef10f2cfb369ccca3ee0a5bcd2/src/neo/Network/P2P/TaskManager.cs#L140-L142
the ExceptWith method is used to exclude 16,000 of the 62,800 knownHashes. Each delete performance is 0.015s and totally spends 240s. The process will get stuck. **That means under high overload, if the nodes are out of sync due to lacking of a large number of transactions, obtaining transactions will become extremely slow.**
The reason is mainly the computational efficiency issue of the OrderedDictionary.  The unit test below will show the performance for large number of elements.
```C#
        [TestMethod]
        public void TestFIFOSet()
        {
            Stopwatch stopwatch = new Stopwatch();
            var bucket = new FIFOSet<int>(150_000);
            stopwatch.Start();
            for (int i = 1; i <= 550_000; i++)
            {
                bucket.Add(i);
            }
            stopwatch.Stop();
            Console.WriteLine($"Add timespan: {stopwatch.Elapsed.TotalSeconds}s");
            stopwatch.Reset();
            var items = new int[10000];
            var value = 550_000;
            for (int i = 0; i <= 9999; i++)
            {
                items[i] = value;
                value -= 50;
            }
            stopwatch.Start();
            bucket.ExceptWith(items);
            stopwatch.Stop();
            Console.WriteLine($"except with timespan: {stopwatch.Elapsed.TotalSeconds}s");
            stopwatch.Reset();

            stopwatch.Start();
            var ret = bucket.Contains(140_000);
            stopwatch.Stop();
            Console.WriteLine($"contains with timespan: {stopwatch.Elapsed.TotalSeconds}s result: {ret}");
            stopwatch.Reset();

            stopwatch.Start();
            ret = bucket.Contains(545_001);
            stopwatch.Stop();
            Console.WriteLine($"contains with timespan: {stopwatch.Elapsed.TotalSeconds}s result: {ret}");
            stopwatch.Reset();
        }
```
Console prints
```
Add timespan: 21.3015352s
except with timespan: 10.8901768s
contains with timespan: 0.0001552s result: False
contains with timespan: 5E-07s result: True
```
**Do you have any solution you want to propose?**

Currently, we only use `FIFOSet` to cache `UInt256` hashes. The elimination order is based on the sequence of the hashes. In this case the elimination order is not critical. Therefore, we can use the List of HashSet instead, create a List containing multiple HashSets, and each element is stored in the HashSet in chronological order. When the previous HashSet is full, it starts to load the next HashSet. If all HashSets are full, delete the first HashSet in the List and add a new empty HashSet at the end of the List to hold the newly added elements.
The design of this class is as follows
```C#
    public class HashSetCache<T> : IEnumerable<T> where T : IEquatable<T>
    {
        private readonly int hashSetCapacity; //maxmium of elements in each hashset
        private readonly List<HashSet<T>> sets = new List<HashSet<T>>();

        public int Size
        {
            get
            {
                int size = 0;
                foreach (var set in sets)
                {
                    size += set.Count;
                }
                return size;
            }
        }

        public HashSetCache(int hashSetCapacity, int hashSetCount = 10)
        {
            if (hashSetCapacity <= 0) throw new ArgumentOutOfRangeException(nameof(hashSetCapacity));
            if (hashSetCount <= 0 || hashSetCount > 20) throw new ArgumentOutOfRangeException($"{nameof(hashSetCount)} should between 1 and 20");

            this.hashSetCapacity = hashSetCapacity;
            for (int i = 0; i < hashSetCount; i++)
            {
                sets.Add(new HashSet<T>());
            }
        }

        public bool Add(T item)
        {
            if (Contains(item)) return false;
            foreach (var set in sets)
            {
                if (set.Count < hashSetCapacity)
                {
                    return set.Add(item);
                }
            }
            sets.RemoveAt(0);
            var newSet = new HashSet<T>
            {
                item
            };
            sets.Add(newSet);
            return true;
        }

        public bool Contains(T item)
        {
            foreach (var set in sets)
            {
                if (set.Contains(item)) return true;
            }
            return false;
        }

        public void ExceptWith(IEnumerable<T> items)
        {
            foreach (var item in items)
            {
                foreach (var set in sets)
                {
                    if (set.Remove(item)) break;
                }
            }
        }

        public IEnumerator<T> GetEnumerator()
        {
            foreach (var set in sets)
            {
                foreach (var item in set)
                {
                    yield return item;
                }
            }
        }

        IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
    }
```
The unit test shows this design is much faster than OrderedDictionary under the same input condition.
```C#
        [TestMethod]
        public void TestHashSetCache()
        {
            Stopwatch stopwatch = new Stopwatch();
            var bucket = new HashSetCache<int>(15_000);
            stopwatch.Start();
            for (int i = 1; i <= 550_000; i++)
            {
                bucket.Add(i);
            }
            stopwatch.Stop();
            Console.WriteLine($"Add timespan: {stopwatch.Elapsed.TotalSeconds}s");
            stopwatch.Reset();
            var items = new int[10000];
            var value = 550_000;
            for (int i = 0; i <= 9999; i++)
            {
                items[i] = value;
                value -= 50;
            }

            stopwatch.Start();
            bucket.ExceptWith(items);
            stopwatch.Stop();
            Console.WriteLine($"except with timespan: {stopwatch.Elapsed.TotalSeconds}s");
            stopwatch.Reset();
            stopwatch.Start();
            var ret = bucket.Contains(140_000);
            stopwatch.Stop();
            Console.WriteLine($"contains with timespan: {stopwatch.Elapsed.TotalSeconds}s result: {ret}");
            stopwatch.Reset();

            stopwatch.Start();
            ret = bucket.Contains(545_001);
            stopwatch.Stop();
            Console.WriteLine($"contains with timespan: {stopwatch.Elapsed.TotalSeconds}s result: {ret}");
            stopwatch.Reset();
        }
```
Here is the Console print
```
Add timespan: 0.2357612s
except with timespan: 0.0035483s
contains with timespan: 2.1E-06s result: False
contains with timespan: 4.6E-06s result: True
```

**Neo Version**
- Neo 3

**Where in the software does this update applies to?**
- Consensus
- Ledger



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use List of HashSet to cache hashes instead of OrderedDictionary in FIFOSet #1382

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	private void OnRestartTasks(InvPayload payload)
	{
	knownHashes.ExceptWith(payload.Hashes);

Use List of HashSet to cache hashes instead of OrderedDictionary in FIFOSet #1382

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions