Multi Value Dictionary

I recently was asked to implement a dictionary-like data structure that can store multiple values for a given key. The interface I was given is as follows.

    public interface IMultiValueDictionary<K, V> : IEnumerable<KeyValuePair<K, V>>
    {
        bool Add(K key, V value);
        IEnumerable<V> Get(K key);
        IEnumerable<V> GetOrDefault(K key);
        void Remove(K key, V value);
        void Clear(K key);
        IEnumerable<KeyValuePair<K, V>> Flatten();
    }

Since the precondition was that the values are unique for any given key, I decided to use a Dictionary where key is of type K and value is a HashSet of type V, as follows.

private readonly Dictionary<K, HashSet<V>> _data = [];

Let’s go down the list and implement all the methods.

public bool Add(K key, V value)
{
    if (_data.TryGetValue(key, out var values))
    {
        return values.Add(value);
    }
    else
    {
        _data.Add(key, [value]);
        return true;
    }
}

The add key/value method is pretty straightforward. If the values for a given key exist, attempt to add new value to that hash set, which returns a boolean indicating success, otherwise add the key with a new hash set containing only the new value.
Time Complexity: O(1) or O(n) if underlying array runs out of empty spots and larger array has to be reallocated (Dictionary is implemented using an array under the hood, and HashSet is implemented using a Dictionary where key and value are same).

public IEnumerable<V> Get(K key)
{
    if (!_data.TryGetValue(key, out var values))
        throw new KeyNotFoundException();

    return values;
}

The get by key method should throw an exception if the key doesn’t exist, otherwise return the values at key.
Time Complexity: O(1)

public IEnumerable<V> GetOrDefault(K key) => _data.TryGetValue(key, out var values) ? values : [];

The get or default is a safe version of the get method, which will never throw an exception and is useful for method-chaining.
Time Complexity: O(1)

public void Remove(K key, V value)
{
    if (!_data.TryGetValue(key, out var values))
        throw new KeyNotFoundException();

    values.Remove(value);
}

The remove key/value method will throw an exception if the key doesn’t exist, otherwise will try to remove the value.
Time Complexity: O(1)

public void Clear(K key) => _data.Remove(key);

The clear method just removes that key if it exists, and all the values associated with it.
Time Complexity: O(1)

public IEnumerable<KeyValuePair<K, V>> Flatten()
{
    var data = new List<KeyValuePair<K, V>>();

    foreach (var item in _data)
        foreach (var value in item.Value)
            data.Add(new KeyValuePair<K, V>(item.Key, value));

    return data;
}

The flatten method just needs to return a collection of all key/value pairs, where key is repeated for each of its values. The above implementation does the job, however, it creates a copy of all the data in another list. Luckily, C# offers a better solution.

public IEnumerable<KeyValuePair<K, V>> Flatten()
{
    foreach (var item in _data)
        foreach (var value in item.Value)
            yield return new KeyValuePair<K, V>(item.Key, value);
}

The solution is to use yield return, which is an iterator to provide the next value. It avoids having to create another copy of data. Or use shorthand LINQ method syntax as shown below.
Time Complexity: O(n)

public IEnumerable<KeyValuePair<K, V>> Flatten()
    => _data.SelectMany(x => x.Value.Select(value => new KeyValuePair<K, V>(x.Key, value)));

Great! Are we done?

Not exactly. If you notice above, the IMultiValueDictionary interface inherits IEnumerable interface, so if we try to compile as is, we will get two errors.

'MultiValueDictionary<K, V>' does not implement interface member 'IEnumerable<KeyValuePair<K, V>>.GetEnumerator()'
'MultiValueDictionary<K, V>' does not implement interface member 'IEnumerable.GetEnumerator()'

We have to implement GetEnumerator methods, as instructed by the compiler, to define how our data structure will iterate through its data. As it turns out, flatten already does exactly that (or defers that action) by returning an IEnumerable, which has GetEnumerator.

public IEnumerator<KeyValuePair<K, V>> GetEnumerator() => Flatten().GetEnumerator();

And lastly IEnumerable.GetEnumerator is implemented with the following one-liner, which you don’t usually have to change.

IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();

And that does it! This compiles nicely, passes all unit tests and is efficient at storing/retrieving data.



Leave a Reply

Your email address will not be published. Required fields are marked *