Software Design Blog

Simple solutions to solve complex problems

Yield IEnumerable vs List Building

This post describes the use of yield and compares it to building and returning a list behind an IEnumerable<T> interface.

Download Source Code

Setup

The example consists of a contact store that will allow the client to retrieve a collection of contacts.

The IStore.GetEnumerator method must return IEnumerable<T>, which is a strongly typed generic interface that describes the ability to fetch the next item in the collection.

The actual implementation of the collection can be decided by the concrete implementation. For example, the collection could consist of an array, generic list or yielded items.

       
    public interface IStore<out T>
    {
        IEnumerable<T> GetEnumerator();
    }

    public class ContactModel
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
    }

Calling GetEnumerator

Let's create two different stores, call the GetEnumerator on each store and evaluate the console logs to determine if there is a difference between the List Store and the Yield Store.

List Store

The code below is a common pattern I've observed during code reviews, where a list is instantiated, populated and returned once ALL of the records have been constructed.
    
    public class ContactListStore : IStore<ContactModel>
    {
        public IEnumerable<ContactModel> GetEnumerator()
        {
            var contacts = new List<ContactModel>();
            Console.WriteLine("ContactListStore: Creating contact 1");
            contacts.Add(new ContactModel() { FirstName = "Bob", LastName = "Blue" });
            Console.WriteLine("ContactListStore: Creating contact 2");
            contacts.Add(new ContactModel() { FirstName = "Jim", LastName = "Green" });
            Console.WriteLine("ContactListStore: Creating contact 3");
            contacts.Add(new ContactModel() { FirstName = "Susan", LastName = "Orange" });
            return contacts;
        }
    }

    static void Main(string[] args)
    {
        var store = new ContactListStore();
        var contacts = store.GetEnumerator();

        Console.WriteLine("Ready to iterate through the collection.");
        Console.ReadLine();
    }
ContactListStore: Creating contact 1
ContactListStore: Creating contact 2
ContactListStore: Creating contact 3
Ready to iterate through the collection.

Yield Store

The yield alternative is shown below, where each instance is returned as soon as it is produced.
    
    public class ContactYieldStore : IStore<ContactModel>
    {
        public IEnumerable<ContactModel> GetEnumerator()
        {
            Console.WriteLine("ContactYieldStore: Creating contact 1");
            yield return new ContactModel() { FirstName = "Bob", LastName = "Blue" };
            Console.WriteLine("ContactYieldStore: Creating contact 2");
            yield return new ContactModel() { FirstName = "Jim", LastName = "Green" };
            Console.WriteLine("ContactYieldStore: Creating contact 3");
            yield return new ContactModel() { FirstName = "Susan", LastName = "Orange" };
        }
    }

    static void Main(string[] args)
    {
        var store = new ContactYieldStore();
        var contacts = store.GetEnumerator();

        Console.WriteLine("Ready to iterate through the collection.");
        Console.ReadLine();
    }
Ready to iterate through the collection.
Let's call the collection again and obverse the behaviour when we fetch the first contact in the collection.
  
        static void Main(string[] args)
        {
            var store = new ContactYieldStore();
            var contacts = store.GetEnumerator();
            Console.WriteLine("Ready to iterate through the collection");
            Console.WriteLine("Hello {0}", contacts.First().FirstName);
            Console.ReadLine();
        }
Ready to iterate through the collection
ContactYieldStore: Creating contact 1
Hello Bob

Possible multiple enumeration of IEnumerable

Have you ever noticed the "possible multiple enumeration of IEnumerable" warning from ReSharper? ReSharper is warning us about a potential double handling issue, particularly for deferred execution functions such as yield and Linq. Have a look at the results produced from the code below.
  
        static void Main(string[] args)
        {
            var store = new ContactYieldStore();
            var contacts = store.GetEnumerator();
            Console.WriteLine("Ready to iterate through the collection");

            if (contacts.Any())
            {
                foreach (var contact in contacts)
                {
                    Console.WriteLine("Hello {0}", contact.FirstName);
                }
            }
            
            Console.ReadLine();
        }
Ready to iterate through the collection
ContactYieldStore: Creating contact 1
ContactYieldStore: Creating contact 1
Hello Bob
ContactYieldStore: Creating contact 2
Hello Jim
ContactYieldStore: Creating contact 3
Hello Susan

IEnumerable.ToList()

What if we have a requirement to materialize (build) the entire collection immediately? The answer is shown below.

  
        static void Main(string[] args)
        {
            var store = new ContactYieldStore();
            var contacts = store.GetEnumerator().ToList();
            Console.WriteLine("Ready to iterate through the collection");
            Console.ReadLine();
        }
ContactYieldStore: Creating contact 1
ContactYieldStore: Creating contact 2
ContactYieldStore: Creating contact 3
Ready to iterate through the collection

Calling .ToList() on IEnumerable will build the entire collection up front.

Comparison

The list implementation loaded all of the contacts immediately whereas the yield implementation provided a deferred execution solution.

In the list example, the caller doesn't have the option to defer execution. The yield approach provides greater flexibility since the caller can decide to pre-load the data or pull each record as required. A common trap to avoid is performing multiple enumerations on the same collection since yield and Linq functions will perform the same operation for each enumeration.

In practice, it is often desirable to perform the minimum amount of work needed in order to reduce the resource consumption of an application.

For example, we may have an application that processes millions of records from a database. The following benefits can be achieved when we use IEnumerable in a deferred execution pull-based model:

  • Scalability, reliability and predictability are likely to improve since the number of records does not significantly affect the application’s resource requirements.
  • Performance and responsiveness are likely to improve since processing can start immediately instead of waiting for the entire collection to be loaded first.
  • Recoverability and utilisation are likely to improve since the application can be stopped, started, interrupted or fail. Only the items in progress will be lost compared to pre-fetching all of the data where only a portion of the results was actually used.
  • Continuous processing is possible in environments where constant workload streams are added.
Comments are closed