Software Design Blog

Simple solutions to solve complex problems

A step-by-step guide to detect errors and retry operations


We are liable for the reliability of our apps, which work when tested but fail in production environments due to unexpected temporary conditions. Errors can occur due to an intermittent service, infrastructure fault, network issue or explicit throttling. If the operation is retried a short time later (maybe a few milliseconds later) the operation may succeed.

A lazy brain can quickly solve this problem without a lot of planning, flexibility or configuration in mind as shown in this MSDN sample that performs a retry when the SMTP mail client fails.

  
            try
            {
                client.Send(message);
            }
            catch (SmtpFailedRecipientsException ex)
            {
                for (int i = 0; i < ex.InnerExceptions.Length; i++)
                {
                    SmtpStatusCode status = ex.InnerExceptions[i].StatusCode;
                    if (status == SmtpStatusCode.MailboxBusy ||
                        status == SmtpStatusCode.MailboxUnavailable)
                    {
                        System.Threading.Thread.Sleep(5000);
                        client.Send(message);
                    }
                }
            }

The problem is how do you retry a failed operation correctly?

The solution is to create an error detection strategy to determine if the error can be retried and use a retry policy to define the number of retries and frequency between retries.

This post extends the previous example by adding retry logic to the SMTP Mail Client. The aim is to illustrate how to improve the reliability of an app by making it more robust.

Download Source Code

Retry Solution - Done Properly

Let’s get started by implementing clean retry code in 4 easy steps.

Step 1 – Add the transient fault handling NuGet package

The Microsoft transient fault handling NuGet package contains generic, reusable and testable components that we will use for solving the SMTP mail client retry problem.

Add the EnterpriseLibrary.TransientFaultHandling NuGet package to the Orders.Implementation project.

Step 2 – Create a transient error detection strategy

Transient (temporarily) faults are errors that can be retried. Here are a few examples of transient errors that may work if retried:

  • The host is not responding or is unavailable
  • The server is busy or the connection was dropped

Here are a few examples of permanent errors that won’t work even if retried:

  • Authentication failed (invalid username or password)
  • Bad request (invalid email address, attachment not found)

Let’s create a detection strategy that will detect SMTP Mail errors that can be retried.

  
    public class SmtpMailErrorDetectionStrategy : ITransientErrorDetectionStrategy
    {
        public bool IsTransient(Exception ex)
        {
            var smtpFailedException = ex as SmtpFailedRecipientsException;
            if (smtpFailedException == null) return false;

            return smtpFailedException.InnerExceptions
                    .Any(mailEx => mailEx.StatusCode == SmtpStatusCode.MailboxBusy ||
                                   mailEx.StatusCode == SmtpStatusCode.MailboxUnavailable);
        }
    }
Step 3 – Create the SMTP Retry Decorator

Based on to the single responsibility principle (SRP), the mail client should only be responsible for one thing, which is sending mail. Adding error detection and retry logic violates the SRP since it shouldn't be it's problem.

The decorator pattern is great for extending an existing class. Just imagine how large the class will become if we continually add cross-cutting functionality to it such as retries, logging and performance monitoring.

Here is the existing SMTP mail client used in the previous post:


We are going to decorate the SMTP mail client with a retry mail client decorator as shown below:

  
    public class SmtpMailClientRetryDecorator : IMailClient
    {
        private readonly IMailClient _next;
        private readonly RetryPolicy _retryPolicy;

        public SmtpMailClientRetryDecorator(IMailClient next, RetryPolicy retryPolicy)
        {
            if (next == null) throw new ArgumentNullException("next");
            if (retryPolicy == null) throw new ArgumentNullException("retryPolicy");
            _next = next;
            _retryPolicy = retryPolicy;
        }

        public void Send(string to, string subject, string body)
        {
            _retryPolicy.ExecuteAction(() => _next.Send(to, subject, body));
        }
    }
Step 4 – Compose the Solution

The transient fault handling library provides the following retry strategies:

Retry strategy

Example (intervals between retries in seconds)

Fixed interval

2,2,2,2,2,2

Incremental intervals

2,4,6,8,10,12

Random exponential back-off intervals

2, 3.755, 9.176, 14.306, 31.895

Let’s register the fixed interval retry strategy in the Unity IoC container as shown below. If you are new to Dependency Injection (DI), read this post.

  
            // This should be defined in app settings
            const int maxRetries = 5;
            var retryInterval = TimeSpan.FromSeconds(2);

            _container.RegisterType<FixedInterval>(
                        new InjectionConstructor(maxRetries, retryInterval));
            _container.RegisterType<RetryPolicy<SmtpMailErrorDetectionStrategy>>(
                        new InjectionConstructor(new ResolvedParameter<FixedInterval>()));
            
            _container.RegisterType<IMailClient, SmtpMailClient>(typeof(SmtpMailClient).FullName);
            _container.RegisterType<IMailClient, SmtpMailClientRetryDecorator>(
                        new InjectionConstructor(
                               new ResolvedParameter<IMailClient>(typeof(SmtpMailClient).FullName),

And for those who may think this is over-engineered and too complicated then this is how you can construct the retry policy in the SMTP mailer class – but of course you will run in to the problems discussed in earlier posts!

  
const int maxRetries = 5;
var retryInterval = TimeSpan.FromSeconds(2);
var policy = new RetryPolicy<SmtpMailErrorDetectionStrategy>(
                    new FixedInterval(maxRetries, retryInterval));
policy.ExecuteAction(() => client.Send(mesage));

Summary

This post showed how to implement a flexible, configurable solution using the Microsoft Transient Fault Handling Application Block to retry transient failures in order to improve app reliability.

If we refer to the original MSDN code sample, we can see a similar pattern by evaluating specific transient error codes (transient error detection strategy), pausing for 5 seconds (retry policy - retry Interval) and retrying the send operation once (retry policy - retry count).

Comments are closed