Processing everything in real-time is easy but often makes an application tightly coupled, prone to failure and difficult to scale.
Let’s assume a user just clicked on the purchase button to kick off the following workflow:
- Save the order
- Send a confirmation email (mask the credit card number)
- Update the user’s loyalty points in an external system
The problem is that non-critical workloads (step 2 and 3) can negatively impact an application's performance and recoverability. What happens when the mail server is down and the email confirmation fails? How do we monitor failure? Do we roll the transaction back or replay the workflow by taking a checkpoint after each successful step?
The solution is to ensure that application flow isn't impeded by waiting on non-critical workloads to complete. Queue based systems are effective at solving this problem.
This post will use a purchase order application that accepts XML requests. The application will sanitise the credit card details by masking most of the digits before sending the request to a mailbox.
Download Source Code
Setup
The following XML document will be used as the purchase order. It is about 512KB to simulate a decent payload.
Test User
123 Abc Road, Sydney, Australia
Visa
4111111111111111
Gambardella, Matthew
XML Developer's Guide
...
The PurchaseOrderProcessor class was intentionally kept small to focus on solving the main problem, which is to minimise the performance impact of the mailing system.
public interface IMailer
{
void Send(string message);
}
public class PurchaseOrderProcessor
{
private readonly IMailer _mailer;
public PurchaseOrderProcessor(IMailer mailer)
{
if (mailer == null) throw new ArgumentNullException("mailer");
_mailer = mailer;
}
public void Process(string xmlRequest)
{
var doc = new XmlDocument();
doc.LoadXml(xmlRequest);
// Process the order
_mailer.Send(xmlRequest);
}
}
The SMTP mailer will be configured to write the mail to disk to simplify the illustration.
public class FileSmtpMailer : IMailer
{
private readonly string _path;
private readonly string _from;
private readonly string _recipients;
private readonly string _subject;
public FileSmtpMailer(string path, string from, string recipients, string subject)
{
if (path == null) throw new ArgumentNullException("path");
_path = path;
_from = @from;
_recipients = recipients;
_subject = subject;
}
public void Send(string message)
{
using (var client = new SmtpClient())
{
// This can be configured in the app.config
client.DeliveryMethod = SmtpDeliveryMethod.SpecifiedPickupDirectory;
client.PickupDirectoryLocation = _path;
using (var mailMessage =
new MailMessage(_from, _recipients, _subject, message))
{
client.IsBodyHtml = true;
client.Send(mailMessage);
}
}
}
}
The MaskedMailerDecorator will be used for masking the credit card details.
public class MaskedMailerDecorator : IMailer
{
private readonly Regex _validationRegex;
private readonly IMailer _next;
private const char MaskCharacter = '*';
private const int MaskDigits = 4;
public MaskedMailerDecorator(Regex regex, IMailer next)
{
if (regex == null) throw new ArgumentNullException("regex");
if (next == null) throw new ArgumentNullException("next");
_validationRegex = regex;
_next = next;
}
public void Send(string message)
{
if (_validationRegex.IsMatch(message))
{
message = _validationRegex.Replace(message,
match => MaskNumber(match.Value));
}
_next.Send(message);
}
private static string MaskNumber(string value)
{
return value.Length <= MaskDigits ?
new string(MaskCharacter, value.Length) :
string.Format("{0}{1}",
new string(MaskCharacter, value.Length - MaskDigits),
value.Substring(value.Length - MaskDigits, MaskDigits));
}
}
Baseline
Let's establish a performance baseline by running the application with the Null Mailer that doesn't do anything. Refer to the reject the null checked object post if you are new to the null object pattern.
public class NullMailer : IMailer
{
public void Send(string message)
{
// intentionally blank
}
}
static void Main(string[] args)
{
var path = Path.Combine(Directory.GetCurrentDirectory(), "PurchaseOrder.xml");
var request = File.ReadAllText(path);
var nullMailer = new NullMailer();
var orderProcessor = new PurchaseOrderProcessor(nullMailer);
var stopWatch = Stopwatch.StartNew();
Parallel.For(0, 1000, i => orderProcessor.Process(request));
stopWatch.Stop();
Console.WriteLine("Seconds: {0}", stopWatch.Elapsed.TotalSeconds);
Console.ReadLine();
}
Seconds: 6.3404086
Real-Time Processing
Let’s measure the performance when MaskedMailerDecorator and FileSmtpMailer are used.
Directory.CreateDirectory(@"C:\temp");
var ccRegEx = new Regex(@"(?:\b4[0-9]{12}(?:[0-9]{3})?\b
|\b5[1-5][0-9]{14}\b)", RegexOptions.Compiled);
var path = Path.Combine(Directory.GetCurrentDirectory(), "PurchaseOrder.xml");
var request = File.ReadAllText(path);
// Use Unity for doing the wiring
var fileMailer = new FileSmtpMailer(@"C:\temp", "[email protected]", "[email protected]", "Order");
var maskedMailer = new MaskedMailerDecorator(ccRegEx, fileMailer);
var orderProcessor = new PurchaseOrderProcessor(maskedMailer);
var stopWatch = Stopwatch.StartNew();
Parallel.For(0, 1000, i => orderProcessor.Process(request));
stopWatch.Stop();
Console.WriteLine("Seconds: {0}", stopWatch.Elapsed.TotalSeconds);
Console.ReadLine();
Seconds: 32.0430142
Outch! The application took 5x longer to process the same number of requests.
Background Processing
Let's extend the solution by adding a memory queue to buffer the results. The queue effectively acts as an outbox for sending mail without overwhelming the mail server with parallel requests.
public class QueuedMailerDecorator : IMailer
{
private readonly IMailer _next;
private BlockingCollection<string> _messages;
public QueuedMailerDecorator(IMailer next)
{
if (next == null) throw new ArgumentNullException("next");
_next = next;
_messages = new BlockingCollection<string>();
Task.Factory.StartNew(() =>
{
try
{
// Block the thread until a message becomes available
foreach (var message in _messages.GetConsumingEnumerable())
{
_next.Send(message);
}
}
finally
{
_messages.Dispose();
}
}, TaskCreationOptions.LongRunning);
}
public void Send(string message)
{
if (_messages == null || _messages.IsAddingCompleted)
{
return;
}
try
{
_messages.TryAdd(message);
}
catch (ObjectDisposedException)
{
Trace.WriteLine("Add failed since the queue was disposed.");
}
}
public void Dispose()
{
if (_messages != null && !_messages.IsAddingCompleted)
{
_messages.CompleteAdding();
}
GC.SuppressFinalize(this);
}
}
Here is a diagram that depicts the collection of decorated mailers to intercept the communication between the PurchaseOrderProcessor and the FileSmtpMailer.
Let's run the code below to evaluate if the queue made a performance difference.
// Use Unity for doing the wiring
var fileMailer = new FileSmtpMailer(@"C:\temp", "[email protected]", "[email protected]", "Order");
var maskedMailer = new MaskedMailerDecorator(creditCardRegEx, fileMailer);
var queuedMailer = new QueuedMailerDecorator(maskedMailer);
var orderProcessor = new PurchaseOrderProcessor(queuedMailer);
var stopWatch = Stopwatch.StartNew();
Parallel.For(0, 1000, i => orderProcessor.Process(request));
stopWatch.Stop();
Console.WriteLine("Seconds: {0}", stopWatch.Elapsed.TotalSeconds);
Console.ReadLine();
Seconds: 6.3908034
Success! The performance of the file mailer was nearly identical to the results without a mailer.
The drawback of the in-memory queue used in this post is that it requires memory to store messages temporarily before it is processed. It is possible to lose messages if the application crashes or stops unexpectedly before all of the requests have been processed.
These issues can be addressed with a locally persisted queue such as MSMQ or a cloud based queue such as Azure Service Bus. Queues provide many benefits that will be covered in the next post.
Summary
This post provided evidence to the performance degradation caused by processing non-critical workloads in real-time. Using queues can be an effective technique to keep an application responsive by buffering and processing tasks in the background.