Workflow, Part I

In the next series of posts I’m going to dig into a specific workflow example, showing how continuations can be used to support a process that moves forward by fits and starts, and how this can be considerably easier than you expect.  I’d note again that I am not arguing against the use of Windows Workflow Foundation by these examples, instead I’m trying to pull back the curtains around the fundamental concepts so you can build a clear mental model of the process and applicable practices.

 

Presenting this information in context is now going to take a series of long winded posts – while the technique is simple, it is subtle, and there are complexities under the surface.  In order to provide some definition of the value of struggling through these less than adequately edited posts, I’ll start with my conclusion.

 

Continuations are architecturally significant because the problem range they apply to is very large, and the solution domain they are realized in is almost trivial, given you assume development in the .Net environment.  The problem range is large because continuations address the cross cutting concerns of creating reliable processes that communicate with unreliable, often non-performant entities external to the main system.  In most systems, the logic required to guarantee reliability and stability across a distributed system where all nodes and links do not have the same non-functional characteristics of performance and reliability is large, complex, and expensive.  This technique reduces the complexity of that logic by several orders of magnitude, and potentially raises the overall reliability of the system by several orders of magnitude.

 

Bear with me through the following posts, and I’ll give you all the code and information you need to ascertain this for yourself and implement it in your own projects,

 

All of the examples in this case study are drawn from the world of the HelpMatch system.  My reasons for using this system and some links to other relevant data are here.

 

One of the central goals of HelpMatch is to allow individuals with items to donate to get them to people with needs.  Consider this basic series of steps;

 

  1. A Request is entered into HelpMatch for a specific set of items
  2. The Request is approved (or not) by someone associated with HelpMatch
  3. If the Request is approved, then requests for each needed item are sent to all possible donors of that item.
  4. Once one or more donors have filled the request for a specific item, then the rest of the donors are notified that the request has been filled.

 

Given the disconnected nature of HelpMatch, and the early stage of the design, lets assume (for the moment) that everybody actually understands XML, and that the primary form of communication between participants is via email.  Returning to the above example, the act of requesting approval involves sending an email message to a HelpMatch associate, where the email message contains XML that defines the request and provides the ability for the associate to approve (or not) the request.

 

We’ll make one further assumption at this point, that the subject line of the email contains the unique key used to identify the continuation, so that our general purpose mail handler can easily associate a continuation with a piece of email.  Recall from some of my previous posts that I observed it was necessary to be able to uniquely identify a continuation from outside the scope of the running system, because there would be no guarantee the system would be running at the time the continuation was activated.

 

Consider a real workflow.  A request was (magically) entered into the database, and another utility (magically) discovered this new request and started the HelpMatch Request workflow, which goes through the steps defined above.  The first step this workflow takes is to get a HelpMatch associate to approve the request.  Since she started this, let’s say Ruth Malan is at the top of the list, so an email request, containing an XML message and having a unique key in the subject line.  At this point there isn’t anything else the system can do until Ruth responds, at least with respect to this specific workflow instance.  So it creates a continuation, associates it with the same unique key that was in the email subject line, saves the continuation to durable storage, and exits.

 

Our workflow can be defined as a collection of methods in the class RequestWorkflow.  This class is marked as serializable, and all of our continuations are bound to instance member functions, so that whenever we serialize a continuation, the instance data associated with the class is bound to the continuation.

 

Let’s look at a potential constructor for this class.

 

public RequestWorkflow(Request request,Agent agent)

{

   _agent = agent;

   _request = request;

 

   // get the process started by generating an approval request

   string continID = String.Format("approvalRequest:{0}", _request.ID);

   GlobalEMailManager.SendMessage(continID, _agent.Email, generateApprovalRequest());

   GlobalContinuationManager.Add(new Continuation(continID, RequestApproved));

}

 

We have two member variables in the class, the Agent (a HelpMatch associate) and the Request.  Since we’re placing references to instances of these two classes in our serializable RequestWorkflow class, we must remember that these classes also need to be serializable.  In general, the rule of thumb you should use is that the instance that represents any ‘continuable’ workflow must be serializable, and this also applies to all of the references it contains.  There are of course anti-patterns to this, but for the moment just assume its serializable all the way down.

 

So our constructor initializes the two member variables, and then it creates a unique identifier for the continuation, defined as the text approvalRequest:, followed by the unique identifier for that request.  For the moment, look at the request and agent instances as arbitrary business objects, where they each have an ID property which allows them to be recovered from, or persisted to, durable storage.

 

The GlobalEmailManager and GlobalContinuationManager are two external properties, defined in the HelpMatchWorkflow base class, from which RequestWorkflow has been derived.  For the moment, simply assume the first is a magical mechanism that handles all sending and receiving of email, and the second handles everything related to continuations.

 

The GlobalEmailManager is used to send and receive email messages.  In this case, to send a message, it takes three parameters; the unique key we have assigned the continuation, the email address, and the text that is to be placed in the message.  In this example we use a private function, generateApprovalRequest(), that is assumed to generate the proper XML for the message body.

 

After the message has been sent, the continuation is generated.  The continuation is bound to the RequestApproved instance member function of the RequestWorkflow, i.e. a function with the definition public void RequestApproved(Continuation continuation).  It is assigned the same identity as was used to identify the email, which means that we have a means for associating any email returned by Ruth in response to the approval request with this continuation.

 

Lets look at this very closely.  We sent an email, and now we’ve established a continuation which indicates where we want to go next.  We are not saying what we want to do next; we are saying what we want to do next in this workflow.  Where we want to go next in the workflow is contingent on having received a response to our email. The difference is subtle, but critical.  Imagine threads, memory, and everything else were free, so that we could simply make a synchronous email transmission, blocking further activity by the workflow until someone responded to the message.  It’s a remarkably simple programming model; it’s just not very realistic.  Our HelpMatch system would never scale, and the whole thing would fall apart any time the application stopped, because so much of its state would be defined by execution context.

 

Look at what will happen in our managed workflow system.  We can have a fairly unsophisticated mail handler, one that just strips the fluff (e.g. ‘re:’) from the subject line and uses the remainder as a key to locate a continuation.  We do not need to keep this flow of execution running, we just need to be able to call the continuation function (RequestApproved) bound to the same instance of the object that originally sent the email.

 

Assume for the moment we don’t stop the host program.  Here’s the code for the ContinuationManager Add() function, which was called in the code above to add the new continuation to our dictionary of continuations.

       private Dictionary<string, Continuation> _continuations =

new Dictionary<string, Continuation>();

    public void Add(Continuation theContinuation)

    {

        _continuations.Add(theContinuation.Identity, theContinuation);

    }

 

The Identity value of the continuation is the name we assigned to it, generated by this statement in the RequestWorkflow constructor.

 

string continID = String.Format("approvalRequest:{0}", _request.ID);

 

We placed exactly the same value in the subject line of the email, so if we assume there’s some externally triggered logic that gets handed an email message, it’s not hard to see how we could recover this key from the subject line.  Given that we have recovered the key from the subject line, the following code could resume the workflow logic by calling through to the continued function.

 

    public void Resume(string continuationID,object data)

    {

        Continuation theContinuation = _continuations[continuationID];

        theContinuation.Data = data;

        theContinuation.Resume(theContinuation);

    }

 

The first argument is the continuation key we extracted from the continuation, the second is the data we extracted from the email – for the moment, since we’ve assumed XML assume we’ve located some standard element and extracted and parsed the inner text.  The last line of the Resume() function calls the Resume method on the continuation itself, which is defined as;

        public void Resume(object data)

        {

            _data = data;

            _continuation(this);

        }

 

The first line is simply binding the received data value to the continuation, where the resumed function will be able to access it.  The second line is calling the delegate function, defined as;

 

public delegate void BookmarkLocation(Continuation resumed);

private BookmarkLocation _continuation;

 

I won’t get into the details of the resumed workflow in this post, however it is the RequestApproved member function that will be called, because that was the function we used when we created the continuation in the first place.  The instance of the RequestWorkflow that was active when the continuation was defined is associated with the delegate we created,  so the same object instance that we had in the constructor when we created the continuation is the one we had when the workflow resumed.  I’ve shown a fragment of this function below, and you can see how it’s getting the data bound to the continuation (what came in when the response was delivered) to control it’ internal flow.

 

public void RequestApproved(Continuation continuation)

{

     if((bool) continuation.Data)

     {

         // request was approved, so requests to all of the donors need to be spawned off

     }

     else

     {

         // request was not approved

     }

}

 

 

Summarizing, we can see how a continuation allows us to establish a point in the flow of a programs execution that we can arbitrarily resume.   We could accomplish exactly the same effect with a state machine, or even a case statement, but those would both be fairly invasive implementation patterns, i.e. we’d loose a lot of design flexibility because of the need to use a specific logical structure to be able to leave and re-enter a logical sequence of operations.  A continuation, created by applying a standardized naming convention to instance method delegates, allows us to gain the same advantage with much smaller effect on our operating code.

 

 

 

Published Wednesday, April 25, 2007 5:56 AM by MarkMMullin
Filed Under: , ,

Comments