Thursday, April 24, 2014

RedPrairie Implementation of Sampling in Workflows

Sometimes you come across an algorithm that is so elegant yet simple that you have to appreciate it.  It is a common requirement to apply some operation to a percentage of tasks for audit reasons.  For example you may want to divert 10% of the picked cartons to a special lane for audit.  Similarly you may want to apply some workflows to a certain percentage of items only.

When I considered the problem, I concluded that I would have to somehow mark the task, e.g. pckwrk, to indicate which cartons need to be audited.  RedPrairie has the concept of Workflow Rates:


Looking at this setup I was intrigued because the setup was independent of any workflow and was talking about probability.  Going deeper into the implementation, I was amazed at the elegance and simplicity of their implementation - rather than marking every nth row and deal with the associated complexities - simply roll a dice - if it says "1" - apply workflow - Genius!

Consider that if we have a work-station next to a conveyor where picked cartons are traveling and it is the job of the user to audit 50% of the cartons that pass next to him.  He could try to pick every other carton or he could simply flip a coin when he sees a carton - heads audit it, tails let it go.  Law of Large Numbers guarantees that  our results will approach the desired rate.

How do we toss a coin in software?  Generate random numbers.  For the random numbers generated by software  each number in a given random sequence had an equal chance of occurring, and that various other patterns in the data should be also distributed equiprobably.  So all we have to do is generate a number and check against defined probability to see if we should perform the operation.

To see it working - the command responsible for doing this is "get probability calculation".  You should pass in prob_qty and max_prob_qty.  For example for 10% probability you should pass in 10 and 100 respectively.

To test it, you can run it for large numbers and see the results.  So lets run it for 10, 100, 1000,and 10000 iterations and see how many workflows we will do.  Results were - as expected:

When you run the test, the exact counts will be different but they will approach the desired 10% result.  Simplicity of this solution is truly elegant.  The result should be acceptable for anyone.

For reference, the code to generate the above table is:

do loop where iterations = 4
|
{
   sl_power
   where num = 10
   and power = @i+1
   |
   publish data
   where max_iterations = @o_value
   |
   {
      do loop where iterations = @max_iterations
      |
      {
         get probability calculation result
         where prob_qty = 10
         and max_prob_qty = 100
         and invtid_num = 1
         |
         if ( @result = 1 )
            filter data where moca_filter_level = '1'   
      }
   } 
   >> res
   |
   publish data 
   where max_iterations = @max_iterations
   and cnt_work_done = rowcount(@res)
}