Overview
One thing that all large implementations have in common is the need for some tasks that need to be done periodically. These may be tasks that are by design or due to some data corruption. We often are unable to get the fix from the vendor or the fix may require an upgrade which may not be feasible.
In previous versions we had Schedule Agents that could do this. In new versions we have a slightly more elaborate construct called jobs.
Our Approach
At Oracular we employ a slightly more sophisticated approach to this problem. We implement a concept of an application monitor that detects abnormalities and fixes them. While it executes using the provided constructs of schedule agent or jobs - it provides a more abstract view of this problem and thus the overall solution is easier to manage.
We view the problem of an application monitor as follows:
- Detect the issues of a certain type. Return several rows where each row indicates the issue of this type. This can take the form of a MOCA command or a SQL statement.
- For each of the rows returned from above execute MOCA code to fix that situation.
- Optionally log in dlytrn
- Optionally raise EMS alerts.
- Provide a front-end so that advanced end users can manipulate these monitors or add new ones.
- Run these as a single job so that overall job schedule is not adversely impacted.
- Publish progress to the job monitoring framework developed by oracular
Our Solution
We define this monitors in a new table:
Column | Description |
Monitor Id | A unique name for each monitor |
Description | Detailed information about it |
Sort# | To control the sequence in which the monitors run |
Time Since Last Run | Second since last run. The monitor job could be running every minute but one of the monitors may be defined to run every 10 minutes. |
Enabled? | 1/0 to enable it |
Detect Command | 4000 byte field to provide MOCA or SQL to detect an abnormality. Return n rows. |
Fix Command | 4000 byte field to provide MOCA to fix 1 row. Executes for each row returned from detect command |
Log Dlytrn? | 1/0 to log dlytrn for each row returned by detect command |
Raise EMS on Fail? | 1/0. if enabled, raise EMS alert if fix command fails |
The solution is in the form of a single MOCA command called "run ossi monitor" that is scheduled using MOCA jobs. We define this to execute every minute and to control individual jobs that we may want to run less frequently - we use "Time Since Last Run" concept. Users can maintain this information themselves using a simple front-end application:
All WMS implementations have some of these jobs that are developed over time to handle typical issues. For example:
- We often have orphan invmov records. We can detect such records and then delete them.
- Sometimes we have pckmov records that do not have corresponding pckwrk records. We can detect the condition and kill such pckmov records
- It is common to have situations where rescod is not cleared. We can detect the condition and run "deallocate resource location" for these locations.
- It is often a requirement to reprocess downloads in EERR status.
- We can create a monitor to detect some abnormal condition and send an email to a set of users with results of the query. This solution is easier than creating an EMS alert since it does not require a rollout process and the email can include detailed results.
Conclusion
Some sort of an application monitor is a common requirement in any large system and RedPrairie/JDA WMS is no exception. Above design pattern can be implemented generally for any large system. It looks at the problem abstractly and provides a solution that is simple to implement, maintain, and manage.