Tuesday, September 20, 2005

BizTalk Rules engine scalibility

When exploring different rule base vendors, you probably encounter references to the RETE algorithm by Charles Forgy. This algorithm has proven itself to scale well for very large ruleset.
In all my years of experience, I've never encountered a client with more than 500 rules in a single rule policy. The reason is not any scalability issues on inference engines. The reason is simple. It is difficult for a business expert to
  • manage a very large rule policy
  • maintain a very large rule policy
  • verify and validate a large rule policy

As with any complexity, the divide and conquer strategy works very well. Split the large policy into smaller parts.
But for those who like to push the limits and see how well the BizTalk inference engine scales, you might like to read the article Microsoft'’s Rule Engine Scalability Results - A comparison with Jess and Drools, by Charles Young.


James Taylor said...

Have to disagree with you - see my post on larger rule bases on my blog - http://edmblog.fairisaac.com/weblog/2006/01/how_many_rules_.html

Marco Ensing said...

James, thanks for the responds. I feel obligated to further classify my statement after reading your post. There are probably two misinterpretations:

First of all: I’m not stating that a company does not have thousands of business rules. They often do. However these rules are grouped together in functional modules.

Within Microsoft Biztalk such a group of rules are referred to as a Rule Policy. CA’s Aion refers to a group of rules as a Ruleset. And probably ILog and Blaze are also using the term Ruleset.

There can be rule policies (rulesets) for Eligibility, Product pricing, Marketing Campaigns etc.

And within a subcategory of ‘Eligibility’, it can be further divided in ‘General Eligibility rules’, and ‘A product specific eligibility rules’, or ‘A region specific eligibility rules’ etc.

Interestingly enough Domain Driven Design Patterns (Eric Evans) can assist in identifying proper modules (a.k.a. policies, a.k.a rulesets)

Everyone uses modules, but few treat them as a full-fledged part of the model. Code gets broken down into all sorts of categories, from aspects of the technical architecture to developers' work assignments. Even developers who refactor a lot tend to content themselves with modules conceived early in the project.
It is a truism that there should be low coupling between modules and high cohesion within them. Explanations of coupling and cohesion tend to make them sound like technical metrics, to be judged mechanically based on the distributions of associations and interactions. Yet it isn't just code being divided into modules, but concepts. There is a limit to how many things a person can think about at once (hence low coupling). Incoherent fragments of ideas are as hard to understand as an undifferentiated soup of ideas (hence high cohesion).

Choose modules that tell the story of the system and contain a cohesive set of concepts. This often yields low coupling between modules, but if it doesn't look for a way to change the model to disentangle the concepts, or an overlooked concept that might be the basis of a module that would bring the elements together in a meaningful way. Seek low coupling in the sense of concepts that can be understood and reasoned about independently of each other. Refine the model until it partitions according to high-level domain concepts and the corresponding code is decoupled as well.
Give the modules names that become part of the ubiquitous language. modules and their names should reflect insight into the domain.

Secondly we might have a different interpretation of how we count rules. I consider decision table as well as decision trees as a single atomic rule. And I only consider rules that are actually executed by an inference engine.

Putting all this in the analogy of software design; yes over time we have created software programs with millions lines of code. But the divide and conquer strategy of component based development has lead to maintainable and understandable code. Take down the mirrors and smoke screens on business rules, and you will see it is all not that different.

The power of business rules are in the declarative nature.

Ask Rolo said...


Since a decision tree has many paths, I like to include each one in the rule count.

Re "Take down the mirrors and smoke screens on business rules, and you will see it is all not that different."

Well, you're right, it's just a little different from the status quo procedural programming.

But I think you'll agree that that slight difference results in lots of benefits (time to market, agility, externalizing rules, etc.)

See What's the big deal with the business rules approach?

As you said, "The power of business rules are in the declarative nature."

Aaron Banks said...

How does the business rules engine work? I'm still confused how it's applied and integrated into the company. I love reading this though, you did a great job of explaining the product.

Aaron Banks said...

I am looking for places where I can find information on a business rules engine and this was perfect. Thanks so much for posting this. You helped me out a bunch.