Domain specific languages, are they worth it?

Yet another interesting discussion of Domain Specific Languages popped up:

http://www.artima.com/forums/flat.jsp?forum=106&thread=155752

Domain specific languages (DSLs) are scripting languages (usually) built on top of a traditional environment. For programmers who have written code only in a single language model, moving to a DSL model can be a bit of a shock. The fundamental idea is that when your base programming language is becoming constraining in some way, you can build a DSL on top of it to remove some of the limitations. (Another option is migrate to a more powerful language, but sometimes that is too drastic of a step to take.)

A project I worked heavily on uses a code generator. Code generators are actually very similar to domain specific languages in the problem set that they solve. In this specific case, the code was being written in VBScript for use in an Active Server Page application. The application created long before .NET 1.0 was released. Over time, it became clear that the same fundamental ideas were being used over and over in some parts of our system.

Instead of repeating ourselves, we built a data driven code generator. We provided non programmers with a UI to edit the system components without actually writing code. The data in the database created by the non programmers was in effect a DSL that was specified the generated code. Our DSL allowed us to create the structure of the required code, with the generated code being fleshed out on demand to the specifics based on the data provided by our non programming staff. The final product allowed us to reintroduced inheritance into VBScript via our DSL, reducing our actual source code size while increasing productivity.

This kind of solution works well in some cases, but can also create debugging problems. Your crashes will probably only point to generated code, which you must then map back to the original source of the problem. On the other hand, over time the DSL can become very robust as edge cases are dealt with and debugging routines are implemented.

We are currently porting the application from VBScript to C#. We are seeing great improvements simply by building interpreters of the original DSL. The old system would generate code which then was saved to ASP pages on disk which were invoked by end users. The C# based interpreters are seeing response times improve because many of the ugly VBScript hacks have given way to a more elegant C# based class structure and a more direct implementation of our ideas.

All of this brings up the question: are DSLs worth the effort and potential liabilities? One limitation is that DSLs are not usually strongly typed while languages such as C# and Java are. This means you can create more errors that are found late in the process with DSLs. Another limitation is that your core language has robust debugging tools and a large library to call upon. On the other hand, C# and Java can be very cumbersome to use if you need to invoke many variants of nearly identical code. You could subclass for each instance, but in many cases DSL driven invocations of a more flexible object are more efficient than generating reams of nearly identical code. As long as the work to build and maintain the DSL is lower than subclassing, it is a winning choice.

After many years of using these types of solutions, I am finding that the best way to use DSLs is as a guide for what your core system really needs to be able to do. The dividing line between DSL and core code will evolve as the system grows. Early on, you will probably have all functionality in the core system. As your requirements demand more flexibility, you will find yourself repeating yourself. Where possible, use the core system’s language to eliminate that repetition: you will retain the advantages of strong typing and a robust debugging environment. In many cases you can avoid creating a DSL and the work maintain it by clever analysis of the core problem.

However, if you find that you are repeating yourself in ways that your core language cannot easily remove, consider moving to a DSL or data driven architecture. Over time, this flexibility will lead to the discovery of features that probably should be submerged back into the core system. You will find these wherever you find yourself repeating yourself in the DSL (which defeats the purpose of the DSL) or creating large blocks of code in the DSL. When these things happen, it indicates that your core system needs to gain more power, potentially as an enhancement to the DSL you created. Remember, the goal is to write as little code as possible in the DSL side: verbose DSL code is an anti-pattern to be avoided and too much dependence on a DSL can create a version of the second system effect where too much work is being put into the DSL and not enough on solving the original problem.

Category: