Algorithmic Performance and Functional Programming
I have been doing more and more functional style programming, thanks mostly to the support for the style that C# 3.5 provides. There are some performance hits to be taken by using such a style, but for the most part they have been acceptable “percentage increases” over the non-functional code. However, there is one thing that I have been noticing as I adopt a more and more functional style and that is that I tend to write more and more “generic” algorithms that take a large class of types and work with them in a more abstract way.
I think this is probably a good thing, in the Don’t Repeat Yourself style. However, I am becoming concerned with something that was less of a problem with more traditional object oriented programming: algorithmic performance.
When the objects you are using hide their data, one of the advantages was that the internal implementation could ensure the data was in a particular arrangement (say, sorted or some specific tree such as red-black trees) and optimize the algorithms based on that arrangement. Especially for objects that live for a long time it could pay off handsomely to accept the up front cost of arranging the data loaded according to your internal preferences.
With the functional style I find that the data tends to live outside the functions that manipulate it. This means that the algorithm must be more robust against various types of inputs. The classic example is sort algorithms: random data, presorted data and reverse sorted data may result in radically different performance. With a class I could pre-digest the data to my preferred format, while with a generic function I find myself having to deal with “what I get” and then handing it off. There are algorithms that do deal with this kind of situation, Introspective Sort for example attempts to avoid worse case data by switching algorithms.
I have to wonder if it is just the phase in functional programming I have reached (in other words, I’m still not good enough at it), or if it is something more deep rooted that makes traditional objects more “in control” and thus more easily able to manage algorithmic performance by “stacking the deck” by tightly algorithms (methods) and data structures.