Complexity and Software Design

(Why Large IT Projects Fail)

From the literature and the cases above it is apparent that large IT system deployments fail more often than not, and are almost never completely delivered. Is there a fundamental reason for this?

Large IT projects are certainly complex in the conventional sense. Bar-Yam (ref) argues that they are also “complex” in the mathematical sense. There may be several independent large systems interacting in unpredictable ways. This complexity property makes them difficult to analyse and therefore difficult if not impossible to predict. It is therefore practically impossible to specify the outcome of a complex system design at the outset. Thus the traditional “waterfall” (see below) specification process is likely or even bound to fail.

Conventional large system engineering has been based on historic designs such as the Manhattan Project, or the Apollo project to land a man on the moon. While these appear “complex” the outcome sought was relatively discrete and simple in an engineering sense. Health IT systems have a complex output in many records which are used in various ways for individual, management and population purposes. The Health IT environment is inherently complex with large amounts of data of different types.

From Bar Yam – “The complexity of a task can be quantified as the number of possible wrong ways to perform it for every right way. The more likely a wrong choice, the more complex the task. In order for a system to perform a task it must be able to perform the right action. As a rule, this also means that the number of possible actions that the system can perform (and select between) must be at least this number. This is the Law of requisite variety that relates the complexity of a task to the complexity of a system that can perform the task effectively. “

Clearly in the case of Health records there are many “wrong ways” and probably many “right ways” to record a health encounter!

The governance of large projects in government is itself complex – there are many stakeholders with various agendas interacting in various ways, not necessarily under the control of the design team.

Bar Yam argues that a fundamentally different approach should be taken to the design of complex systems. Rather than attempting to specify a single solution at the outset, the design process should focus on the process of evolution towards a solution.

Even agile software design processes (see below) are not “evolutionary” in this sense. They do not have many possible solutions competing with only the “fittest” surviving – they are a single solution evolving by iterative small changes towards a solution. The Open Source software development process is perhaps closer – here there may be several projects developing a similar solution to a particular problem.

There are design processes which are a combination of these techniques. Toyota are regarded as world leaders in design and their vehicles are market leaders. They use a traditional systems engineering approach, but have several teams working in parallel on the same problem. Each team is allowed to progress their work to an advanced stage before a single solution is chosen. This redundancy would appear to be wasteful but in fact produces superior designs.

Where change is being made to a large complex system, Bar Yam advocates small changes at various places in the system. These gradually replace the current components over time, with the old system being maintained for a period for redundancy and safety.

The internet can be regarded as a large system that was developed using this “complex systems engineering” approach, being created and revised incrementally over decades. The internet has brought change to the world equivalent to the industrial revolution. It is a Complex Systems Engineering success story.

Software design approaches – “Waterfall” vs “Agile”

The traditional approach to commissioning or designing a software system is termed the “Waterfall” process.

Software designers or vendors attempt to capture a detailed set of specifications from users which completely describe the functions of the new system. The customer assesses the new system against the specifications. When the two parties agree that the specifications have been met to an acceptable degree, the system is delivered. Further evolution of the system is a separate issue – often not considered at the initial design phase. So the design effort is essentially a one-off “waterfall” process using a traditional systems engineering approach. In practice, there are a number of problems with this approach in large Health IT systems.

Firstly the vast majority of new Health IT systems are not designed from scratch – they are usually created by modifying a system already held by the software vendor. There are claims by the vendor that their system can be “customized” to fit the customer’s requirements. In practice, the customer usually accepts significant compromises as to which specifications are not addressed. Contracts generally limit the writing of new software to address specifications, because this is expensive and uncertain. Indeed if the product being considered is not well designed and maintained, it may be impossible to modify it to a significant extent (see “Legacy Software” in a further article)

Secondly the process of capturing accurate specifications is difficult – misunderstandings may remain between customer and software engineer. To make the process more reliable, formal processes and even software languages have been written for this purpose.

Thirdly, the customer’s needs will always change and evolve. There is often “Specification Creep”, particularly if there are many “stakeholders” and if the process is not tightly managed. In large Government projects there may be hundreds of stakeholders with various agendas. Invariably the delivery of specifications is incomplete. If the process of updating and improving the software is not explicitly planned and funded changes after the startup date will be difficult. Bureaucrats generally want to deliver a specific outcome on a certain date – funding is always contested and limited. A system which is developed incrementally over time is less politically attractive than a showpiece startup announcement. They do not generally expect to fund an ongoing process of evolution and improvement.

Finally, the project seeks to replace a current system with a “big bang” approach. The old system is not necessarily retained for safety and redundancy

Thus, if we accept that that these systems are “complex” as discussed above, then this approach is almost certain to fail.

An alternative design process has emerged in recent years termed “agile” or “extreme” programming. Here the designer starts with a partial but quickly produced solution to the customer’s requirements. The customer assesses the solution and proposes changes and extensions. The designer produces these and the customer assesses the system again and proposes improvements. Through an iterative series of such cycles, the system evolves towards a solution. This approach would overcome several of the problems above – in particular some of the issues of complexity and difficulty in specification would be better addressed. However this approach is not generally adopted in large Government funded systems.

Next – Open Source and Software Design

Reference

When Systems Engineering Fails — Toward Complex Systems Engineering

SMC’03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme – System Security and Assurance (Cat. No.03CH37483), 2003

Yaneer Bar-Yam