Open Source Software – A Paradox

The nuts and bolts of Software – Intellectual Property

The code running in the myriad of computers in the world is called object code .

This is a series of low level commands read sequentially from memory that tell the computer processor what to do. These are in assembly language which is completely unintelligible to humans. (except for maybe a very few nerds who like this stuff!)

Depending on the computer language used, this object code is generated from source code by another program called an interpreter or compiler. This source code is human readable and describes the function of the program. Intellectual property resides in the source code. Large enterprise levels programs may have millions of lines of such code. These are usually proprietary – ie the source code is copyright and secret. The user of the program buys a licence which allows them to use the code for a limited time but most other rights are strictly limited. In particular they must use the original vendor for support including bugfixes and upgrades as the source is not available to anyone else. This lack of alternatives allows the vendor to charge more for support than would otherwise be the case – this situation is termed “Vendor Lockin”.

Data and Intellectual Property

Useful and substantial programs operate on data such as names, addresses, text and measurements. These data are stored in a repository generally called a database. But in the computer world, data can be stored in different ways – for example a number can be binary, integer, decimal, signed or unsigned. All these quantities are handled and stored in different ways. Data can be modelled in different ways with constraints and associations with other data. Databases have structures called schemas which allow them to store and recover data reliably. These models and schemas are also generally proprietary. A vendor may charge a fee to convert data to a format suitable for another database or even regard the customer’s data as proprietary and refuse to do so altogether. The customer is truly “locked in” in this situation and the barriers to change to another program are substantial.

The Open Source Paradox

Yet another software development paradigm is termed “Open Source”. The design process is not necessarily different to those discussed above – rather the difference is in how the development is funded and how the intellectual property created is treated. The software is “Open” – the source code is public and free for anyone to use as they see fit. However, there is one important caveat – software developed from this codebase must also remain “Open”. Much of this software has been developed by volunteers for free, though commercial programmers may also use this model and there is no reason why a commercial entity cannot charge for installing or supporting an Open Source program. But the source code must be publicly available.

Commercial developers argue that an Open Source model does not deliver the resources required in large software developments and the resources needed for ongoing support. They argue that Open Source cannot deliver the quality that commercial offerings can. But is this really true?

If you are browsing the internet you are likely to be using Open Source software. The majority of web servers are based on an Open Source stack – typically LAMP (Linux operating system, Apache webserver, MySQL database and PHP scripting language). Certainly the internet would not function without Open standards such as HTTP. The Linux Desktop now rivals commercial alternatives such as Windows or MacIntosh in functionality and stability.

But how can “free” software be equal to if not better than proprietary software? You get what you pay for, right?

This apparent Paradox can be explained by several factors.

The passion of volunteers – “nerds” will always want to show how clever they are. They may want the functionality of a particular program that is not otherwise available

Corporate memory – the efforts of the “nerds” and others are not lost. The code is available to others to extend and improve. Open Source versioning systems such as SubVersion and GitHub have been developed, which allow tracking of changes and cooperation between developers. GitHub now has more than 50 million users worldwide.

Programmers who have no connection with each other apart from an interest in a particular project can cooperate via these systems and their work is automatically saved and curated. Over time this is powerful. In the proprietary world programmers may work for years on a project, producing high quality software, but if their work does not gain market acceptance it is locked up in a proprietary licence and forgotten.

Development techniques are more suited to “complex” systems engineering. Open Source software is developed incrementally and with many competing solutions. As discussed previously this is likely to produce a better outcome in a complex environment.

Complexity and Software Design 

(Why Large IT Projects Fail) 

From the literature and the cases above it is apparent that large IT system deployments fail more often than not, and are almost never completely delivered. Is there a fundamental reason for this?

Large IT projects are certainly complex in the conventional sense. Bar-Yam (ref) argues that they are also “complex” in the mathematical sense. There may be several independent large systems interacting in unpredictable ways. This complexity property makes them difficult to analyse and therefore difficult if not impossible to predict. It is therefore practically impossible to specify the outcome of a complex system design at the outset. Thus the traditional “waterfall” (see below) specification process is likely or even bound to fail.

Conventional large system engineering has been based on historic designs such as the Manhattan Project, or the Apollo project to land a man on the moon. While these appear “complex” the outcome sought was relatively discrete and simple in an engineering sense. Health IT systems have a complex output in many records which are used in various ways for individual, management and population purposes. The Health IT environment is inherently complex with large amounts of data of different types.

From Bar Yam – “The complexity of a task can be quantified as the number of possible wrong ways to perform it for every right way. The more likely a wrong choice, the more complex the task. In order for a system to perform a task it must be able to perform the right action. As a rule, this also means that the number of possible actions that the system can perform (and select between) must be at least this number. This is the Law of requisite variety that relates the complexity of a task to the complexity of a system that can perform the task effectively. “

Clearly in the case of Health records there are many “wrong ways” and probably many “right ways” to record a health encounter!

The governance of large projects in government is itself complex – there are many stakeholders with various agendas interacting in various ways, not necessarily under the control of the design team.

Bar Yam argues that a fundamentally different approach should be taken to the design of complex systems. Rather than attempting to specify a single solution at the outset, the design process should focus on the process of evolution towards a solution.

Even agile software design processes (see below) are not “evolutionary” in this sense. They do not have many possible solutions competing with only the “fittest” surviving – they are a single solution evolving by iterative small changes towards a solution. The Open Source software development process is perhaps closer – here there may be several projects developing a similar solution to a particular problem.

There are design processes which are a combination of these techniques. Toyota are regarded as world leaders in design and their vehicles are market leaders. They use a traditional systems engineering approach, but have several teams working in parallel on the same problem. Each team is allowed to progress their work to an advanced stage before a single solution is chosen. This redundancy would appear to be wasteful but in fact produces superior designs.

Where change is being made to a large complex system, Bar Yam advocates small changes at various places in the system. These gradually replace the current components over time, with the old system being maintained for a period for redundancy and safety.

The internet can be regarded as a large system that was developed using this “complex systems engineering” approach, being created and revised incrementally over decades. The internet has brought change to the world equivalent to the industrial revolution. It is a Complex Systems Engineering success story.

Software design approaches – “Waterfall” vs “Agile” 

The traditional approach to commissioning or designing a software system is termed the “Waterfall” process.

Software designers or vendors attempt to capture a detailed set of specifications from users which completely describe the functions of the new system. The customer assesses the new system against the specifications. When the two parties agree that the specifications have been met to an acceptable degree, the system is delivered. Further evolution of the system is a separate issue – often not considered at the initial design phase. So the design effort is essentially a one-off “waterfall” process using a traditional systems engineering approach. In practice, there are a number of problems with this approach in large Health IT systems.

Firstly the vast majority of new Health IT systems are not designed from scratch – they are usually created by modifying a system already held by the software vendor. There are claims by the vendor that their system can be “customized” to fit the customer’s requirements. In practice, the customer usually accepts significant compromises as to which specifications are not addressed. Contracts generally limit the writing of new software to address specifications, because this is expensive and uncertain. Indeed if the product being considered is not well designed and maintained, it may be impossible to modify it to a significant extent (see “Legacy Software” in a further article)

Secondly the process of capturing accurate specifications is difficult – misunderstandings may remain between customer and software engineer. To make the process more reliable, formal processes and even software languages have been written for this purpose.

Thirdly, the customer’s needs will always change and evolve. There is often “Specification Creep”, particularly if there are many “stakeholders” and if the process is not tightly managed. In large Government projects there may be hundreds of stakeholders with various agendas. Invariably the delivery of specifications is incomplete. If the process of updating and improving the software is not explicitly planned and funded changes after the startup date will be difficult. Bureaucrats generally want to deliver a specific outcome on a certain date – funding is always contested and limited. A system which is developed incrementally over time is less politically attractive than a showpiece startup announcement. They do not generally expect to fund an ongoing process of evolution and improvement.

Finally, the project seeks to replace a current system with a “big bang” approach. The old system is not necessarily retained for safety and redundancy

Thus, if we accept that that these systems are “complex” as discussed above, then this approach is almost certain to fail.

An alternative design process has emerged in recent years termed “agile” or “extreme” programming. Here the designer starts with a partial but quickly produced solution to the customer’s requirements. The customer assesses the solution and proposes changes and extensions. The designer produces these and the customer assesses the system again and proposes improvements. Through an iterative series of such cycles, the system evolves towards a solution. This approach would overcome several of the problems above – in particular some of the issues of complexity and difficulty in specification would be better addressed. However this approach is not generally adopted in large Government funded systems. 

Next – Open Source and Software Design 

Reference

When Systems Engineering Fails — Toward Complex Systems Engineering

SMC’03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme – System Security and Assurance (Cat. No.03CH37483), 2003

Yaneer Bar-Yam

Why is Health IT so hard?

The History of large Health IT Projects

Only a minority of large Health Information Technology (IT) projects are delivered with functionality approximating that specified by the customer, on time and on budget. There is a significant rate of what can only regarded as complete failures. Given the money spent (usually in the hundreds of millions of dollars) this is a surprising result.

To give a few examples: (1) 

UK NHS

The number one project disaster of all time is probably the massive £12 billion plan to create the world’s largest civilian database linking all parts of the UK’s National Health Service. This was initially launched in 2002. The project was in disarray from the beginning, missing initial deadlines in 2007, and eventually being scrapped by the UK government in September 2011.

The nine-year debacle, under the National Programme for IT, was way over cost and years behind schedule due to technical issues, issues with vendors and constantly changing system specifications.

In early 2012, one of the primary suppliers, CSC made a $1.49 billion write-off against the botched project. One report claimed the failed project had cost UK taxpayers £10bn with the final bill expected to be “several hundreds of millions of pounds higher.”

South Australian EPAS system (2) 

This system was set up initially in 2 country hospitals and SA Repatriation Hospital in 2012

This was a pilot for a statewide electronic medical records system

It was to be deployed in the new Royal Adelaide Hospital but when the hospital opened the system was unusable and paper records were used. At that time the total published cost was $422M. This can only be regarded as a comprehensive failure, though the SA government sought to ”reset” the project using the same software.

HealthSMART modernisation program

In mid-2008, the Victorian government unveiled its HealthSMART program to modernise and replace IT systems across the Victorian public health sector.

Implementation costs for the HealthSMART clinical ICT system rollout blew out to $145.3 million or 150 per cent more than the original budget of $58.3 million, according to an Auditor-General’s report.

The Auditor-General’s audit report also suggested that the absence of appropriate controls and effective mitigations at certain sites could pose serious safety risks to patients.

MyHealth Record (3) 

The Federal Government has been engaged in the development of a National E Health record for more than 20 years in various forms

In spite of the investment of some $1.97B at Jan 2020,  it has not achieved a useful universal Health record for all Australians.

Some 23M records had been created with half of these holding no data. Of those with data many are incomplete or not useful.  

The uptake by the public has been limited by concerns about privacy. It appeared that many government agencies were expecting to gain access to the data – this has now been limited. GPs are used to managing privacy – they are reluctant to allow their patient’s data to be uploaded to a system which appeared not to have to same privacy protections as their own systems.

GPs are also expected to manage and curate the data – however there is little provision by Government to acknowledge the cost and legal risk and reward them for doing so 

Early in the process there was the rollout of the Public Key Infrastructure (PKI) system. This was intended to allow identification of health providers such as doctors electronically so that billing and other functions could be carried out online. Unfortunately this did not achieve widespread acceptance by providers because it was cumbersome to use and most of the legal risk was placed on providers with an onerous contract. It appeared that the vendors of the system were able to essentially absolve themselves of risk. The provider password was created by the vendor. As the system had the legal force of a signature, some providers were concerned that having a password held elsewhere by an unknown entity was an unacceptable risk to them.

At one point there was an attempt to agree on interface standards between systems. This was never successful because of the commercial disincentives to standardization and the fact the vendors of software systems were not adequately remunerated. Much the money appeared to be spent on conferences, strategic plans and administration.

Summary of reasons 

In my view there three main reasons why the majority of these large projects fail either partially or completely.

(1) Complexity

These projects are genuinely difficult. They are large, enterprise level systems with many users, inputs and outputs. This means they are complex in both the mathematical and the lay sense.

Complex systems are difficult or impossible to analyse mathematically. The outcome of a design process is therefore difficult if not impossible to predict. Ideally a designer will adopt an “evolutionary” approach and replace parts of the system progressively while maintaining current systems for redundancy. But these large Health IT systems are designed with a “waterfall” approach – ie specifications are gathered and the system delivered as a “one-off” on a specific “go-live” date.

(2) Commercial reasons

These projects are commissioned by Governments. They are exclusively delivered by large corporations usually for eyewatering sums of money. The source code is invariably proprietary and secret. Interoperability between systems appears to be discouraged by the vendor. In any case it seems impossible to achieve – there are strong commercial reasons for this.

The tendering and contracting process is invariably “commercial in confidence” – this does not allow external scrutiny. The important decisions which will determine the direction of the project are made at the start of the process and in secret. In practice these systems are rarely designed from scratch – they are usually created by adapting an existing system held by the vendor. Inevitably this involves some compromise on the part of the customer as to which specifications are met.

(3) Government/Political factors 

These projects are politically risky. 

But the large corporation delivering the software can be used as a “Risk Partner” by the bureaucrats and politicians commissioning them. 

There is an imperative to deliver a discrete project at a specific time – an evolutionary design process is not politically attractive. 

These large projects require specific skills to manage – Government decision making and management systems perform poorly at the best of times – they are generally not up to the task.

I will explore these issues further in a series of posts.  

References

(1) Spectacular IT Failures

https://www.bbc.com/news/uk-politics-24130684

(2) SA EPAS system

https://www.abc.net.au/news/2016-11-08/what-is-south-austrailas-epas-patient-record-system/8005334


(3) MyHR

https://www.news.com.au/technology/online/security/australian-governments-controversial-my-health-record-system-slammed-as-failure/news-story/9f45df2927fe0b85c28a461343dd4704

https://www.theguardian.com/australia-news/2020/jan/23/my-health-record-almost-2bn-spent-but-half-the-23m-records-created-are-empty