a fragmented CD symbolizing software segregation

Software Design for Segregation

“Divide et Impera”

The concept of breaking a (complex) problem into smaller challenges which may be handled much easier is a well-established concept since centuries. In software design the corresponding concept is called segregation. Divide and rule (Latin: divide et impera) was used by Julius Caesar and other emperors as Napoleon to gain and maintain power by breaking up large groups into smaller fractions. Each fraction has less power than the large group and – important – the entity, implementing the division, i.e. the emperor.

Another very important prerequisite: the emperor segregates these fractions, so the smaller groups cannot collaborate and re-unite (i.e. adversely influence each other)!

To efficiently handle the complexity of a large software system, decompose the system into smaller parts. You will find the corresponding concept in the international standard IEC 62304: system > item > unit.

To ensure that the individual parts do not (adversely) influence the other parts: ensure proper segregation. Especially with respect to risk management, segregation plays an important role!

This article will show you how to do this at the level of software design.

Foundations

Segregation can be understood as the characteristic of a software-system, ensuring that software items are protected from unintended influence from other software items. Consequently, segregation or separation is meant as the act of isolation of software items in order to support

  • the safe, secure and reliable functioning (as intended)
  • the effectiveness and reliability of risk control measures.

Reducing Complexity

Software architecture aims at reducing complexity of software by statically decomposing a given software system into (less complex) software items. Decomposition continues until a level of refinement is achieved such that a software item is not further decomposed (“software unit”) – a decision made by you, the manufacturer!

Note that segregation is always expected whenever decomposing a software system into software items. Otherwise, the items are not independent and therefore not decomposed!

In other words: the reduction of complexity (by breaking down a software system into software items) can only be achieved effectively if each software item accomplishes its intended function without unintended / unwanted influence from other software items, i.e. by establishing segregation between different software items.

Segregation is achieved through technical, constructive measures in the design and implementation of software.

Improving Risk Control

As shown above, in the most general sense, the goal of segregation is to allow handling even complex systems and avoiding flaws or any impact in performance.

But – even more important! – segregation between software items supports effective risk control with respect to safety or security – while preserving and safeguarding a controlled level of mutual interference, which may result from intended coupling or from sharing resources.

Segregation Goals

How can you achieve effective segregation? 

If we rephrase the definition from the international standard IEC 62304, then segregation is effective if the individual software items as well as the integrated system maintain their functionality and quality attributes related to risk control even under adverse conditions. But: what is an adverse condition in the context of effective segregation?

  • Faults: We cannot prevent faults from happening. For segregation to be effective in the presence of faults, we must process the errors and exclude failures.
  • Limited Resources, Resource Competition: The set of available resources is always limited. Segregation is effective with respect to limited resources if we can guarantee that all resources needed to perform the tasks related to risk control can always be provided when needed, regardless of the load on the system, the load in the system environment, and the cooperative or non-cooperative behavior of any software item competing for the limited resources.
  • Performance Dependencies, Limited Performance, Real-Time-Behavior: Risk control may require a guaranteed performance and / or a guaranteed reaction-time for critical tasks. Segregation is effective with respect to performance if we can guarantee that all performance criteria related to risk control are always met, regardless of the load on the system, the load in the system environment, and the cooperative or non-cooperative behavior of any software item competing for processing time and contributing to a risk related process.

Despite the correctness of the respective unit design and implementation, some critical software unit being part of a software system executed on the same computing platform requires segregation. For software, the regulatory requirements of safety and performance (despite their functional correctness) translate to:

  • non-blocking execution of each critical unit (NB),
  • performant computation of each critical unit and (PC),
  • integrity of data into, within and from each critical unit (ID)

For these goals we will examine the relevant supporting constructive measures in the subsequent chapter.

Segregation Strategies

From the existing work and practical experience with existing dependable software designs, five strategies supporting the above-mentioned goals are of particular importance:

Note that we herein use the general term “CPU” for the platform’s processing unit – be it virtual, discrete, integrated or only a kernel / core within an integrated processing unit.

a) Cooperative Sharing of Resources

  • Critical items acquire resources early or statically, i.e. the platform pre-assigns resources at start-up or at least before a critical operation is being started.
  • Other items (which are not relevant for critical operations) only acquire shared resources “late” i.e. right when they need them.
  • Schedulers lock optimistically, i.e. they lock late, but check and resolve locking conflicts and release resources early.
  • Platforms (operating system, scheduler, hypervisor) detect load maximum levels and limit system use.

b) Cooperative Tactics For Performance

  • The CPU scheduler uses preemptive scheduling, i.e. it allows to withdraw the CPU from a long-running task
  • Items use multithreading, which allows them to act responsively and avoid being blocked by earlier requests
  • Non-critical unit computations are limited via time-outs, such that they do not use all the resources that would be needed for critical items,
  • Design and implementations of items use algorithms with linear order of performance scaling i.e. O(n) algorithms, i.e. their resource appetite does not “explode” with larger input.

c) Component Isolation

  • Items avoid interactions of data shared between components, i.e. shared data is not modified, or any modification is done in a coordinated, synchronized way
  • Items detect and correct each error “near their origin”, instead of letting it propagate through the whole application
  • Items prevent faults from blocking computation, data or resources, such that other, critical items will not be blocked or disturbed by those faults
  • Platform/framework wraps non-critical software in order to catch all non-local procedure exits.

d) Graceful Degradation

  • Platforms/frameworks detect situations in which other software units consume resources or generate faults beyond controlled limits,
  • Frameworks/applications limit or disables resource/time use of non-essential functions,
  • Applications move the system into a fail-safe state,
  • Platforms/frameworks prioritize the allocation of control and resources to the critical software units.

e) Guaranteed Allocation of CPU

  • The scheduler guarantees allocation of CPU time within a defined interval. E.g., the software unit is assigned to a process/thread with an elevated priority class in the operating system,
  • The total CPU time is sufficient for the critical software unit to perform its computation within a defined maximum time. E.g., the overall load of all other units has a sufficient upper bound.

Mapping Towards Goals of Software Segregation

The following table explains how strategies support segregation goals:

StrategyNon-blocking ExecutionPerformanceIntegrity
Cooperative SharingXX
Cooperative Performance UseXX
Graceful DegradationXXX
Guaranteed Allocation of CPUXX
Component IsolationXX

Our Conclusion: You Need (You Want) Segregation!

As shown above: despite its use in high-risk applications, a certain level of segregation will be needed for any software architecture to effectively reduce the complexity of the software system. As a result, the goals of

  • Reducing software complexity,
  • Constructing a suitable software architecture, and
  • Increasing software dependability

highly depend on each other.

Write a comment or suggest a term!