Future Electronics – Predicting power-supply reliability: an art or a science?

By Paul Baker
Business Development Manager (UK)
Future Power Solutions (a division of Future Electronics)

The datasheet is a wonderful repository of sound, tested and verified information about the performance of a component, module or system. In the case of a Power Supply Unit (PSU), the datasheet tells engineers about a huge variety of performance parameters, including ripple and noise, efficiency, accuracy of regulation, isolation voltage and electro-magnetic emissions. The range and detail of the information on offer enable the user to characterise with great confidence the expected behaviour of the unit in any given application.

But what about one other important performance parameter: the reliability of the power supply? In truth, today’s PSUs from reputable manufacturers offer extremely long lifetimes. The lifetime is precisely predictable when operated in the test conditions specified by reliability standards such as MIL-HDBK-217 or Telcordia. What is more, experience shows that high- quality PSUs also offer long lifetimes outside these strictly defined parameters.

A question remains for system designers, however: how confidently can they predict the average lifetime when operating the PSU outside these test conditions? A wide variety of common factors can break these conditions: heat, shock and vibration, transient fluctuations in the supply voltage, and the ageing of electrolytic capacitors can all give rise to premature failure. The datasheet’s standard lifetime rating, then, is rarely exactly applicable to a real-world product.

At the same time, failure to manage the end-product’s reliability is hardly acceptable. The brand’s reputation is a valuable asset. The environmental and financial cost of disposal and repair are also damaging.

So how can a system design engineer confidently estimate the reliability of a Commercial Off-The-Shelf (COTS) PSU? And which are the most effective ways to maximise this level of confidence?

The limitations of manufacturers’ reliability data
The most commonly provided value expressing the lifetime of a new COTS PSU is the Mean Time To Failure (MTTF) or Mean Time Between Failures (MTBF) value. MTTF is normally specified in thousands of hours at a constant operating (ambient) temperature.

Of course, MTTF gives no indication about the time at which any single unit, chosen at random from a large population of units, will fail: MTTF is an average value. Some units will last longer than the specified MTTF value, and some will fail prematurely. In fact, assuming a constant failure rate, which might be an unrealistic assumption in the context of the operation of electronic equipment, the probability that an individual unit will last as long as the MTTF value is just 37%. Put another way, half of the units will have failed after 0.69 of the MTTF has elapsed, as shown in Figure 1.

Fig. 1: A curve showing the probability that a unit with a given MTTF is still operational after a given multiple of the MTTF. (Source: CUI  ‘Reliability Considerations in Power Supplies’)

Fig. 1: A curve showing the probability that a unit with a given MTTF is still operational after a given multiple of the MTTF. (Source: CUI ‘Reliability Considerations in Power Supplies’)

This is because the failures for a constant failure rate are characterised by an exponential factor, as expressed by the equation for calculating the probability of a component not failing after a given time:

PSU manufacturers employ models based on highly accelerated tests in order to predict the failure rate of their products. They cannot run a test population of PSUs under normal operating conditions and wait to observe failures, because it would take many years to gather statistically significant data. So they subject their products to excessive temperature, vibration, current and voltage stresses in order to rapidly impair them.

Clearly a sound methodology should underlie the models that convert the results of accelerated tests into a datasheet’s MTTF value; reputable PSU manufacturers carefully verify and refine their methodology to ensure it reflects users’ experience in the real world.

In so far as it goes, then, a datasheet MTTF value specified by a trusted manufacturer may be relied on. But because it applies only to narrowly specified operating conditions, it is best used as a comparison tool when choosing from among a range of competing products. In other words, MTTF is good for exposing the relative longevity of different PSUs tested under similar conditions.

But the realised value of MTTF in any given application is highly dependent on the operating conditions in that application. Temperature has the greatest effect on lifespan, but it is also affected by absolute levels of input and output current and voltage, by the rate of change in these parameters, by mechanical stress and by other factors.

So while the MTTF figure is calculated based on a set of ‘typical’ and constant operating conditions, many users’ applications will operate in conditions which:
• are highly variable
• differ from the ‘typical’ values

Even if the application has constant conditions, they will almost certainly not be the same as those of the datasheet’s typical application.

The datasheet information about failure rates and reliability is, then, of limited utility when estimating likely failure rates in any given real-world application. And yet the power-system designer must design for a maximum acceptable failure rate appropriate to his or her end product. Whether this target failure rate is almost zero, in a mission-critical application, or one failure every 10,000 hours in the case of a low-cost consumer product, the designer must attain a high level of confidence that the actual failure rate in the field will at least reach the minimum target.

As described above, the MTTF in the datasheet does not provide this high level of confidence other than in the stated constant operating conditions. So how can the power-system designer predict the real-world failure rate more confidently? The answer is part art, part science.

The science is in the additional data sets that will be available from reputable PSU suppliers. Manufacturers such as Murata Power Solutions, Vicor and CUI, for instance, will provide field data: a statement about the observed failure rate of PSUs returned to the manufacturer for repair or replacement. The statement is based on examination of each failed unit, and provides an analysis of the cause of failures.

This statement can help potential users of a particular model of PSU to:
• verify the MTTF calculation by observing the correlation between it and the observed field failure rate, as shown in Figure 2
• identify specific operating conditions, stresses or component parts that appear to cause most failures

Fig. 2: A PSU’s lifespan has three phases. ‘Infant mortality’ is high in the first phase  lasting around 24 hours. Pre-shipment burn-in weeds out these infant mortality failures. (Source: CUI  ‘Reliability Considerations in Power Supplies’)

Fig. 2: A PSU’s lifespan has three phases. ‘Infant mortality’ is high in the first phase lasting around 24 hours. Pre-shipment burn-in weeds out these infant mortality failures. (Source: CUI ‘Reliability Considerations in Power Supplies’)

Reputable manufacturers also provide detailed application notes which design engineers can study to learn how to optimise their implementation of a PSU. Application notes from suppliers such as SL Power provide useful guidelines on thermal and mechanical design, for instance, and reflect the depth of detail to which its design optimisation process drills down. Following the manufacturer’s guidelines will help to maximise the PSU’s lifetime.

The second additional data point is available on request to users, for example, of Vicor power supplies: an application-specific MTTF rating, customised for the operating conditions typical in the user’s application. Even taking account of the uncertainty inherent in accelerated testing methodology, and the uncertainty in the user’s own specification of the application’s operating conditions, this customised MTTF figure gives a more reliable estimate of the average failure rate across a population of Vicor PSUs in the user’s application than the standard MTTF value based on typical operating conditions.

The third data point, again available from every reputable PSU manufacturer, is a thermal plot, showing the unit’s safe operating curve, and the way that this is affected by changes in the application such as the addition of a cooling airflow.

Even this expanded range of data, however, cannot provide an average failure rate that can be calculated with absolute certainty in any given application: the range of variables affecting the operation of a PSU, and the uncertainty inherent in the manufacturer’s testing methods, are simply too great. Indeed, the mind-boggling nature of the uncertainty inherent in random real-world events has exercised some of science’s greatest minds: Alan Turing is said to have expressed the problem to a colleague thus:
‘How best could you estimate the number of taxi cabs in a town, having seen a random selection of their licence plates?’

Science, then, only provides part of the answer; the power-system designer must also apply the art of the engineer. Experience will give the designer a feel for the reliability of each manufacturer’s data. By examining their own products’ field failures, OEM designers can build up a picture of the actual failure rate, and the causes of failures, and compare it with the expectation they had formed based only on the manufacturer’s data. Is there a close correlation, or is actual performance better or worse than expected? And how far does it deviate from the predicted performance?

The engineer’s intuitions about these questions help to reinforce the confidence he or she has in any estimate of failure rates derived from measurement and statistical calculations.

Predicting with confidence
When a datasheet expresses information about a PSU’s reliability, or unreliability, it does so with apparent mathematical certainty. The data on their own, however, only give a limited level of confidence in the predicted MTTF in any given application.

But design engineers can enjoy a high level of confidence in the lifetime performance of their chosen PSU when it is supplied by a known, reputable manufacturer, and when they can draw on their own experience of the manufacturer’s data, or that of a trusted third party such as a power-supply distributor. Collectively, many years of knowledge is built into the products, and surprisingly, not all come at a premium cost.

In other words, it is neither art nor science alone which help the engineer to make good judgements about the lifespan of a PSU, it is art and science combined.