When Do You Stop?
Dave Jesse on July 22, 2019

It sounds like a simple question, but sometimes the task seems to grow while the target moves further from your grasp. Here is a tale of one such case which illustrates how hard it can be to decide when to stop working on a problem.

An Undefined Error

One of our customers raised a support ticket to have an “Undefined Error” investigated. The flight had failed to process and they asked us to look into the cause. A quick view showed a strange spike at the start of the flight, and here is a close look at the pressure altitude and airspeed data.

The lump of data before the gap is taken from the top of descent, but lies between the start of a takeoff run and a point early in the descent. Clearly something very strange has gone on here.

This flight was one part of a larger download, with 26 flights of various durations recorded. By looking at this longer period we can see if the error is a “one off” or repeated. Here is what the original download looked like before splitting into separate flights.

Careful review found these two other errors:

At this stage we have three corrupted flights out of 26. In all these cases the data disjoints appear at the start of a flight, although with so few examples this may not always be the case.

Tracing the error

Whenever there are jumps or blocks of data that look wrong,I always turn to the frame counter to see if the data was supplied by the flight data acquisition system with a sequential frame counts, or if there were jumps in the frame counter as well, as this points to an error in the recording system rather than the acquisition system.

Taking the last case above, here is the same data with the frame counter in view.

Clearly the frame counter jumps at the same point as the data, implying that there is something amiss with the recording system, or data transfer, and the data source on the aircraft is probably working correctly.

When Do You Stop?

The title of this blog was “When Do You Stop?” and at this point we could tell the customer that we think the error is in the recording system and not the data acquisition system, and leave them to investigate further.

Alternatively, we can carry the investigation further to see if we can provide more information that will aid trouble-shooting, and to check it was not caused by the data transfer system.

We know that the data frame counter, which should increment linearly, jumped, so we can compare other time-related parameters which should also be predictable. I’ll skip a couple of steps here and show you the key result in a table, again for the data range of the case plotted above.

The corrupt data was from 21st April, and embedded in a recording on 9th June. The entire recording was for flights from 3rd to 10th June, so the corrupt section was for a flight that happened five weeks before the start of the data.

Checking the original download for the flight on 21st, we can show that the data is exactly the same – that is it’s not just the hour and minute values that have been somehow altered, but the entire data for that period was an exact copy of the data originally recorded on 21stApril.

Should We Go Further?

We could now go on to try to recreate the data processing and extraction stages to see if this error could have been created by the data transfer and processing. If this data was transferred from a new type of recorder or for a new customer, we might do this, but in this case it was for a well-known aircraft type and recorder configuration and a long-established customer.

Another pointer is that since June, there have been no more cases of data corruption for this aircraft. We therefore run the risk of creating more work for ourselves and for our customer than is justified to identify the cause of three corrupt flights on one download.

What I have tried to illustrate here is that investigating data corruption can be complex – and sometimes surprising – but we always need to keep in balance the effort in the investigation and possible costs for the airline, if we had asked for components to be changed, against the benefits of quality data.