Software Failures in Space
On January 28, 1986, the Space Shuttle Challenger and her seven-member crew were lost when a ruptured O-ring in the right Solid Rocket Booster caused an explosion soon after launch. © by NASA
The satellite sits in the clean room. Safely cradled as it goes through detailed testing. Doors open with cables snaking from its heart, across the floor, to a bank of equipment. Power carefully conditioned, every signal recorded, behavioral models constantly verified. Technicians and engineers peer into their screens, hunting. Hunting for anything unexpected. Fixing anything unexpected. All must perform only and exactly as expected. No exceptions.
Launch day comes. The warm serenity of an equatorial launch site. The cataclysmic fire of a perfect launch. The first stage separates cleanly, falling away as the kick motor engages. Near its final orbit, the satellite awakens. First job, take bearings and establish attitude control. Second job, well, doesn’t matter. The satellite starts tumbling, faster and faster, thrusters firing, fuel draining quickly. The last thing ever heard from the satellite is an emergency broadcast: all circuits functioning perfectly, attitude control system unresponsive, fuel nearly depleted, battery nearly drained, solar cells unresponsive.
WISE Project Scientist Peter Eisenhardt stands next to the fully assembled WISE satellite at Ball Aerospace & Technologies Corp., in Boulder, Colorado. © by NASA/JPL-Caltech/Ball
What could have happened? People turn to the software since the electrical system was working perfectly. The change log is accessed and some fixes put in during testing show up. They are the usual math errors often found in guidance systems — a missing or inappropriate negative sign in an attitude control matrix and an axis misused in the matrix. Let’s look at these. Consider that a 3-axis accelerometer chip has readings for its x, y, and z axis. But these aren’t actually the x/y/z of the craft. The chip gets mounted on a circuit board, moved around to minimize signal interference, and rotated on the board to keep its data lines away from the edge. Thus, the x/y/z axis in the accelerometer documentation are no longer the craft x/y/z and not even likely the x/y/z the software developer was told they would be. These things happen as each set of engineers work their specialty and are easily seen and fixed during testing.
To test all the instruments in the satellite, an access panel had to be opened and some instruments moved so the test cables could reach into the core. Doing so moved the accelerometer board. The engineers were watching their screens and expected an error in the attitude control system, so they just altered the software and kept testing.
Was there really an issue in the software? Perhaps with the axis settings, but that was never actually tested, just changed. Was this really an “error” in the software or was it just out of the loop on current hardware specs? Regardless of how you classify the incident, it is the large number of these tiny details that makes it all Rocket Science. Hell Yeah!
read this on www.part-time-scientists.com







Subscribe to this Blog
Exploration Prize Group Presented by 