|
Here are some of the horrors I've endured in my career as a programmer.
More often than I care to remember, I was the one inflicting the horror
on my customers or employers. Considering that I've been in this line
of work for 30 years, my record is damn good.
Fort Leavenworth, CATTS. The Fulda Gap digital map data generation always failed just before the end of my data run. I think the data covered 100 by 50 kilometers, sampled every 100 meters, and I had to build 5 zoom levels and generate the terrain (contours), movement, vegetation, and roads at all five of these. I'd spent hundreds of hours making sure the design was right, the code was bug-free, it would be usable. The first run, I fired it up at 10 p.m. and waited around a couple of hours and then drove 40 miles home in the icy midwinter night. The run died about 6 a.m., just before it finished. I tightened the code so it wouldn't run as long. Same result. It always died just before the end. Did I have a bug? After a week, I split the run into pieces. Still the same result. My time was up - I'd resigned a month or so before, and now I had to leave and go out West. It turned out that there was no need for me to feel guilty - the Army had run the machine, an ancient CPU that was no longer manufactured, at 90 degrees the week before I started running my code. The machine was crumbling, literally. It took them months to get it fixed. They had to fly someone in from Arizona to rebuild it by hand. I'm one of the few guys around who can claim, however implausibly, that his software caused mainframe hardware to die. The TSO/DMS switch on the GA 18/30 in Portland. I'd ordered the machine with a TSO/DMS switch, because we ran 2 operating systems on the machine, and they required different interrupt structures. It arrived with a "TSS/DMS" switch. The machine wouldn't boot. I was suffering that sneaking unease so familiar to all programmers. The solution: write a program in binary using the front panel, and make damn sure to eliminate the possibility that the interrupts were wrong. Oops. They were wrong. By this time, it was the middle of the day. I called G.A. and complained. "We thought you meant TSS/DMS. We haven't made a TSO/DMS switch in a couple of years." Idiots, don't you know how "assume" is spelled? I arranged everything: the flight to L.A., gathered all the stuff I needed to completely rewrite the system, and so on. Got on the plane, flew to L.A. that afternoon, drove to Anaheim. The only hangup was when the guard wouldn't let me in the gate. Finally got through that and spent the entire night rewriting everything that used I/O on a special machine that had all three interrupt structures on it. Finished just about dawn, got back in the car, flew back to Portland with all my stuff piled in the extra seat I'd bought for it. It worked! By evening I was essentially done. Went out to eat, but couldn't stay awake. No food for over a day, but the hell with it. Went back to the apartment to sleep the sleep of the righteous, knowing that I'd saved the company. This was probably the most demanding bit of work I've ever done: total concentration, and no room for error. The Board of Trade. It's little known that the Kansas City Board of Trade was first to trade stock futures. Walt Vernon, one of the most admirable men I've ever met, had the idea and pushed it through. The company I worked for was hired to implement it. We were using Series/1s, and when they were turned off, the programmable keys lost their settings. Shortly before the Board was due to open, the first day of trading, I went to the floor to program the keys. My finger quite literally slipped as I was doing this, and hit a key for the entry of data, spraying data across the boards above the trading floor, in plain sight of everyone, and entering meaningless data for the opening trades in the files. This was a big deal because the national TV networks were there, with film ready to roll in 20 minutes. I had no idea how to fix my blunder, though. Went and got my boss, who freaked. Wrapped his arms around himself and thought deeply for a few minutes, made a few entries, and fixed the problem. The weird thing is that I think this is the only time in my entire career that my finger has slipped, except on cosmetic text entries. And it was the worst possible moment. PFS. Volumes sinking without a trace because they overlapped. Data would grow from one volume into the next. Luckily, I had a rigorous backup policy. I don't remember who configured those volumes originally- me, or someone else. But it was nerve-wracking while it lasted. Like the Board of Trade, this was the other time someone else solved the problem. Thanks, Pete. The Data Garbage box. The DG machine that Val and I worked on. The hardware would just freeze up for no reason. The operating system was obviously written by a group of madmen who never spoke to each other. And the application software, if possible, was worse. I decided to quit. They quickly transferred me to other work. Poor Val stuck with it a while longer. CompuServe's X.25 bug. We moved our application from a weird CP/M LAN-in-a-box (what a nightmare piece of crap that was; I don't even want to put it on this page of horrors, it was so bad) to Xenix. Bad as it was, Xenix was an improvement, except for one thing: the pseudo-ttys started locking up. It turned out that occasionally CompuServe would deliver a 2-byte packet (say, the letter "D" and a carriage return) in two packets, with the "More" bit set. This is illegal, according to the blue book (then the current standard). I had to write code to prevent the deadlocks. A real productive use of time, modifying gettys and I/O instead of enhancing the application. Finally had to get CompuServe and Altos in at the same time, and they sat around and waited until it happened. The reaction of the CompuServe guy was that "it works okay on the VAX". Morons. It took them only minutes to reconfigure their boxes so they wouldn't do this any more. As Janeane Garofolo would put it, "I'm thinking, like, an AK-47 and the nearest rooftop" next to CompuServe headquarters. No wonder CompuServe failed to adapt and went down the tubes. Good riddance. Hide and seek with the hardware. The time at 2901 Grand that every once in a while the machine would throw a few bytes of garbage into the data being read from the disk. It took me weeks to figure out that the interrupts from our homemade tape reader were the source of the problem. Roger went back to his blueprints, made our home-made I/O hardware smarter, and the problem went away. He shall remain nameless. This, and the next story, are about a guy I used to work with. He was brilliant, but erratic and careless. I'm not sure of all the details, but I have to put these stories here anyway... In the early days of a certain project I joined - before I was even employed at that company, in fact - there were some very serious bugs in the product I was later doomed to work on. At one of the beta sites, a bug caused the customer to lose weeks of data. Two of our people flew there and spent - oh, who knows? let's make up a number and say 36 hours - re-keying all the data. (Remember the first three rules of life in the electronic lane: backup, backup, backup.) Exhausted, they went back to their hotel for a well-deserved rest. The Blameful One (call him Yul) showed up, installed the software to fix the problem, and left without bothering to check that it worked. Rested once again, the 2 who had been keying their fingers to the bone returned, fired up the machine, and immediately lost all their data again. When she got home, one of them confronted Yul and told him that if he ever did that again, she was going to rip his lips off. Go, girl. Yul, number 2. This one I saw in person. Yul and another guy were replacing a burned-out power supply. This work wasn't warrantied, I think; customers weren't supposed to do this, the vendor was. Turned out that this was a special model of the machine, and took a different power supply. Meltdown. The vendor wouldn't pay to fix it, either. We had to foot the bill. |