Malicious code written into DNA infects the computer that reads it
By Devin Coldewey
Aug 10 2017
In a mind-boggling world first, a team of biologists and security researchers have successfully infected a computer with a malicious program coded into a strand of DNA.
It sounds like science fiction, but I assure you it’s quite real — although you probably don’t have to worry about this particular threat vector any time soon. That said, the possibilities suggested by this project are equally fascinating and terrifying to contemplate.
The multidisciplinary team at the University of Washington isn’t out to make outlandish headlines, although it’s certainly done that. They were concerned that the security infrastructure around DNA transcription and analysis was inadequate, having found elementary vulnerabilities in open-source software used in labs around the world. Given the nature of the data usually being handled, this could be a serious problem going forward.
Sure, they could demonstrate the weakness of the systems with the usual malware and remote access tools. That’s how any competent attacker would come at such a system. But the discriminating security professional prefers to stay ahead of the game.
“One of the big things we try to do in the computer security community is to avoid a situation where we say, ‘Oh shoot, adversaries are here and knocking on our door and we’re not prepared,’” said professor Tadayoshi Kohno, who has a history of pursuing unusual attack vectors for embedded and niche electronics like pacemakers.
“As these molecular and electronic worlds get closer together, there are potential interactions that we haven’t really had to contemplate before,” added Luis Ceze, one co-author of the study.
Accordingly, they made the leap plenty of sci-fi writers have made in the past, and that we are currently exploring via tools like CRISPR: DNA is basically life’s file system. The analysis programs are reading a DNA strand’s bases (cytosine, thymine etc, the A, T, G, and C we all know) and turning them into binary data. Suppose those nucleotides were encoding binary data in the first place? After all, it’s been done before — right down the hall.
Here comes the mad science
Here’s how they did it. All you really need to know about the transcription application is that it reads the raw data coming from the transcription process and sorts through it, looking for patterns and converting the base sequences it finds into binary code.
“The conversion from ASCII As, Ts, Gs, and Cs into a stream of bits is done in a fixed-size buffer that assumes a reasonable maximum read length,” explained co-author Karl Koscher in response to my requests for more technical information.
That makes it ripe for a basic buffer overflow attack in which programs execute arbitrary code because it falls outside expected parameters. (They cheated a little by introducing a particular vulnerability into the software themselves, but they also point out that similar ones are present elsewhere, just not as conveniently for purposes of demonstration.)
After developing a way to include executable code in the base sequence, they set about making the exploit itself. Ironically, it’s inaccurate to call it a virus, although it’s closer to a “real” virus than perhaps any malicious code ever written.
“The exploit was 176 bases long,” Koscher wrote. “The compression program translates each base into two bits, which are packed together, resulting in a 44 byte exploit when translated.”
Given that there are 4 bases, it would make sense to have each represent a binary pair. Koscher confirmed this was the case. (If you’re curious, as I was: A=00, C=01, G=10, T=11.)