Read/Write DNA
Nova Spivack, of Twine fame, has come out with an interesting blog post questioning whether our ‘junk’ DNA (the 97% of our DNA that doesn’t code for amino acids) could be a more effective storage mechanism for communal knowledge than Wikipedia:
There is of course one other place to store knowledge which may be even better than the Wikipedia — and that is DNA. By storing knowledge in human DNA of living humans, or of common bacteria for that matter, it could then potentially be passed down and spread through generations into the far future. However the mutability of DNA over time might gradually introduce errors that would degrade the information within particular lines of DNA over long periods of time.
Perhaps this could however be mitigated by comparing DNA samples from a large cross-section of individuals within the population of descendants of original holders of DNA-knowledge-archives in the future — this would effectively enable statistical error cancellation. The farther in the future from the date at which the knowledge is “written” to the DNA of some number of humans, the more people’s DNA would be needed to eliminate the errors statistically. This would however in principle counteract mutations and enable the reliable recovery of messages in DNA even very far in the future.
Interestingly, the problem that he posits here and his proposed solution mirror the wiki process itself: by gathering data from everyone, errors are likely to occur, but by normalizing across a large sample, those errors should be minimized if not eliminated.
Spivack goes on to cite an article by Karl Kruszelnicki about a language that possibly already exists in our DNA:
According to the linguists, all human languages obey Zipf’s Law. It’s a really weird law, but it’s not that hard to understand. Start off by getting a big fat book. Then, count the number of times each word appears in that book. You might find that the number one most popular word is “the” (which appears 2,000 times), followed by the second most popular word “a” (which appears 1,800 times), and so on. Right down at the bottom of the list, you have the least popular word, which might be “elephant”, and which appears just once.
Set up two columns of numbers. One column is the order of popularity of the words, running from “1″ for “the”, and “2″ for “a”, right down “1,000″ for “elephant”. The other column counts how many times each word appeared, starting off with 2,000 appearances of “the”, then 1,800 appearances of “a”, down to one appearance of “elephant”.
If you then plot on the right kind of graph paper, the order of popularity of the words, against the number of times each word appears you get a straight line! Even more amazingly, this straight line appears for every human language - whether it’s English or Egyptian, Eskimo or Chinese! Now the DNA is just one continuous ladder of squillions of rungs, and is not neatly broken up into individual words (like a book).
So the scientists looked at a very long bit of DNA, and made artificial words by breaking up the DNA into “words” each 3 rungs long. And then they tried it again for “words” 4 rungs long, 5 rungs long, and so on up to 8 rungs long. They then analysed all these words, and to their surprise, they got the same sort of Zipf Law/straight-line-graph for the human DNA (which is mostly introns), as they did for the human languages!
There seems to be some sort of language buried in the so-called junk DNA! Certainly, the next few years will be a very good time to make a career change into the field of genetics.
Incidentally, this type of analysis is what generates most great discoveries: somebody looking at two things that have never before been connected to each other and saying, “Hey, there’s a pattern here!”
Spivack goes on to suggest that all we need is a way of writing to the DNA and we’re sweet (assuming we also have a way to read it).
Wouldn’t it be great? Imagine you’re the first person encoded—you’d be unstoppable at pub quizzes. You’d make millions on Jeopardy! and 1 vs. 100. You’d be totally insufferable (nobody likes a literal know-it-all), but at least you’d be rich.
Unfortunately, there’s an issue. Not with the idea that societal knowledge can be carried within us—that already exists. How else do salmon know where to go? No, it’s more the idea of our ability to mechanically control this process that pulls me up short.
Mainly, the problem is that there’s no single-source option for DNA. If somebody updates Wikipedia, we all see the updated version, but with DNA, you’d have to have an intimidatingly active sex life to make sure new information is properly distributed.
And how do you handle the question of version control? It would be worse than figuring out whether you qualify as a Native American. “Well, my great-great-grandmother was first infected with knowledge in 2014, so my batch is more recent than yours…” What a mess.
Sorry, Nova, I think we’ve got a ways to go before your idea can be made a reality. I will say this, though, if you can make the semantic web happen, I’ll back you for wikiDNA as well.
(hat tip: Brian Hayes)




