Bioinformatics: Gerhard Schimpf interviews Sebastian Schultheiss
Sebastian J. Schultheiss is Managing Director of Computomics Molecular Data Analysis in Tübingen, Germany. His company specializes in applied bioinformatics, specifically for agricultural applications like plant breeding to improve crops.
Gerhard Schimpf (GS): Sebastian, thank you for agreeing to an interview about our topic Being Human with Algorithms. Could you introduce yourself and give us an overview of your current responsibilities?
Sebastian J. Schultheiss (SJS): I’m a co-founder and Managing Director of Computomics, a German company focused on analyzing genomics data for plant breeders and seed companies. We started Computomics in 2012 and wanted to apply machine learning and bioinformatics to this space, because datasets are growing exponentially and require a lot of interpretation, which can no longer be done manually.
GS: Which developments led you to pursue Computer Science and which trends of the currently observable digital transformation should we pay the most attention to?
SJS: I’m a native of Tübingen and was very lucky early on to be exposed to computer networks and Linux servers while still in high school, through “Schulen ans Netz”, a German government initiative of the 1990s. I’ve always been fascinated by computer science, specifically machine learning, and then became interested in genomics. In my opinion, those are the fields where we can expect the most progress in the next decade. The most interesting field is therefore the intersection of the two, which is bioinformatics. I studied this subject at the Universities of Tübingen and Michigan, and then did my PhD at the Max Planck Institute for Developmental Biology and the Friedrich Miescher Laboratory of the Max Planck Society in the “Machine Learning in Biology” group.
With the advent of high-throughput DNA sequencing at low cost around 2006 (so-called next-generation sequencing), the amount of genomic data started to accumulate exponentially. We saw this in our scientific work and thought: let’s start a service company that offers to interpret this data. Since the relationship between an organism’s observable characteristics (its phenotype) and the genetic potential it carries (its genotype) are extraordinarily complex and influenced by the organism’s environment, bioinformaticians started applying machine learning and other pattern recognition algorithms to these datasets. This is commonplace in human genetics today and has allowed us to understand several hereditary diseases, but at the time was still novel for plant breeding. In an effort to focus on a specific market with our emerging company, we applied machine learning and other bioinformatics methods to plant breeding research.
In terms of attention: we are starting to see other molecular biology methods produce large amounts of data as well. This is an area of major developments, and most people born today will probably have their genomes sequenced in their lifetime. The digitalization of our genomes and those of other organisms will have unfathomable consequences, and many ethical questions haven’t been answered. The next area to focus on are microorganisms that live within us, on us and around us. They are as important as our own genome, but that data is not at all protected by Germany’s Human Genome Diagnostics Act, for instance.
GS: How does that influence your work and how do you contribute to the digital transformation yourself?
SJS: The deluge of data has led to an increased demand in interpretations. We cannot sift through terabytes of DNA reads manually, so we have to use computer algorithms to do that. At Computomics, we continue to develop our machine learning-based algorithms with an effort to make the classifications they make more interpretable. We have seen instances where this effort, e.g. through visualizations, has led to further biological insight. So the “pattern” that an algorithm recognizes can indicate the way the biological process actually works. To me, that’s a fascinating prospect and something we are actively pushing forward.
GS: Would you consider the technology that you are applying in your company to be Artificial Intelligence? Do you see an overall chance for a positive change and improved standards of living in our society?
SJS: Based on the way the expression Artificial Intelligence is used today, I would agree that we are using that at Computomics. Personally, I don’t believe anyone has created an actual “intelligence” yet. So I tend to prefer the word “machine learning” instead. Our algorithms are certainly learning something, and with a bit of coaxing, we are able to get insights from the way the algorithms did that and learn something ourselves. AI obviously has tremendous potential to improve our lives, if we make that happen ethically and in an unbiased way. We mustn’t use AI to arrive at or excuse decisions that are unethical or inhumane. Ethical results can and should be hard-coded in a suitable way, see for instance Isaac Asimov’s Laws of Robotics.
GS: Can you emphasize with the general public’s fears and ethically rooted concerns that learning systems can either be abused, or could develop a life of their own and might go out of control?
SJS: Yes, absolutely. There are a lot of powerful, seemingly convincing scenarios in science fiction films, books and other media that might further these concerns. The first scenario is very real in my opinion, as systems have always been used by the ones in power to their advantage. But that’s the same with any technology. In this scenario, humans are clearly responsible for the abuse and the powerful have to be kept in check, like the people have done in the past.
For the second scenario, there might be inadvertent biases or neglect that could be introduced by human-generated data. It is therefore important that we understand more about how these systems evolve and monitor them closely. That’s where our own research into the interpretability of the machine’s decisions is headed.
One idea to mitigate that, are “Ethics by Design” principles already applied in the development of these systems. The often-portrayed “Skynet” scenario of machines taking over: I’m not convinced that that’s what could happen any time soon, or that we wouldn’t be able to stop it if it did.
I think a convincing counter-example are non-computer-based forms of artificial intelligence. We have already created artificial systems that make decisions on a larger scale: companies and other bureaucracies. They do define our world today, but it doesn’t feel like they are, collectively, outside of our control.
GS: Who, in your opinion, is responsible to alleviate these concerns and mitigate the risks? Do you see an immediate call to action?
SJS: Everyone involved in building these systems and selling them has a responsibility to do the right thing. I would like to see an international effort to define the ethical limits in which machines should be allowed to operate. Then we can start to put them into our software’s source code and programming libraries, so they become automatic.
The UN declaration of human rights, for example, is a lot more specific than many people probably know about. It contains some very detailed descriptions for situations that are somewhat technology-dependent, and which a hundred years ago no one would have included. As a member company of the UN Global Compact, Computomics displays the Human Rights in its offices, that’s why we are aware of this. Humanity could conceivably write a few amendment articles that pertain directly to the new ethical challenges posed by artificial intelligence.
In my opinion, this should be driven by computer scientists and engineers, who have a technical understanding of what’s possible today and where dangers come from. That could be an effort that ACM undertakes with its members, engaging with philosophers or the UN Human Rights Council directly, to explain the computer science and understand the ethical perspective.
GS: Thank you, Sebastian, for these impressing remarks. We at ACM are looking forward to remaining in touch with you.