Biotechnology and Machine Learning with SVM and LSS: Implementation (The Biotechnology and Machine learning Problem and Proposed Solution)

The Problem

Discovering drugs still involves a lot of real experiments on live cells. Automated and simulated models of cells could allow speeding up this process. Simulations require designing models, which represent real structures to predict their behaviour without doing real experiments and requiring more resources.

If we consider a cell as an object which we are trying to simulate as precisely as possible, we might need to include physics and chemistry and even quantum mechanics of each atom. Full simulation even of just one cell is still beyond what computers can do effectively today and that’s why biology, physics, chemistry and computing experts try to find the best balance of how far the problem can be abstracted, but still keep predictions in satisfactory precision and still return results within resource constraints. Simulating just certain function would reduce the problem, but could lose precision, because of undiscovered relations to other functions. There are some examples of quite abstract solutions in machine learning, like neural networks, which simulate brain function of recognizing patterns, without the need to simulate brains to the atomic level and even taking more space than just couple paragraphs of code. But even if this example lets us know that it is possible to make quite big abstractions to work, in drug development it might not be the case. Drugs are also made of molecules, which have to be exactly defined to be able to produce them. That causes some difficulties when trying to abstract molecular processes to get answers about precise molecules. It is difficult to abstract something and get out precise answers, but it is possible to increase precision by gradually abstracting less and less.

“Computational modeling of the structure of protein-peptide interactions is usually divided into two stages: prediction of the binding site at a protein receptor surface, and then docking (and modeling) the peptide structure into the known binding site” (M. Blaszczyk et al., 2015, p1). The “gap between identified hits and the many criteria that must be fulfilled, can be bridged by investigating the interactions between the ligands and their receptors“(V. Lounnas et al., 2013, p1). Finding the point of abstraction at which simulated cell or cell’s function behaves acceptably close to reality is another interesting and complicated problem. Experimenting at different levels of abstraction could allow finding the optimal solution for further development of these simulations. “With faster and more powerful computers larger and more complex systems may be explored using computer modelling or computer simulations” (J. Meller, 2001, p1). Work with cell receptor and ligand interactions have examples both in real biochemistry experiments and machine learning. The most unambiguous and precise language for modelling is mathematics and “mathematical modelling typically requires deep mathematical or computing knowledge, and this limits the spread of modelling tools among biologists” (R. Gostner et al., 2014, p16).

The knowledge gap between experts and beginners keeps increasing, because their time on actual research is more valuable than time spent explaining their work. Advancing science caused more narrow fields to arise and as a result it keeps increasing the knowledge gap even between experts. The need for experts and guides on their created technologies arise and it could be solved by gradually educating about the general topic or parts of it and its problems. There are experts who are doing actual experiments and search for real solutions, but there are not that many intermediate tools or solutions which introduce to the bioinformatics by showing how it works and what are its tools.

Proposed solution

Partial solution could be to advance research about these kind of problems and educate more people about bioinformatics. Proposed solution offers to educate about the main topics of bioinformatics and one of the best machine learning techniques- SVM. The implementation expands this guide in to Java implementation of SVM interpretation called Least Similar Spheres (LSS).

Implementation (The Biotechnology and Machine learning Problem and Proposed Solution)

The Problem

Proposed solution

No comments:

Post a Comment