Printer Friendly Page

12: �Peptide Diversity in Cone Snail Venom: Integrating Mass Spectrometry and Next Generation Sequencing in the Analysis of Natural Peptide Libraries� by Professor P. Balaram, Molecular Biophysics Unit, Indian Institute of Science, Bangalore.12 January 2017. 20 January 2016


P rofessor Padmanabhan Balaram received his Bachelor's degree in Chemistry from Ferguson College, University of Pune followed by a Master's degree from the Indian Institute of Technology, Kanpur. He did his Ph.D. from Carnegie-Mellon University, Pittsburgh (1972) and underwent Post-Doctoral training at the Department of Chemistry in Harvard University, U.S.A. After a postdoctoral training, he returned to the Indian Institute of Science, where he has been ever since a faculty member in the Molecular Biophysics Unit. Prof. Balaram's main area of research has been the investigation of the structure, conformation, and biological activity of designed and natural peptides. He has extensively used techniques such as Nuclear Magnetic Resonance spectroscopy, Infrared spectroscopy, and Circular Dichroism, along with X-ray crystallography. He has been a major contributor to the evaluation of factors influencing the folding and conformations of designed peptides, and has investigated structural elements playing a key role in the formation of secondary structural motifs such as helices, beta turns, and sheets. Along with Isabella Karle, a frequent collaborator, he has also pioneered the use of alpha-aminoisobutyric acid to induce and retain helicity and constrain peptide conformations. Balaram has authored more than 400 research papers, and is a fellow of the Indian National Science Academy. He was the editor of the journal Current Science till June 2013. He is a recipient of the Padma Bhushan award (2014) as well as the TWAS (The World Academy of Sciences) Prize for the advancement of science in developing countries (1994).

My thanks to Prof. Anil Tyagi1 and Prof. Raghuram2 for inviting me to give this lecture.

It�s a very special privilege to give a lecture named in memory of Yellapragada SubbaRow, especially when Sri S. P. K. Gupta3 is in the audience. Because, I think Sri Gupta has been single handed in bringing back to us how much we owe to Yellapragada SubbaRow. I have watched this down for almost 15 years and it turns out that in this time there is a much great awareness of Dr. SubbaRow�s work in India than there was may be 20 or 25 years ago. And I think the students of biochemistry, and I suspect biochemistry is taught in biotechnology department, will realize how fundamental Dr. SubbaRow�s contributions really are.

I am by training a chemist and I worked my entire career in a Biological Sciences Division at the Indian Institute of Science. Therefore, I have always been thinking about the relationship between chemistry and biology.


I came across this article, which I thought I would start with, where Arthur Kornberg, the discoverer of DNA polymerases, called chemistry �the lingua franca of the medical and biological sciences.� Over the course of the last 50 years or so, as molecular biology and biotechnology have grown, a biochemistry and chemistry in the biological sciences have sort of faded into the background. Even when students study biology, they are not often taught how important it is to have a strong grounding in chemistry. Arthur Kornberg4 suggested that the rift between the two cultures of chemistry and biology might derive from the apparently more right brain dominated character of biologists and the left brain dominated character of chemists.


What does this mean? This means that if you are working in physics or mathematics, you are logical and you think analytically. This of course requires the left brain. On the other hand, if you have to make humanity, social sciences and so on, the agronomist, the poet, an author, you are more creative and you use the right brain. What Kornberg suggested was that chemists use the left brain while biologists use the right brain and those people who work right in the middle of chemistry and biology must be in between somewhere. Sometimes they must use one side of the brain; sometime they must use the other side of the brain.

At this point, we might ask, where do we place Dr. SubbaRow? Dr. SubbaRow as we have seen in this video actually studied medicine. But he studied medicine in Madras Medical College at a time when the disciplines were not that compartmentalized and in the study of medicine you would have a great deal of biochemistry, microbiology, all of this talk just as in the study of engineering, you would be taught a lot of mathematics and physics and chemistry. The result of it is that if you look back at Dr. SubbaRow�s career and the kind of work that he did, it turns out that much of his work could really be classified as chemistry because he discovered molecules in nature.


I am going to show you two slides and the only thing I want to tell you is that all the slides that have come out after these two slides I made for this talk but these two slides that I have here I did not make them for this talk at all, I made them many years ago and I made them many years ago after I saw Mr. Gupta�s book IN QUEST OF PANACEA5 and put it on my slide.


Why did I put it on my slide? I was giving a talk or I was asked to give a talk. I was then the editor of Current Science, and writing a great deal about the growth of journals, the problem of citations, impact factor and so forth.


I came to give a talk here in Delhi. So, I prepared a list. This list is of the most highly cited publications in the literature of science. This is like asking; who are the batsmen who have scored the maximum number of runs in test matches or who scored the maximum number of goals in the Football World Cup. This is the kind of question one might ask in science, which paper is Ranked 1, which paper is Ranked 10, and which paper Ranked 11. Rank No. 1 in the literature still belongs to Oliver Lowry s paper on the estimation of proteins6. The terms of biochemistry do well in the most highly cited papers. There is a Bradford�s protein estimation which comes at No. 3. And there is Fiske-SubbaRow�s paper7, which comes at 23 at that time. The colorimetric estimation of phosphorus was published almost a century ago. We should celebrate the centenary of this paper when it does happen. It should not be forgotten because both Sri Gupta and I would have disappeared from the scene. But you must still remember it, students! This is a marvellous paper. When I began my career, I actually used Fiske-SubbaRow reagents in Bangalore in the early 1970s. At that time, I came to know who SubbaRow really was.

Sri Gupta did many films and what you have seen the video that was just screened a few minutes ago.


There have been remarkable discoveries but I will mention just one - ATP. Fritz Lipman8 is credited in literature of biochemistry with having been the discoverer of ATP. It turns out that SubbaRow also discovered ATP. There is no molecule more central to biochemistry and energetics of the cell than ATP. Look at the molecules he discovered. He died in 1948 but subsequently organic chemists used each one of these molecules as target for synthesis. I myself was 99th or 100 post-doctoral fellows working in Robert Woodward lab at Harvard on the total synthesis vitamin B12, which took about 15 years to do and ended in 1976. So these are molecules in which the literature of chemistry and biochemistry. They are historic events in chemistry and biochemistry. So it is remarkable that your university really recognizes and honours Dr SubbaRow with an annual lecture.


What am I going to talk about? I thought I would talk to you again about the search for molecules because after all Dr. SubbaRow's work consistent largely for searching of molecules which would be useful.


What I am going to talk to you are about molecules which are found in the venom of marine cone snails. I will explain the problem. These molecules, which are essentially peptides, target membrane receptors and channels and they contain multiple disulphide bonds. Now, I am a laboratory scientist, I don�t go to the sea, I don�t know how to swim, I don�t eat fish, I have no knowledge of seafood, yet I am working on snails. How did this working of snails come about? This project is really the brainchild of my late colleague, Prof. K. S. Krishnan9, who passed away a couple of years ago rather suddenly. Krishnan was a neurobiologist. He was also a naturalist. So, he went out into the field the great deal and he arrived in my laboratory some years ago -- may be about 12 or 13 years ago -- with a backpack. He got out the contents of this backpack and emptied it on my desk.


First thing I felt was a very strong and nasty smell actually coming from the contents that he had emptied. There were all the beautiful shells but smelled absolutely terrible because he brought them straight from the sea I asked him, �Please put them back.� I helped him put them back. Then I asked him, �What do you want me to do with this?� He said, �Don�t they look beautiful?� I said, �They look beautiful but they smell awful and what do you want me to do?�


He explained to me about the marine cone snail venom and I read a little bit about it. The venom consists of hundreds of molecules which target every membrane and receptor in the central nervous system. The problem which bothers much of humanity is pain. If you want to have pain relief, you must have appropriate molecules which target the right receptors. So, Krishnan suggested that he collect shells and have a good time on the beach while I take the venom and extract molecules from this venom. It is largely a laboratory task, one which involves purification and characterization. He said, �Look here you are a chemist, and this is what we should be doing.� So I began this work.


The question you might ask is this: why do snails have so many molecules in the venom which are so potent in paralysing other creatures into which they inject the venom?� This is because snails are very slow moving, they can't eat, they can't catch their prey, they must immobilise their prey before they eat them and they do this by shooting a cocktail of toxins which immobilise or paralyse the prey and afterwards they can ingest them. So, there are snails which eat worms, there are snails which eat fishes, and there are snails which eat other molluscs. So, this is a classic problem in biology, the problem of predators and preys. Predators need to get the prey, and the preys need to avoid the predators.


So, one can collect snails. Their shells are all marvellous. If you go across the south-eastern coast of India, south of Chennai, go all the way to Rameshwaram, you will find lots of shops selling these shells. In fact it is from one of those areas, where there are lots of shops devoted completely to put in shells out onto the market that Dr. A P J Abdul Kalam, the Rocket Man and former President of India really comes.


The venom apparatus of cone snail looks like this:


It is a marvellous apparatus. There is a duct like a bulb and then there is a long duct and then there is a harpoon-like structure which is coated with venom. The snail flings it at the pray. This is rather like a moving deck using a harpoon to attack a whale. The molecules are coated here. The molecules are synthesized here and they are pushed out on the coated out harpoon. This is a marvel of biology. It is a wonder that evolution has done. He has discovered a large number of molecules. Molecules, which have multiple disulphide bonds.


You can summarise the problem: There are as many peptides as there are many snails and all of these are heavily post-translationally modified. Since they are heavily post-translationally modified, there must be many enzymes which are actually post-translationally modifying them.


So, lots of things to study here. Olivera10, who somewhat documented it at the beginning of the century, called this as conotoxinomics because today if you want to get anybody interested in the subject, you have to add the suffix `omics� to it. Today, if you ask students what they want to do, they say, genomics or proteomics or metabolomics or whatever it is. Just remember, that at the heart of �omics� of any kind, there is biochemistry, the kind of biochemistry that Dr. SubbaRow actually practised. Conus venom is a painkiller that has been developed largely because neurons communicate with one another across the synaptic cleft which is a small bit of water. Molecules have to swim across this water. One cell will releases the neuro-transmitter and it will bind to the receptor on the other cell, recognize and then this chemical signal will be transduced into an electrical signal. It is a succession of chemical and electrical signals, which are involved in the propagation of synapse. This is something that neurobiologists are very much interested in. Therefor molecule, which swims across this cleft, looks like the Palk Strait between India and Sri Lanka or the English Channel between Britain and the European continent. They just came across a little bit but they are recognized very specifically whether as acetylcholine, whether as serotonin, all the neurotransmitters, glutamate, glycine all of them, which you will find in your textbook.


But, structural biology has no very far and fast. The nicotine acetylcholine receptor, little molecule acetylcholine is recognized very specifically by this enormous structure a pentameric replicate.


Now off course I will show you in the next slide, where one might imagine the conus peptides actually bind because some of the conus peptides are very specific antagonists of the nicotine acetylcholine receptors, which you want to block in case of pain. They are rather molecules which block the glutamate receptor. Some molecules block D-N-methyl-D-aspartate receptor and so forth


Now, these are nicotinic acetylcholine receptors and this is now a molecule which has in fact been used in clinical study as a pain reliever. It has been discontinued in clinical trials now because of efficacy concerns and low affinity in human receptors as compared to mouse receptors. But you can see the molecule. It has short peptide multiple cysteine residues - two disulphide bonds. This is the molecule, which we investigated. In my own research, we are now finding sequences, which are closely related to this by the important significantly in the sequence. I would now like to establish their biological efficacies, their targets, and so forth.


What I am going to tell you about in the remaining part of my talk is really how one actually starts with a venom and ends up with the sequences of molecules. The alpha conus toxins are potential ligands for the nicotine acid acetylcholine receptors. Of course if you want to find molecules, you find that nature produces molecules only in the presence of lot of other molecules.


For example, every Ayurvedic preparation is a large collection of molecules and you don�t know which one really is responsible for the efficacy. Western medicine, the kind of medicine that Dr. SubbaRow practised really worked through single pure molecules, tetracycline, the folate and so forth. Now the problem is separating mixtures which are usually large libraries of molecules.


Why do I call them libraries of molecules? Of course the student, most of the students, don�t know now what a library looks like and what the librarian actually does because they sit in front of computers. But in old days, libraries were the places where journals and books were. They were just thrown there by the postman or the supplier. It was the librarian�s task to catalogue those books and put them in their right places, catalogue journals and so forth and they had a system of classification.


We are effectively doing the librarian�s job now. We have a large mixture of molecules. We have to find out what they are. We have to find out the context in which they work and then we have to actually classify them.


You might have a mixture of substances. They might sometimes be the product of laboratory synthesis and most often they are natural extracts of animal, plant, or microbial origin. What we have to do with the mixture? We have to separate it. Today, if we were to separate we would use all the advanced methods of chromatography. Once you have got the pure molecules, you would use all the advanced methods of structure determination -- x-ray diffraction, nuclear magnetic resonance.


I am going to talk today about mass spectrometry.


I do not know about you, how many of you are research students. A few of you are. How many of you have studied chemistry? A few of you. About 50 years ago when I studied chemistry at Ferguson College in Pune, we used to be given for an examination a mixture two substances, which we would have to separate and then analyse. The first thing we have to do is to separate the two substances. Then you have to find out what they are by doing some simple chemical tests. Students quickly find out -- and I was among them � that the best way of finding out what the constituents of the mixture was to befriend the attender who had actually mixed the substances together before examination. The only way to do this was to cultivate him over the course of the academic year. You take him to the canteen; you give him along we asked for it and so forth. In this way, he would whisper in your ears during the examination what the substances were and then you go backward and write all the steps which were necessary because the examiner would never come and actually find out whether you really separated the substances or not. I realized in later years that my education taught me more about human relations than it actually taught me analytical chemistry. It is sort of ironic at the end of my career the work that I do could largely be classified as analytical chemistry or analytical biochemistry.


The nature of molecules being investigated depends on how you got them. If you extracted the molecules with water, you would get all the water soluble molecules, proteins, peptide, nucleic acid, sugars, and so forth, you are a biochemist. If you extracted with the organic solvent, you would get lipid, terpenoid, steroid, alkaloid and all, then you are an organic chemist. You will notice that in institutes, very old and very modern, all are mixed up in biotechnology but in the regular university the Biochemistry Department and the Organic Chemistry Department will not talk to one another. In fact, in my own institution which is over 100 years old, they are actually two separate buildings. This is to ensure that they don�t converse at all. The one individual who actually straddled the barrier between biochemistry and the organic chemistry with great facility was actually Dr. SubbaRow and that is how he made all the discoveries that he did in 1930s and 1940s.


We now have to sequence peptides of natural mixtures. We purify them and then we sequence. You can sequence them in two ways. You can sequence them by Edman11 sequencing which is the classical way. You react with the chemical reagents and that results in the fragmentation. You isolate it by chromatography, the molecule and identify the amino acid derivative. Or you sequence by mass spectrometry. It turns out that you hardly ever sequence peptides and proteins nowadays and you actually sequence genes. And most of the time it turns out that nucleic acid sequencing has been so automated that no student knows actually how to sequence DNA. What he does know is that you need some money to send it away to some company and then it would be sequenced there. On the other hand, it turns out that peptide sequencing still needs to be done in the laboratory and probably mass spectrometry would be one way in which it can be done by an individual.


Conus venom is particularly suitable for mass spectrometry because it is heavily post-transnationally modified. These amino acids are now modified in some way and you need mass spectrometry because Edman sequencing really fails. Here I would really like to tell you something about how science progresses. You need methods. Mikhail Tsvet12, the man who really invented chromatography, said a long time ago that an essential condition for all fruitful research is to have at ones disposal a satisfactory technique. He then quoted a French scientist and said somebody once remarked that unfortunately the methodology is frequently the biggest aspect of scientific investigation. Who was he quoting? He was quoting Rene Descartes, who said all scientific progress is a progress in method. Today it unfortunately turns out that Indian laboratories are flush with money. As Indian laboratories are flush with money, they are also flush with equipment and it turns out that in laboratories after laboratories in India one of the saddest things about scientific situation is how poorly the equipment is used and how few people are really technically competent to actually use these methods. So, it is very important for students to recognize that one must master one or two methods, and one must have sound methods which are your own methods. It is not often the ideas come. Freeman Dyson13, the British theoretical physicist, said it very well. He said science is often driven by new technology rather than by new concepts. And I bring this to your attention because there is one word which I have sort of really been worried about for the last 25 years. It is the word biotechnology because it is not the scientific discipline. It is the results; it is the hopefully the application of many scientific disciplines in solving the problems of biology. So, one needs chemistry; one needs microbiology; one needs biochemistry. You need all these disciplines to be studied individually. And, what is often done in biotechnology is new technology rather than new concepts that is taught and the new technology is PCR, genome sequencing and so forth. It is not that you never sequence. In the old days you had those long gels so you had t least to run the gels, make the gels without breaking the gels and so on. But today you don�t have to do that. So, the learning of the theory of sequencing doesn�t help you in everyday life. I want to use the rest of the time that remains to tell you that it is as important to be a technician as it is to be a professor.


The technique that I am going to talk to you about is Mass Spectrometry. And mass spectrometry like every other technique that we use today for the analyses of molecules has its origin in physics. J. J. Thomson�s work on the electron is really the historical starting point of mass spectrometry. Subsequently, Aston discovered all the isotopes that you see using Mass Spectrometry. But then for a long time, Mass Spectrometry was used in organic chemistry not very much, not very well because of difficulties in instrumentation.


In 1989, I heard a talk in my own institution which described the Noble Prize in physics for the development of what was called ion trap technique. I was the professor by then and I came out of the lecture which was given by one of my friends and had a conclusion.


I thought this ion trap method can have no possible relationship to anything that I do. Little did I realize that a few years later the ion trap was a heart for every Mass Spectrometry. I thought I would very briefly tell you without any slide what the ion trap really is, what the principle of ion trap really might be.


Just imagine when you filter. When you filter something, you actually pore the stuff. There is filter paper in between. All the big stuff which is solid remains on the top and all the small things which are in solution, and the molecules come out. Today, of course, biochemists and molecular biologists use more complicated filters. They will have molecular weight cut off - 3 kDa cut off, 10 kDa cut off, sometimes even one 1 KDa cut off and you pay some money for those filters. The ion trap is a marvellous gadget. What it does is this: If you charge an ion, that is, if you have a molecule and if you put charge on it, you have a charged particle. It would be now filter, which would exclude all the charged particles which are outside the mass range you select. So smaller particles will be excluded, bigger particles would also be excluded; and only what you want would be caught inside the ion traps and you can measure them. Provided you want to measure the mass because the mass is a biometric for the molecule. If I could measure all your masses very accurately and masses never change, unfortunately that is not the case. Masses keep changing of human being. Otherwise, Mr. Nilekani could have put the mass to the 10 decimal place on to the Aadhar card. You can�t do that. But for molecules mass wont change because atom has a finite mass. A molecule made up of atoms. Once you know the masses of atom, you know the masses of molecules and you will be able to get them.


Now, two things about Mass Spectrometry which are very special. One is the high sensitivity technique. That is, you need very little material to measure the mass spectrum. The second is, it is a very high resolution technique. That is, you can measure the mass of a molecule with the third decimal place. The result is, there are a very few techniques which are high resolution and high sensitivity. In NMR spectroscopy, for example, the resolution and sensitivity will go in opposite directions. If you increase the resolution, you will lose the sensitivity and so forth.


Mass spectrometry gives you the best of both the works. You see the kind of ion chromatogram they inject into mass spectrometer. This detects the ions as a function of time, which lead to detector and then one deconvolutes from this and get the mass spectrum over there and measure masses very precisely. Here, for example, all these molecules can be measured. You can see how I have in the sample, which has been injected into the ion the mass spectrometer. This is a little bit of conus venom inserted into the mass spectrometer and in principle I can measure lots of molecules.


I told you in the beginning that molecules have disulphide bonds. If you have disulphide bonds, the first thing you would like to know how many. One way you can do that is to reduce it and carry out the chemical reaction, which is reduction followed by alkylation, which is a known biochemistry trick. It turns out that all the old biochemistry and old organic chemistry of the 1950s today become unclaimed methods of processing samples before you put them into a mass spectrometer. Whenever you do a chemical reaction barring isomerisation reaction, you either add the master molecule or remove the master molecule. Therefore, mass spectrometry becomes a diagnostic for any chemical transformation of a molecule. You know therefore how many groups you have if you block it with N-ethylmaleimide. You will see the mass. Now you know how much mass you have to add and then it all being multiples of these numbers. So it�s a very, very easy experiment. Then we go for the experiment. So, if you have matrix assisted laser desorption machine, this is what you will get when you do the reaction. The peaks will move, you see how much they move, you immediately know how many disulphide bonds there are.


So, this is what you really need to do: you have the sequence of amino acid, the one letter code here. It is a sort of interesting, if you write the one letter code, you write P H Y S I C S, you will get a peptide sequence. If you write C H E M I S T R Y, you get a peptide sequence. But if you write B I O L O G Y, you won�t get peptide sequence because B is not a letter which is used. Neither O is the letter which is used. As a consequence, it seems that amino acids sequences somehow exclude BIOLOGY from them.


But look at this. Peptides are basic groups, acidic groups: aspartic acid, lysine and so forth. You can protonate them. This is the simplest reaction � add a proton, remove a proton. If you add a proton, you would get a plus charge. If you remove a proton, you would get a minus charge. Therefore, you need charged molecules to go into the gas phase because Mass Spectrometry is a mass charge measurement. And where peptides and proteins are charged, they just charged in solution because depending on the pH at which you have dissolved your sample it will have some charge or other.


Remember whoever you have been taught about isoelectric point and so forth. Then, biological polymers are repetitive in a backbone sequence. Only the side chains vary, and you can now break these bonds in the Mass Spectrometry very specifically at points and get fragments. And using that, you can actually sequence.


What is sequencing? Sequencing is nothing but a way of taking a molecule into the gas phase and then breaking it and, having broken it, measuring the masses of the fragments and then putting all the fragments back together. If for example, this is glass. I won�t break it. But in principle, I could break it. I could send all of you out of the room, smash it on the ground and break it. Call you back into the room and ask you, �What object have I broken?� You will pick up all the pieces and try to join them together. If I had broken that very gently, you would be able to do this very easily because you will have very few pieces. On the other hand, if I had really smashed it, then, there would be so many small pieces and you won�t be able to put them back together.


The Mass Spectrometry has an advantage. The energy which it uses for fragmenting molecules can be controlled on this spectrometer so that you can factor it on different conditions. More fragments, less fragments, and so on and so forth.


But in order to sequence a large molecule, you need a large number of fragments. If you have large numbers of fragments, you have large numbers of peaks in the spectrum, and if you have large number of peaks in a spectrum, you need a lot of time to interpret it.


If you have a jigsaw puzzle with 20 pieces, anybody would put it together. If you have a jigsaw puzzle with 2,000 pieces, it would require a great deal of concentration and patience to put it together. An adequate number of fragmented ions is generally not obtained. So, we would like to have other methods which help us to sequence. The next generation sequencing is the technique that I really want to introduce you to in this particular problem. There are many ways, and this is the technical side. I won�t spend too much time on this but just show you some spectrums. Each peak corresponds to a molecule because it in turn corresponds to a mass. If I just take the crude venom, I can immediately estimate, I must have at least minimum number of this many molecules. If there are many more molecules, I would not be able to find.


How many molecules are there in the cone snail venom? It is estimated that one cone snail species might produce as many as thousand molecules. How many cone snail species are there in the world? There are probably about 700 to 1,000. So, there were huge numbers of molecules out there with varying degrees of biological specificity. And Prof. Krishnan�s idea was that we catch the cone snails which are unique to the Indian Coast. We would have a venom, which other people do not have and we would get new molecules.


I was particularly worried about whether you got new molecules or not because at this time I was largely interested in trying to learn how to use Mass Spectrometry in sequencing. One can now analyse molecules. Now, for examples, you find peaks like this. The nice thing about Mass Spectrometry is you don�t need to know anything else. You only have numbers and then you only need to know how to add and subtract. If you know how to add and subtract, you will know the differences between numbers. For example, these two peaks are separated by 58 Dalton. You might ask what is 58 Dalton. 58 Dalton is glycine has mass of 57 and one Dalton is mass of proton. One proton and one glycine have disappeared.


What is that? That is the well known post-translational modification, where C-terminal glycine becomes an amide at the end. This is the main amidating enzyme, which actually does this. All your brain hormones are everything C-terminal amidated. And, the same kind of post-translational modification is also used in the snail.


You can do simple things. What I would do in the college with great difficulty, you can today have a little bit of thionyl chloride and methanol and esterify all the carboxylic acids. When you esterify them and make methylester, you will add forty. If you have amino group you will acylate them you will add 42. In this way by shifts and mass, you can determine how many carboxylic acids, how many lysine and so forth there are. You can determine the mass very accurately as I show you here.


These are isotopic patterns. Remember that if a molecule has a hundred carbon atoms, the natural abundance of the carbon13 isotopes is 1.1 percent. Therefore, there will be 1.1 percent carbon13 in the molecule. If you have more than 100 carbon atoms, the second isotopic peak would become larger. This is the kind of isotopic pattern that you will have at smaller masses. As you go to larger and larger masses, the isotopic patterns will actually begin to shift.


This is the typical kind of spectrum that one would get. This is what I do by hand. But I know for example, aspartic acid has the mass residue of 115, tryptophan 186 and so I find those spacing I know this is aspartic acid and tryptophan and I can write down the sequence.


When I do this way I don�t know whether the sequence is being written this way or the sequence is being written the other way because I am not able to interpret every peak in the spectrum. All the students who are now being educated in the new millennium will immediately ask the question, �Is there any computer program, which should do all of this?� There must be some software sitting out there, which will then do it. Just imagine, think about it a little bit. If software did any thing, then of course you are not using the computer that you have. And that will slowly fall into disuse and by the time you are as old as Prof. Kannan13A and me, we old stuff becoming worried about neuro-degeneration and Alzheimer and so forth. It is better to keep yourself occupied doing this.


Fortunately, there is no computer programme as yet which can really do de novo sequencing. But we can�t get full sequences. So we turn to next generation sequencing and integrate both next generation sequencing. What should we do now? We would not try to get some sequences. Some of you might ask, �Is there some cone snail genome which has been sequenced?� Cone snail genome has not yet been sequenced. There is shortage of money in the West after the depression set in and no body wants to sequence the cone snail. Lot of money (is) probably (needed to) get a very large genome and it will take considerable efforts to actually do it.


But today you can do NGS and create what is called a transcriptome library. A transcriptome library is created by taking mRNA, which are effectively the only genes which are expressed, and convert it into complementary DNA and then break it up into small pieces and sequence them. Once you sequence them, now, it is only illumina but previously the work before 454 sequencing, you have this millions of nucleic acid bases.


Here I should let you know the secret. I don�t know anything about the DNA sequencing. It just turns out that Prof. Krishnan arrived in my lab about five years ago, saying we are going to do a project on deep sequencing. I did not know what deep sequencing was. What worried me was the pronoun that he used. He said we. What am I going to do on that, I asked. He said, �Don�t worry.� I said, �What is deep sequencing?� He said, �Look I heard that it is a very powerful technique. What would you worry about? We first write the project. He wrote a project to the Department of Biotechnology and he also managed to get some money because I think, everybody likes the idea about deep sequencing in Delhi and this is right.


The question then was what do you do with these data which comes, this millions of antidotes and I don�t know what to do with it. He said he will hire a bioinformatics company. I didn�t think much of this because this is outsourcing research. So he got a few guys from the bioinformatics company in Bengaluru and then those guys appeared in my lab and they asked me, �What are you going to do?� After a little while, they started asking me, �How are we going to do it?� And I was again worried by this �we� because I do not have a clue to what all this is about. So, I told them, �Okay, we will worry about it.� And then I got myself a post-doc, a post-doc who was trained in computer science and who was doing actually bioinformatics for his Ph.D. He arrived in my lab and he spoke a language which I did not understand, and I spoke the language which he did not understand. Both of us drank lots of coffee but we did not communicate with one another. Eventually around the time I retired -- you know, it is very useful to retire because when you retire you can actually think of something -- I began to think about how and what this problem is, and I explained to him.


What is NGS sequencing? I don�t know what is it that I am going to find. But I know roughly what it should look like. It should have signal sequence. It should have toxin rich sequence at C-terminus. I know what the sequence is toxin � they cysteine rich. I know it should be bigger than the certain amount. I have a big book and my students sometimes do this. They do PhD thesis, which are like this but pages will be unnumbered. If you drop it down on the floor and if you want to put it back. You have a thousand to read. You read the first page you pick up and then you have to read it in context. So you read all the pages and decide which one goes after this and which one goes before this. After a little while you go mad and you will give up.


That�s what sequencing is all about? How is this problem to be solved? This problem of assembling is of course best solved by the computer scientist and the mathematician. You should leave it to them. They would produce the algorithm and then the computer scientists would produce the software. There are many assembly software available. Even to try all of them out, you will have to be able to recognize the genes. It is not a gene which would look like this. What I have written down at the bottom here. There is my cysteine-rich sequence at the end, the signal sequence I can recognize. Once I recognize this cysteine, I know that is the gene. We began slowly developing a procedure.


How would the computer scientists actually do this? They do nothing but use Watson crick base pair and stitch together. It is a painful process but the computer can do this very well. After all, the triplet code is known. Remember, once I stitch sequences like this together, I get long stretches of assembled sequence. I don�t know how to read it. So, what do you do? You read it this way and you shift by one and read again, shift by one, read it again. You read it in all three open reading frames this way. Then, you go back the other way, and then read it in all three open reading frames the other way. Six open reading frames are now translated. Five-sixth of the data is noise, and one-sixth of the data is signal. If you are able to recognize the signal, you are in business. And do we know how to recognize those signals? We use multiple assembly programmes. We write our own little things. We take all sequences which lie between two stop codon, which are beyond a certain length and then examine all of them. Once we got all these meaningless sequences, we blast them against the database of conus sequences and we blast them against the NCBI (National Centre for Biotechnology Information) database and eventually we pull out the genes. Once one has the gene, he is ready to go. You are in stamp collecting. Remember that the physicist many years ago in the 1930s said that there are two kinds of science � physics and stamp collecting. I don�t know what he would have said about today�s �omics�. But today in biology, sequence collecting is a very fashionable and interesting area because you will find something in the sequences eventually.


A lot of full genes, partial genes. We get a lot of toxin sequences and now we are identifying them by mass spectrometry. Once I identified a gene and then I saw a little piece with DW, DW which is already recognized in the mass spectrum. What I can now do is to calculate what the mass of this matured peptide should be, match it with what I would observe, and then go back to the spectrum and try to interpret it in terms of the sequence.


This is what I mean by integrating mass spectrometry and the next generation sequencing.


Of course, the next steps should be once I got the sequence is to assign every peak in the mass spectrum. But it is not easy to assign every peak in the mass spectrum. If I had a NMR (Nuclear Magnetic Resonance) spectrum, I must assign every peak in NMR spectrum. If I have the mass spectrum, I don�t have to assign every peak in the mass spectrum. For, nobody can assign every peak in mass spectrum. Eventually, one might break molecules break somewhat in complex ways. So there are lots of sequences. This is what I have been playing with for a while and it sort of interesting and it keeps you occupied. This is like doing Sudoku. Once you start doing it, you will be doing it irrespective. Some of my principals ask what you will do with sequences. My answer is I don�t care. They look interesting. Some of them look nice if you keep looking at them.


I read the famous molecular biologist Sidney Bruno once in Bangalore. What he said about sequences was that he wouldn�t allow them to be handled by computers but would read them. Sometimes when you read them and keep looking at them, you see something, you see some patterns, and you see something, allowing you to go back. I can go back and interpret mass spectrum. If I have peak for example 16 Dalton apart, one oxygen atom has been added. Proline has become hydroxyl-proline. That�s the post-translational modification in the cone snail. If I lose 58, C-terminal glycine has resulted in a C-terminus amide. One can now identify post-translational modifications, one can do all of the assignments and so forth. I would quickly go over this to tell you that today electrospray ionization mass spectrometry, which was invented by John Fenn14. John Fenn should be an inspiration to all of us who are beyond a certain age because he invented electro-spray ionization spectrometry at the age of 67. By the time he invented the electro-spray ionization spectrometry in his university, Yale University, a very respectable university, said, �You are too old, you better get out.� So, he left the university and went to a small university called Virginia Commonwealth University which gave him shelter and then he received the Nobel Prize subsequently. This tells you that you might find something if you keep at it long enough. Electro spray allows you to put things through a liquid chromatography column and take it directly into Mass Spectrometer. You are in effect using a mass spectrometer as a detector. All these peaks are mass detected peaks. You can integrate across these peaks to get the mass spectrum and each of these. For example, I got about 40 peaks that I could count but each peak might have seven or eight peptides. I might take 300 or 400 masses without any difficulty. We analyse these peaks; we identify molecules; and then we compare molecules with sequences, which we have got from NGS sequencing. If we got them from NGS sequencing, you are sure of the sequence. If you haven�t got them from NGS sequencing, we have to de novo sequence them, which is difficult. It requires the molecule to be isolated, derivatized enzymatic cleaved and so forth. All the old techniques of biochemistry have to be used. It is only the detection with the mass spectrometer which is new. One can do this, and identify many interesting molecules.


But I won�t show you this so much but tell you a problem which I think students must know, particularly students of today.


Proteins have disulphide bonds. Sanger�s15 problem in determining the sequence of insulin was not only to determine sequence of insulin but also determine the disulphide bonds which are present in insulin. I believe disulphide bonds had to be determined. If you don�t know how many disulphide bonds determinations have been done after the early days of protein biochemistry, I would say, not too many. When molecular biology came in the 1970s, most protein genes you were looking at were intracellular, which don�t have disulphide bonds and therefore one disulphide bond was forgotten. If you have two cysteines, there is only one possibility. But if you have four cysteines, there are three possibilities. If have six cysteines, there are 15 possible connections. It keeps on going. If you have 34 cystines, you get that many possibilities.


How many of you know which protein has 34 cysteines? It is the most abundant proteins in your body that is albumin. If you go to Dr. Sarin�s Liver Institute16, you will find that they are perpetually infusing albumin into patients whose livers have begun to fail. You might ask, �If there are so many disulphide bond possibilities, I hope they are infusing the albumin which has the right disulphide connectivity.� Human albumin has the right disulphide connectivity but after it is processed a little bit, you don�t know. There is a 35th cysteine, which is a free cysteine. If you have a free cysteine, you can have a thiol-disulphide interchange. Therefore, disulphide bonds can get. The disulphide bond problem is not a trivial problem of determining connectivity.


What we have been trying to do is to work on methodology, which I will quickly show you to determine this by mass spectrometry. If you break the disulphide bond, and there are two ways of breaking it. You can either lose that hydrogen or lose the other alpha hydrogen or the beta hydrogen. One way you get the hydro alanine and cysteine for sulphide. The other way you get cysteine and thiol for sulphide. This is the most favoured pathway and this way you will also lose H2S2. So, you can recognize these fragmentations by mass spectrometry and hope that if you break the molecules, you will get fragments. Each fragment should contain only one disulphide bond and then you would find what the fragments are. That�s what we are trying to do. In simple cases, this is possible. In more complex cases, it becomes difficult. We have in fact been trying to develop an algorithm. We have the algorithm, we have the working programme, which allows you to interpret mass spectro fragmentations. This is not very easy. I have used this to collaborate in cases where there has been a controversy in the literature and tried to use mass spectrometry to give the correct answer. The wonderful thing is that when there is a controversy over two possibilities, one side of the controversy will like a result in any way, even if it is wrong. Therefore, you can in fact get your papers published without too much of a difficulty. I still think that this method needs considerable amount of effort and refinement even now. This name itself generates diversity in many ways -- by post-translational modifications, by hyper mutations of toxin genes. They are extremely mutable in the C-terminus region and the strange thing is that cysteine codon is very often maintained. So, you can have one gene, you can have five peptides coming out of this with unique biological activities.


One question you can ask is, �Does the snail itself have one more way of generating diversity by having different disulphide connectivity of the same molecule?�


Here is our HPLC separation: we can take the fraction, for example, fraction 11 has mass spectrum, fraction 14 has mass spectrum. They are identical in mass but they are separated in their retention time by quite a lot. Now you can go back and linearise them and sequence them and find them, they have the same sequence. This is what the linearised mass spectrum looks like (projected on the screen) and this is what the linearized mass spectrum of the other fraction.


If you look at both of them, you will see that they are same. They are identical molecules but appear in different places. Therefore, there must be different disulphide isomers but you need additional work to actually establish which isomer it is of the different possibilities.


Conus toxin, for example, which acts against a specific receptor like norepinephrine transporter, can be analysed. There are acids available. Unfortunately, these acids are not available easily. They are not available in my laboratory. Any way, my laboratory is almost practically closed down. But with Prof. Olivera�s help one can analyse some of these sequences for this specific activity. This is what we are doing, Concluding this talk, what I would like to tell you is about membrane channels and receptors. Membrane channels are diverse; membrane receptors are diverse. You need a diversity of molecules in order to target all of these properly. Matching ligands and their receptors, Emil Fischer gave us a wonderful metaphor of the lock and the key. I put the lock on one side and the key on the other side.


Some two and half years ago, I retired and I shifted my residence. It turned out that I found lots of locks which I carried through my entire 40 years of existence as a faculty member, and I had lots of keys. With typical south Indian conservativeness, I thought I might still sandwich some locks and some keys, which might come in useful somewhere or the other. The only way of doing this is to try every key in every lock and once you got a pair, you separate them. Try as I did, I could not find any matching pairs of locks and keys.


This is the problem. Sometimes you find with the diversity of biological natural products on one hand and diversity of membrane receptors on the other.


Back to the acetylcholine receptor, all the sequences and you can see that�s where you were supposed to bind. We have all the molecular weights exactly, we have the masses. We can ask, �Are they present in the venom?


This is the HPLC trace, small segment of HPLC trace. If you do what is called an extractive ion chromatogram, you can actually pick up these masses and their positions very precisely. Once you have them, you know what kind of molecules are there in your venom, which of these molecules, and you can take those fractions. Then you know which receptor you need actually to acid them against. There is lots of information which comes from such extracted ion chromatograms. This means you have complex chromatograms detected by the mass spectrometer. You put in a mass of an ion with a tolerance of 0.1 Daltons and then it could pick out those region of chromatogram where ions of that mass appear, you integrate across them and obtain the mass spectrum, and then you try to analyse what molecule this is.


Since I was trained in chemistry, I don�t fail and I am each time amazed about the kind of technology that there is.


I suspect that most of you would take all the technology that you find in the laboratories for granted. If you take them for granted, you will never become an expert in those technologies.


One word about biology. Biology is really a marvellous field because there are so many wonderful phenomena. You have this diversity of receptors, and you have diversity of ligands, all available in nature.


The biochemistry helped you years ago. I have not been able to find the reference but I read once that for every enzyme that nature has placed in, there is somewhere else a ligand which would inhibit it. That�s how natural products inhibit enzymes which they never ever see. But in the cases of receptors, they come for a specific reason, and they are involved for signalling within the organism but predators which want to disrupt the kind of signalling, will now make molecules. So, predator molecules would evolve; pre-receptors would also evolve. So, there is constant evolution of chemistry. Nature is a far more sophisticated chemist than any chemist who ever lived. And in thinking about this, I would to tell you that cone snail venom has insulins. These are insulins which are monomeric insulins which do not aggregate. The question really is, �Do they bind to the cone snail receptors?� Interestingly, there are two kinds of cone snail insulins � insulin which binds to the cone snail receptors and insulin which binds to fish receptors which try to knock out. Hypoglycaemic shock is the first line in knocking out a prey. Since both toxin and receptor sequences evolved under the selective pressures of evolution, we must really have in our minds what is called the Red Queen Hypothesis.


I don�t know how many of you have heard of this. The Red Queen Hypothesis refers to �Through the Looking Glass�, the sequel to the �Alice in Wonderland� written by Lewis Carroll. Alice comes across the Red Queen and they are then running around. Alice notes that there was a large checker board which has squares of different colours. It is a vulnerable game to run from square to square. But Alice is very observant. She says to the Red Queen, �You seem to be a running a lot, and don�t seem to be getting anywhere.� Almost like our economy sometimes. The Red Queen tells her, �Now, here you see, it takes all the running you can do to keep in the same place.� That of course is what we sometimes sense in our research also.


What it means is that interaction remains the same. Their interaction cannot really be very much different because their interaction is determined by the physical forces between atoms. One configuration of atoms is altered, and then another configuration of atoms is evolved to actually target it.


When I saw this paper, I was really impressed by how much we don�t know about biology. This paper by Lubec17 and his collaborators appeared in PNAS. A peptide which we had sequenced also appears in the monarch butterfly. It turns out that it is only in the wings of the butterfly but not in other parts of the body. What is the toxin doing in the butterfly? It turns out that this is to dissuade the predators, mantis and gecko, from chewing on wings of the butterfly, immobilizing it and eating it up.


There is a great deal of work to be done and finding relationships between these molecules elsewhere in nature.


I saw once a cartoon about biochemistry versus proteomics. What it all means is that in biochemistry you are looking at a single fish, and in proteomics you are trying to catch multiple fish at the same time.


I am at heart a biochemist but the pressures of modern era required that you learn a little bit of the other techniques in order to stay alive.


Last of all, I would like to make an acknowledgement to my institution18 which has been my home for 43 years and which I am about to leave. I don�t think that I could have found a better place to work for all these years.


Thank you very much

References :

(1) Vice Chancellor of GGSIPU.

(2) Dean, School of Biotechnology, GGSIPU

(3) Biographer of Dr Yellapragada SubbaRow.

(4) Arthur Kronberg (1918-2007) discovered mechanisms in biological synthesis of DNA and jointly won the 1954 physiology and Medicine Nobel

(5) IN QUEST OF PANACEA, Successes and Failures of Yellapragada SubbaRow (New Delhi: Evelyn Publishers, 1988).

(6) Nicole Kresge, Robert D. Simoni and Robert L. Hill The most highly cited paper in publishing history: protein determination by Oliver H. Lowry J. Biol. Chem. 2005, 280:e25.

(7) Fiske-SubbaRow Method (http://www.jbc.org/content/66/2/375.short)

(8) Fritz Lipman (1946-2014) discovered coenzyme A and jointly won the 1997 Chemistry Nobel..

(9) Prof. K. S. Krishnan (1946-2014) the biophysicist was on the faculties of Tata Institute of Fundamental Research, National Centre of Biological Sciences and Indian Institute of Science, and was a passionate watcher of birds, wasps, fruit flies and snails.

(10) Baldomera Olivera (b.1941, Manila), Chemist, discovered many cone snail toxins � a breakthrough in study of ion chemicals and neuromuscular synapses.

(11) Pehr Edman and Agnes Henschen Sequence Determination Protein Sequence Determination pp 232-279

(12) Mikhail Tsvet (1872-1919): Russian Botanist who invented adsorption chromatography (13) Freeman Dyson FRS, theoretical physicist and mathematician known for work in quantum electrodynamics

(13 A.)Prof Krishnamoorthy Kannan, protein chemist, founder Dean and Professor of School of Biotechnology at the Guru Gobind Singh Indraprastha. University in Delhi. Ardent admirer of YSR, he initiated the 'Dr SubbaRow Memorial Lecture' at IPU in 2002.

(14) John Fenn (1917-2010, analytical chemist, shared 2002 Chemistry Nobel for work in mass spectrometry.

(15) Antony O. W. Stretton The First Sequence: Fred Sanger and Insulin Genetics October 1, 2002 vol. 162 no. 2 527-532

(16) Dr. Sarin�s Liver Institute: The Institute of Liver and Biliary Science at Vasant Kunj, New Delhi, headed by Dr S K Sarin

(17) Bae N, Li L, L�dl M, Lubec G. Peptide toxin glacontryphan-M is present in the wings of the butterfly Hebomoia glaucippe (Linnaeus, 1758) (Lepidoptera: Pieridae). Proc Natl Acad Sci U S A. 2012 Oct 30;109(44):17920-4.

(18) My institution: Indian Institute of Science, Bangalore, India

Acknowledgements:

The initial transcription of the presentation was done by S. Prabhakar with inputs from Natasha Jha from the video recording by Chetan Choudhary and Mayank Singh Rajput, students of the School of Biotechnology of Guru Gobind Singh Indraprastha University, New Delhi. It was subsequently edited for accuracy of scientific terms by Dr. Dinesh Kumar Jaiswal, Dr. Neha Gupta and Mr. Pradeep Kumar, members of Prof. N Raghuram�s lab, with the help of the video recording by University Photographer Ratnesh.


(c) Evelyn Publishers, This Website is dedicated to Dr Yellapragada SubbaRow whose contribution to human well being is unparalled

-->