What is a virus?
Virus is an infectious agent which can only replicate inside of cells. It is usually not seen as a microorganism - it is arguably not a living thing. It is essentially genetic information wrapped in a coat of protein molecules forming an enveloped called capsid. Sometimes, just like in the case of coronavirus, it has another membrane made of lipids, with various embedded proteins.
How does SARS-CoV-2 work?
A virus can exist in two different forms: either as a virus particle, called virion, or inside a host cell, where it uses the mechanisms of the cell to replicate. The pictures we are familiar with from the media, and also our visualizations, are of the virion.
Virions don't have any means to move themselves. The rely solely on the environment to push them around until they happen to bump into a cell they are able to infect. With coronaviruses this typically happens when an infected person coughs, sneezes, or talks, and tiny droplets of saliva containing virions spread from their mouth or nose. If such droplets end up on the face of another person, the virus might have a chance to get inside their body and infect some of their cells.
The surface of the virion is covered by a lipid bilayer - a fatty membrane which the virus ripped off of the infected cell as it was finding its way out. This membrane can be easily broken down by soap. That's why washing hands thoroughly with soap is effective in preventing the spread of COVID-19, the disease caused by the SARS-CoV-2 virus.
Embedded in the membrane are several proteins. In case of SARS-CoV-2 these are:
- Spike protein: The molecules of this protein are the typical spikes seen in many coronavirus images, which also give the coronavirus its name. The virus uses the spike protein to find cells it can infect and attach to them. In SARS-CoV-2, this protein attaches to ACE2, a receptor found in membranes of cells in lungs, arteries, kidneys, and intestines. After attaching to the ACE2 receptor, the spike changes its shape, or conformation, and the viral membrane fuses with the one of the attacked cell. This causes the genetic material of the virion, which is enclosed inside the viral membrane, to be injected into the cell.
- Envelope protein: This protein assists in the process of budding as new viruses are formed. In this process, the virus takes part of the lipid membrane of a host cell, and uses it to make its own envelope. Only few of the envelope protein molecules are left over in the mature, infections virus.
- Membrane protein: This protein packages the viral genome inside the virion. This is the most numerous protein in the viral membrane of SARS-CoV-2.
Inside the viral envelope, there is the genetic information of the virus. The genetic information is encoded in single-stranded RNA, which is scaffolded and protected by many copies of nucleocapsid protein.
A virus cannot replicate itself. Instead, it hijacks the replication mechanism of the host cell. Once the virus attaches to the cell and injects its RNA into it, the machinery of the cell is forced to produce individual components from which new copies of the virus are formed. These components are the viral proteins encoded by the RNA of the virus. A single cell can make thousands of virions, which then exit through a transportation system of the cell and stray around until they happen to hit another cell that they can infect.
Making of the SARS-CoV-2 model
The goal of this project was not only to create an accurate atomistic model of the coronavirus particle, including its insides, but also do it in a way that the model can be easily updated. The frequency with wich new knowledge is discovered made us chose algorithmic approaches instead of traditional manual modeling. In this way, our model is rather calculated than handcrafted, and by changing the input parameters, we can update it to reflect the latest theories and discoveries made by biologists.
Input data
What exactly is a virion made of cannot be seen in any microscope. Microscopes show us the overall shape of the virion, but the exact molecular composition has to be determined through various other techniques. Once biologists figure out which molecules the virion is made of, how these molecules look like, and how they are organized, we can use computer algorithms to create a 3D model which correspond to all the findings biologists have discovered.
Building of such model typically starts with defining the overall shape. Electron microscopes can only show us pictures of cross-sections of individual virus particles. These contours can give us a pretty good idea of how the virion looks like in 3D. Once we build the 3D model of the overall shape using statistical methods based on the electron microscopy images, we can populate it with many copies of models of the individual molecules which make up the virus.
We can get the protein models from PDB, a database of atomic models of proteins and other organic molecules. They describe the positions of individual atoms, from which the molecule of the given protein is composed. However, there were no models for SARS-CoV-2 proteins available yet, so biologists had to prepare them for us according to their most current research.
Modeling
The first step in creating the model is to populate the surface of the 3D hull of the virion with the lipid bilayer. For this, we developed an algorithm which does this automatically for any shape we want. You can read a paper which describes the technique.
Similarly, the RNA stored within the virion is generated with a procedural algorithm, so that we don't have to model it by hand. The technique used here is based on our research of how to generate molecules like RNA very quickly on the GPU. You can read about it here.
The research on the molecular composition of SARS-CoV-2 is still ongoing. In general, when building models like this one, we can always only incorporate the current knowledge in biology, which might change with every new discovery or an experiment. Therefore, this model is built using statistical modeling approach, which we described in this paper. With this approach, several molecules of the proteins are placed in the scene, reflecting their mutual relationships and positions. Together with the information about the number of individual protein molecules that should be present in the virus particle, a rule-based algorithm populates the entire scene with the proteins.
This approach, as opposed to traditional 3D modeling, allows us to easily adapt the model to any new findings biologist may have. Our rendering pipeline is then immediately able to display the virion in high resolution. In this way, we can support scientists in generating hypotheses about the virus, since we can reflect their findings in our model in a very short time.
Colors
A virus particle is very small - a SARS-CoV-2 virion is around 50 to 150 nm in diameter. If you placed 1000 of them side by side, they would be as thick as a single hair. Or you can imagine it this way: say there's a single virion sitting on your palm. Now, if the virion was the size of a marble, your hand would be 15 km in diameter.
A virion is smaller than the wavelength of visible light, and so we can never see it, not even with the most powerful light microscope. We can see it in an electron microscope, because electron microscopes don't work with light, but with electrons, which have much smaller wavelengths. The images we see of the virus are made by a computer, which assigns colors to the individual parts of the image. These colors do not reflect how the virus actually looks like. Since it is smaller than the shortest wavelengths of visible light, we can say that a single virus particle does not have any color.
Just like the electron microscopy software chosing the colors to produce the micrographs, when creating visualizations, we have to choose the colors for the individual molecules. There is no standard way of coloring proteins and other molecules found in viruses, cells, or microorganisms. That's why different visualizations use different color schemes.
While electron microscope software usually uses shades of gray, we decided to use more vivid colors to highlight certain parts of the virion. We chose overall cold color scheme to suggest that the virus is arguably not a living thing, but rather a molecular machine. Despite the cold color scheme, we made the RNA bright red, since this is the molecule that carries the information necessary to replicate the virus, and thus can be seen as its most important part, the most living part of this parasitic molecular machine. We also made the spike proteins bright and vivid, as those are the molecules actively attaching the virus to the host cells. These are the proteins that could be potentially targeted by vaccines or antiviral drugs. On the other hand, we chose the color of the lipid membrane as cold, muted shade, as this part of the virion is not encoded by it's RNA, but rather taken from the host cell.
We also used depth of field. This is an effect of optical lenses where only a portion of the captured image is in focus. In computer graphics, images can be created perfecly sharp, but we decided to apply artificial depth of field so that we can guide the attention to different parts of the virion. As the virus particle modelled on atomic resolution is quite complex, we found this to be a useful storytelling tool.