Unpacking the CRISPR Toolbox
The acronym “CRISPR” is gaining popularity as a stand-in for the more complete CRISPR-Cas9 genetic engineering tool. More precisely yet, CRISPR, or clustered regularly interspaced short palindromic repeats, describes a genetic motif in bacterial and archaeal genomes that encodes a suite of RNA tools used by a specific class of DNA-cutting proteins in the microbial immune system. Because of these proteins’ dependence on CRISPR, they’re called “CRISPR-associated,” or “Cas” for short.
Compared to other gene editing paradigms, CRISPR is inexpensive because it only requires the synthesis of a short segment of guide RNA and addition of a Cas enzyme: Cas9. And it's simple to implement – just introduce a gRNA-Cas9 complex to DNA.
The discovery of CRISPR-Cas9 ushered in a transformative era of biotechnology, but it was just the first tool to be discovered from a larger toolbox of CRISPR-associated proteins. Cas9 is like a flat-head screwdriver, and sometimes a project requires specialized screws. Like any useful toolbox, CRISPR offers tools that suit many specialized purposes. What follows is an explanation of the suite of tools available in CRISPR, including descriptions of the specialized purposes that each variant befits.
The discovery of Cas9 was the final piece of a puzzle revealing bacteria’s adaptive defense system against viral and plasmid DNA invasion.
Cas9 is a DNA-cutting enzyme that couples to an RNA molecule called a guide RNA (gRNA) to break apart DNA wherever there is a nucleotide sequence matching the complement of the gRNA sequence.
The nucleotide sequence of the gRNA template matches the target site, guiding the Cas9 endonuclease to this position. By simply synthesizing a custom-designed gRNA sequence, researchers can accomplish precision gene editing at any site of interest.
Cas9 exists in many bacterial species and differs slightly between species. One of the main differences between species is in the sequence of a specific motif that has to occur in the targeted sequence for Cas9 to make a cut. The form of Cas9 found in the bacteria Streptococcus pyogenes is the most thoroughly characterised variant by virtue of its target motif being “NGG,” making it one of the simplest and most versatile Cas9 target motifs discovered to date.
Cas9 from S. pyogenes, also known as spyCas9, can only target sequences of DNA that are adjacent to a di-guanine target motif. Although the NGG motif requirement may seem limiting, di-guanines are abundant throughout typical genomes, enabling most DNA regions to be accessible by CRISPR.
Once a cut has been made in a genome with CRISPR, the cell recognises the event as DNA damage and fires off two potential DNA repair pathways - non-homologous end joining (NHEJ), and homology directed repair (HDR). It is these pathways that researchers manipulate to achieve their desired genome engineering outcome.
In NHEJ, the cell’s DNA damage repair machinery tries to stick the two cut ends of DNA back together. During the process, a phenomena called DNA resection can occur, where enzymes chew back at the ends of the DNA before re-sticking them. If a cut is made in the middle of a gene and the NHEJ machinery repairs it, there is a chance for a few bases to be lost. This will often perturb the gene function, creating a genetic knock-out mutant.
In HDR, the cell uses its diploidy to repair damage on one chromosome by using the other identical chromosome as a template. This process can be tricked by adding another “donor” strand of DNA that has homology to the sequences either side of the cut site. The HDR machinery would repair the cut while incorporating the donor strand, allowing researchers to insert new genetic material, creating a targeted knock-in mutant.
There are many ways by which researchers can nudge a cell towards one repair pathway over the other depending on the application; however the interplay between the two repair pathways is still a subject of considerable ongoing research.
Cas9 is an enzyme with multiple functional domains distributed over two lobes. Some of these domains identify the DNA target, while other sites are the scissors, cutting the DNA target. With targeted mutations the nuclease domains can be deactivated, effectively blunting these scissors while maintaining Cas9’s DNA targeting activity. Researchers can then leverage CRISPR’s precision and simple implementation for purposes other than genetic engineering.
This deactivated version of Cas9, known as dead Cas9 or dCas9, can be fused to other DNA acting proteins that do not have sequence specificity, allowing them to be targeted to any sequence of interest. Some researchers have used dCas9 as a mapping tool. With a fluorescent marker protein attached to the dCas9 complex, suddenly any location along a DNA molecule that matches the gRNA template will light up under UV like a beacon. Using this technique, researchers can chart the distribution of genes on a chromosome. Other researchers use dCas9 to target genes with transcription factors in order to study how manipulating the rate of specific protein production affects the wider cell environment or to better understand epigenetic influences on the genome.
Activating targeted genes with transcription factors bound to dCas9 is known as CRISPRa, and inhibiting the expression of targeted genes using the same method is known as CRISPRi. Together, the complementary information gained from CRISPRa and CRISPRi can be used to investigate chromosome dynamics and the complex gene expression pathways that support life.
dCas9 has both DNA cleaving domains deleted. Alternatively, by only perturbing a single domain, Cas9 becomes a highly precise DNA nicking enzyme. Cas9 that is altered to break only a single strand is thus known as Cas9n.
Cas9n is useful as a tool to control the DNA repair process. The cell has a number of nick repairing pathways that are independent from those activated by double stranded break damage. However, both nicks and double stranded breaks have their own HDR pathways. By using Cas9n, researchers can bias their experiment toward HDR, improving the efficiency of genetic knock-ins and ensuring any unwanted effects from NHEJ can be avoided.
CRISPR-Cas9’s precise targeting of specific DNA sequences is an appealing feature for gene editing and genetic research. However, even though Cas9 will target only the intended sequence for the majority of gRNA sequences designed, there are occasionally Cas9 targeting mistakes that impact experimental efficiency.
If these targeting errors occur with observable regularity within a particular gene editing protocol, experimental results could yield erroneous data. Hence, the availability of a Cas9 variant with reduced off-target effects will be crucial to the development of CRISPR-based therapeutics in live organisms.
Cas9 variants designed for improved accuracy are generally known as high fidelity Cas9, or hfCas9. Over the last few years, hfCas9 variants have been described in the literature as only ever targeting the sequence of interest. The general strategy for reducing off-target binding is to modify the structure of the Cas9 protein, without altering its function, to diminish the very slight chemical affinities for DNA so that the Cas9-gRNA complex only binds to DNA on the basis of nucleotide complementarity. In other words: Cas9 can accommodate a couple of non-complementary bases in a sequence match because of small bonding interactions between the protein and the DNA, whereas the reduction of these small interactions for hfCas9 forces perfect gRNA-DNA complementarity for a cut to be made.
CasX and CasY
Because CRISPR-Cas is crucial for bacterial immunity to phage, the system exists across the bacterial kingdom. UC Berkeley researcher Jillian Banfield discovered two new Cas enzymes, CasX and CasY, by sequencing the genomes of bacteria collected from groundwater and soil and searching for CRISPR-Cas systems therein.
At just 980 amino acids, CasX and CasY are smaller than Cas9, which comprises 1,053 amino acids. And because Banfield, working with Jennifer Doudna, demonstrated that CasX and CasY are functional, the two new enzymes are attractive in biotechnology settings. Smaller proteins are both easier to synthesize and to deliver into cells.
The full utility of CasX and CasY has yet to be shown, but the discovery of CasX and CasY by way of metagenomic search highlights the fact that our current CRISPR tools are drawn from a tiny segment of the bacterial world. The vast diversity of bacteria presents the possibility that there are many more CRISPR genes still undiscovered.
Unlike the Cas9 variants described previously, which were invented by deliberately modifying the Cas9 protein, Cpf1 is a distinct protein with a similar function to Cas9. Cpf1 belongs to the same family of CRISPR-associated endonucleases as Cas9 and, as such, shares many of Cas9’s functions, like its use of guide RNA to target a region of DNA and its double-strand severing ability. But Cpf1 performs these functions slightly differently than Cas9.
Some of Cpf1’s variations make it preferable to Cas9. One advantage is that Cpf1’s gRNA is a simpler, shorter structure, as well as easier to engineer than the gRNA for Cas9. A second advantage is that Cpf1 has a target motif of “TTN.” Whereas the key nucleotide sequence motif that Cas9 recognizes grants access to many regions of a typical genome, there are inevitably some regions that can’t be accessed because of a lack of adjacent di-guanines. Thus, Cpf1 opens up whole new DNA regions to CRISPR that weren’t previously accessible.
C2c2 is a CRISPR endonuclease that operates in a manner like Cas9 and Cpf1, but with a twist. Instead of precisely breaking targeted DNA sequences,C2c2 targets RNA sequences.
Because of how C2c2 ignores DNA in favor of RNA, it is useful in scenarios where transient effects are the goal. One of CRISPR-Cas9’s most appealing qualities is that it can introduce permanent and heritable edits into DNA. However, in many instances, permanent changes are not the desired outcome. With C2c2, researchers can transiently alter gene expression by cleaving RNA transcribed from a target gene.
However, unlike Cas9 and Cpf1 which make a cut and then disassociate from their target, C2c2 demonstrates “collateral effects” by which it cleaves other RNA molecules in its vicinity when activated. If C2c2 is to be developed into a precise RNA editing tool, these collateral effects need to be better understood and reigned in. On the other hand, researchers have shown these collateral effects make C2c2 a highly sensitive biosensor for viruses like Zika.
Future additions to the CRISPR toolbox
Gene editing with CRISPR methods is a highly active field of research. Scientists are continually searching for clever applications of these proteins that interact with DNA in living organisms. Undoubtedly, new tools in the CRISPR toolbox will emerge, presenting solutions to existing problems in biotechnology and leading to new methods that haven’t yet been imagined, just like how the implementation of Cas9 lowered barriers to gene editing research. This article will continue to be updated as the CRISPR toolbox evolves.