Defining the effects of mismatches on type I-E CRISPR immunity and phage escape
Phan, Phong Tuan
Sashital, Dipali G.
McIntosh, Gustavo C.
Phillips, Gregory J.
Is Version Of
Biochem, Biophysics, and Molecular Biology
To rapidly respond to the constant threat by bacteriophages, the most abundant microorganism, bacteria have evolved various defense mechanisms, one of which is the adaptive RNA-guided CRISPR-Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) system. These diverse systems generally consist of a set of cas genes and a CRISPR containing a series of repetitive DNA elements interspaced with variable sequences known as spacers, that match with foreign DNA. Upon infection, sequences from viral genomes are acquired by Cas proteins and integrated into the CRISPR array as new spacers. CRISPR transcripts are then generated from the CRISPR locus and processed into short CRISPR RNAs (crRNAs). These crRNAs then guide a Cas protein effector complex to the complementary DNA and/or RNA target sequences, leading to their degradation, a process named interference. In DNA-targeting CRISPR-Cas systems, point mutations in the protospacer adjacent motifs (PAMs) or the PAM proximal target region, called the seed, often leads to phage escape from the CRISPR system. A process called priming allows the host to restore immunity against rapidly evolve phages by directing spacer acquisition from genomes that are targeted by the interference machinery. Mismatches between the crRNA and target that block interference often leads to priming. However, the effects of mutations on each of these processes varies significantly depending on the type of mismatch, location of the mutation, and the sequence of the crRNA spacer. Indeed, previous work from our lab showed that different spacer sequences are more tolerant of mismatches, suggesting that some crRNAs may provide longer-lasting immunity than others. The goal of this thesis is to examine how different spacer content affects immunity in the type I-E CRISPR-Cas system in Escherichia coli K12. First, in Chapter 2, we establish a convenient tool for measuring CRISPR interference over time. Using a green fluorescent protein (GFP) reporter plasmid, we have developed a method that enables monitoring the level of interference within bacterial colonies or in individual cells using a variety of detection methods. We validate this tool as a direct readout of CRISPR interference and demonstrate its utility with a variety of PAM and seed mutant targets, as well as with multiple CRISPR-Cas effectors. In Chapter 3, we use our GFP-reporter and a plasmid-library screen to systematically investigate how the local G/C content in the seed region correlates with mutational tolerance. We hypothesize that due to thermodynamic stability, G/C-rich seed regions may have a higher mutational tolerance than U/A-rich seeds. To investigate this, we created four different strains with variable G/C content in the seed region and measured CRISPR interference against individual target plasmids and a plasmid library containing seed mutations. Using our GFP-reporter assay, we find that high G/C-content in the first four positions of the seed is sufficient to increase rates of interference, while high U/A-content at these positions shows a corresponding increase in the rate of priming. High-throughput plasmid library screening assays reveal that mismatch type strongly affects the rate of interference. In general, mismatches containing dC and dG are more deleterious than mismatches containing dA or dT, and the crRNA sequence contributes to the level of observed defect. Double mismatches are also tolerated in the seed, although they largely block direct interference from the original spacer and instead promote priming. Overall, our study provides a rich dataset that informs on the effects of mismatch type and location for seed sequences with different G/C content. Although we established that different mismatch types can cause different levels of attenuation of CRISPR interference, the effect of this variability on phage escape has not been determined. In Chapter 4, we examined how different mismatch types between crRNA and DNA target at the first position of the seed sequence affect phage escape. We establish the types of mismatches that are bona fide escape mutants, leading to substantial loss of CRISPR interference on par with a non-targeting crRNA. Other mismatches have much weaker effects on CRISPR interference but enable the selection of phage mutants that create additional seed mismatches. By monitoring the phage genome over time using high-throughput sequencing, we show that different types of mismatches dictate the level of protection against phages. We confirmed the loss of protection due to CRISPR escapers. Further analysis of the phage mutated sequences showed that the result of these escapees possibly was due to either the rate of further mutant emergence at a specific location in the seed region or the pre-existing mismatch in the population. Together with data from chapter 3, our findings highlight how differences in crRNA sequences may enable differential protection against phages by limiting the emergence of escape mutants through mutational tolerance.