Deep Learning Tools for Protein Binder Design
Advisor: David Baker (Biochemistry)
The ability to design protein-binding proteins is broadly useful. I will present on our work to develop a deep learning-based pipeline for protein binder design. I will show how we configured AlphaFold2 to classify in silico designs which are likely to bind from those which are not likely to bind. I will then demonstrate how we can use the ProteinMPNN model, in combination with classical Rosetta protocols, to perform efficient sequence design on binder backbones. Finally, I will show how we trained a denoising diffusion model to generate protein backbones and how this can be used to massively accelerate the binder design pipeline. This deep learning-based pipeline is faster, easier to use, and has much higher experimental success rates than the previous Rosetta-based pipeline.
Nate will be staying on as a postdoc in the Baker lab is cofounding a startup working on the computational design of drugs.
Phenotyping and Immune Signaling of Monocytes and Macrophages in Porous Precision Templated Scaffolds
Advisor: James Bryers (Bioengineering)
Porous precision-templated scaffolds (PTS) are three-dimensional biomaterial constructs where the pore size and pore interconnects can be precisely controlled, allowing for the creation of scaffolds with tunable characteristics and implant outcomes. Regardless of the polymer used in construction and without the use of any signaling/stimulating molecules, PTS with uniform, interconnected, 40 μm pores have shown a remarkable ability in immunomodulating resident cells for tissue regeneration. In contrast, PTS with smaller or larger pores result in a pro-inflammatory and pro-fibrotic foreign body reaction. The mechanism behind the pore-size-mediated phenomenon remains unclear; however, monocyte and macrophage phenotypes have been identified as key mediators in regulating implant outcome within the PTS. Here, we quantify the infiltration kinetics and functional role of circulating monocytes to subcutaneously implanted PTS. We then identify the regulatory roles of MyD88-dependent signaling downstream of Toll-like receptors (TLRs) that drive the regenerative, pro-healing response in 40 μm PTS. Finally, we demonstrate a synergetic relationship between TLR signaling and macrophage receptor with collagenous structure (MARCO) in modulating macrophage phenotype. Overall, these findings further our understanding of the molecular mechanisms underlying cell behavior and tissue regeneration in PTS and implantable biomaterials.
Nathan will be pursuing a career in industry post graduation.
Understanding the Static and Transient Behavior of Organic Electrochemical Transistors
Advisor: David Ginger/Christine Luscombe (Chemistry/Materials Science & Engineering)
An organic electrochemical transistor (OECT) is a type of transistor with both ionic and electronic carriers involved in device operation. Recently, OECTs have emerged as promising candidates for both neuromorphic computing and biosensing applications as they exhibit direct response to biologically relevant ions, neurotransmitters, and metabolites. Moreover, the typically soft and flexible nature of organic semiconductors opens the possibility of the implementation of OECTs in both brain-machine interfaces and implantable biosensors. Nevertheless, deeper understanding of static and transient behavior of OECTs is necessary to unleash the full potential of OECTs for all the promising real-world applications aforementioned. Here, we first study the impact of the polymer side chain on the transconductance and the speed of OECT devices. Specifically, we show higher transconductance and faster kinetics if more polar function groups are on the side chain, or if the polar functional group is farther away from the polymer backbone. Next, we elucidate why accumulation mode organic electrochemical transistors turn off much faster than they turn on, which is a phenomenon prevalent in published studies yet cannot be explained by existing models. We further identify that ion transport is limiting the device’s operation speed and provide guides for engineering faster OECTs from both device and materials perspectives. Last, we synthesize new polymers and characterize their OECT performance. We verify that higher polymer crystallinity indeed reduces the OECT electronic carrier mobility. Together, these studies help to expand our understanding of the static and transient behavior of OECTs.
Emerson will be joining Analog Devices in San Jose after graduation.
Modeling the longevity and secretory capacity of engineered human plasma cells
Advisor: Richard James (Seattle Children’s Research Institute’s Center for Immunity and Immunotherapies)
Due to their unique longevity and capacity to secrete high levels of protein, plasma cells have the potential to be used as a cell therapy for protein replacement. Through CITE-seq and bulk RNA-seq experiments, I showed ex vivo engineered human plasma cells exhibited transcriptional features akin to natural long-lived plasma cells. These engineered cells exhibited altered protein secretion and long-term in vivo engraftment in a humanized IL-6 mouse model. To further study the protein secretion capacity of our engineered plasma cells, we developed a method to associate protein secretion with cell information in a single-cell basis. We were able to analyze thousands of plasma cells directly linking IgG secretion with transcript profile (SEC-seq) and surface markers. Our work has added to our understanding of ex vivo engineered plasma cells and the technology we developed enables exploration of links between genome and secretory function, laying the foundation for numerous discoveries in immunology, stem cell biology, and beyond.
De novo protein design as a tool to study protein-inorganic interface interactions
Advisor: David Baker (Biochemistry)
Biomolecules have the ability to regulate the formation of hierarchically structured biominerals through their interactions with inorganic crystals. However, the details of the atomic structure at the organic-inorganic interface that governs this process are not yet known. In this talk, I will present a set of design principles for the creation of molecular templates targeted to interact with calcium carbonate, and demonstrate the effect of our scaffolds on guiding the nucleation and growth of calcium carbonate as well as the effect of mutants with an engineered surface chemistry. These templates achieve a degree of tunability only accessible through protein design and represent one of the most programmable systems for calcium carbonate and broader biomineralization studies. These advances open the possibility to use de novo protein design to program biomineralization, offering a pathway to creating advanced hybrid materials.
Spatial integration of natural scenes in the mammalian retina.
Advisor: Fred Rieke (Physiology & Biophysics)
The human visual system relies on neural signals that travel along the optic nerve from the eye to the brain. Signals within the optic nerve are generated by individual neurons (“ganglion cells”) that reside within the retina and produce neural spikes in response to specific visual inputs. In this talk, we discuss recent advances in our understanding of how retinal ganglion cells in non-human primates encode realistic scenes from the natural world.
Using electrophysiological techniques, we show we can simplify the spatial structure of complex natural movies without strongly affecting the spike responses of retinal ganglion cells. Importantly, these simplified stimuli (with 16 wedge-shaped pixels) only require knowledge of the neuron's classical receptive field, enabling relatively accurate predictions of neural responses to natural scenes with minimal inputs. We then show that low-dimensional spaces can be used to build “degenerate” stimuli: different images that evoke similar spike responses. Our efforts highlight how stimulus design can be used to investigate the functional properties of complex neural circuits.
After graduation Julian will be starting a postdoc at a pharmaceutical company, Novartis.
Probing Operando Mixed Ionic/Electronic Transport in Conjugated Polymers: From Molecular Level to Device Kinetics
Advisor: David Ginger (Chemistry)
Mixed ionic-electronic transport in conjugated polymers significantly impacts the performance of organic electronics, ranging from organic electrochemical transistors (OECTs) to next-generation neuromorphic computing architectures. The exceptional performance of conjugated polymers in these applications stems from their remarkable ability to efficiently accommodate counterions throughout the entire device volume during electrochemical redox processes. The dynamic changes in the electrical and chemical environment that occur during this process necessitate the utilization of operando characterization. In this talk, I will present our work on employing operando measurements to comprehend the structural and morphological changes at the molecular level, as well as the operational mechanisms affecting the device kinetics in various applications. Our research highlights the power of operando characterization in unveiling the fundamental processes of mixed ionic-electronic transport and providing valuable insights for application-oriented molecular design.
Toward Practical Multi-Reference Configuration Interaction Methods and Applications
Advisor: Xiaosong Li (Chemistry)
The accurate description of electronic structures in molecular systems has always been the central question in the field of quantum chemistry. For problems such as studying chemical properties of the late-row element components or investigating the energy dissipation pathways in electronically excited/ionized molecules, relativistic effects and electron correlations must be addressed. Here we present a novel relativistic multireference configuration interaction (MRCI) method to treat both electron correlations and relativistic effects variationally. With the efficient distributed implementations, we are able to resolve MRCI wavefunctions with more than 1 billion determinants. Benchmarks on electronic fine structures splitting and real-world applications including resolving L-edge X-ray spectra of various iron complexes will be presented and analyzed.
Model-driven DBTL cycles acceleration with broad-host-range bacterial CRISPRa/i circuits
Advisors: James Carothers and Jesse Zalatan (Chemical Engineering/Chemistry)
CRISPR-Cas gene regulatory tools have revolutionized biological network programming. Recently, we developed CRISPR gene activation (CRISPRa) tools that demonstrate broad applicability across various bacteria. With thorough characterizations of bacterial CRISPRa, we found that design rules of effective CRISPRa are stringent and highly context-dependent. Thus, we developed numerous strategies to overcome existing limitations — including DNA context engineering, utilization of engineered protein to bypass target sites requirements, and characterization of multiple bacterial CRISPRa systems for alternative design rules. Implementation of CRISPRa tools in chemical bioproduction enable synthesis of aromatic amines, a precursor to various functional polymer materials, which was previously difficult to synthesize through conventional routes. In combination with CRISPR gene interference (CRISPRi), we further explore the capability of CRISPRa/i platform to regulate expression of both foreign genes and native genes intricately involved in bacterial metabolism. Since programmability of CRISPRa/i relied on guided RNA sequence, we found that engineering at the RNA level could provide tunable gene expression of multiple genes simultaneously. Furthermore, when combined with the genome-scale metabolic models, this system accelerated the Design-Build-Test-Learn (DBTL) processes for microbial strain optimization, bypassing stepwise genetic reconstruction through trans-acting CRISPRa/i circuit. The resulting constructs could be comprehensively investigated using a multi-omics platform, gathering detailed information to improve subsequent DBTL cycles. By coupling programmable and tunable gene regulatory tools with large metabolic models informed by omics data, our platform established a foundation for non-canonical microbial strain engineering that benefit diverse disciplines from industrial biotechnology to therapeutic discovery.
Following graduation, Ice will be staying with the Carothers/Zalatan labs as a postdoc while preparing for a transition to the next postdoc position.
Bottom-Up Synthesis of Colloidal Systems Using Sequence Defined Molecules
Advisor: Lilo Pozzo (Chemical Engineering)
Self-assembled colloidal nanoparticles can be used to deliver vaccines, sense pathogenic materials, and mark tumors. In response to light, they catalyze the formation of clean fuels or help transform it directly into usable energy. Quantum dots are used in commercial displays and will be a key step in new forms of computing and information storage. The performance of nanoparticles is dictated by their structure and composition and can be controlled using sequence defined molecules. The latter include any polymeric or oligomeric materials in which the exact sequence is precisely controlled using enzymatic processes or synthetic chemistry. Their physicochemical diversity and modularity are used to intervene in chemical processes occurring during synthesis, stabilize specific crystal facets, or form templates that guide nanomaterial growth. Yet the multivariate relationship between experimental parameters and intermolecular reactions that govern nanomaterial self-assembly is difficult to study using traditional experimental methods.
In this work, gold nanoparticle synthesis in the presence of peptides is used as a model system for developing and integrating experimental automation with computational approaches to extract information for guiding sequence design. Several peptide variants were selected through systematic variations of a gold binding peptide, and nanoparticles were synthesized using a liquid handling robot in a large design space of reagent concentrations. The plasmonic response of nanoparticles was used as a fast proxy for changes in structures and was analyzed using functional data analysis methods. The analysis resulted in a metric for quantifying how changes in peptide design affect nanoparticle synthesis outcomes, and the conclusions were corroborated with small-angle X-ray scattering and electron microscopy. Next, the relationship between substitution of methionine in a peptide sequence and an increase in particle anisotropy was assessed. A programmed liquid handling robot was used to dynamically intervene in nanoparticle synthesis to control the resulting structure and stability of anisotropic nanoparticles. Finally, highlights of how small-angle X-ray scattering can work in parallel with computational methods to study colloidal self-assembly mechanisms are presented.
Kacper will be working as a scientist at Arzeda in Seattle following graduation.
De Novo Design of Protein Conformational Changes and Protein-Peptide Interactions
Advisor: David Baker (Biochemistry)
This work addresses two difficult challenges in protein design: designing proteins with two distinct, interconvertible structural conformations and designing proteins to bind tightly to biologically active, flexible helical peptides. First, through the design of “hinge” proteins that can switch between two fully structured conformations in response to an effector-binding event, we demonstrate the modular transformation of biochemical information. Through detailed structural and biophysical characterization, we show tight coupling between the conformational change and the effector binding. Second, we present methods for designing proteins to bind helical peptides, which we show bind specifically to therapeutically relevant targets with nanomolar and picomolar affinities. These two approaches go beyond the single-structure paradigm in protein design and enable new possibilities for therapeutics and bioengineering.
Phil is co-founding a new AI therapeutics company with fellow MolES graduate Nate Bennett.
Designing Hierarchical Organic-Inorganic Hybrid Materials with High-Information-Content Building Blocks
Advisor: Francois Banexy (Chemical Engineering)
Hierarchical hybrid materials are attractive for catalytic, opto-electronic, and sensing applications due to the collective and emergent properties that originate from the precise organization and interplay of organic and inorganic units. However, intimate integration of disparate components remains difficult. Using sequence-defined peptoids as scaffolding blocks and solid-binding proteins as functional ones, we synthesized a variety of hybrid hierarchical nanostructures that take advantage of the excellent programmability of peptoids, and of the ability of structurally organized solid-binding proteins to control the binding, nucleation, and growth of various inorganic components. These supramolecular architectures include three-dimensional (3D) materials consisting of alternating 2D layers of peptoids, proteins and silica nanoparticles, protein-conjugated peptoid nanotubes decorated with titania nanocrystals of defined size in the sub-5nm regime, and more complex titania-gold nanocomposites capable of photocatalysis under both visible and UV light illumination. Considering the outstanding modularity of peptoids, the diversity of natural and de novo designed protein frameworks, and the abundance of solid-binding peptides interacting with a variety of inorganic compounds, this simple but modular strategy should prove useful for the fabrication of a broad range of advanced functional material.
Generating and Harnessing Learned Embeddings for Protein Design
Advisor: David Baker (Biochemistry)
Proteins consist of a sequence of amino acids that spontaneously fold into unique three-dimensional structures which carry out important biochemical functions. Understanding this sequence-structure-function relationship is integral in designing new proteins that carry out important and specific functions. Through deep learning, we can reliably generate meaningful representations of proteins from sequence and structure to this end. My work has been focused on generating and harnessing learned embeddings of proteins for various downstream prediction tasks. I applied learned structure-based embeddings for the tasks for protein structure refinement and protein ensemble generation. I next explored a data-efficient method for encoding protein information through jointly leveraging sequence and structure information. We used this joint representation to predict the influence of single mutations on protein efficacy. Finally, following the success and release of RoseTTAFold for accurate structure prediction, I used the embeddings learned through this model for mutation effect prediction.
Sanaa has relocated to San Francisco and has started work as an applied machine learning scientist at Google X.
The Role of α-Sheet Structure in Bacterial and Mammalian Amyloidogenesis and its Implication in the Microbial Alzheimer’s Disease Hypothesis
Advisor: Valerie Daggett
Amyloidogenesis involves the production of toxic, soluble α-sheet oligomers prior to the deposition of nontoxic β-sheet fibrils. While mammalian amyloid is associated with over 50 diseases and is considered inherently pathogenic, several bacteria utilize amyloid fibrils to fortify the biofilm and protect cells from the surrounding environment. Here, we employed de novo peptides that stably adopt α-sheet conformation to inhibit mammalian amyloidogenesis, neutralizing the toxicity associated with amyloid-β and islet amyloid polypeptide oligomers. We showed that the same α-sheet peptides inhibited bacterial amyloidogenesis in E. coli and S. aureus and increased bacterial susceptibility to multiple antibiotics. The microbial Alzheimer’s disease hypothesis suggests that Aβ aggregation is triggered in response to microbial infection in the brain. We applied our knowledge of the conserved amyloid inhibition mechanism of the α-sheet peptides to elucidate molecular mechanisms that govern the role of Aβ in the innate immune response. We found that amyloid-forming E. coli increased the production of toxic Aβ oligomers in human neuroblastomas, and that Aβ oligomers specifically inhibited E. coli amyloidogenesis, reduced biofilm cell density, and increased antibiotic susceptibility. Finally, Aβ and CsgA, the primary protein in E. coli amyloid fibrils, inhibited one another’s aggregation and neutralized toxicity via interactions between α-sheet oligomers. These findings suggest that mammalian amyloids can combat pathogens through interactions between toxic, soluble α-sheet oligomers.
Understanding the lineage transition to bypass AR program in castration resistant prostate cancer.
Advisor: John Lee (Fred Hutch / UW Department of Medicine)
Prostate tumors harbor substantial biological heterogeneity with highly various treatment response. The next-generation sequencing technology has enabled discovery of previously uncharacterized genetic alterations in cancer genome. Functionally association the genetic abnormalities with the prostate cancer initiation and progression is crucial. In this work, we describe a combinatorial genetic strategy applied to an organoid transformation assay to rapidly generate diverse, clinically relevant prostate cancer models. Coupling with singlecell or spatially resolved next-generation sequencing we are able to resolve the clonal architecture of the resultant tumors to uncover polygenic drivers of cancer phenotypes. Lineage plasticity is recognized as a common mechanism for treatment resistance in cancer. The shift from an androgen receptor (AR)-positive lineage to an AR-negative lineage has increased over the past decade due to the use of 2nd generation AR inhibitors. However, the genetic determinants driving the progression of prostate cancer to an AR-null state and the acquisition of neuroendocrine differentiation remain largely unknown. We have delineated critical roles of the pioneer factors ASCL1 and NEUROD1 in neuroendocrine transdifferentitation and uncovered their abilities to silence AR expression and signaling by remodeling chromatin at the somatically acquired AR enhancer and global AR binding sites with enhancer activity. We have also demonstrated a tuft cell lineage driven transcription factor POU2F3 is associated with downregulation of AR signaling pathways under low androgen conditions. In summary, our work has contributed to a deeper understanding of the genetic determinants involved in the initiation and progression of prostate cancer.
Photocatalytic Material and Membrane Systems
Advisor: Bruce Hinds (Materials Science & Engineering)
Two distinct types of photocatalytic material systems have been studied for wearable dialysis devices and precise pharmaceutical synthesis: TiO2 nanowire anodes and gold nanoporous membranes. The study on TiO2 involves the development of a novel urea photodecomposition system (POUR) that efficiently and selectively converts urea into N2 and CO2, enabling spent dialysate regeneration for portable kidney dialysis. The long-term stability and regeneration treatments of TiO2 photocatalyst has been investigated. The oxidative environment generated localized around the TiO2 surface is considered as an efficient way to remove the Ti-C and maintain the photocatalytic performance of TiO2. External voltage applied to the TiO2 single crystal nanowires dramatically enhances the collection of photogenerated electrons to the cathode and pushing holes to reaction surface thereby minimizing the recombination process and significantly increasing the photocurrent (~14x) and the urea photodecomposition rate. Further mechanistic investigations revealed the requirement of chloride (Cl-) for the complete oxidation of urea to physiologically safe N2 and CO2. Quenching studies proved a mechanism based on TiO2 surface bound radical intermediates (Ti-Cl·). High photocurrent to reaction efficiency for this 6 e-/h+ process and selectivity of urea suggests urea nitrogens are bound to TiO2 surface (Ti-N bonds) during the complete oxidation process. On the other hand, gold nanoparticles in a plasmonic flow reactor demonstrated an over 200% quantum efficiency for peroxide activation, offering controlled single oxidation reactions for pharmaceutical synthesis. The reactor design, optimized pore diameter, and LED illumination wavelength were crucial factors influencing peroxide activation efficiency. Overall, these findings shed light on the potential applications of TiO2 and gold nanoparticles in advanced photocatalytic systems for the wearable dialysis device and precise pharmaceutical synthesis.
Yeast-based assays for studying the functional impact of missense variants in a rare human disease gene at scale
Advisor: Aimee Dudley (Pacific Northwest Research Institute)
Advancements in high-throughput sequencing technologies have accelerated the discovery of human genetic variation. However, leveraging genomic information for precision medicine is currently limited by the relatively small number of variants for which there is enough supporting evidence to interpret them clinically. An example of clinically actionable diseases for which large-scale functional data can have an enormous impact on patient health is for serine biosynthesis defects; a group of rare inherited metabolic disorders caused by pathogenic variants in PHGDH, PSAT1, and PSPH. However, because L-serine supplementation, especially if started early, can ameliorate and in some cases even prevent symptoms, knowledge of pathogenic variants is highly actionable. Here, we use a yeast-based complementation assay to measure the functional impact of 1,914 amino acid substitutions in human PSAT, ~88% of all unique SNV-accessible missense variants. Our assay scores agree well with known biological features of the enzyme and existing clinical annotations, supporting its use as functional evidence for variant interpretation. We then extend this approach to assay a subset of pairwise PSAT1 allele combinations in yeast diploids. Results from our diploid assay successfully distinguish patient genotypes from those of healthy carriers and agree well with disease severity. Additionally, we develop a linear model that uses individual allele measurements (in haploid yeast cells) to accurately predict the biallelic function (in diploid yeast cells) of ~1.8 million allele combinations corresponding to potential human genotypes. Finally, we present a method that could be used to experimentally measure large numbers of variant combinations in yeast diploids. Taken together, our work provides an example of how large-scale functional assays in model systems can be powerfully applied in the study of rare disease and to inform future diagnostic efforts.
Learning to build, building to learn: Engineering constitutive promoters in plants
Advisor: Jennifer Nemhauser (Biology)
Synthetic biology offers tools to modify plants in an environment that is changing at a faster pace than can be matched by evolution alone. Some of the indispensable tools in the synthetic biology toolkit involve ways to regulate transcription strength and transcription pattern. By leveraging publicly available RNA-seq atlases, we were able to identify a set of some of the most stably expressed genes in the genome of the reference plant Arabidopsis thaliana. We evaluated these promoter parts in transient assays in Nicotiana benthamiana and in stable transgenic lines of Arabidopsis. To provide additional functionality to these natural promoters, we introduced gRNA-target sites recognized by a dCas9-repressor construct to turn these constitutive promoters into repressible NOR logic gates in N. benthamiana. To explore the fundamental design rules behind constitutive promoters, we did an in silico experiment that expanded the RNA-seq atlas pipeline to multiple angiosperm species. Comparisons between core promoter architectures and gene expression stability revealed potential differences in core promoter usage in monocots and eudicots. Furthermore, evaluating groups of evolutionarily related promoters across species found a lack of strong evolutionary preference for core promoter types for expression stability. To improve upon the repression aspect of transcriptional regulation, we used machine learning models to predict and optimize a short alpha helical repression domain, and identified potential key residues that contribute to repression. Taken together, this work contributes to our ability to engineer transcriptional regulation in plants by providing a new set of tools, as well as revealing design rules behind both gene expression pattern and repression.