Acknowledgments

When I was younger I somehow found myself constantly going back and forth trying to figure out my favorite subject in science class: biology, chemistry, or physics. This dissertation is the culmination of that debate, born out of a desire to integrate all my interests, and is a dream come true.

This thesis would not have been possible without the help of many mentors, family members, friends, and colleagues. There are too many people to thank; if you are reading this, know that I am grateful for your support and contributions.

Primary thanks go to my mentor Gregory Bowman for pushing me to be the best scientist I can be. Greg gave me the freedom to explore the boundaries of science and the mentoring to grow both as a person and as a scientist. It is fun to reminisce on how we first met when the lab was still covered in cardboard, and the wild journey we’ve been on since. My present and future success is in large part because of him.

Science can feel like a solo endeavor, but I have been lucky enough to be surrounded by lab members, both past and present, that made me feel like part of a team. I am grateful for the time I got to spend with Xianqiang (Leos) Sun, who I worked closely with on the G protein projects and learned much about computational biophysics from. I also want to thank Katie Hart and Chris Ho, whose advice and guidance during their time in the lab was invaluable. A special thanks goes to Thomas Frederick, for being both an incredible colleague and a wonderful friend. Spending time after hours reading papers or arguing about data was a perfect blend of fun and insightful. It has also been a joy working with Neha Vithani on subsequent G protein projects. Neha has been a fount of positivity and support for me, providing both personal and scientific advice on all occasions. I am forever grateful for her patience with me pinging her with a tidal wave of questions. Working alongside friends like Matthew Cruz and Catie Knoverek has been an incredibly rewarding experience; Their unconditional love and support kept me going when I struggled and getting to spend time with them, both in the lab and outside, has brought me nothing but joy. It has been a pleasure to grow and learn alongside Maxwell Zimmerman, who joined the lab at the same time I did, and so became my “brother” in the lab. I am grateful to Justin Porter and Mickey Ward for all our wonderful discussions that would range from science to the news to economic systems. I enjoyed spending time with newer members of the lab Artur Meller, Upasana Mallimadugula, Jonathan Borowsky, and Catherine Kuhn, who are doing incredible work and have taught me much in a short time. Lastly, I wish to thank my two “Bay buddies” from when I first started in the lab, Carrie Sibbald and subsequently Katie Moeder. Every conversation with either of you was an enriching experience and made me a better scientist (and arguably a better person).

I want to thank my thesis committee (Linda Pike, Ken Blumer, Jay Ponder, Rohit Pappu, and Janice Robertson) for their mentorship and advice throughout my graduate career. They, along with other faculty members in my department, served as sources of mentorship and insight into a life in science and research. I particularly want to thank Ken Blumer and Jay Ponder, who over multiple conversations gave me tons of advice about an academic career and my career trajectory. Jay, along with Garland Marshall, has been a long-time supporter of mine since my undergraduate years, and I will forever be grateful for his support and mentorship.

I am grateful to have been a part of the Folding@home (F@h) consortium. Getting to discuss science and F@h logistics with PIs and fellow F@h scientists has been an incredibly enriching experience. I learned so much about managing an organization like F@h thanks to Anton Thynell, and about engineering a network of this scale thanks to Joseph Coffland. A big shout-out goes to all the testers, citizen-scientists, and volunteers that have been a part of Folding@home. Without you, this thesis would literally not have been possible. Discussing science and troubleshooting simulations and projects with testers on Slack and our forums made me feel like they were a part of the lab.

This thesis was possible thanks to support, financial and otherwise, from a variety of sources. I want to thank the National Institutes of Health, the Division of Biological and Biomedical Sciences (DBBS), and Millipore-Sigma for providing much of my graduate funding. Also a quick shoutout to NVIDIA for providing a GTX Titan X to the lab, which was useful for preliminary simulations. I also want to thank the Biochemistry and Molecular Biophysics (BMB) department for the opportunities to be involved in department events through my time on the Student Liaison Committee, and for all the free coffee (that directly led to this thesis). A big thank-you to the admin folks in the BMB department, the DBBS program, and the OISS office. They helped my graduate school experience go smoothly and without any major hitches. They have been nothing but helpful and provided resources at every turn.

Having a support network is critical to surviving graduate school, and I would not have been able to complete this thesis without the support of my friends. To Alex Bernstein, Daniel Deibler, Scott Haber, Jack Reidy, and Sid Ravishankar: Thank you for all the years of listening to me talk about science, your unending support when I struggled, and all the endless fun we’ve had over the years. Thanks to my St. Louis D&D group (Hannah, Jeffrey, Shawn, Anthony, Charley, and River) for all the fun times over the years. Shout out to Becky Ye for all the fulfilling conversations, and to my friends Jake Lyonfields and Ashley Kuykendall for their support and advocacy; I am a better person because of all that you opened my eyes to. Lastly, a big shout-out goes to friends and colleagues in my graduate program (Tyson, Robb, Kacey, Jim, Josh, and Jessey) and beyond (Joseph H., Matt H., Rafal W.) for the years of great conversations, fun discussions, and incredible memories.

I want to give a thank you to the teachers I have had from Singapore American School, American Embassy School (New Delhi), and my undergraduate years at Washington University. I would not be here without the education and direction you provided me. Thank you to Ms. Jain, Ms. Sosa, Mr. Brakenhoff, Mr. Ortiz, and many others who taught me through the years.

I am grateful to my mom and dad for literally making this thesis possible, and for showing me a world of different cultures and opening my mind to new perspectives. A big shout-out goes to my cousins Kshitij (Conny) and Kapun, who have been nothing but loving and supportive since the day I was born. Final thanks go to my partner Sophia Fox-Dichter for her unending love and support. This thesis is as much her effort as it is mine.

Bonus thanks go to Coco the dog. You have put more into this thesis than you will ever comprehend. You are a good girl.

Sukrit Singh
Washington University in St. Louis
January 2021

Dedicated to all the immigrants out there. You’re getting the job done.

ABSTRACT OF THE DISSERTATION
Understanding and exploiting protein allostery and dynamics using molecular simulations
by
Sukrit Singh
Doctor of Philosophy in Computational and Molecular Biophysics
Washington University in St. Louis, 2019
Professor Gregory R. Bowman, Chair

Protein conformational landscapes contain much of the functionally relevant information that is useful for understanding biological processes at the chemical scale. Understanding and mapping out these conformational landscapes can provide valuable insight into protein behaviors and biological phenomena, and has relevance to the process of therapeutic design.

While structural biology methods have been transformative in studying protein dynamics, they are limited by technical limitations and have inherent resolution limits. Molecular dynamics (MD) simulations are a powerful tool for exploring conformational landscapes, and provide atomic-scale information that is useful in understanding protein behaviors. With recent advances in generating datasets of large timescale simulations (using Folding@home) and powerful methods to interpret conformational landscapes such as Markov State Models (MSMs), it is now possible to study complex biological phenomena and long-timescale processes. However, inferring communication between residues across long distances, referred to as allosteric communication, remains a challenge.

Allostery is a ubiquitious biological phenomena by which two distant regions of a protein are coupled to one another over large distances. Allosteric coupling is the mechanism through which events in one region (such as ligand binding) alter the conformation or dynamics of another region (ie. large conformational domain motions). For example, allostery plays a critical role in cellular signaling, such as in the transfer of a signal from outside the cell to cytosolic proteins for generating a cellular response.

While many methods have made tremendous progress in inferring and measuring allosteric communication using structures or molecular simulations, they rely on a structural view of allostery and do not account for the role of conformational entropy. Furthermore, it remains a challenge to interpret allosteric coupling in large, complex biomolecules relevant to physiology and disease.

In this thesis, I present a method to measure the Correlation of All Rotameric and Dynamical States (CARDS) which is used to construct and interpret allosteric networks in biological systems. CARDS allows us to infer allostery both via concerted changes in protein structure and in correlated changes in conformational entropy (dynamic allostery). CARDS does so by parsing trajectories into dynamical states which reflect whether a residue is locally ordered (ie. stable in a single rotameric basin) or disordered (ie. rapidly hopping between rotamers).

Here I explain the CARDS methodology (chapter 2) and demonstrate applications to a variety of disease-relevant systems. In particular, I apply CARDS and other sophisticated computational methods to understand the process of G protein activation (chapter 3), a protein whose mutations are linked to cancers such as uveal melanoma. I further demonstrate the utility of CARDS in the study a potentially druggable pocket in the ebolavirus protein VP35 (chapter 4). The analyses and models constructed in this work are supported by experimental testing. Lastly, I demonstrate how integrating MD with experiments, sometimes with the help of citizen-scientists around the world, can provide unique insight into biological systems and identify potentially useful targets. In particular, I highlight our recent effort converting Folding@home into an exascale computer platform to hunt for potentially druggable pockets in the proteome of SARS-CoV-2 (chapter 7) (the cause of the COVID19 pandemic).