Link to full page (citation export, more details):

SAPPHIRE-based clustering

F. Cocina; A. Vitalis; A. Caflisch

Journal: J. Chem. Theory Comput.
Year: 2020
Volume: 16
Issue: 10
Pages: 6383–6396
DOI: 10.1021/acs.jctc.0c00604
Type of Publication: Journal Article

Algorithms; clustering; data analysis; markov state models; molecular dynamics


Molecular dynamics simulations are a popular means to study biomolecules, but it is often difficult to gain insights from the trajectories due to their large size, in both time and the number of features. The SAPPHIRE (States And Pathways Projected with HIgh REsolution) plot allows a direct visual inference of the dominant states visited by high-dimensional systems and how they are interconnected in time. Here, we extend this visual inference into a clustering algorithm. Specifically, the automatic procedure derives from the SAPPHIRE plot states that are kinetically homogeneous, structurally annotated, and of tunable granularity. We provide a relative assessment of the kinetic fidelity of this SAPPHIRE-based partitioning in comparison to popular clustering methods. This assessment is carried out on trajectories of a toy model and two polypeptides. We conclude with an application of our approach to a recent 100-microsecond trajectory of the main protease of SARS-CoV-2.