Combining task- and data parallelism to speed up protein folding on a desktop grid platform

TitleCombining task- and data parallelism to speed up protein folding on a desktop grid platform
Publication TypeConference Paper
Year of Publication2003
AuthorsUk B., Taufer M., Stricker T., Settanni G., Cavalli A., Caflisch A.
Conference NameProceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2003)
Date PublishedMay 2003
Conference LocationTokyo
ISBN Number0-7695-1919-9
Accession Number8064753
Keywordsbest-first search, Chemistry, chemistry computing, clustered workers, computational CHARMM chemistry codes, Computational modeling, Costs, data parallelism, desktop grid platform, distributed systems, Grid computing, Internet, molecular dynamics simulations, Optimal scheduling, parallel processing, Parallel programming, Productivity, Protein Folding, Proteins, task-parallelism, tree searching, United Devices MetaProcessor software environment

The steady increase of computing power at lower and lower cost enables molecular dynamics simulations to investigate the process of protein folding with an explicit treatment of water molecules. Such simulations are typically done with well known computational chemistry codes like CHARMM. Desktop grids such as the United Devices MetaProcessor are highly attractive platforms, since scavenging for unused machines on Intra- and Internet delivers compute power that is almost free. However, the predominant programming paradigm for current desktop grids is pure task parallelism and might not fit the needs for protein folding simulations with explicit water molecules. A short overall turn-around time of a simulation remains highly important for research productivity, but the need for an accurate model and long simulation time-scales leads to tasks that are too large for optimal scheduling on a desktop grid. To address this problem, we introduce a combination of task- and data parallelism as a well suitable computing paradigm for protein folding investigations on grid platforms. As a proof of concept, we design and implement a simple system for protein folding simulations based on the notion of combined task and data parallelism with clustered workers. Clustered workers are machines grouped into small clusters according to network and CPU performance criteria and act as super-nodes within a desktop grid, permitting the utilization of data parallelism in addition to the task parallelism. We integrate our new paradigm into the existing software environment of the United Devices MetaProcessor. For a test protein, we reach a better quality of the folding calculations than we reached using just task parallelism on distributed systems.