Many-task computing

Many-task computing

Many-task computing (MTC)[1][2][3][4][5][6][7] aims to bridge the gap between two computing paradigms, high throughput computing (HTC)[8] and high-performance computing (HPC).


MTC is reminiscent to HTC, but it differs in the emphasis of using many computing resources over short periods of time to accomplish many computational tasks (i.e. including both dependent and independent tasks), where the primary metrics are measured in seconds (e.g. FLOPS, tasks/s, MB/s I/O rates), as opposed to operations (e.g. jobs) per month. MTC denotes high-performance computations comprising multiple distinct activities, coupled via file system operations. Tasks may be small or large, uniprocessor or multiprocessor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly coupled. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large. MTC includes loosely coupled applications that are generally communication-intensive but not naturally expressed using standard message passing interface commonly found in HPC, drawing attention to the many computations that are heterogeneous but not "happily" parallel.

There is more to HPC than tightly coupled MPI, and more to HTC than embarrassingly parallel long running jobs. Like HPC applications, and science itself, applications are becoming increasingly complex opening new doors for many opportunities to apply HPC in new ways if we broaden our perspective. Some applications have just so many simple tasks that managing them is hard. Applications that operate on or produce large amounts of data need sophisticated data management in order to scale. There exist applications that involve many tasks, each composed of tightly coupled MPI tasks. Loosely coupled applications often have dependencies among tasks, and typically use files for inter-process communication. Efficient support for these sorts of applications on existing large scale systems will involve substantial technical challenges and will have big impact on science.

Related Areas

Some related areas are multiple program multiple data (MPMD), high throughput computing (HTC), workflows, capacity computing, or embarrassingly parallel. Some projects that could support MTC workloads are Condor,[9] Mapreduce,[10] Hadoop,[11] Boinc,[12] Cobalt HTC-mode,[13], Falkon[14], and Swift.[15]


  1. ^ IEEE Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS08) 2008,
  2. ^ ACM Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS09) 2009,
  3. ^ IEEE Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS10) 2010,
  4. ^ ACM Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS11) 2011,
  5. ^ IEEE Transactions on Parallel and Distributed Systems, Special Issue on Many-Task Computing, June 2011,
  6. ^ I. Raicu, I. Foster, Y. Zhao. "Many-Task Computing for Grids and Supercomputers", IEEE Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS08), 2008
  7. ^ "Many Task Computing: Bridging the performance-throughput gap", International Science Grid This Week (iSGTW), January 28th, 2009,
  8. ^ M. Livny, J. Basney, R. Raman, T. Tannenbaum. "Mechanisms for High Throughput Computing," SPEEDUP Journal 1(1), 1997
  9. ^ D. Thain, T. Tannenbaum, M. Livny, "Distributed Computing in Practice: The Condor Experience" Concurrency and Computation: Practice and Experience 17( 2-4), pp. 323-356, 2005
  10. ^ J. Dean, S. Ghemawat. "MapReduce: Simplified data processing on large clusters." In OSDI, 2004
  11. ^ A. Bialecki, M. Cafarella, D. Cutting, O. O'Malley. "Hadoop: A Framework for Running Applications on Large Clusters Built of Commodity Hardware,", 2005
  12. ^ D.P. Anderson, "BOINC: A System for Public-Resource Computing and Storage," IEEE/ACM International Workshop on Grid Computing, 2004
  13. ^ IBM Coorporation. "High-Throughput Computing (HTC) Paradigm," IBM System Blue Gene Solution: Blue Gene/P Application Development, IBM RedBooks, 2008
  14. ^ I. Raicu, Y. Zhao, C. Dumitrescu, I. Foster, M. Wilde. "Falkon: A Fast and Lightweight Task Execution Framework," IEEE/ACM SC, 2007
  15. ^ Y. Zhao, M. Hategan, B. Clifford, I. Foster, G. Laszewski, I. Raicu, T. Stef-Praun, M. Wilde. "Swift: Fast, Reliable, Loosely Coupled Parallel Computation", IEEE SWF, 2007

Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Task Force Games — was a game company started in 1979 by Allen Eldridge and Stephen Cole. Mr. Cole left the company in the early 80 s, but continued to design the company s best selling Star Fleet Battles game. Mr. Eldridge sold the company to New World Computing… …   Wikipedia

  • Many-worlds interpretation — The quantum mechanical Schrödinger s cat paradox according to the many worlds interpretation. In this interpretation every event is a branch point; the cat is both alive and dead, even before the box is opened, but the alive and dead cats are in… …   Wikipedia

  • Parallel computing — Programming paradigms Agent oriented Automata based Component based Flow based Pipelined Concatenative Concurrent computing …   Wikipedia

  • Grid computing — is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non interactive workloads that involve a large number of files. What …   Wikipedia

  • Ubiquitous computing — (ubicomp) is a post desktop model of human computer interaction in which information processing has been thoroughly integrated into everyday objects and activities. In the course of ordinary activities, someone using ubiquitous computing engages… …   Wikipedia

  • High-throughput computing — (HTC) is a computer science term to describe the use many computing resources over long periods of time to accomplish a computational task.ChallengesThe HTC community is also concerned with robustness and reliability of jobs over a long time… …   Wikipedia

  • Ribbon (computing) — In computing, ribbons are graphical user interface widgets composed of a strip across the top of a window that exposes all functions that a program can perform in a single place. Additional ribbons may appear based on the context of the… …   Wikipedia

  • Distributed computing — is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal …   Wikipedia

  • Wikipedia:Reference desk/Computing — The Wikipedia Reference Desk covering the topic of computing. Computing #eee #f5f5f5 #eee #aaa #aaa #aaa #00f #36b #000 #00f computing Wikipedia:Reference de …   Wikipedia

  • Process (computing) — In computing, a process is an instance of a computer program that is being executed. It contains the program code and its current activity. Depending on the operating system (OS), a process may be made up of multiple threads of execution that… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”