GridFTP

GridFTP

GridFTP is an extension of the standard File Transfer Protocol (FTP) for use with Grid computing [ Taylor, Ian J. From P2P to Web Services and Grids - Peers in a Client/Server World. Springer, 2005 ] . It is defined as part of the Globus toolkit, under the organisation of the Global Grid Forum (specifically, by the GridFTP working group).

The aim of GridFTP is to provide a more reliable and high performance file transfer for Grid computing applications. This is necessary because of the increased demands of transmitting data in Grid computing - it is frequently necessary to transmit very large files, and this needs to be done fast and reliably.

GridFTP is the answer to the problem of incompatibility between storage and access systems. Previously, each data provider would make their data available in their own specific way, providing a library of access functions. This made it difficult to obtain data from multiple sources, requiring a different access method for each, and thus dividing the total available data into partitions. GridFTP provides a uniform way of accessing the data, encompassing functions from all the different modes of access, building on and extending the universally accepted FTP standard. FTP was chosen as a basis for it because of its widespread use, and because it has a well defined architecture for extenions to the protocol (which may be dynamically discovered).

Features of GridFTP

GridFTP is useful for a number of reasons - including faster transfer and in-built security. It achieves this through the following alterations to normal FTP [ [http://it-dep-fio-ds.web.cern.ch/it-dep-fio-ds/Documentation/gridftp.asp What is GridFTP ? ] ] .

Security with GSI

GSI - Grid Security Infrastructure - is another part of the Globus toolkit which provides authentication and encryption to file transfers, with user specified levels of confidentiality and data integrity. FTP itself is inherently insecure, and thus open to packet sniffing and eavesdropping, and has traditionally relied on things such as SSH and SSL for security.

Third party transfers

A useful feature of FTP is that it allows remote transfer between servers to be initiated by a local client. GridFTP builds on this, and adds security and authentication for the local initiator.

Parallel and striped transfer

GridFTP achieves much greater use of bandwidth by allowing multiple simultaneous TCP streams. Files can be downloaded in pieces simultaneously from multiple sources; or even in separate parallel streams from the same source, which is still able to make better use of the bandwidth. Striped and interleaved transfers, again either from multiple or single sources, allow further speed increases.

Partial file transfer

Although FTP has the ability to resume an interrupted file transfer from a specific point in a file, it does not support the transmission of only a certain portion of a file. GridFTP allows a subset of a file to be sent. Such a feature is useful in applications where only small sections of a very large data file are required for processing (a motivating example being the processing of data from a high energy physics experiment, a traditional use of Grid technology).

Fault tolerance and restart

GridFTP provides a fault tolerant implementation of FTP, to handle network unavailability and server problems. Transfers can also be automatically restarted if a problem occurs.

Automatic TCP optimisation

The underlying TCP connection in FTP has numerous settings such as window size and buffer size. GridFTP allows automatic (or manual) negotiation of these settings to provide optimal transfer speeds and reliability (settings are likely to need to be different for best performance with large "files" and for large "groups" of files).

References

External links

* http://globus.org/toolkit/docs/2.4/datagrid/deliverables/C2WPdraft3.pdf
* http://globus.org/toolkit/docs/3.2/gridftp/


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Advanced Resource Connector — infobox software name = ARC caption = NorduGrid logo and monitor screenshot developer = NorduGrid, KnowARC and NDGF latest release version = 0.6.1 latest release date = 7 September 2007 operating system = Linux genre = Grid computing license =… …   Wikipedia

  • Globus Toolkit — Infobox Software name = Globus Toolkit caption = developer = Globus Alliance latest release version = [http://www.globus.org/toolkit/downloads/4.2.0/ 4.2.0] latest release date = July 02, 2008 latest preview version = latest preview date =… …   Wikipedia

  • Open Grid Forum — Formation 2006 Type Standards Development Organization Purpose/focus Developing standards for Grids Creating Grid communities Region served Worldwide OGF President …   Wikipedia

  • Grid-oriented storage — (GOS) is a dedicated data storage architecture which can be connected directly to a computational grid to support advanced data bank services and reservoirs for data that can be shared among multiple computers and end users on the grid.… …   Wikipedia

  • Kepler scientific workflow system — Kepler is a free software system for designing, executing, and sharing scientific workflowsLudäscher B., Altintas I., Berkley C., Higgins D., Jaeger Frank E., Jones M., Lee E., Tao J., Zhao Y. 2006. Scientific Workflow Management and the Kepler… …   Wikipedia

  • Computación distribuida — La computación distribuida o informática en malla, es un nuevo modelo para resolver problemas de computación masiva utilizando un gran número de ordenadores organizadas en racimos incrustados en una infraestructura de telecomunicaciones… …   Wikipedia Español

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”