The Cambridge eScience Centre

 

Multicast Transport for Grid Computing




Multicast Transport for Grid Computing

Reliable high-speed bulk data delivery using IP multicast

Conference paper: Reliable high-speed Grid data delivery using IP multicast (PDF)

The aspirations driving recent multicast research work have been to achieve scalability to hundreds or thousands of receivers - audio streaming over the Internet being a typical application. Within a grid environment, however, a typical application might involve bulk data transfer to just ten or twenty sites. Using multicast for this type of application can provide significant benefits including reduced load on the transmitter, an overall reduction in network traffic, and consequently shorter data transfer times.

In this project, we are investigating how multicast can be exploited within such an environment without requiring major changes to applications or underlying networks. We are building a userspace transport protocol that provides reliable multicast data transfers.

Today, applications use TCP for reliable unicast transfers. It is a mature and well-understood protocol. By modifying TCP to deliver data to more than one receiver at a time, and use multicast when available, an application can transparently send data reliably to multiple recipients. Using existing TCP mechanisms, the modified protocol ensures that data is delivered reliably. Multicast transmission is attempted for performance reasons, but fallback to unicast preserves backwards compatibility and reliability guarantees, and capitalizes on more than a decade of experience that TCP implementations enjoy.

Most previous work on reliable multicast has proposed new APIs and protocols that are not present outside of a lab environment. This has led to much theoretical analysis, but little practical deployment. In order to investigate the benefits that multicast can offer in an existing wide-area network such as a grid environment, a different approach is necessary.

Network protocols are typically implemented in the kernels of hosts and network devices. Any proposals that require modifications to these protocols imply changes to kernel code. This immediately restricts deployment opportunities. By limiting changes to code that runs in user-space on end stations, new protocols can be developed and tested on live networks.

TCP is a reliable end-to-end unicast transfer protocol implemented in host kernels. It is possible to modify TCP behaviour between end stations without any changes to intermediate devices, however this requires kernel changes. If TCP is moved into user-space, changes can be made without modifying the kernel, but again, some form of privileged access is needed by the user-space TCP implementation to directly send and receive packets on a host's network interfaces. While not as significant a barrier to widespread deployment as kernel changes, this privileged access requirement severely limits the ease with which new code can be widely tested.

One solution to this problem is to implement a modified multicast TCP over UDP. User-space applications can freely send and receive UDP packets, so a small shim layer can be introduced to encapsulate and decapsulate the TCP-like engine's packets into UDP. While there are performance implications by running in userspace, the instant deployment potential of a userspace library, coupled with the scalability of multicast, mean that any such limitations are more than acceptable.

Conatct: Karl.Jeacle@cl.cam.ac.uk & Jon.Crowcroft@cl.cam.ac.uk

University of Cambridge Computer Laboratory


  



Comments to the webmaster. Contact: Tel (+44/0) 1223 764282, Fax (+44/0) 1223 765900