Tuesday, September 29, 2009

Safe and Effective Fine-grained TCP Retransmission for Datacenter Communication


The paper focuses on the Incast observed for synchronized reads and writes. Incast is a communication pattern occurs when many senders simultaneously upon receiving the request from a client, transmit a large amount data and result in bottleneck link, packets overfilling the buffers on the client's port on the switch that results in many losses while the client has to wait a minimum of 200msec before receiving the whole response. This reduces the throughput seen by the application 1-10% of bandwidth capacity.



Other works offered a solution with application specific mechanism while this pprovides a TCP level solution as

· Reducing minimum RTO in workload involves a switch with 32KB of output buffer size per port and more than 10 simultaneously servers. RTT=100usec, link capacity=1Gbps and the client issues a request of data block of 1MB. It has been found that maximum effectiveness can be achieved by making min. RTO closer to the network's RTT.
· So, for datacenter of 10 Gbps network and 10usec port-to-port latency, avoiding throughput collapse and idle link requires smaller RTO. The authors analyzed the behavior of 10Gbps Ethernet network with reduced RTT=100- 20usec, block size of 80MB and scaled the number of the servers into thousands. They found that removing the lower bound on RTO can improve the performance for up to 512 servers only, while for higher number of servers only 50% reduction in application throughput was recorded due to synchronized retransmission and successive timeout
· Therefore they add some randomization to the RTO timer to desynchronize the repeated flows.
· They focused the importance of fine-grained TCP retransmission for wide area flow with different RTT and RTO. They verified that eliminating min. bound on RTO and enabling retransmission time in microseconds can avoid the Incast.

No comments:

Post a Comment