samah mustafa: November 2009

Originally Internet based on point-to0point communication abstraction, but many applications have emerged like multicast, anycast and host mobility that require more general abstraction and a layer of indirection that decouples the sending host from the receiving host mismatched with the original one. however, implementing these general abstraction at IP layer or application layer poses many difficulties. This paper portrays a single overlay network to serve as an indirection infrastructure i3 that combine the generality of IP layer solutions with the deployability of overlay solutions. It offers best effort service without reliability or ordered delivery.
i3 serves as a rendezvous-based communication abstraction where a single node in this network is responsible to store packets sent to all identifiers id's that agree in the first k bits. Receiver sends trigger (idt, addr) to indicate that all packets with longest prefix match id should be forwarded by the i3 infrastructure to the node identified by addr. i3 achieves general communication abstraction of
i) mobility by maintaining end-to-end connectivity even when both end points move simultaneously by simply updating the receiver's trigger to consider the new address
ii) multicast by simply having hosts maintain triggers with the same identifier
iii) Anycast by simply having hosts of anycast group maintain triggers which are identical in a prefix of length k bits. This also is helpful to encode application preferences through the use of m-k bits which can used to balance the load over the servers
Another generalization of i3 is to replace an identifier by identifier stack that can be a list of identifier or address such that a source can send a packet to a series of identifier. These type of generalization is helpful to transform the data by third part server or to pass the data through firewall before its delivering to the destination, to enabling heterogeneous receivers to subscribe to same multicast group.
i3 is implemented depending on Chord lookup protocol using circular identifier space with m=256bits and k=128 bits, such overlay

network is self configuring in that a node can join i3 infrastructure simply. i3 uses soft state approach to be robust against node failures depending on the packet routing, end-host's periodic refreshing to maintain their triggers and stack identifier. However, the routing in i3 overlay network is less efficient than routing via IP. Thus, the clients caches the i3 server's IP address and the receiver has to choose their triggers with an identifier that is located on nearby server.

The paper simulates i3's efficiency with randomly distributed identifier. The big challenge was to achieve efficient routing since the end-host doesn't have good control over the location of it's triggers. A ratio of inter node latency on i3 network to that on IP network is used to enhance the performance of i3.

one of the main challenges in Large P2P systems is the lookup problem without a centralized server. Many algorithms have been developed based on DHT interface while the algorithm must be able to locate the node that is responsible to store the data associated with a specific key. This requires that each node has to maintain IP address of neighbors to form an overlay network. DHT implementation includes that keys and node are identified using a number of m-bits, and each key is associated to data stored in node with ID close to that key. The paper discusses recent scalable algorithms with low latency and balanced distribution of keys among the nodes.

In Chord, each node has a finger table based on power of two data structure such that- a query for a key k is forwarded to the node in finger table with the highest ID not exceeding k and at least half of the remaining ID-space distance to k. the distance function is unidirectional and asymmetric. It depends on the numeric difference between two ID.
Pastry, Tapesry and Kademlia based on tree-like data structure. In Pastry and Tapesry, each node maintains a the location of some node with each prefix such that the distance function is symmetric but not unidirectional. The correctness requires that each node has a leaf set (nodes closest smaller and larger than the node). Pastry optimize the forwarding performance by maintaining a routing table that will be used to find a node has a longer shared prefix (than current node or at least equal) with the queried key.

Kademlia's distance function depends on XOR of two ID that means it is symmetric and unidirectional.

In CAN, each node is responsible for a rectangular zone and it maintains a routing table of the neighbors in coordinate space. The lookup is performed by forwarding the query in straight path roughly to the neighbor to the node storing the key.

Each of the aforementioned algorithms involves a procedure to adapt to node's arrival and departure with a a key difference in the data structure they use as a routing table. However, these P2P lookup algorithms have been analyzed under static conditions only. As it is obvious, they have many common and different aspects and they need further work to get more strength one.

samah mustafa

Monday, November 9, 2009

Internet Indirection Infrastructure

Monday, November 2, 2009

LOOKINGUP DATA in P2P SYSTEMs

Followers

Blog Archive