Superlinear and Bandwidth Friendly Geo-replication for Store-and-forward Systems
To keep internet based services available despite inevitable local internet and power outages, their data must be replicated to one or more other sites. For most systems using the store-and-forward architecture, data loss can also be prevented by using end-to-end acknowledgements. So far we have not found any sufficiently good solutions for replication of data in store-and-forward systems without acknowledgements and with geographically separated system nodes. We therefore designed a new replication protocol, which could take advantage of the lack of a global order between the messages and the acceptance of a slightly higher risk for duplicated deliveries than existing protocols. We tested a proof-of-concept implementation of the protocol for throughput and latency in a controlled experiment using 7 nodes in 4 geographically separated areas, and observed the throughput increasing superlinearly with the number of nodes up to almost 3500 messages per second. It is also, to the best of our knowledge, the first replication protocol with a bandwidth usage that scales according to the number of nodes allowed to fail and not the total number of nodes in the system.