Replicating S3 data between datacenters
For data replication between datacenters, S3 users can use either S3 geo-replication or cross-region replication.
S3 geo-replication
- Geo-replication is designed to improve the distribution of data across geographically distributed data networks.
- With geo-replication, you can only replicate data to other Virtuozzo Hybrid Infrastructure S3 storages, up to 4 sites.
- During geo-replication, a bucket is copied to the destination S3 storage along with its access keys, policies, and properties.
- In a multisite deployment, S3 nodes send their local data asynchronously to other sites, which leads to eventual consistency of bucket operations. Eventual consistency does not guarantee that reads are to return the new state after the write has been completed. It helps to hide communication latencies on writes at the cost of the probable old state observed by readers.
- You can enable S3 geo-replication in the admin panel.
Cross-region replication (CRR)
- CRR is Amazon-like S3 replication.
- CRR is used to copy objects asynchronously across S3 buckets stored in different clusters and public cloud providers.
- During CRR, only objects from a bucket are copied to another bucket on the destination S3 storage. The access keys, policies, and bucket properties are not replicated.
- CRR allows more granular replication, for example, at the prefix level.
- With CRR, you can also replicate data to a third-party S3 storage.
- CRR replicates only newly written objects.
- You can enable CRR by using the Amazon S3-compatible API. For details, refer to the Object Storage Orchestration API Reference.
Limitations
- S3 geo-replication and CRR can be used on the same clusters for different buckets, but you cannot enable them together for the same bucket.