@@ -19,7 +19,7 @@ versions older than 2.6.0 are supported.
19
19
* ** rsync-fetcher** - fetches the repository from the remote server, and uploads it to s3.
20
20
* ** rsync-gateway** - serves the mirrored repository from s3 in ** http** protocol.
21
21
* ** rsync-gc** - periodically removes old versions of files from s3.
22
- * ** rsync-fix-encoding ** - see "Migrating from v0.2.11 to older versions" section .
22
+ * ** rsync-migration ** - see [ Migration ] ( #migration ) section for more details .
23
23
24
24
## Example
25
25
@@ -28,67 +28,85 @@ versions older than 2.6.0 are supported.
28
28
$ RUST_LOG=info RUST_BACKTRACE=1 AWS_ACCESS_KEY_ID=< ID> AWS_SECRET_ACCESS_KEY=< KEY> \
29
29
rsync-fetcher \
30
30
--src rsync://upstream/path \
31
- --s3-url https://s3_api_endpoint --s3-region region --s3-bucket bucket --s3-prefix repo_name \
32
- --redis redis://localhost --redis-namespace repo_name \
33
- --repository repo_name
34
- --gateway-base http://localhost:8081/repo_name
31
+ --s3-url https://s3_api_endpoint --s3-region region --s3-bucket bucket --s3-prefix prefix \
32
+ --pg-url postgres://user@localhost/db \
33
+ --namespace repo_name
35
34
```
36
35
2. Serve the repository over HTTP.
37
36
` ` ` bash
38
37
$ cat > config.toml << -EOF
39
38
bind = ["localhost:8081"]
39
+ s3_url = "https://s3_api_endpoint"
40
+ s3_region = "region"
40
41
41
42
[endpoints."out"]
42
- redis = "redis://localhost "
43
- redis_namespace = "test "
44
- s3_website = "http://localhost:8080/test/test- prefix"
43
+ namespace = "repo_name "
44
+ s3_bucket = "bucket "
45
+ s3_prefix = "prefix"
45
46
46
47
EOF
47
48
48
49
$ RUST_LOG=info RUST_BACKTRACE=1 rsync-gateway <optional config file>
49
50
` ` `
50
-
51
- 3. GC old versions of files periodically.
51
+ 3. GC old versions of files manually.
52
52
` ` ` bash
53
53
$ RUST_LOG=info RUST_BACKTRACE=1 AWS_ACCESS_KEY_ID=<ID> AWS_SECRET_ACCESS_KEY=<KEY> \
54
54
rsync-gc \
55
55
--s3-url https://s3_api_endpoint --s3-region region --s3-bucket bucket --s3-prefix repo_name \
56
- --redis redis://localhost --redis-namespace repo_name \
57
- --keep 2
56
+ --pg-url postgres://user@localhost/db
58
57
` ` `
59
- > It' s recommended to keep at least 2 versions of files in case a gateway is still using an old revision.
58
+ > It' s recommended to keep at least 2 revisions in case a gateway is still using an old revision.
60
59
61
60
## Design
62
61
63
62
File data and their metadata are stored separately.
64
63
65
64
### Data
66
65
67
- Files are stored in S3 storage, named by their blake2b-160 hash (`<namespace/<hash>`).
68
-
69
- Listing html pages are stored in `<namespace>/listing-<timestamp>/<path>/index.html`.
66
+ Files are stored in S3 storage, named by their blake2b-160 hash (`<prefix>/<namespace>/<hash>`).
70
67
71
68
### Metadata
72
69
73
- Metadata is stored in Redis for fast access.
70
+ Metadata is stored in Postgres.
71
+
72
+ An object is the smallest unit of metadata. There are three types of objects:
73
+ - **File** - a regular file, with its hash, size and mtime
74
+ - **Directory** - a directory, and its size and mtime
75
+ - **Symlink** - a symlink, with its size, mtime and target
76
+
77
+ Objects (files, directories and symlinks) are organized into revisions, which are immutable. Each revision has a unique
78
+ id, while an object may appear in multiple revisions. Revisions are further organized into repositories (namespaces),
79
+ like `debian`, `ubuntu`, etc. Repositories are mutable.
80
+
81
+ A revision can be in one of the following states:
82
+
83
+ - **Live** - a live revision is a revision in production, which is ready to be served. There can be multiple live
84
+ revisions, but only the latest one is served by the gateway.
85
+ - **Partial** - a partial revision is a revision that is still being updated. It' s not ready to be served yet.
86
+ - ** Stale** - a stale revision is a revision that is no longer in production, and is ready to be garbage collected.
87
+
88
+ # # Migration
89
+
90
+ # ## Migration from v0.3.x to v0.4.x
91
+
92
+ v0.4.x switched from Redis to Postgres for storing metadata, greatly improving the performance of many operations and
93
+ reducing the storage usage.
94
+
95
+ Use ` rsync-migration redis-to-pg` to migrate old metadata to the new database. Note that you can only migrate from
96
+ v0.3.x to v0.4.x, and you can' t migrate from v0.2.x to v0.4.x directly.
74
97
75
- Note that there are more than one file index in Redis .
98
+ The old Redis database is not modified .
76
99
77
- - `<namespace>:index:<timestamp>` - an index of the repository synced at `<timestamp>`.
78
- - `<namespace>:partial` - a partial index that is still being updated and not committed yet.
79
- - `<namespace>:partial-stale` - a temporary index that is used to store outdated files when updating the partial index.
80
- This might happen if you interrupt a synchronization, restart it, and some files downloaded in the first run are
81
- already outdated. It' s ready to be garbage collected.
82
- - ` <namespace>:stale:<timestamp>` - an index that is taken out of production, and is ready to be garbage collected.
100
+ ### Migrating from v0.2.x to v0.3.x
83
101
84
- > Not all files in partial index should be removed. For example, if a file exists both in a stale index and a " live "
85
- > index, it should not be removed .
102
+ v0.3.x uses a new encoding for file metadata, which is incompatible with v0.2.x. Trying to use v0.3.x on old data will
103
+ fail .
86
104
87
- # # Migrating from v0.2.11 to older versions
105
+ Use `rsync-migration upgrade-encoding` to upgrade the encoding.
88
106
89
- There' s a bug affecting all versions before v0.3.0 and after v0.2.11, which causes the file metadata to be read in a
90
- wrong format and silently corrupting the index. Note that no data is lost, but the gateway will fail to direct users to
91
- the correct file. `rsync-fix-encoding` can be used to fix this issue.
107
+ This is a destructive operation, so make sure you have a backup of the database before running it. It does nothing
108
+ without the `--do` flag.
92
109
93
- After v0.3.0, all commands are using the new encoding. You can still use this tool to migrate old data to the new
94
- encoding. Trying to use the new commands on old data will now fail.
110
+ The new encoding is actually introduced in v0.2.12 by accident. `rsync-gateway` between v0.2.12 and v0.3.0 can' t parse
111
+ old metadata correctly and return garbage data. No data is lost though, so if you used any version between v0.2.12 and
112
+ v0.3.0, you can still use ` rsync-migration` to migrate to the new encoding.
0 commit comments