-
Notifications
You must be signed in to change notification settings - Fork 6
Home
See the Technology Overview for details on the tools
CRLite promises substantial compression of the dataset; In our staging environment, the binary form of all unexpired certificate serial numbers comprises about 16 GB of memory in Redis; the binary form of all enrolled and unexpired certificate serial numbers comprises about 1.7 GB on disk, while the resulting binary Bloom filter compresses to approximately 5 MB.
These artifacts are in the mlbf
folder for a given run, available in the published data-sets. See "Where can I get the CRLite data that is used to make filters?"
Bloom filters are probabilistic data structures with an error rate due to data collisions. However, if you know the whole range of data that might be tested against the filter, you can compute all the false positives and build another layer to resolve those. Then you keep going until there are no more false positives. In practice, this happens in 25 to 30 layers, which results in substantial compression.
The key innovation for CRLite is that Certificate Transparency (CT) data can be used as a stand-in for "all the certificates in the Web PKI". It's reasonably easy to tell if a certificate is in Certificate Transparency: Was it delivered with a Signed Certificate Timestamp (SCT) from a CT log? Similarly, it's reasonably easy to tell that a certificate was known to a CT log at the time that the CRLite filter was constructed: Was the SCT at least one Maximum Merge Delay older than the CRLite filter?
The remaining issues are whether the Issuer is included/enrolled in the CRLite filter set, which is provided as a flag along with the Firefox Intermediate Preloading data.
They tend to be between 20kB and 50kB, in a form we call "stashes". You can use the crlite_status
tool to investigate the sizes of recent runs. Similarly, you can use rust-query-crlite
to read and evaluate certificates against the filter+stash sets.
You can see an output of the crlite-status tool, which shows filter statistics by date, here: https://gist.github.com/jcjones/1fd9f63f93c7b85f87f4ac9b0f134905
All CAs that have fresh Certificate Revocation Lists (CRLs) encoded into their issued certificates get included into CRLite. Freshness meaning that the CRLs' signatures are valid and that they aren't passed their NextUpdate
time.
We initially thought we would hand-pick some issuing CAs, but automation was simpler.
Analysis why issuers become unenrolled in CRLite is still active, but the usual culprit in the logs is that the next CRL simply can't be downloaded by the CRLite aggregate-crls
tooling, which has limited retry and resume functionality. There is audit data available using the crlite-status tool with the --crl
options to analyze when issuers are being enrolled or unenrolled in CRLite.
Firefox will use OCSP (stapled or actively queried) if the certificate's Signed Certificate Timestamps are too new for the current filter.
CRLite won't be used. If the issuer is truly unknown, Firefox will give an unknown issuer warning like always, nothing there will change. If the issuer is not in the Mozilla Root Program, then it won't be eligible for CRLite.
Each CRLite filter is published with a list of enrolled issuers. The easiest way to check if an issuer is enrolled is to query a certificate from that issuer against a filter using the rust-query-crlite
tool with verbose logging (-v
). If the issuer is not enrolled, the tool will output NotEnrolled
. For more detailed instructions see "How can I query my CRLite filter".
At Internet-scale, this is likely a common occurrence: Certificate Authorities generally have lag in updating revocation information, and there's no requirement that CRLs and OCSP update together. Firefox can be configured to double-check revoked certificate results from CRLite against OCSP (by setting security.pki.crlite_mode = 3
in about:config
). In this mode, if CRLite says a certificate is revoked, and OCSP says it is valid, then the OCSP result is used. This is currently the default behavior on Firefox Beta and Nightly channels. CRLite is not yet enabled on the Release channel.
The CRLite filters are published manually at Firefox Remote Settings. You can examine the data using JSON tooling at this URL: https://firefox.settings.services.mozilla.com/v1/buckets/security-state/collections/cert-revocations/records
The rust-query-crlite
tool can be used to download a filter and store it in the format used by Firefox. See How can I query my CRLite filter below.
Install the rust-query-crlite
program from the CRLite repository by running cargo install --path ./crlite/rust-query-crlite
.
Your Firefox profile contains a subdirectory called security_state
. To query the CRLite filter used by your Firefox profile, run:
rust-query-crlite -vvv --db /path/to/security_state x509 /path/to/certificate1 /path/to/certificate2 [...]
or
rust-query-crlite -vvv --db /path/to/security_state https host1.example.com host2.example.com [...]
The provided security_state
directory does not have to be in a Firefox profile. To download the current CRLite filter pass --update prod
. (Note that this cannot currently be used to populate an empty Firefox security_state
directory because Firefox requires additional metadata about the freshness of filters which is not populated by rust-query-crlite
.)
The production data is hosted in Google Cloud Storage in a bucket named crlite-filters-prod
. The web interface for the files is accessible publicly here, though browsing it requires a Google login: https://console.cloud.google.com/storage/browser/crlite-filters-prod
The staging environment, which contains only a fraction of the WebPKI, is here: https://console.cloud.google.com/storage/browser/crlite-filters-stage
The Google gsutil
tool is handy for downloading entire datasets (~7 GB each). These commands would download all the files:
mkdir crlite-dataset/
gsutil -m cp -r gs://crlite-filters-prod/20200101-0 crlite-dataset/
The known
folder contains JSON files named by the enrolled issuing CA of all their unexpired DER-encoded serial numbers. The revoked
folder has files of the same issuing CA format, but contains DER-encoded serial numbers of the revoked certificates. The serials in revoked
are not guaranteed to be a subset of known
, as many are likely expired, so set math is required to get known revoked
from the directories.
The mlbf
folder contains the filter and its metadata as-generated.
The log
folder contain all the logs for the runs. As of this writing, many errors and warnings are still emitted that require bugfixing in one fashion or other. There are also many pointers to potential CRL problems with CAs, though few are compliance issues, and at least some are known to be innocent problems.
The crlite-status
tool is probably what you're looking for. You can get it from pypi:
pip3 install crlite-status
crlite_status 8
Install the rust-create-cascade
program from the CRLite repository by running cargo install --path ./crlite/rust-create-cascade
.
With a full dataset at hand from the above gsutil
command:
rust-create-cascade -vv --known ./20200101-0/known/ --revoked ./20200101-0/revoked/
See the main README.md.
It's extremely inefficient, having to do so many OCSP queries. While the original paper's implementation did it, and so did casebenton/certificate-revocation-analysis
(our initial proof-of-principal), downloading CRLs scales much better. If CRLite gains traction, OCSP bandwidth savings and speedups may prove to be reasons for CAs to issue CRLs.
They're binary-encoded flat lists of Issuer Subject Public Key Information hashes, followed by a list of serial numbers.
The read_keys.py script can read stash files.
Currently CRLite uses a heuristic that end-users will collect stashes until the total size of the collected stashes is going to be larger than a new filter. At that point, the infrastructure will switch over to a new filter and clear all existing stashes.
The contract between CRLite clients and the infrastructure allows the infrastructure to adjust this heuristic at will. Most likely, this will be modified over time to optimize client-side searches, as searching the stashes is slower than searching the Bloom filter cascade, and purely choosing to update the filter on file-size does not account for those speed differences.
A CT log is monitored if its crlite_enrolled
flag is set in the ct-logs Remote Settings collection. This collection is periodically updated with entries from Google's log list, but the crlite_enrolled
flag is only set after manual review by a Mozilla engineer.
ct-fetch stores certificate serial numbers and CRL distribution points in the Redis database.
Serial numbers are stored as Redis sets with the keys being named in the form serials::<expiration date and hour>::<issuer>
, with each key's expiration set to automatically expunge upon reaching the expiration day-and-hour.
CRL distribution points are also stored as Redis sets, with keys in the form crls::<issuer>
, and CRL DPs do not expire; as they are discovered, CRLite assumes they will be updated until the retirement of the issuer.