Skip to content

Commit

Permalink
Merge pull request #258 from hirosystems/develop
Browse files Browse the repository at this point in the history
release to master
  • Loading branch information
rafaelcr authored Aug 30, 2024
2 parents 7d168b9 + b1b8f4b commit a92d448
Show file tree
Hide file tree
Showing 18 changed files with 187 additions and 182 deletions.
93 changes: 12 additions & 81 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,9 @@
* [Stopping the service](#stopping-the-service)
* [Using an image cache service](#using-an-image-cache-service)
* [Service architecture](#service-architecture)
* [External](#external)
* [Internal](#internal)
* [Blockchain importer](#smart-contract-importer)
* [Smart Contract Monitor](#smart-contract-monitor)
* [Job Queue](#job-queue)
* [Process Smart Contract Job](#process-smart-contract-job)
* [Process Token Job](#process-token-job)
* [Job Queue](#job-queue)
* [Process Smart Contract Job](#process-smart-contract-job)
* [Process Token Job](#process-token-job)
* [Bugs and Feature Requests](#bugs-and-feature-requests)
* [Contribute](#contribute)
* [Community](#community)
Expand Down Expand Up @@ -63,11 +59,9 @@ The Token Metadata Service is a microservice that has hard dependencies on other
components. Before you start, you'll need to have access to:

1. A fully synchronized [Stacks node](https://github.com/stacks-network/stacks-blockchain)
1. A fully synchronized instance of the [Stacks Blockchain
API](https://github.com/hirosystems/stacks-blockchain-api) running in `default` or `write-only`
mode, with its Postgres database exposed for new connections. A read-only DB replica is also
acceptable.
1. A fully synchronized instance of [Chainhook](https://github.com/hirosystems/chainhook)
1. A local writeable Postgres database for token metadata storage
1. (Optional) A Google Cloud Storage bucket for storing token images

### Running the service

Expand Down Expand Up @@ -110,77 +104,14 @@ disconnected.

### Using an image cache service

The Token Metadata Service allows you to specify the path to a custom script that can pre-process
every image URL detected by the service before it's inserted into the DB. This allows you to serve
CDN image URLs in your metadata responses instead of raw URLs, providing key advantages such as:

* Improves image load speed
* Increases reliability in case original image becomes unavailable
* Protects original image hosts from DDoS attacks
* Increases user privacy
* etc.

An example IMGIX processor script is included in
[`config/image-cache.js`](https://github.com/hirosystems/token-metadata-api/blob/develop/config/image-cache.js).
You can customize the script path by altering the `METADATA_IMAGE_CACHE_PROCESSOR` environment
variable.
The Token Metadata API allows you to specify a Google Cloud Storage bucket to store the token images
scanned by the metadata processor. This is recommended if you're looking to optimize your image speeds
when using the API via a wallet, etc. See the `env.ts` file for more information on which environment
variables you need to provide.

## Service architecture

### External

![Architecture](architecture.png)

The Stacks Token Metadata Service connects to three different systems to operate:

1. A Stacks Blockchain API database to import all historical smart contracts when booting up, and to
listen for new contracts that may be deployed. Read-only access is recommended as this service
will never need to write anything to this DB.
1. A Stacks node to respond to all read-only contract calls required when fetching token metadata
(calls to get token count, token metadata URIs, etc.)
1. A local Postgres DB to store all processed metadata info

The service will also need to fetch external metadata files (JSONs, images) from the Internet, so it
must have access to external networks.

### Internal

![Flowchart](flowchart.png)

#### Blockchain importer

The
[`BlockchainImporter`](https://github.com/hirosystems/token-metadata-api/blob/develop/src/token-processor/blockchain-api/blockchain-importer.ts)
component is only used on service boot.

It connects to the Stacks Blockchain API database and scans the entire `smart_contracts` table
looking for any contract that conforms to SIP-009, SIP-010 or SIP-013. When it finds a token
contract, it creates a
[`ProcessSmartContractJob`](https://github.com/hirosystems/token-metadata-api/blob/develop/src/token-processor/process-smart-contract-job.ts)
job and adds it to the [Job queue](#job-queue) so its tokens can be read and processed thereafter.

This process is only run once. If the Token Metadata Service is ever restarted, though, this
component re-scans the API `smart_contracts` table from the last processed block height so it can
pick up any newer contracts it might have missed while the service was unavailable.

#### Smart Contract Monitor

The
[`BlockchainSmartContractMonitor`](https://github.com/hirosystems/token-metadata-api/blob/develop/src/token-processor/blockchain-api/blockchain-smart-contract-monitor.ts)
component constantly listens for the following Stacks Blockchain API events:

* **Smart contract log events**

If a contract `print` event conforms to SIP-019, it finds the affected tokens and marks them for
metadata refresh.

* **Smart contract deployments**

If the new contract is a token contract, it saves it and enqueues it for token processing.

This process is kept alive throughout the entire service lifetime.

#### Job Queue
### Job Queue

The role of the
[`JobQueue`](https://github.com/hirosystems/token-metadata-api/blob/develop/src/token-processor/queue/job-queue.ts)
Expand All @@ -204,13 +135,13 @@ There are two env vars that can help you tune how the queue performs:

This queue runs continuously and can handle an unlimited number of jobs.

##### Process Smart Contract Job
#### Process Smart Contract Job

This job makes a contract call to the Stacks node in order to determine the total number of tokens
declared by the given contract. Once determined, it creates and enqueues all of these tokens for
metadata ingestion.

##### Process Token Job
#### Process Token Job

This job fetches the metadata JSON object for a single token as well as other relevant properties
depending on the token type (symbol, decimals, etc.). Once fetched, it parses and ingests this data
Expand Down
Binary file removed architecture.png
Binary file not shown.
Binary file removed flowchart.png
Binary file not shown.
12 changes: 12 additions & 0 deletions migrations/1725039927416_job-retry-after.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
/* eslint-disable @typescript-eslint/naming-convention */
import { MigrationBuilder, ColumnDefinitions } from 'node-pg-migrate';

export const shorthands: ColumnDefinitions | undefined = undefined;

export function up(pgm: MigrationBuilder): void {
pgm.addColumn('jobs', {
retry_after: {
type: 'timestamptz',
},
});
}
3 changes: 2 additions & 1 deletion src/api/routes/ft.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ import {
StacksAddressParam,
TokenQuerystringParams,
} from '../schemas';
import { handleTokenCache } from '../util/cache';
import { handleChainTipCache, handleTokenCache } from '../util/cache';
import { generateTokenErrorResponse, TokenErrorResponseSchema } from '../util/errors';
import { parseMetadataLocaleBundle } from '../util/helpers';

Expand All @@ -25,6 +25,7 @@ const IndexRoutes: FastifyPluginCallback<Record<never, never>, Server, TypeBoxTy
options,
done
) => {
fastify.addHook('preHandler', handleChainTipCache);
fastify.get(
'/ft',
{
Expand Down
2 changes: 2 additions & 0 deletions src/api/routes/status.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,14 @@ import { FastifyPluginCallback } from 'fastify';
import { Server } from 'http';
import { ApiStatusResponse } from '../schemas';
import { SERVER_VERSION } from '@hirosystems/api-toolkit';
import { handleChainTipCache } from '../util/cache';

export const StatusRoutes: FastifyPluginCallback<
Record<never, never>,
Server,
TypeBoxTypeProvider
> = (fastify, options, done) => {
fastify.addHook('preHandler', handleChainTipCache);
fastify.get(
'/',
{
Expand Down
1 change: 1 addition & 0 deletions src/api/schemas.ts
Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,7 @@ export const ApiStatusResponse = Type.Object(
queued: Type.Optional(Type.Integer({ examples: [512] })),
done: Type.Optional(Type.Integer({ examples: [12532] })),
failed: Type.Optional(Type.Integer({ examples: [11] })),
invalid: Type.Optional(Type.Integer({ examples: [20] })),
},
{ title: 'Api Job Count' }
)
Expand Down
79 changes: 25 additions & 54 deletions src/api/util/cache.ts
Original file line number Diff line number Diff line change
@@ -1,18 +1,25 @@
import { FastifyReply, FastifyRequest } from 'fastify';
import { SmartContractRegEx } from '../schemas';
import { logger } from '@hirosystems/api-toolkit';
import { CACHE_CONTROL_MUST_REVALIDATE, parseIfNoneMatchHeader } from '@hirosystems/api-toolkit';

/**
* A `Cache-Control` header used for re-validation based caching.
* * `public` == allow proxies/CDNs to cache as opposed to only local browsers.
* * `no-cache` == clients can cache a resource but should revalidate each time before using it.
* * `must-revalidate` == somewhat redundant directive to assert that cache must be revalidated, required by some CDNs
*/
const CACHE_CONTROL_MUST_REVALIDATE = 'public, no-cache, must-revalidate';
enum ETagType {
chainTip = 'chain_tip',
token = 'token',
}

export async function handleTokenCache(request: FastifyRequest, reply: FastifyReply) {
async function handleCache(type: ETagType, request: FastifyRequest, reply: FastifyReply) {
const ifNoneMatch = parseIfNoneMatchHeader(request.headers['if-none-match']);
const etag = await getTokenEtag(request);
let etag: string | undefined;
switch (type) {
case ETagType.chainTip:
// TODO: We should use the `index_block_hash` here instead of the `block_hash`, but we'll need
// a DB change for this.
etag = (await request.server.db.getChainTipBlockHeight()).toString();
break;
case ETagType.token:
etag = await getTokenEtag(request);
break;
}
if (etag) {
if (ifNoneMatch && ifNoneMatch.includes(etag)) {
await reply.header('Cache-Control', CACHE_CONTROL_MUST_REVALIDATE).code(304).send();
Expand All @@ -22,6 +29,14 @@ export async function handleTokenCache(request: FastifyRequest, reply: FastifyRe
}
}

export async function handleTokenCache(request: FastifyRequest, reply: FastifyReply) {
return handleCache(ETagType.token, request, reply);
}

export async function handleChainTipCache(request: FastifyRequest, reply: FastifyReply) {
return handleCache(ETagType.chainTip, request, reply);
}

export function setReplyNonCacheable(reply: FastifyReply) {
reply.removeHeader('Cache-Control');
reply.removeHeader('Etag');
Expand Down Expand Up @@ -52,47 +67,3 @@ async function getTokenEtag(request: FastifyRequest): Promise<string | undefined
return undefined;
}
}

/**
* Parses the etag values from a raw `If-None-Match` request header value.
* The wrapping double quotes (if any) and validation prefix (if any) are stripped.
* The parsing is permissive to account for commonly non-spec-compliant clients, proxies, CDNs, etc.
* E.g. the value:
* ```js
* `"a", W/"b", c,d, "e", "f"`
* ```
* Would be parsed and returned as:
* ```js
* ['a', 'b', 'c', 'd', 'e', 'f']
* ```
* @see https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-None-Match#syntax
* ```
* If-None-Match: "etag_value"
* If-None-Match: "etag_value", "etag_value", ...
* If-None-Match: *
* ```
* @param ifNoneMatchHeaderValue - raw header value
* @returns an array of etag values
*/
function parseIfNoneMatchHeader(ifNoneMatchHeaderValue: string | undefined): string[] | undefined {
if (!ifNoneMatchHeaderValue) {
return undefined;
}
// Strip wrapping double quotes like `"hello"` and the ETag validation-prefix like `W/"hello"`.
// The API returns compliant, strong-validation ETags (double quoted ASCII), but can't control what
// clients, proxies, CDNs, etc may provide.
const normalized = /^(?:"|W\/")?(.*?)"?$/gi.exec(ifNoneMatchHeaderValue.trim())?.[1];
if (!normalized) {
// This should never happen unless handling a buggy request with something like `If-None-Match: ""`,
// or if there's a flaw in the above code. Log warning for now.
logger.warn(`Normalized If-None-Match header is falsy: ${ifNoneMatchHeaderValue}`);
return undefined;
} else if (normalized.includes(',')) {
// Multiple etag values provided, likely irrelevant extra values added by a proxy/CDN.
// Split on comma, also stripping quotes, weak-validation prefixes, and extra whitespace.
return normalized.split(/(?:W\/"|")?(?:\s*),(?:\s*)(?:W\/"|")?/gi);
} else {
// Single value provided (the typical case)
return [normalized];
}
}
4 changes: 3 additions & 1 deletion src/env.ts
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@ const schema = Type.Object({
JOB_QUEUE_SIZE_LIMIT: Type.Number({ default: 200 }),
/** Maximum time a job will run before marking it as failed. */
JOB_QUEUE_TIMEOUT_MS: Type.Number({ default: 60_000 }),
/** Minimum time we will wait to retry a job after it's been executed. */
JOB_QUEUE_RETRY_AFTER_MS: Type.Number({ default: 5_000 }),

/**
* The max number of immediate attempts that will be made to retrieve metadata from external URIs
Expand Down Expand Up @@ -118,7 +120,7 @@ const schema = Type.Object({
* next request that is sent to it (seconds). This value will be overridden by the `Retry-After`
* header returned by the domain, if any.
*/
METADATA_RATE_LIMITED_HOST_RETRY_AFTER: Type.Number({ default: 3600 }), // 1 hour
METADATA_RATE_LIMITED_HOST_RETRY_AFTER: Type.Number({ default: 60 }), // 1 minute
/**
* Maximum number of HTTP redirections to follow when fetching metadata. Defaults to 5.
*/
Expand Down
24 changes: 18 additions & 6 deletions src/pg/pg-store.ts
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,16 @@ export class PgStore extends BasePgStore {
await this.sql`
UPDATE jobs
SET status = ${args.status},
invalid_reason = ${args.invalidReason ? args.invalidReason : this.sql`NULL`},
invalid_reason = ${
args.status == DbJobStatus.invalid && args.invalidReason
? args.invalidReason
: this.sql`NULL`
},
${
args.status != DbJobStatus.pending
? this.sql`retry_count = 0, retry_after = NULL,`
: this.sql``
}
updated_at = NOW()
WHERE id = ${args.id}
`;
Expand All @@ -216,30 +225,33 @@ export class PgStore extends BasePgStore {
async retryAllFailedJobs(): Promise<void> {
await this.sql`
UPDATE jobs
SET status = ${DbJobStatus.pending}, retry_count = 0, updated_at = NOW()
SET status = ${DbJobStatus.pending}, retry_count = 0, updated_at = NOW(), retry_after = NULL
WHERE status IN (${DbJobStatus.failed}, ${DbJobStatus.invalid})
`;
}

async increaseJobRetryCount(args: { id: number }): Promise<number> {
async increaseJobRetryCount(args: { id: number; retry_after: number }): Promise<number> {
const retryAfter = args.retry_after.toString();
const result = await this.sql<{ retry_count: number }[]>`
UPDATE jobs
SET retry_count = retry_count + 1, updated_at = NOW()
SET retry_count = retry_count + 1,
updated_at = NOW(),
retry_after = NOW() + INTERVAL '${this.sql(retryAfter)} ms'
WHERE id = ${args.id}
RETURNING retry_count
`;
return result[0].retry_count;
}

/**
* Retrieves a number of queued jobs so they can be processed immediately.
* Retrieves a number of pending jobs so they can be processed immediately.
* @param limit - number of jobs to retrieve
* @returns `DbJob[]`
*/
async getPendingJobBatch(args: { limit: number }): Promise<DbJob[]> {
return this.sql<DbJob[]>`
SELECT ${this.sql(JOBS_COLUMNS)} FROM jobs
WHERE status = 'pending'
WHERE status = 'pending' AND (retry_after IS NULL OR retry_after < NOW())
ORDER BY COALESCE(updated_at, created_at) ASC
LIMIT ${args.limit}
`;
Expand Down
2 changes: 2 additions & 0 deletions src/pg/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ export type DbJob = {
retry_count: number;
created_at: string;
updated_at?: string;
retry_after?: string;
};

export type DbUpdateNotification = {
Expand Down Expand Up @@ -285,6 +286,7 @@ export const JOBS_COLUMNS = [
'retry_count',
'created_at',
'updated_at',
'retry_after',
];

export const METADATA_COLUMNS = [
Expand Down
Loading

0 comments on commit a92d448

Please sign in to comment.