Deprecate unarchiving and sync manager stacks, migrated to separate repos #1038

alexiswl · 2025-06-29T00:34:40Z

Replaced by https://github.com/OrcaBus/service-fastq-sync-manager and https://github.com/OrcaBus/service-fastq-unarchiving-manager

…epos

Copilot

Pull Request Overview

This PR deprecates the FastqUnarchiving and FastqSync manager stacks by migrating them into separate repositories, and updates the StatelessStackCollection to remove their instantiation.

Removed FastqUnarchivingManagerStack and FastqSyncManagerStack class imports and usages.
Retained only the props imports for configuration types.
Commented out all related stack creation code.

Comments suppressed due to low confidence (1)

lib/workload/stateless/statelessStackCollectionClass.ts:340

This duplicated commented instantiation block repeats the deprecation cleanup. Remove the redundant comments to keep the codebase clean.

      ...this.createTemplateProps(env, 'BclConvertManagerStack'),

Copilot · 2025-06-29T00:35:14Z

lib/workload/stateless/statelessStackCollectionClass.ts

-  FastqUnarchivingManagerStackProps,
-} from './stacks/fastq-unarchiving/deploy';
-import { FastqSyncManagerStack, FastqSyncManagerStackProps } from './stacks/fastq-sync/deploy';
+import { FastqUnarchivingManagerStackProps } from './stacks/fastq-unarchiving/deploy';


The FastqUnarchivingManagerStackProps import is no longer used in active code (only in commented sections). Consider removing it to reduce clutter and avoid unused imports.

Suggested change

import { FastqUnarchivingManagerStackProps } from './stacks/fastq-unarchiving/deploy';

Copilot · 2025-06-29T00:35:15Z

lib/workload/stateless/statelessStackCollectionClass.ts

+    // this.fastqUnarchivingManagerStack = new FastqUnarchivingManagerStack(
+    //   scope,
+    //   'FastqUnarchivingManagerStack',
+    //   {
+    //     ...this.createTemplateProps(env, 'FastqUnarchivingManagerStack'),
+    //     ...statelessConfiguration.fastqUnarchivingManagerStackProps,
+    //   }
+    // );
+
+    // this.fastqSyncManagerStack = new FastqSyncManagerStack(scope, 'FastqSyncManagerStack', {
+    //   ...this.createTemplateProps(env, 'FastqSyncManagerStack'),
+    //   ...statelessConfiguration.fastqSyncManagerStackProps,
+    // });
+


There is a large block of commented-out stack initialization for FastqUnarchiving, FastqSync, and Icav2DataCopyManager. Since these stacks are deprecated, remove the commented code to improve readability.

Suggested change

// this.fastqUnarchivingManagerStack = new FastqUnarchivingManagerStack(

// scope,

// 'FastqUnarchivingManagerStack',

// {

// ...this.createTemplateProps(env, 'FastqUnarchivingManagerStack'),

// ...statelessConfiguration.fastqUnarchivingManagerStackProps,

// }

// );

// this.fastqSyncManagerStack = new FastqSyncManagerStack(scope, 'FastqSyncManagerStack', {

// ...this.createTemplateProps(env, 'FastqSyncManagerStack'),

// ...statelessConfiguration.fastqSyncManagerStackProps,

// });

// Removed deprecated stack initialization for FastqUnarchivingManagerStack and FastqSyncManagerStack.

alexiswl · 2025-06-30T06:26:36Z

(stateless and stateful)

alexiswl · 2025-07-22T23:03:38Z

Some additional migration notes:

Data Sharing Manager DynamoDB Migration Notes

Click to expand!

For DynamoDb, our database names are different, so instead we will need to perform a data migration.

data-sharing-packaging-api-table -> DataSharingPackagingApiTable
data-sharing-push-api-table -> DataSharingPushApiTable
data-sharing-packaging-lookup-table -> DataSharingPackagingLookupTable

From our experience with the Fastq Manager DynamoDB deployment, easier to download + batch and then reupload

Batch Write Item / DataSharingPackagingApiTable

We can however use the batch-write-item command (that allows blocks of 25)

date

TABLE_NAME="DataSharingPackagingApiTable"
BATCH_ITEM_LIST=25

aws dynamodb scan \
  --table-name data-sharing-packaging-api-table \
  --query 'Items' \
  --output json > data.json

db_length="$( \
  jq --raw-output \
    'length' < data.json
)"

for min_iter in $(seq 0 "${BATCH_ITEM_LIST}" "${db_length}"); do
  # Get max iter
  max_iter="$(( ${min_iter} + 25 ))"

  if [[ "${max_iter}" -gt "${db_length}" ]]; then
    max_iter="${db_length}"
  fi

  jq --raw-output \
    --arg tableName "${TABLE_NAME}" \
    --argjson min_iter "${min_iter}" \
    --argjson max_iter "${max_iter}" \
    '
      .[$min_iter:$max_iter] |
      {
        "\($tableName)": (
          . | map({"PutRequest": {"Item": .}})
        )
      }
    ' < data.json \
  > "request_items_iter.${min_iter}_${max_iter}.json"

  aws dynamodb batch-write-item \
    --no-cli-pager \
    --request-items "file://request_items_iter.${min_iter}_${max_iter}.json"
done

date

Batch Write Item / DataSharingPushApiTable

date

TABLE_NAME="DataSharingPushApiTable"
BATCH_ITEM_LIST=25

aws dynamodb scan \
  --table-name data-sharing-push-api-table \
  --query 'Items' \
  --output json > data.json

db_length="$( \
  jq --raw-output \
    'length' < data.json
)"

for min_iter in $(seq 0 "${BATCH_ITEM_LIST}" "${db_length}"); do
  # Get max iter
  max_iter="$(( ${min_iter} + 25 ))"

  if [[ "${max_iter}" -gt "${db_length}" ]]; then
    max_iter="${db_length}"
  fi

  jq --raw-output \
    --arg tableName "${TABLE_NAME}" \
    --argjson min_iter "${min_iter}" \
    --argjson max_iter "${max_iter}" \
    '
      .[$min_iter:$max_iter] |
      {
        "\($tableName)": (
          . | map({"PutRequest": {"Item": .}})
        )
      }
    ' < data.json \
  > "request_items_iter.${min_iter}_${max_iter}.json"

  aws dynamodb batch-write-item \
    --no-cli-pager \
    --request-items "file://request_items_iter.${min_iter}_${max_iter}.json"
done

date

Batch Write Item / DataSharingPackagingLookupTable

date

TABLE_NAME="DataSharingPackagingLookupTable"
BATCH_ITEM_LIST=25

aws dynamodb scan \
  --table-name data-sharing-packaging-lookup-table \
  --query 'Items' \
  --output json > data.json

db_length="$( \
  jq --raw-output \
    'length' < data.json
)"

for min_iter in $(seq 0 "${BATCH_ITEM_LIST}" "${db_length}"); do
  # Get max iter
  max_iter="$(( ${min_iter} + 25 ))"

  if [[ "${max_iter}" -gt "${db_length}" ]]; then
    max_iter="${db_length}"
  fi

  jq --raw-output \
    --arg tableName "${TABLE_NAME}" \
    --argjson min_iter "${min_iter}" \
    --argjson max_iter "${max_iter}" \
    '
      .[$min_iter:$max_iter] |
      {
        "\($tableName)": (
          . | map({"PutRequest": {"Item": .}})
        )
      }
    ' < data.json \
  > "request_items_iter.${min_iter}_${max_iter}.json"

  aws dynamodb batch-write-item \
    --no-cli-pager \
    --request-items "file://request_items_iter.${min_iter}_${max_iter}.json"
done

date

~ 3 K items in 2 minutes!

Prod has around 80 K items in this table so we should be okay

alexiswl · 2025-07-22T23:04:12Z

Data Manager S3 Migration Notes

Click to expand!

CDK Import Steps

Comment out the following lines in the stateful application stack:

buildDataSharingS3Bucket

And then deploy with

bash scratch/rsync-deploy.sh \
  cdk-stateful deploy \
    --require-approval never \
    StatefulDataSharingStackPipeline/StatefulDataSharingStackPipeline/OrcaBusBeta/StatefulDataSharingStack

Uncomment the lines in the stateful application stack and run the import command

pnpm cdk-stateful import StatefulDataSharingStackPipeline/StatefulDataSharingStackPipeline/OrcaBusBeta/StatefulDataSharingStack

Run drift detection, everything should be clear

alexiswl · 2025-07-22T23:04:56Z

Fastq Manager DynamoDb Migration Notes

Click to expand

Summary

For DynamoDb, our database names are different, so instead we will need to perform a data migration.

fastqManagerDynamoDBTable -> FastqDataTable
fastqSetDynamoDBTable -> FastqSetDataTable
fastqJobDynamoDBTable -> FastqJobsTable

Can either download + upload with, but this may be quite slow for large databases

aws dynamodb scan \
  --table-name fastqManagerDynamoDBTable \
  --query 'Items' \
  --output json > fastq_data.json

for i in $(jq -rc '.[]' < fastq_data.json); do 
  aws dynamodb put-item \
    --no-cli-pager \
    --table-name FastqDataTable \
    --item "$i"
done

Or export and import from S3.

However, for export / import, table must not yet exist.

Batch Write Item / FastqDataTable

We can however use the batch-write-item command (that allows blocks of 25)

date

TABLE_NAME="FastqDataTable"
BATCH_ITEM_LIST=25

aws dynamodb scan \
  --table-name fastqManagerDynamoDBTable \
  --query 'Items' \
  --output json > fastq_data.json

db_length="$( \
  jq --raw-output \
    'length' < fastq_data.json
)"


for min_iter in $(seq 0 "${BATCH_ITEM_LIST}" "${db_length}"); do
  # Get max iter
  max_iter="$(( ${min_iter} + 25 ))"

  if [[ "${max_iter}" -gt "${db_length}" ]]; then
    max_iter="${db_length}"
  fi

  jq --raw-output \
    --arg tableName "${TABLE_NAME}" \
    --argjson min_iter "${min_iter}" \
    --argjson max_iter "${max_iter}" \
    '
      .[$min_iter:$max_iter] |
      {
        "\($tableName)": (
          . | map({"PutRequest": {"Item": .}})
        )
      }
    ' < fastq_data.json \
  > "request_items_iter.${min_iter}_${max_iter}.json"

  aws dynamodb batch-write-item \
    --no-cli-pager \
    --request-items "file://request_items_iter.${min_iter}_${max_iter}.json"
done

date

I think this is the way forward!!

Let's do the other two fastq tables

Batch Write Item / FastqSetDataTable

date
TABLE_NAME="FastqSetDataTable"
BATCH_ITEM_LIST=25

aws dynamodb scan \
  --table-name fastqSetDynamoDBTable \
  --query 'Items' \
  --no-cli-pager \
  --output json > fastq_set_data.json


db_length="$( \
  jq --raw-output \
    'length' < fastq_set_data.json
)"


for min_iter in $(seq 0 "${BATCH_ITEM_LIST}" "${db_length}"); do
  # Get max iter
  max_iter="$(( ${min_iter} + 25 ))"

  if [[ "${max_iter}" -gt "${db_length}" ]]; then
    max_iter="${db_length}"
  fi

  jq --raw-output \
    --arg tableName "${TABLE_NAME}" \
    --argjson min_iter "${min_iter}" \
    --argjson max_iter "${max_iter}" \
    '
      .[$min_iter:$max_iter] |
      {
        "\($tableName)": (
          . | map({"PutRequest": {"Item": .}})
        )
      }
    ' < fastq_set_data.json \
  > "request_items_iter.${min_iter}_${max_iter}.json"

  aws dynamodb batch-write-item \
    --no-cli-pager \
    --request-items "file://request_items_iter.${min_iter}_${max_iter}.json"
done

date

Batch Write Item / FastqJobsTable

date

TABLE_NAME="FastqJobsTable"
BATCH_ITEM_LIST=25

aws dynamodb scan \
  --table-name fastqJobDynamoDBTable \
  --query 'Items' \
  --no-cli-pager \
  --output json > fastq_job_data.json


db_length="$( \
  jq --raw-output \
    'length' < fastq_job_data.json
)"


for min_iter in $(seq 0 "${BATCH_ITEM_LIST}" "${db_length}"); do
  # Get max iter
  max_iter="$(( ${min_iter} + 25 ))"

  if [[ "${max_iter}" -gt "${db_length}" ]]; then
    max_iter="${db_length}"
  fi

  jq --raw-output \
    --arg tableName "${TABLE_NAME}" \
    --argjson min_iter "${min_iter}" \
    --argjson max_iter "${max_iter}" \
    '
      .[$min_iter:$max_iter] |
      {
        "\($tableName)": (
          . | map({"PutRequest": {"Item": .}})
        )
      }
    ' < fastq_job_data.json \
  > "request_items_iter.${min_iter}_${max_iter}.json"

  aws dynamodb batch-write-item \
    --no-cli-pager \
    --request-items "file://request_items_iter.${min_iter}_${max_iter}.json"
done

date

alexiswl · 2025-07-22T23:05:38Z

Fastq Manager S3 Migration Notes

Click to expand

CDK Import Steps

Comment out the following lines in the stateful application stack:

addNtsmBucket
addFastqManagerCacheBucket

And the following lines in S3

addTemporaryMetadataDataLifeCycleRuleToBucket

And then deploy with

bash scratch/rsync-deploy.sh \
  cdk-stateful deploy \
    --require-approval never \
    StatefulFastqStack/StatefulFastqStackPipeline/OrcaBusBeta/StatefulFastqStack

Uncomment the lines in the stateful application stack and run the import command

pnpm cdk-stateful import StatefulFastqStack/StatefulFastqStackPipeline/OrcaBusBeta/StatefulFastqStack

Then redeploy

Run drift detection, everything should be clear, then add in the lifecycle rules and redeploy

Then rerun the drift detection.

alexiswl · 2025-07-24T02:56:43Z

Complete - merging

Deprecate unarchiving and sync manager stacks, migrated to separate r…

1e9e259

…epos

alexiswl requested a review from Copilot June 29, 2025 00:34

alexiswl self-assigned this Jun 29, 2025

Copilot AI reviewed Jun 29, 2025

View reviewed changes

alexiswl added 2 commits July 9, 2025 18:56

Update datasharing s3 artifacts removal policy

52d7962

Also purge fastq manager and data sharing manager stacks

5d1e805

(stateless and stateful)

alexiswl added this pull request to the merge queue Jul 24, 2025

Merged via the queue into main with commit e5f0b98 Jul 24, 2025
6 checks passed

alexiswl deleted the deprecation/deprecate-primary-data-stacks branch July 24, 2025 02:57

Deprecate unarchiving and sync manager stacks, migrated to separate repos #1038

Deprecate unarchiving and sync manager stacks, migrated to separate repos #1038

Uh oh!

Conversation

alexiswl commented Jun 29, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Jun 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 29, 2025

Choose a reason for hiding this comment

Uh oh!

alexiswl commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO LIST

Uh oh!

alexiswl commented Jul 22, 2025

Data Sharing Manager DynamoDB Migration Notes

Batch Write Item / DataSharingPackagingApiTable

Batch Write Item / DataSharingPushApiTable

Batch Write Item / DataSharingPackagingLookupTable

Uh oh!

alexiswl commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Data Manager S3 Migration Notes

Uh oh!

alexiswl commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fastq Manager DynamoDb Migration Notes

Summary

Batch Write Item / FastqDataTable

Batch Write Item / FastqSetDataTable

Batch Write Item / FastqJobsTable

Uh oh!

alexiswl commented Jul 22, 2025

Fastq Manager S3 Migration Notes

Uh oh!

alexiswl commented Jul 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alexiswl commented Jun 30, 2025 •

edited

Loading

alexiswl commented Jul 22, 2025 •

edited

Loading

alexiswl commented Jul 22, 2025 •

edited

Loading