Skip to content

Inventory CSV format is not compatible with S3 Batch Operations manifest #201

@jordanpadams

Description

@jordanpadams

Checked for duplicates

Yes - I've already checked

🐛 Describe the bug

When using the generated inventory CSV as a manifest for S3 Batch Operations with the pds_s3_data_tagger Lambda function, the batch operation fails with the following error:

Failed to parse task from manifest at byte offset 0. ErrorMessage: Unexpected number of task fields. Expected: 3. Observed: 7

The inventory CSV created from the "Determine unique data from existing inventory" task contains 7 columns, but S3 Batch Operations requires a CSV manifest with only 2-3 columns in a specific format: bucket name, object key, and optionally version ID.

🕵️ Expected behavior

The inventory CSV should be generated in a format compatible with S3 Batch Operations, containing only the required 2-3 columns:

  1. Bucket name
  2. Object key
  3. Version ID (optional)

This format is required for S3 Batch Operations to properly invoke the Lambda function for each object.

📜 To Reproduce

  1. Complete the "Determine unique data from existing inventory" task
  2. Use the generated CSV file as the manifest for an S3 Batch Operation
  3. Configure the batch operation to invoke the pds_s3_data_tagger Lambda function
  4. Attempt to run the batch operation
  5. Observe the error: "Failed to parse task from manifest at byte offset 0. ErrorMessage: Unexpected number of task fields. Expected: 3. Observed: 7"

🖥 Environment Info

  • AWS S3 Batch Operations
  • pds_s3_data_tagger Lambda function
  • Generated inventory CSV (7 columns - incompatible format)

📚 Version of Software Used

pds_s3_data_tagger Lambda function (current version in main branch)

🩺 Test Data / Additional context

The generated inventory CSV needs to be transformed to match the S3 Batch Operations manifest format. According to AWS documentation, the CSV format must have either 2 or 3 columns in the following order:

  • Column 1: Bucket name
  • Column 2: Object key
  • Column 3: Version ID (optional)

Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/batch-ops-create-job.html

🦄 Related requirements

#196 (primary issue)
#187
#194
#197


For Internal Dev Team To Complete

⚙️ Engineering Details

To be filled by engineering team

🎉 Integration & Test

To be filled by engineering team

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

ToDo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions