Skip to content

Feature request: Parse records as JSON and S3Schema in S3SqsEventNotificationRecordSchema #3525

Closed
@dreamorosi

Description

@dreamorosi
Contributor

Discussed in #3449

Originally posted by cyrildewit December 30, 2024
Hi, I am implementing a Lambda handler that will consume a SQS queue filled with S3 event notifications. I expected that S3SqsEventNotificationSchema would also validate and parse the payload of the S3 Event Notification within the SQS Message, but that is currently not the case.

Here's a copy of the official schemas for easy reference source:

const S3SqsEventNotificationSchema = z.object({
  Records: z.array(S3SqsEventNotificationRecordSchema),
});

const S3SqsEventNotificationRecordSchema = SqsRecordSchema.extend({
  body: z.string(),
});

I wonder if this design decision was made on purpose?

I have solved it by creating a custom version of S3SqsEventNotificationSchema, like so:

const extendedS3SqsEventNotificationSchema = S3SqsEventNotificationSchema.extend({
    Records: z.array(SqsRecordSchema.extend({
        body: JSONStringified(S3Schema),
    })),
});
type ExtendedS3SqsEventNotification = z.infer<typeof extendedS3SqsEventNotificationSchema>;
Full code example for reference:
import type { SQSHandler } from 'aws-lambda';
import middy from '@middy/core';
import { Logger } from '@aws-lambda-powertools/logger';
import { injectLambdaContext } from '@aws-lambda-powertools/logger/middleware';
import { parser } from '@aws-lambda-powertools/parser/middleware';
import {
  S3Schema,
  S3SqsEventNotificationSchema,
  SqsRecordSchema,
} from '@aws-lambda-powertools/parser/schemas';
import { JSONStringified } from '@aws-lambda-powertools/parser/helpers';
import { BatchProcessor, EventType, processPartialResponse } from '@aws-lambda-powertools/batch';
import { z } from 'zod';

const processor = new BatchProcessor(EventType.SQS);

const logger = new Logger({
  logLevel: 'debug',
});

const extendedS3SqsEventNotificationSchema = S3SqsEventNotificationSchema.extend({
  Records: z.array(SqsRecordSchema.extend({
      body: JSONStringified(S3Schema),
  })),
});

type ExtendedS3SqsEventNotification = z.infer<typeof extendedS3SqsEventNotificationSchema>;

const recordHandler = async (record: ExtendedS3SqsEventNotification['Records'][number]): Promise<void> => {
  logger.debug('S3 Event Records', {
      s3EventRecords: record.body.Records,
  });
};

const lambdaSqsHandler: SQSHandler = async (event, context) =>
  processPartialResponse(event, recordHandler, processor, {
      context,
  });

export const handler = middy(lambdaSqsHandler)
  .use(injectLambdaContext(logger, { logEvent: true }))
  .use(parser({ schema: extendedS3SqsEventNotificationSchema }));

Activity

added
confirmedThe scope is clear, ready for implementation
feature-requestThis item refers to a feature request for an existing or new utility
parserThis item relates to the Parser Utility
on Jan 24, 2025
dreamorosi

dreamorosi commented on Jan 24, 2025

@dreamorosi
ContributorAuthor

I've opened this issue after the discussion linked above, below my original answer:

The body field in a SQS event is always a string. Because of this the correct type is z.string() and not S3Schema.

If we defined the schema like this:

const S3SqsEventNotificationRecordSchema = SqsRecordSchema.extend({
  body: S3Schema,
});

The parsing would always fail since body is a string and S3Schema is a z.object({ ... }).

When it comes to generic SQS events we can't make the assumption of the body being a JSON string, since you could send anything as message to the queue. For the specific S3 -> SQS integration however it should be safe to assume that the body is always a stringified object, and thus we could potentially JSON.parse it for you.

After the discussion we agreed to convert this in a feature request so that it can be implemented.

Note

The issue is open for contributions. If you're interested in taking it please leave a comment to claim it, a maintainer will assign the issue to you. Below are some details on the proposed implementation, however feel free to comment further to get more details or clarify the requirements further.

As part of this issue, we should do the following:

  • Update the S3SqsEventNotificationRecordSchema here to use the JSONStringified helper from packages/parser/src/helpers.ts and wrap the S3Schema
  • Update the docstring above the schema to clarify that we are automatically transforming from JSON the body and then parsing it as an S3 event
  • Update the test case in packages/parser/tests/unit/schema/s3.test.ts below to account for the new transform
it('parses an S3 event notification SQS event', () => {
  // Prepare
  const event = getTestEvent<S3SqsEventNotification>({
    eventsPath,
    filename: 'sqs-event',
  });
  const expected = structuredClone(event);
  // @ts-expect-error - Modifying the expected result to account for transform
  expected.Records[0].body = JSON.parse(expected.Records[0].body);

  // Prepare
  const result = S3SqsEventNotificationSchema.parse(event);

  // Assess
  expect(result).toStrictEqual(expected);
});

Note that the steps above might not be 100% exhaustive and there might be more changes needed, feel free to discuss this with maintainers if unsure.

added
good-first-issueSomething that is suitable for those who want to start contributing
help-wantedWe would really appreciate some support from community for this one
on Jan 24, 2025
cyrildewit

cyrildewit commented on Jan 24, 2025

@cyrildewit
Contributor

Hi @dreamorosi, thank you very much for converting the discussion into this feature request and providing some handles.

I would love to pick it up and prepare a pull request!

dreamorosi

dreamorosi commented on Jan 24, 2025

@dreamorosi
ContributorAuthor

No problem at all, thank you for offering to help out.

I've assigned it to you, let me know if you have any questions- if not, I'll see you in the PR review!

cyrildewit

cyrildewit commented on Jan 25, 2025

@cyrildewit
Contributor

Hi @dreamorosi, I do have a question about what is prefered in terms of dealing with the breaking change. From what I have read in the process documentation are new major versions not a common practice.

I could introduce a new schema including the new changes, to make the change non-invasive.

dreamorosi

dreamorosi commented on Jan 27, 2025

@dreamorosi
ContributorAuthor

Hi @cyrildewit, thank you for the question and for taking the time to read about our processes.

Technically you're spot on, this is a change in behavior, and could be considered as a breaking change. Normally we'd want to hold this type of change for a major release, which we are already discussing here.

In this specific instance however one could argue that the proposed behavior should have been the original one all along and that the current handling of the field as a plain string makes the schema practically unusable and generates confusion - which I believe is what also made you ask the question in the first place.

Because of this I am inclined to move forward with the change on the original schema, but I'd also want to hear your opinion as a contributor and user of the tool. If you think this should be a backwards compatible change then let's mark the original one as deprecated (example) and introduce a new one. Then, once we release v3 we'll drop the legacy one.

15 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

completedThis item is complete and has been merged/shippedfeature-requestThis item refers to a feature request for an existing or new utilitygood-first-issueSomething that is suitable for those who want to start contributingparserThis item relates to the Parser Utility

Type

No type

Projects

Status

Shipped

Milestone

No milestone

Relationships

None yet

    Participants

    @dreamorosi@cyrildewit

    Issue actions

      Feature request: Parse records as JSON and `S3Schema` in `S3SqsEventNotificationRecordSchema` · Issue #3525 · aws-powertools/powertools-lambda-typescript