All classes are under active development and subject to non-backward compatible changes or removal in any future version. These are not subject to the Semantic Versioning model. This means that while you may use them, you may need to update your source code when upgrading to a newer version of this package.
Language | Package |
---|---|
TypeScript | @cdklabs/generative-ai-cdk-constructs |
Python | cdklabs.generative_ai_cdk_constructs |
This CDK construct creates a data ingestion pipeline using Amazon Kendra.
Files in Amazon Kendra supported formats are uploaded to an input Amazon Simple Storage Service (S3) bucket. Authorized clients (Amazon Cognito user pool) will trigger an AWS AppSync mutation to start the ingestion process, and can use subscriptions to get notifications on the ingestion status. The mutation call will trigger an AWS Step Function.
The construct sets up the following resources:
An Amazon S3 bucket for document uploads. AWS Lambda functions for processing documents and managing Kendra sync jobs. An AWS Step Functions workflow to orchestrate document processing steps. An Amazon Kendra index for querying processed documents. An AWS AppSync GraphQL API for initiating document processing workflows and querying the Kendra index.
If you have multiple workflows using GraphQL endpoints and want to use a single endpoint, you can use an AppSync Merged API. This construct can take as a parameter an existing AppSync Merged API; if provided, the mutation call and subscription updates will be targeted at the Merged API.
This construct will require an existing Amazon Kendra index. You can follow the steps in the official AWS Developer Guide to create and manage your Amazon Kendra index.
Here is a minimal deployable pattern definition:
import { RagAppsyncStepfnKendra } from '@cdklabs/aws-rag-appsync-stepfn-kendra';
const app = new cdk.App();
const stack = new cdk.Stack(app, 'MyStack');
new RagAppsyncStepfnKendra(stack, 'MyRagAppsyncStepfnKendra', {
cognitoUserPool: existingCognitoUserPool,
kendraDataSourcesProps: [
// Your Kendra data sources configurations
],
// Other configurations
});
from aws_cdk import core
from cdklabs.aws_rag_appsync_stepfn_kendra import RagAppsyncStepfnKendra
app = core.App()
stack = core.Stack(app, "MyStack")
RagAppsyncStepfnKendra(stack, "MyRagAppsyncStepfnKendra",
cognito_user_pool=existing_cognito_user_pool,
kendra_data_sources_props=[
# Your Kendra data sources configurations
],
# Other configurations
)
Out of the box implementation of the construct without any override will set the following defaults:
- Primary authentication method for the AppSync GraphQL API is Amazon Cognito User Pool.
- Secondary authentication method for the AppSync GraphQL API is IAM role.
- Set up a VPC
- Uses existing VPC if provided, otherwise creates a new one
- Set up a Security Group used by the AWS Lambda functions
- Uses existing Security Group, otherwise creates a new one
- Sets up two Amazon S3 Buckets
- Uses existing buckets if provided, otherwise creates new ones
By default, the construct will enable logging and tracing on all services which support those features. Observability can be turned off by setting the pattern property observability
to false.
- AWS Lambda: AWS X-Ray, Amazon CloudWatch Logs
- AWS Step Function: AWS X-Ray, Amazon CloudWatch Logs
- AWS AppSync GraphQL API: AWS X-Ray, Amazon CloudWatch Logs
When you build systems on AWS infrastructure, security responsibilities are shared between you and AWS. This shared responsibility model reduces your operational burden because AWS operates, manages, and controls the components including the host operating system, virtualization layer, and physical security of the facilities in which the services operate. For more information about AWS security, visit AWS Cloud Security.
This construct requires you to provide an existing Amazon Cognito User Pool and an existing Amazon Kendra index. Please refer to the official documentation on best practices to secure those services:
Optionnaly, you can provide existing resources to the constructs (marked optional in the construct pattern props). If you chose to do so, please refer to the official documentation on best practices to secure each service:
If you grant access to a user to your account where this construct is deployed, this user may access information stored by the construct (Amazon Simple Storage Service buckets, Amazon Kendra, Amazon CloudWatch logs). To help secure your AWS resources, please follow the best practices for AWS Identity and Access Management (IAM).
AWS CloudTrail provides a number of security features to consider as you develop and implement your own security policies. Please follow the related best practices through the official documentation.
Note You should validate each file before you ingest them using this construct. See here for file input validation best practices. Ensure you only ingest the appropriate documents into your knowledge base. Any results returned by the knowledge base is eligible for inclusion into the prompt; and therefore, being sent to the LLM. If using a third-party LLM, ensure you audit the documents contained within your knowledge base. This construct provides several configurable options for logging. Please consider security best practices when enabling or disabling logging and related features. Verbose logging, for instance, may log content of API calls. You can disable this functionality by ensuring observability flag is set to false.
This solution uses the Amazon Kendra service, which is not currently available in all AWS Regions. You must launch this construct in an AWS Region where this is available. For the most current availability of AWS services by Region, see the AWS Regional Services List.
Service quotas, also referred to as limits, are the maximum number of service resources or operations for your AWS account.
Make sure you have sufficient quota for each of the services implemented in this solution. For more information, refer to AWS service quotas.
To view the service quotas for all AWS services in the documentation without switching pages, view the information in the Service endpoints and quotas page in the PDF instead.
When deleting your stack which uses this construct, do not forget to go over the following instructions to avoid unexpected charges:
- empty and delete the Amazon Simple Storage Bucket(s) created by this construct if you didn't provide existing ones during the construct creation
- if the observability flag is turned on, delete all the associated logs created by the different services in Amazon CloudWatch logs
© Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.