首页开发教程利用无服务器架构实时转录与翻译的方法

利用无服务器架构实时转录与翻译的方法

2024-10-01 185

通过利用无服务器技术,我们能够构建灵活、高效且经济实惠的转录与翻译解决方案,帮助企业和个人在多语言环境中实现无缝交流。本文将探讨如何利用简单的事件驱动无服务器架构来使用Amazon Translate和Amazon Transcribe,并创建一个小型编排以实现音频文件的转录和翻译。

一、架构概述

当音频文件以 “/audio” 前缀上传至 Amazon Simple Storage Service (AWS S3) 存储桶时,将触发一个 AWS Lambda 函数,该函数将利用 Amazon Transcribe 从音频文件中提取文本。转录后的文本将被保存到同一存储桶的 “/transcriptions” 目录下。

当 “/transcriptions” 目录中创建新文件时,另一个 AWS Lambda 函数将被触发以执行翻译任务。为简化说明,我们假设录音为英语,并希望将其翻译为意大利语。这个 Lambda 函数会将翻译结果写入 “/translations” 目录。

利用无服务器架构实时转录与翻译的方法

此编排非常适合用于创建小型原型和概念验证。在本例中,我们将 Amazon S3 作为消息总线。若将该架构推广到生产环境,建议使用更复杂的编排工具,如 AWS Step Functions。

注意事项:

1、使用 Amazon S3、S3 前缀和 AWS Lambda 实现编排时,请务必小心,以免导致 Lambda 函数出现递归调用,这样会产生不必要的费用。例如上面提到的两个函数之一误写入了 “/audio” 目录,就会陷入无限循环!为了防止这种情况,建议限制函数的调用频率,并可能为所有的编排工具添加紧急停止开关。一种常见策略是为每个编排工具提供一个 Amazon S3 路径的拒绝列表;

2、如果在触发函数的事件中检测到上述路径之一,可以对该函数进行调用频率限制或忽略该事件,甚至将事件转发到死信队列并向通知系统发送警报。有些用户可能更喜欢为每个阶段保留独立的存储桶,但请注意 AWS 账户限额。如需更多存储桶,可申请增加服务配额。

二、创建AWS CDK堆栈

1、安装AWS CDK CLI

如果还未安装 AWS CDK,请使用 npm 进行安装:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
npm install -g aws-cdk
npm install -g aws-cdk
npm install -g aws-cdk

2、新建AWS CDK项目

为 AWS CDK 项目新建一个目录,并初始化项目:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
mkdir transcribe-translate
cd transcribe-translate
cdk init app --language typescript
mkdir transcribe-translate cd transcribe-translate cdk init app --language typescript
mkdir transcribe-translate
cd transcribe-translate
cdk init app --language typescript

3、添加所需的AWS CDK依赖项

运行以下命令安装必要的依赖:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
npm install @aws-cdk/aws-s3 @aws-cdk/aws-lambda @aws-cdk/aws-iam @aws-cdk/aws-s3-notifications @aws-cdk/aws-lambda-nodejs
npm install @aws-cdk/aws-s3 @aws-cdk/aws-lambda @aws-cdk/aws-iam @aws-cdk/aws-s3-notifications @aws-cdk/aws-lambda-nodejs
npm install @aws-cdk/aws-s3 @aws-cdk/aws-lambda @aws-cdk/aws-iam @aws-cdk/aws-s3-notifications @aws-cdk/aws-lambda-nodejs

二、创建基础架构

编辑 “lib/transcribe-translate-stack.ts” 文件以定义资源:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<div class="lb-code lb-ctxt-p lb-no-highlighting" data-lb-comp="code" data-lb-comp-registered="true"><pre class=" line-numbers language-plainText"><code class="language- language-plainText">import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as s3n from 'aws-cdk-lib/aws-s3-notifications';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';
import * as iam from 'aws-cdk-lib/aws-iam';
import * as path from 'path';
export class TranscribeTranslateStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
// Create the S3 bucket
const bucket = new s3.Bucket(this, 'UploadBucket', {
removalPolicy: cdk.RemovalPolicy.DESTROY, // this is OK only for dev
autoDeleteObjects: true, // this is OK only for dev
});
// Create the Transcribe Lambda function
const transcribeFunction = new NodejsFunction(this, 'TranscribeFunction', {
runtime: lambda.Runtime.NODEJS_18_X,
entry: path.join(__dirname, '../lambda/transcribe.mjs'),
handler: 'handler',
environment: {
BUCKET_NAME: bucket.bucketName,
},
timeout: cdk.Duration.seconds(900),
});
// Grant read/write permissions on S3 objects for the transcribe function
bucket.grantReadWrite(transcribeFunction, 'transcriptions/*');
// Additional permissions for Transcribe
transcribeFunction.addToRolePolicy(new iam.PolicyStatement({
actions: ['transcribe:StartTranscriptionJob', 'transcribe:GetTranscriptionJob'],
resources: ['*'],
}));
// Add S3 event notification to trigger Transcribe function
bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(transcribeFunction), {
prefix: 'audio/',
});
// Create the Translate Lambda function
const translateFunction = new NodejsFunction(this, 'TranslateFunction', {
runtime: lambda.Runtime.NODEJS_18_X,
entry: path.join(__dirname, '../lambda/translate.mjs'),
handler: 'handler',
environment: {
BUCKET_NAME: bucket.bucketName,
},
timeout: cdk.Duration.seconds(900),
});
// Grant read/write permissions on specific S3 objects for the translate function
bucket.grantRead(translateFunction, 'transcriptions/*');
bucket.grantReadWrite(translateFunction,'translations/*');
// Additional permissions for Translate
translateFunction.addToRolePolicy(new iam.PolicyStatement({
actions: ['translate:TranslateText'],
resources: ['*'],
}));
// Add S3 event notification to trigger Translate function
bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(translateFunction), {
prefix: 'transcriptions/',
});
}
}
<button class="lb-code-copy-btn"><div class="lb-code-copy-confirmation"><i class="icon-check">复制<hr class="lb-none-v-margin lb-divider-light"><div id="zlrw"><h2 id=".E8.A7.A6.E5.8F.91.E8.BD.AC.E5.BD.95.E4.BB.BB.E5.8A.A1" class="lb-txt-none lb-none-v-margin lb-h2 lb-title">触发转录任务<div class="lb-txt-16 lb-none-v-margin lb-rtxt">
<div class="lb-code lb-ctxt-p lb-no-highlighting" data-lb-comp="code" data-lb-comp-registered="true"><pre class=" line-numbers language-plainText"><code class="language- language-plainText">import * as cdk from 'aws-cdk-lib'; import { Construct } from 'constructs'; import * as s3 from 'aws-cdk-lib/aws-s3'; import * as lambda from 'aws-cdk-lib/aws-lambda'; import * as s3n from 'aws-cdk-lib/aws-s3-notifications'; import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs'; import * as iam from 'aws-cdk-lib/aws-iam'; import * as path from 'path'; export class TranscribeTranslateStack extends cdk.Stack { constructor(scope: Construct, id: string, props?: cdk.StackProps) { super(scope, id, props); // Create the S3 bucket const bucket = new s3.Bucket(this, 'UploadBucket', { removalPolicy: cdk.RemovalPolicy.DESTROY, // this is OK only for dev autoDeleteObjects: true, // this is OK only for dev }); // Create the Transcribe Lambda function const transcribeFunction = new NodejsFunction(this, 'TranscribeFunction', { runtime: lambda.Runtime.NODEJS_18_X, entry: path.join(__dirname, '../lambda/transcribe.mjs'), handler: 'handler', environment: { BUCKET_NAME: bucket.bucketName, }, timeout: cdk.Duration.seconds(900), }); // Grant read/write permissions on S3 objects for the transcribe function bucket.grantReadWrite(transcribeFunction, 'transcriptions/*'); // Additional permissions for Transcribe transcribeFunction.addToRolePolicy(new iam.PolicyStatement({ actions: ['transcribe:StartTranscriptionJob', 'transcribe:GetTranscriptionJob'], resources: ['*'], })); // Add S3 event notification to trigger Transcribe function bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(transcribeFunction), { prefix: 'audio/', }); // Create the Translate Lambda function const translateFunction = new NodejsFunction(this, 'TranslateFunction', { runtime: lambda.Runtime.NODEJS_18_X, entry: path.join(__dirname, '../lambda/translate.mjs'), handler: 'handler', environment: { BUCKET_NAME: bucket.bucketName, }, timeout: cdk.Duration.seconds(900), }); // Grant read/write permissions on specific S3 objects for the translate function bucket.grantRead(translateFunction, 'transcriptions/*'); bucket.grantReadWrite(translateFunction,'translations/*'); // Additional permissions for Translate translateFunction.addToRolePolicy(new iam.PolicyStatement({ actions: ['translate:TranslateText'], resources: ['*'], })); // Add S3 event notification to trigger Translate function bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(translateFunction), { prefix: 'transcriptions/', }); } } <button class="lb-code-copy-btn"><div class="lb-code-copy-confirmation"><i class="icon-check">复制<hr class="lb-none-v-margin lb-divider-light"><div id="zlrw"><h2 id=".E8.A7.A6.E5.8F.91.E8.BD.AC.E5.BD.95.E4.BB.BB.E5.8A.A1" class="lb-txt-none lb-none-v-margin lb-h2 lb-title">触发转录任务<div class="lb-txt-16 lb-none-v-margin lb-rtxt">
<div class="lb-code lb-ctxt-p lb-no-highlighting" data-lb-comp="code" data-lb-comp-registered="true"><pre class=" line-numbers  language-plainText"><code class="language-  language-plainText">import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as s3n from 'aws-cdk-lib/aws-s3-notifications';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';
import * as iam from 'aws-cdk-lib/aws-iam';
import * as path from 'path';

export class TranscribeTranslateStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // Create the S3 bucket
    const bucket = new s3.Bucket(this, 'UploadBucket', {
      removalPolicy: cdk.RemovalPolicy.DESTROY, // this is OK only for dev
      autoDeleteObjects: true, // this is OK only for dev
    });

    // Create the Transcribe Lambda function
    const transcribeFunction = new NodejsFunction(this, 'TranscribeFunction', {
      runtime: lambda.Runtime.NODEJS_18_X,
      entry: path.join(__dirname, '../lambda/transcribe.mjs'),
      handler: 'handler',
      environment: {
        BUCKET_NAME: bucket.bucketName,
      },
      timeout: cdk.Duration.seconds(900),
    });

    // Grant read/write permissions on S3 objects for the transcribe function
    bucket.grantReadWrite(transcribeFunction, 'transcriptions/*');

    // Additional permissions for Transcribe
    transcribeFunction.addToRolePolicy(new iam.PolicyStatement({
      actions: ['transcribe:StartTranscriptionJob', 'transcribe:GetTranscriptionJob'],
      resources: ['*'],
    }));

    // Add S3 event notification to trigger Transcribe function
    bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(transcribeFunction), {
      prefix: 'audio/',
    });

    // Create the Translate Lambda function
    const translateFunction = new NodejsFunction(this, 'TranslateFunction', {
      runtime: lambda.Runtime.NODEJS_18_X,
      entry: path.join(__dirname, '../lambda/translate.mjs'),
      handler: 'handler',
      environment: {
        BUCKET_NAME: bucket.bucketName,
      },
      timeout: cdk.Duration.seconds(900),
    });

    // Grant read/write permissions on specific S3 objects for the translate function
    bucket.grantRead(translateFunction, 'transcriptions/*');
    bucket.grantReadWrite(translateFunction,'translations/*');

    // Additional permissions for Translate
    translateFunction.addToRolePolicy(new iam.PolicyStatement({
      actions: ['translate:TranslateText'],
      resources: ['*'],
    }));

    // Add S3 event notification to trigger Translate function
    bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(translateFunction), {
      prefix: 'transcriptions/',
    });
  }
}
<button class="lb-code-copy-btn"><div class="lb-code-copy-confirmation"><i class="icon-check">复制<hr class="lb-none-v-margin lb-divider-light"><div id="zlrw"><h2 id=".E8.A7.A6.E5.8F.91.E8.BD.AC.E5.BD.95.E4.BB.BB.E5.8A.A1" class="lb-txt-none lb-none-v-margin lb-h2 lb-title">触发转录任务<div class="lb-txt-16 lb-none-v-margin lb-rtxt">

在以下示例中,我们将触发转录任务,但不会监控其是否成功完成。这种乐观方法只有在创建原型时才会有效!如果投产,建议是轮询 <a href=”https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/transcribe/command/GetTranscriptionJobCommand/”>Amazon Transcribe API ,以便您知道任务何时成功完成。

在本地代码仓库创建一个 lambda/transcribe.mjs 文件

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as s3n from 'aws-cdk-lib/aws-s3-notifications';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';
import * as iam from 'aws-cdk-lib/aws-iam';
import * as path from 'path';
export class TranscribeTranslateStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
// Create the S3 bucket
const bucket = new s3.Bucket(this, 'UploadBucket', {
removalPolicy: cdk.RemovalPolicy.DESTROY, // this is OK only for dev
autoDeleteObjects: true, // this is OK only for dev
});
// Create the Transcribe Lambda function
const transcribeFunction = new NodejsFunction(this, 'TranscribeFunction', {
runtime: lambda.Runtime.NODEJS_18_X,
entry: path.join(__dirname, '../lambda/transcribe.mjs'),
handler: 'handler',
environment: {
BUCKET_NAME: bucket.bucketName,
},
timeout: cdk.Duration.seconds(900),
});
// Grant read/write permissions on S3 objects for the transcribe function
bucket.grantReadWrite(transcribeFunction, 'transcriptions/*');
// Additional permissions for Transcribe
transcribeFunction.addToRolePolicy(new iam.PolicyStatement({
actions: ['transcribe:StartTranscriptionJob', 'transcribe:GetTranscriptionJob'],
resources: ['*'],
}));
// Add S3 event notification to trigger Transcribe function
bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(transcribeFunction), {
prefix: 'audio/',
});
// Create the Translate Lambda function
const translateFunction = new NodejsFunction(this, 'TranslateFunction', {
runtime: lambda.Runtime.NODEJS_18_X,
entry: path.join(__dirname, '../lambda/translate.mjs'),
handler: 'handler',
environment: {
BUCKET_NAME: bucket.bucketName,
},
timeout: cdk.Duration.seconds(900),
});
// Grant read/write permissions on specific S3 objects for the translate function
bucket.grantRead(translateFunction, 'transcriptions/*');
bucket.grantReadWrite(translateFunction,'translations/*');
// Additional permissions for Translate
translateFunction.addToRolePolicy(new iam.PolicyStatement({
actions: ['translate:TranslateText'],
resources: ['*'],
}));
// Add S3 event notification to trigger Translate function
bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(translateFunction), {
prefix: 'transcriptions/',
});
}
}
import * as cdk from 'aws-cdk-lib'; import { Construct } from 'constructs'; import * as s3 from 'aws-cdk-lib/aws-s3'; import * as lambda from 'aws-cdk-lib/aws-lambda'; import * as s3n from 'aws-cdk-lib/aws-s3-notifications'; import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs'; import * as iam from 'aws-cdk-lib/aws-iam'; import * as path from 'path'; export class TranscribeTranslateStack extends cdk.Stack { constructor(scope: Construct, id: string, props?: cdk.StackProps) { super(scope, id, props); // Create the S3 bucket const bucket = new s3.Bucket(this, 'UploadBucket', { removalPolicy: cdk.RemovalPolicy.DESTROY, // this is OK only for dev autoDeleteObjects: true, // this is OK only for dev }); // Create the Transcribe Lambda function const transcribeFunction = new NodejsFunction(this, 'TranscribeFunction', { runtime: lambda.Runtime.NODEJS_18_X, entry: path.join(__dirname, '../lambda/transcribe.mjs'), handler: 'handler', environment: { BUCKET_NAME: bucket.bucketName, }, timeout: cdk.Duration.seconds(900), }); // Grant read/write permissions on S3 objects for the transcribe function bucket.grantReadWrite(transcribeFunction, 'transcriptions/*'); // Additional permissions for Transcribe transcribeFunction.addToRolePolicy(new iam.PolicyStatement({ actions: ['transcribe:StartTranscriptionJob', 'transcribe:GetTranscriptionJob'], resources: ['*'], })); // Add S3 event notification to trigger Transcribe function bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(transcribeFunction), { prefix: 'audio/', }); // Create the Translate Lambda function const translateFunction = new NodejsFunction(this, 'TranslateFunction', { runtime: lambda.Runtime.NODEJS_18_X, entry: path.join(__dirname, '../lambda/translate.mjs'), handler: 'handler', environment: { BUCKET_NAME: bucket.bucketName, }, timeout: cdk.Duration.seconds(900), }); // Grant read/write permissions on specific S3 objects for the translate function bucket.grantRead(translateFunction, 'transcriptions/*'); bucket.grantReadWrite(translateFunction,'translations/*'); // Additional permissions for Translate translateFunction.addToRolePolicy(new iam.PolicyStatement({ actions: ['translate:TranslateText'], resources: ['*'], })); // Add S3 event notification to trigger Translate function bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(translateFunction), { prefix: 'transcriptions/', }); } }
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as s3n from 'aws-cdk-lib/aws-s3-notifications';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';
import * as iam from 'aws-cdk-lib/aws-iam';
import * as path from 'path';

export class TranscribeTranslateStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // Create the S3 bucket
    const bucket = new s3.Bucket(this, 'UploadBucket', {
      removalPolicy: cdk.RemovalPolicy.DESTROY, // this is OK only for dev
      autoDeleteObjects: true, // this is OK only for dev
    });

    // Create the Transcribe Lambda function
    const transcribeFunction = new NodejsFunction(this, 'TranscribeFunction', {
      runtime: lambda.Runtime.NODEJS_18_X,
      entry: path.join(__dirname, '../lambda/transcribe.mjs'),
      handler: 'handler',
      environment: {
        BUCKET_NAME: bucket.bucketName,
      },
      timeout: cdk.Duration.seconds(900),
    });

    // Grant read/write permissions on S3 objects for the transcribe function
    bucket.grantReadWrite(transcribeFunction, 'transcriptions/*');

    // Additional permissions for Transcribe
    transcribeFunction.addToRolePolicy(new iam.PolicyStatement({
      actions: ['transcribe:StartTranscriptionJob', 'transcribe:GetTranscriptionJob'],
      resources: ['*'],
    }));

    // Add S3 event notification to trigger Transcribe function
    bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(transcribeFunction), {
      prefix: 'audio/',
    });

    // Create the Translate Lambda function
    const translateFunction = new NodejsFunction(this, 'TranslateFunction', {
      runtime: lambda.Runtime.NODEJS_18_X,
      entry: path.join(__dirname, '../lambda/translate.mjs'),
      handler: 'handler',
      environment: {
        BUCKET_NAME: bucket.bucketName,
      },
      timeout: cdk.Duration.seconds(900),
    });

    // Grant read/write permissions on specific S3 objects for the translate function
    bucket.grantRead(translateFunction, 'transcriptions/*');
    bucket.grantReadWrite(translateFunction,'translations/*');

    // Additional permissions for Translate
    translateFunction.addToRolePolicy(new iam.PolicyStatement({
      actions: ['translate:TranslateText'],
      resources: ['*'],
    }));

    // Add S3 event notification to trigger Translate function
    bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(translateFunction), {
      prefix: 'transcriptions/',
    });
  }
}

三、触发转录任务

在本地代码仓库创建一个 “lambda/transcribe.mjs” 文件,其中包含转录任务的实现代码。该任务完成后,将在 “/transcriptions” 目录下新建一个对象,触发后续翻译任务的编排。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import { TranscribeClient, StartTranscriptionJobCommand } from '@aws-sdk/client-transcribe';
const transcribeClient = new TranscribeClient();
export const handler = async (event) => {
const bucket = process.env.BUCKET_NAME;
const key = event.Records[0].s3.object.key;
const transcribeParams = {
TranscriptionJobName: `TranscribeJob-${Date.now()}`,
LanguageCode: 'en-US',
Media: {
MediaFileUri: `s3://${bucket}/${key}`
},
OutputBucketName: bucket,
OutputKey: `transcriptions/${key.split("/").pop()}.json`
};
try {
await transcribeClient.send(new StartTranscriptionJobCommand(transcribeParams));
console.log('Transcription job started');
} catch (err) {
console.error('Error starting transcription job', err);
return;
}
};
import { TranscribeClient, StartTranscriptionJobCommand } from '@aws-sdk/client-transcribe'; const transcribeClient = new TranscribeClient(); export const handler = async (event) => { const bucket = process.env.BUCKET_NAME; const key = event.Records[0].s3.object.key; const transcribeParams = { TranscriptionJobName: `TranscribeJob-${Date.now()}`, LanguageCode: 'en-US', Media: { MediaFileUri: `s3://${bucket}/${key}` }, OutputBucketName: bucket, OutputKey: `transcriptions/${key.split("/").pop()}.json` }; try { await transcribeClient.send(new StartTranscriptionJobCommand(transcribeParams)); console.log('Transcription job started'); } catch (err) { console.error('Error starting transcription job', err); return; } };
import { TranscribeClient, StartTranscriptionJobCommand } from '@aws-sdk/client-transcribe';

const transcribeClient = new TranscribeClient();

export const handler = async (event) => {
    const bucket = process.env.BUCKET_NAME;
    const key = event.Records[0].s3.object.key;

    const transcribeParams = {
        TranscriptionJobName: `TranscribeJob-${Date.now()}`,
        LanguageCode: 'en-US',
        Media: {
            MediaFileUri: `s3://${bucket}/${key}`
        },
        OutputBucketName: bucket,
        OutputKey: `transcriptions/${key.split("/").pop()}.json`
    };

    try {
        await transcribeClient.send(new StartTranscriptionJobCommand(transcribeParams));
        console.log('Transcription job started');
    } catch (err) {
        console.error('Error starting transcription job', err);
        return;
    }

};

四、触发翻译任务

创建一个 “lambda/translate.mjs” 文件,编写翻译任务的实现代码。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import { S3Client, GetObjectCommand, PutObjectCommand } from '@aws-sdk/client-s3';
import { TranslateClient, TranslateTextCommand } from '@aws-sdk/client-translate';
const s3Client = new S3Client();
const translateClient = new TranslateClient();
export const handler = async (event) => {
const bucket = process.env.BUCKET_NAME;
const key = event.Records[0].s3.object.key;
// Get the transcription text from S3
let transcriptionData;
try {
transcriptionData = await s3Client.send(new GetObjectCommand({
Bucket: bucket,
Key: key
}));
} catch (err) {
console.error('Error getting transcription from S3', err);
return;
}
const { results } = JSON.parse(await streamToString(transcriptionData.Body));
const transcriptionText = results.transcripts.reduce(
(acc, curr) => acc + curr.transcript, ''
);
console.log('Transcription text:', transcriptionText);
// Translate the transcription
const translateParams = {
SourceLanguageCode: 'en',
TargetLanguageCode: 'it',
Text: transcriptionText
};
console.log(translateParams);
let translation;
try{
translation = await translateClient.send(new TranslateTextCommand(translateParams));
console.log('Translation:', translation.TranslatedText);
}catch(err){
console.error('Error translating text', err);
return;
}
// Save the translation result to S3
const translationResult = {
originalText: transcriptionText,
translatedText: translation.TranslatedText
};
try{
await s3Client.send(new PutObjectCommand({
Bucket: bucket,
Key: `translations/${key.split("/").pop()}.it.txt`,
Body: translationResult.translatedText,
ContentType: 'text/plain'
}));
}catch(err){
console.error('Error saving translation to S3', err);
return;
}
console.log('Translation saved');
};
// Helper function to convert a stream to a string
const streamToString = (stream) => {
return new Promise((resolve, reject) => {
const chunks = [];
stream.on('data', (chunk) => chunks.push(chunk));
stream.on('error', reject);
stream.on('end', () => resolve(Buffer.concat(chunks).toString('utf8')));
});
};
import { S3Client, GetObjectCommand, PutObjectCommand } from '@aws-sdk/client-s3'; import { TranslateClient, TranslateTextCommand } from '@aws-sdk/client-translate'; const s3Client = new S3Client(); const translateClient = new TranslateClient(); export const handler = async (event) => { const bucket = process.env.BUCKET_NAME; const key = event.Records[0].s3.object.key; // Get the transcription text from S3 let transcriptionData; try { transcriptionData = await s3Client.send(new GetObjectCommand({ Bucket: bucket, Key: key })); } catch (err) { console.error('Error getting transcription from S3', err); return; } const { results } = JSON.parse(await streamToString(transcriptionData.Body)); const transcriptionText = results.transcripts.reduce( (acc, curr) => acc + curr.transcript, '' ); console.log('Transcription text:', transcriptionText); // Translate the transcription const translateParams = { SourceLanguageCode: 'en', TargetLanguageCode: 'it', Text: transcriptionText }; console.log(translateParams); let translation; try{ translation = await translateClient.send(new TranslateTextCommand(translateParams)); console.log('Translation:', translation.TranslatedText); }catch(err){ console.error('Error translating text', err); return; } // Save the translation result to S3 const translationResult = { originalText: transcriptionText, translatedText: translation.TranslatedText }; try{ await s3Client.send(new PutObjectCommand({ Bucket: bucket, Key: `translations/${key.split("/").pop()}.it.txt`, Body: translationResult.translatedText, ContentType: 'text/plain' })); }catch(err){ console.error('Error saving translation to S3', err); return; } console.log('Translation saved'); }; // Helper function to convert a stream to a string const streamToString = (stream) => { return new Promise((resolve, reject) => { const chunks = []; stream.on('data', (chunk) => chunks.push(chunk)); stream.on('error', reject); stream.on('end', () => resolve(Buffer.concat(chunks).toString('utf8'))); }); };
import { S3Client, GetObjectCommand, PutObjectCommand } from '@aws-sdk/client-s3';
import { TranslateClient, TranslateTextCommand } from '@aws-sdk/client-translate';

const s3Client = new S3Client();
const translateClient = new TranslateClient();

export const handler = async (event) => {
    const bucket = process.env.BUCKET_NAME;
    const key = event.Records[0].s3.object.key;

    // Get the transcription text from S3
    let transcriptionData;
    
    try {
        transcriptionData = await s3Client.send(new GetObjectCommand({
            Bucket: bucket,
            Key: key
        }));
    } catch (err) {
        console.error('Error getting transcription from S3', err);
        return;
    }

    const { results } = JSON.parse(await streamToString(transcriptionData.Body));

    const transcriptionText = results.transcripts.reduce(
        (acc, curr) => acc + curr.transcript, ''
    );
    console.log('Transcription text:', transcriptionText);

    // Translate the transcription
    const translateParams = {
        SourceLanguageCode: 'en',
        TargetLanguageCode: 'it',
        Text: transcriptionText
    };

    console.log(translateParams);

    let translation; 
    
    try{
        translation = await translateClient.send(new TranslateTextCommand(translateParams));
        console.log('Translation:', translation.TranslatedText);
    }catch(err){
        console.error('Error translating text', err);
        return;
    }

    // Save the translation result to S3
    const translationResult = {
        originalText: transcriptionText,
        translatedText: translation.TranslatedText
    };

    try{
        await s3Client.send(new PutObjectCommand({
            Bucket: bucket,
            Key: `translations/${key.split("/").pop()}.it.txt`,
            Body: translationResult.translatedText,
            ContentType: 'text/plain'
        }));
    }catch(err){
        console.error('Error saving translation to S3', err);
        return;
    }

    console.log('Translation saved');
};

// Helper function to convert a stream to a string
const streamToString = (stream) => {
    return new Promise((resolve, reject) => {
        const chunks = [];
        stream.on('data', (chunk) => chunks.push(chunk));
        stream.on('error', reject);
        stream.on('end', () => resolve(Buffer.concat(chunks).toString('utf8')));
    });
};

五、部署

运行以下命令进行部署:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
cdk bootstrap # if you haven't bootstrapped in this region before
cdk deploy
cdk bootstrap # if you haven't bootstrapped in this region before cdk deploy
cdk bootstrap # if you haven't bootstrapped in this region before
cdk deploy

完成部署后,应该能看到一个输出列表,其中包含 Amazon S3 存储桶名称等信息。

利用无服务器架构实时转录与翻译的方法

六、测试

上传 MP3 文件到 “audio/” 目录以进行测试。可以使用以下 AWS CLI 命令上传文件:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
aws s3 cp ./path/to/your/audio/file s3://your-bucket-here/audio/file.mp3
aws s3 cp ./path/to/your/audio/file s3://your-bucket-here/audio/file.mp3
aws s3 cp ./path/to/your/audio/file s3://your-bucket-here/audio/file.mp3

或者访问 AWS 管理控制台,将 MP3 文件直接上传到 “audio/”。稍后,将在 “transcriptions/” 和 “translations/” 目录中找到每个类别的编排结果。

利用无服务器架构实时转录与翻译的方法

七、清理

任务完成后,可以通过以下命令拆解基础架构:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
cdk destroy
cdk destroy
cdk destroy

相关推荐:

亚马逊AWS是什么?提供哪些服务?

Amazon ECS部署Docker容器教程

AWS S3使用教程

亚马逊云服务器免费领取教程

  • 广告合作

  • QQ群号:4114653

温馨提示:
1、本网站发布的内容(图片、视频和文字)以原创、转载和分享网络内容为主,如果涉及侵权请尽快告知,我们将会在第一时间删除。邮箱:2942802716#qq.com(#改为@)。 2、本站原创内容未经允许不得转裁,转载请注明出处“站长百科”和原文地址。
Fatkun
上一篇: Fatkun怎么安装

相关文章