SAA 考试每日练习 - 2024/11/26

来源：Amazon AWS Certified Solutions Architect - Associate SAA-C03 Exam
15 题 (No.66 ~ No.80)，仅供自己复习使用。
如果侵权请联系删除。

🌟 单词：

approximately_{adv. 大约，大概，约莫}
critical_{adj. 批评的，批判的；紧要的，关键性的；严重的，危急的；审慎的，严谨的；评论性的；挑剔的；临界的}
appropriate_{adj. 合适的；适当的，相称的 | v. 拨出（款项）；占用，挪用}
visibility_{n. 可见性，能见度，可见度，能见距离}
corruption_{n. 腐败，贪污，受贿，贿赂}
bastion_{n. 堡垒，捍卫者，防御工事}
transaction_{n. 交易，生意；处理，办理}
instrumentation_{n. 仪表, 器乐谱写}
delivery_{n. 递送；交付；分娩}
unpredictable_{adj. 无法预言的，不可预测的，难以预料的，（人）善变的}

一、S3 Lifestyle

A company has an application that generates a large number of files, each approximately_大约 5 MB in size. The files are stored in Amazon S3. Company policy requires the files to be stored for 4 years before they can be deleted. Immediate accessibility is always required as the files contain critical_重要的 business data that is not easy to reproduce. The files are frequently accessed in the first 30 days of the object creation but are rarely accessed after the first 30 days.
Which storage solution is MOST cost-effective?

Create an S3 bucket lifecycle policy to move files from S3 Standard to S3 Glacier 30 days from object creation. Delete the files 4 years after object creation.
Create an S3 bucket lifecycle policy to move files from S3 Standard to S3 One Zone-Infrequent Access (S3 One Zone-IA) 30 days from object creation. Delete the files 4 years after object creation.
✅ Create an S3 bucket lifecycle policy to move files from S3 Standard to S3 Standard-Infrequent Access (S3 Standard-IA) 30 days from object creation. Delete the files 4 years after object creation.
Create an S3 bucket lifecycle policy to move files from S3 Standard to S3 Standard-Infrequent Access (S3 Standard-IA) 30 days from object creation. Move the files to S3 Glacier 4 years after object creation.

✨ 关键词：stored for 4 years、Immediate accessibility is always required、frequently accessed in the first 30 days、rarely accessed after the first 30 days

3️⃣ ✅

💡 解析：文件需要存放 4 年，第 1 个月访问频繁，之后偶尔访问，但是需要能立即获取。最便宜的方案。
需要立即获取的话，只能使用 S3 标准 + S3 IA 的方式，然后使用生命周期策略，选 3️⃣。

社区对 1️⃣ 和 2️⃣ 还有争议，2️⃣ 明显错误单区 IA 数据一丢就没了。
而针对 1️⃣，S3 Glacier 其实也分多种：

S3 Glacier Instant Retrieval

S3 Glacier Flexible Retrieval

S3 Glacier Deep Archive

S3 Glacier Deep Archive 长久归档、检索时间 9 小时 ~ 48 小时，已经很熟练了就不介绍了。

Amazon S3 Glacier Instant Retrieval 成本极低的归档存储，提供毫秒级检索能力，适合极少访问的数据

Amazon S3 Glacier Instant Retrieval 是一种归档存储类，可以为很少访问且需要毫秒级检索速度的长期数据提供最低成本的存储。如果您每季度访问一次数据，则与 S3 Standard-Infrequent Access (S3 Standard-IA) 存储类相比，S3 Glacier 即时检索可为您节省高达 68% 的存储成本。
它专为很少访问但在性能敏感的用例中仍需要立即访问的数据而设计，例如图像托管、在线文件共享应用程序、医疗成像和健康记录、新闻媒体资产以及基因组学。

S3 Glacier Flexible Retrieval

对于不需要立即访问但需要灵活地免费检索大量数据的归档数据，例如备份或灾难恢复使用场景，请选择 S3 Glacier Flexible Retrieval，它可在几分钟内检索，也可以在 5-12 小时内进行免费批量检索。

对象存储在 S3 Glacier Flexible Retrieval 并不适用于实时访问。访问这些对象，首先启动一个恢复请求，它创建了一个临时的复制的对象，这你可以访问请求时完成。

它们一个适合医疗图像等每季度访问一次的对象，一个适合灾难恢复场景，都不符合题目中偶尔访问的需求。
况且考 Glacier 下细分存储类型的题目本身就很少，不用太过纠结。

👨‍👨‍👦‍👦 社区讨论：i think C should be the answer here,

Immediate accessibility isalways required as the files contain critical business data that is not easy to reproduce

If they do not explicitly mention that theyare using Glacier Instant Retrieval, we should assume that Glacier -> takes more time to retrieve and may not meet the requirements

二、Multi-consumers in SQS

A company hosts an application on multiple Amazon EC2 instances. The application processes messages from an Amazon SQS queue, writes to an Amazon RDS table, and deletes the message from the queue. Occasional duplicate records are found in the RDS table. The SQS queue does not contain any duplicate messages.
What should a solutions architect do to ensure messages are being processed once only?

Use the CreateQueue API call to create a new queue.
Use the AddPermission API call to add appropriate_合适的 permissions.
Use the ReceiveMessage API call to set an appropriate wait time.
✅ Use the ChangeMessageVisibility API call to increase the visibility_可见的 timeout.

✨ 关键词：Occasional duplicate records are found in the RDS table

4️⃣ ✅

💡 解析：多个 EC2 主机消费 SQS 队列信息，处理后插入 RDS 表中，但是出现了消息被消费多次创建多条表数据的情况。做什么来确保消息只被消费一次。
通过 ChangeMessageVisibility 来隐藏消息可以确保其他消费者不会重复消费该消息。

ChangeMessageVisibility

将队列中指定报文的可见性超时更改为新值。信息的默认可见性超时为 30 秒。最小值为 0 秒。最大为 12 小时。

例如，如果队列的默认超时时间是 60 秒，而您收到信息后已过去了 15 秒，并且您发送了将可见性超时设置为 10 秒的 ChangeMessageVisibility 调用，那么这 10 秒将从您发出 ChangeMessageVisibility 调用时开始计算。因此，在最初更改可见性超时 10 秒（共 25 秒）后再试图更改可见性超时或删除该信息可能会导致错误。

👨‍👨‍👦‍👦 社区讨论：In case of SQS - multi-consumers if one consumer hasalready picked the message and is processing, in meantime other consumer can pickit up and process the message there by two copiesare added at the end. To avoid this the message is made invisible from the time its picked and deleted after processing.This visibility timeout is increased according to max time taken to process the message

三、AWS Direct Connect & VPN

A solutions architect is designing a new hybrid architecture to extend a company’s on-premises infrastructure to AWS. The company requires a highly available connection with consistent low latency to an AWS Region. The company needs to minimize costs and is willing to accept slower traffic if the primary connection fails.
What should the solutions architect do to meet these requirements?

✅ Provision an AWS Direct Connect connection to a Region. Provision a VPN connection as a backup if the primary Direct Connect connection fails.
Provision a VPN tunnel connection to a Region for private connectivity. Provision a second VPN tunnel for private connectivity and as a backup if the primary VPN connection fails.
Provision an AWS Direct Connect connection to a Region. Provision a second Direct Connect connection to the same Region as a backup if the primary Direct Connect connection fails.
Provision an AWS Direct Connect connection to a Region. Use the Direct Connect failover attribute from the AWS CLI to automatically create a backup connection if the primary Direct Connect connection fails.

✨ 关键词：highly available connection with consistent low latency、accept slower traffic if the primary connection fails

1️⃣ ✅

💡 解析：将托管架构迁移到 AWS，要求高可用和低延迟的连接。在主要的网络连接出问题后，愿意接受慢的网络。要求低费用。
低延迟连接到 区域 首选 AWS Direct Connect，之后备用连接搭建 VPN 即可。

过一下常见的几个连接方式：

AWS Site-to-Site VPN

在公网上建立专有网络 (VPN) 来让本地数据中心连接到 AWS VPC

⭐ VPC 级别的

会创建两条 VPN 隧道进行冗余

AWS VPN CloudHub

将多个 VPC、本地数据中心组网

AWS Client VPN

AWS 提供的 VPN 服务

软件 VPC

自己在 EC2 上搭建 VPN

AWS Direct Connect(DX)

通过专线连接到 AWS，类似：本地数据中心 <–> AWS Direct Connect 位置 <–> 某区域内的某 VPC

无论那种方案都需要 1 个月以上的时间

Transit VPC

非 AWS 托管

类似 ZeroTier 的异地组网方式

👨‍👨‍👦‍👦 社区讨论：Direct Connect + VPN best of both

Direct Connect goes throught 1 Gbps, 10 Gbps or 100 Gbpsand the VPN goes up to 1.25 Gbps.
https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/aws-direct-connect-vpn.html

四、HA database

A company is running a business-critical web application on Amazon EC2 instances behind an Application Load Balancer. The EC2 instances are in an Auto Scaling group. The application uses an Amazon Aurora PostgreSQL database that is deployed in a single Availability Zone. The company wants the application to be highly available with minimum downtime and minimum loss of data.
Which solution will meet these requirements with the LEAST operational effort?

Place the EC2 instances in different AWS Regions. Use Amazon Route 53 health checks to redirect traffic. Use Aurora PostgreSQL Cross-Region Replication.
✅ Configure the Auto Scaling group to use multiple Availability Zones. Configure the database as Multi-AZ. Configure an Amazon RDS Proxy instance for the database.
Configure the Auto Scaling group to use one Availability Zone. Generate hourly snapshots of the database. Recover the database from the snapshots in the event of a failure.
Configure the Auto Scaling group to use multiple AWS Regions. Write the data from the application to Amazon S3. Use S3 Event Notifications to launch an AWS Lambda function to write the data to the database.

✨ 关键词：Amazon Aurora PostgreSQL database that is deployed in a single Availability Zone、highly available

2️⃣ ✅

💡 解析：公司应用程序部署在 ALB 后面的弹性 EC2 组上，使用了单一区域的 Aurora PostgreSQL 数据库，希望变得高可用且减少数据丢失。最简单的架构。
需要跨可用区横向扩展 Aurora PostgreSQL 数据库。

这里需要注意的是，Aurora PostgreSQL 数据库的跨可用区副本只是用来在主库灾难时恢复（提升为主库）用的。

在 Aurora 预置数据库集群中创建第二个、第三个以及更多数据库实例时，Aurora 自动设置从写入器数据库实例到所有其他数据库实例的复制。这些其他数据库实例是只读实例，称为 Aurora 副本。讨论如何在集群中组合写入器和读取器数据库实例时，我们还将其称为读取器实例。

由于读写都在主数据库实例上进行，且灾难场景下的数据库提升 (Promote) 需要数分钟完成，无法进行自动化无缝切换，因此 Amazon RDS Proxy 在此场景下并不必要。

提升过程需要几分钟才能完成。在提升只读副本时，RDS 会停止复制并重启只读副本。完成重启后，只读副本即可作为新数据库实例使用。

Amazon RDS Proxy 提高您的应用程序的扩展性、弹性和安全性

通过使用 Amazon RDS 代理，您可以允许您的应用程序池化和共享数据库连接，以提高其扩展能力。RDS Proxy 通过在保留应用程序连接的同时自动连接到备用数据库实例，使应用程序能够更好地抵御数据库故障。使用 RDS Proxy 还使您能够为数据库强制执行 AWS Identity and Access Management (IAM) 身份验证，并将凭证安全地存储在 AWS Secrets Manager。

👨‍👨‍👦‍👦 社区讨论：By configuring the Auto Scaling group to use multiple Availability Zones, the application will be able to continue running even if one Availability Zone goes down. Configuring the database as Multi-AZ will also ensure that the database remainsavailable in the event of a failure in one Availability Zone. Using an Amazon RDS Proxy instance for the database will allow the application to automatically route traffic to healthy database instances, further increasing the availability of the application.
This solution will meet the requirements for high availability with minimal operational effort.

五、NLB and ALB

A company’s HTTP application is behind a Network Load Balancer (NLB). The NLB’s target group is configured to use an Amazon EC2 Auto Scaling group with multiple EC2 instances that run the web service.
The company notices that the NLB is not detecting HTTP errors for the application. These errors require a manual restart of the EC2 instances that run the web service. The company needs to improve the application’s availability without writing custom scripts or code.
What should a solutions architect do to meet these requirements?

Enable HTTP health checks on the NLB, supplying the URL of the company’s application.
Add a cron job to the EC2 instances to check the local application’s logs once each minute. If HTTP errors are detected. the application will restart.
✅ Replace the NLB with an Application Load Balancer. Enable HTTP health checks by supplying the URL of the company’s application. Configure an Auto Scaling action to replace unhealthy instances.
Create an Amazon Cloud Watch alarm that monitors the UnhealthyHostCount metric for the NLB. Configure an Auto Scaling action to replace unhealthy instances when the alarm is in the ALARM state.

✨ 关键词：NLB、HTTP errors

3️⃣ ✅

💡 解析：公司的应用程序部署在 NLB 后面的弹性 EC2 组上，现在需要添加针对 HTTP 错误重启 EC2 实例的功能，不改现有脚本和代码。
NLB 在第 4 层，负责 TCP、UDP 请求，而 ALB 运行在第 7 层，能处理 HTTP、HTTPS 请求，替换为 ALB 并设置健康状态检查，对不通过的实例进行重启操作可以解决这个问题。

社区指出了 NLB 也可以进行 HTTP、HTTPS 错误的检查，但是无法针对 URL 进行提供（筛选）。
不过有 ALB 的情况下，ALB 总是更优的、针对 HTTP、HTTPS 错误进行检查的选择。

👨‍👨‍👦‍👦 社区讨论：I would choose A,as NLB supports HTTP and HTTPS Health Checks, BUT you can’t put any URL (as proposed), only the node IP addresses.
So, the solution is C.

NLBs support HTTP, HTTPS and TCP health checks:
https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-health-checks.html (check HealthCheckProtocol)
But NLBs onlyaccept either selecting EC2 instances or IP addresses directlyas targets. You can’t provide a URL to your endpoints, onlya health check path (if you’re using HTTP or HTTPS health checks).

六、DynamoDB point-in-time recovery

A company runs a shopping application that uses Amazon DynamoDB to store customer information. In case of data corruption_数据损坏, a solutions architect needs to design a solution that meets a recovery point objective (RPO) of 15 minutes and a recovery time objective (RTO) of 1 hour.
What should the solutions architect recommend to meet these requirements?

Configure DynamoDB global tables. For RPO recovery, point the application to a different AWS Region.
✅ Configure DynamoDB point-in-time recovery. For RPO recovery, restore to the desired point in time.
Export the DynamoDB data to Amazon S3 Glacier on a daily basis. For RPO recovery, import the data from S3 Glacier to DynamoDB.
Schedule Amazon Elastic Block Store (Amazon EBS) snapshots for the DynamoDB table every 15 minutes. For RPO recovery, restore the DynamoDB table by using the EBS snapshot.

✨ 关键词：DR、RPO for 15 minutes、RTO for 1 hour

2️⃣ ✅

💡 解析：应用程序通过 DynamoDB 存储数据，从上次备份到出错控制在 15 分钟，灾难恢复控制在 1 小时。
每 15 分钟备份一次，虽然不清楚 point-in-time recovery 服务，但只有 2️⃣ 是最佳选择。

DynamoDB 的时间点备份

Amazon DynamoDB 时间点故障恢复（PITR）提供 DynamoDB 表数据的自动持续备份。时间点故障恢复（PITR）备份由 DynamoDB 完全管理，以每秒粒度提供长达 35 天的恢复点。使用时间点恢复，您不必担心创建、维护或计划按需备份。本部分概述在 DynamoDB 中此过程如何运行。

👨‍👨‍👦‍👦 社区讨论：A - DynamoDB global tables provides multi-Region, and multi-active database, but it not valid “in case of data corruption”. In this case, you need a backup.This solutions isn’t valid.
B - Point in Time Recovery is designed asa continuous backup juts to recover it fast. It covers perfectly the RPO,and probably the RTO. https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/PointInTimeRecovery.html
C - A dailyexport will not cover the RPO of 15min.
D - DynamoDB is serverless… so what are these EBS snapshots taken from???

七、Gateway Endpoint

A company runs a photo processing application that needs to frequently upload and download pictures from Amazon S3 buckets that are located in the same AWS Region. A solutions architect has noticed an increased cost in data transfer fees and needs to implement a solution to reduce these costs.
How can the solutions architect meet this requirement?

Deploy Amazon API Gateway into a public subnet and adjust the route table to route S3 calls through it.
Deploy a NAT gateway into a public subnet and attach an endpoint policy that allows access to the S3 buckets.
Deploy the application into a public subnet and allow it to route through an internet gateway to access the S3 buckets.
✅ Deploy an S3 VPC gateway endpoint into the VPC and attach an endpoint policy that allows access to the S3 buckets.

✨ 关键词：same AWS Region、increased cost in data transfer fees

4️⃣ ✅

💡 解析：同 区域 运行了从 S3 存储桶上传和下载图片的应用程序，数据传输费用开始上涨。
可能存在 VPC 和 S3 没有走内网而走了公网进行连接的问题，使用 网关节点（只支持 S3 和 DynamoDB）可以解决问题。

👨‍👨‍👦‍👦 社区讨论：The correct answer is Option D. Deployan S3 VPC gatewayendpoint into the VPC and attach an endpoint policy that allows access to the S3 buckets.
By deploying an S3 VPC gatewayendpoint, the application can access the S3 buckets over a private networkconnection within the VPC,eliminating the need for data transfer over the internet. This can help reduce data transfer feesas well as improve the performance of the application. The endpoint policy can be used to specify which S3 buckets the application hasaccess to.

八、Security Group

A company recently launched Linux-based application instances on Amazon EC2 in a private subnet and launched a Linux-based bastion_堡垒 host on an Amazon EC2 instance in a public subnet of a VPC. A solutions architect needs to connect from the on-premises network, through the company’s internet connection, to the bastion host, and to the application servers. The solutions architect must make sure that the security groups of all the EC2 instances will allow that access.
Which combination of steps should the solutions architect take to meet these requirements? (Choose two.)

Replace the current security group of the bastion host with one that only allows inbound access from the application instances.
Replace the current security group of the bastion host with one that only allows inbound access from the internal IP range for the company.
✅ Replace the current security group of the bastion host with one that only allows inbound access from the external IP range for the company.
✅ Replace the current security group of the application instances with one that allows inbound SSH access from only the private IP address of the bastion host.
Replace the current security group of the application instances with one that allows inbound SSH access from only the public IP address of the bastion host.

✨ 关键词：bastion host in public subnet、application in private subnet

3️⃣ 4️⃣ ✅

💡 解析：堡垒机在公有子网，应用程序在私有子网，需要从本地网络通过堡垒机连接到应用程序主机。
答案都是针对安全组的操作。
堡垒机的安全组需要放行来自公有 IP 的入站流量，应用程序 EC2 需要放行来自堡垒机 IP 的入站流量。

引申 2 点：

安全组没有拒绝规则

有拒绝规则的网络 ACL 是位于 子网 级别的

👨‍👨‍👦‍👦 社区讨论：C because from on-prem networkto bastion through internet (using on-prem resource’s public IP),
D because bastion and ec2 is in same VPC, meaning bastion can communicate to EC2 via it’s private IP address

九、Security Group

A solutions architect is designing a two-tier web application. The application consists of a public-facing web tier hosted on Amazon EC2 in public subnets. The database tier consists of Microsoft SQL Server running on Amazon EC2 in a private subnet. Security is a high priority for the company.
How should security groups be configured in this situation? (Choose two.)

✅ Configure the security group for the web tier to allow inbound traffic on port 443 from 0.0.0.0/0.
Configure the security group for the web tier to allow outbound traffic on port 443 from 0.0.0.0/0.
✅ Configure the security group for the database tier to allow inbound traffic on port 1433 from the security group for the web tier.
Configure the security group for the database tier to allow outbound traffic on ports 443 and 1433 to the security group for the web tier.
Configure the security group for the database tier to allow inbound traffic on ports 443 and 1433 from the security group for the web tier.

✨ 关键词：web in public subnets、database in a private subnet

1️⃣ 3️⃣

💡 解析：两层架构，Web 应用在公有子网，数据库在私有子网。
安全组配置，公有子网的 EC2 实例面对所有 IP 放行 443 端口的入栈请求，私有子网的数据库实例面对公有子网实例放行 1433 端口的入栈请求。

配置 SQL Server 以侦听特定 TCP 端口

由于端口 1433 是 SQL Server 的已知标准，某些组织指定应更改 SQL Server 端口号以增强安全性。这在某些环境中可能很有用。但是，TCP/IP 体系结构允许端口扫描程序查询打开的端口，因此更改端口号并不是一种可靠的安全措施。

👨‍👨‍👦‍👦 社区讨论：Web Server Rules: Inbound traffic from 443 (HTTPS) Source 0.0.0.0/0 - Allows inbound HTTPS access from any IPv4 address Database Rules : 1433 (MS SQL)The default port to accessa MicrosoftSQL Server database, forexample, on an Amazon RDS instance
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/security-group-rules-reference.html

十、Multi-tiered application and overload

A company wants to move a multi-tiered application from on premises to the AWS Cloud to improve the application’s performance. The application consists of application tiers that communicate with each other by way of RESTful services. Transactions_处理 are dropped when one tier becomes overloaded. A solutions architect must design a solution that resolves these issues and modernizes the application.
Which solution meets these requirements and is the MOST operationally efficient?

✅ Use Amazon API Gateway and direct transactions to the AWS Lambda functions as the application layer. Use Amazon Simple Queue Service (Amazon SQS) as the communication layer between application services.
Use Amazon CloudWatch metrics to analyze the application performance history to determine the servers’ peak utilization during the performance failures. Increase the size of the application server’s Amazon EC2 instances to meet the peak requirements.
Use Amazon Simple Notification Service (Amazon SNS) to handle the messaging between application servers running on Amazon EC2 in an Auto Scaling group. Use Amazon CloudWatch to monitor the SNS queue length and scale up and down as required.
❌ Use Amazon Simple Queue Service (Amazon SQS) to handle the messaging between application servers running on Amazon EC2 in an Auto Scaling group. Use Amazon CloudWatch to monitor the SQS queue length and scale up when communication failures are detected.

✨ 关键词：multi-tiered application、RESTful、Transactions are dropped when one tier becomes overloaded、modernizes、design a solution

4️⃣ ❌ -> 1️⃣ ✅

💡 解析：应用程序由多层（应用程序）构成，使用 RESTful API 互相通信。目前存在问题某一层过载后处理就会被丢弃。需要最简单的架构进行解决。
弹性扩容 + 通过 SQS 解耦是这类“由于某一节点负荷过高导致整体运行效率下降”的通用解，4️⃣ 看起来不存在问题。
不过 4️⃣ 中有个错误是 CloudWatch 在检测到通信的失败 (communication failures are detected) 才开始扩容已经太迟了。

并且题目中还有个需求 modernizes，像 1️⃣ 一样使用 Amazon API Gateway 并在后端部署 Lambda 可以实现相同的功能并且由于自带弹性扩展因此更加现代。
之后再碰到有 modernizes 和 design a solution 这样关键词的题目需要留意下，是不是要重新考虑架构。

👨‍👨‍👦‍👦 社区讨论：Agree with A>>> Lambda = serverless + autoscale (modernize),SQS= decouple (no more drops)

The catch phrase is “scale up when communication failuresare detected” Scaling should not be based on communication failures, that’ll be crying over spilled milk! or rather too late. So D is wrong.

十一、Data sync to S3

A company receives 10 TB of instrumentation_仪表 data each day from several machines located at a single factory. The data consists of JSON files stored on a storage area network (SAN) in an on-premises data center located within the factory. The company wants to send this data to Amazon S3 where it can be accessed by several additional systems that provide critical near-real-time analytics. A secure transfer is important because the data is considered sensitive.
Which solution offers the MOST reliable data transfer?

AWS DataSync over public internet
✅ AWS DataSync over AWS Direct Connectz
AWS Database Migration Service (AWS DMS) over public internet
AWS Database Migration Service (AWS DMS) over AWS Direct Connect

✨ 关键词：10 TB、JSON、SAN in an on-premises data center located within the factory、near-real-time analytics、secure

2️⃣ ✅

💡 解析：工厂每天有 10 TB 的敏感数据以 JSON 文件形式存在本地，现在需要传输到 S3 存储桶中，并通过其他系统进行近实时的分析。
安全敏感数据因此选择 AWS Direct Connect，它在数据到达 AWS Direct Connect 位置后到 AWS 的过程中会走 AWS 的网络。而 AWS Database Migration Service 是本地数据库迁移到 AWS 时使用的。

什么是 AWS DataSync？

AWS DataSync是一项在线数据移动和发现服务，可简化数据迁移，并帮助您在AWS存储服务之间快速、轻松、安全地传输文件或对象数据。

通过使用 DataSync，您可以获得以下好处：

简化迁移规划 - 通过自动数据收集和建议，DataSyncDiscovery 可以最大限度地减少与规划数据迁移到AWS相关的时间、工作量和成本。您可以使用建议为预算计划提供信息，并在迁移临近时重新运行发现任务以验证您的假设。

自动移动数据 - DataSync 使通过网络在存储系统和服务之间移动数据变得更加容易。DataSync自动管理数据传输过程和高性能和安全数据传输所需的基础架构。

安全传输数据 - DataSync 提供端到端安全性，包括加密和完整性验证，以帮助确保您的数据安全、完好无损地到达并随时可用。DataSync通过内置AWS安全机制（例如 AWS Identity and Access Management (IAM) 角色）访问您的AWS存储。它还支持虚拟私有云 (VPC) 终端节点，使您可以选择在不穿越公共互联网的情况下传输数据，并进一步提高了在线复制的数据的安全性。

更快地移动数据 - DataSync 使用专门构建的网络协议和parallel 多线程架构来加快传输速度。这种方法可以加快迁移、用于分析和机器学习的重复数据处理工作流程以及数据保护流程。

降低运营成本 — 采用每千兆字节的固定定价，经济高效地移动数据。DataSync避免编写和维护自定义脚本或使用昂贵的商业传输工具。

看起来 AWS DataSync 本身就已经具备了传输过程中的端到端加密功能，不过有 AWS Direct Connect 的情况下加个保障更好。

👨‍👨‍👦‍👦 社区讨论：DMS is for databasesand here refers to “JSON files”. Public internet is not reliable.So best option is B

The most reliable solution for transferring the data in a secure manner would be option B: AWS DataSync over AWS Direct Connect.
AWS DataSync isa data transfer service that uses network optimization techniques to transfer data efficientlyand securely between on-premises storage systemsand Amazon S3 or other storage targets. When used over AWS Direct Connect, DataSync can provide a dedicated and secure networkconnection between your on-premises data center and AWS.This can help to ensure a more reliable and secure data transfer compared to using the public internet.

十二、Real-time data ingestion architecture

A company needs to configure a real-time data ingestion architecture for its application. The company needs an API, a process that transforms data as the data is streamed, and a storage solution for the data.
Which solution will meet these requirements with the LEAST operational overhead?

Deploy an Amazon EC2 instance to host an API that sends data to an Amazon Kinesis data stream. Create an Amazon Kinesis Data Firehose delivery_递交 stream that uses the Kinesis data stream as a data source. Use AWS Lambda functions to transform the data. Use the Kinesis Data Firehose delivery stream to send the data to Amazon S3.
Deploy an Amazon EC2 instance to host an API that sends data to AWS Glue. Stop source/destination checking on the EC2 instance. Use AWS Glue to transform the data and to send the data to Amazon S3.
✅ Configure an Amazon API Gateway API to send data to an Amazon Kinesis data stream. Create an Amazon Kinesis Data Firehose delivery stream that uses the Kinesis data stream as a data source. Use AWS Lambda functions to transform the data. Use the Kinesis Data Firehose delivery stream to send the data to Amazon S3.
Configure an Amazon API Gateway API to send data to AWS Glue. Use AWS Lambda functions to transform the data. Use AWS Glue to send the data to Amazon S3.

✨ 关键词：real-time data ingestion architecture、an API、a process that transforms data as the data is streamed、a storage solution

3️⃣ ✅

💡 解析：为应用程序配置一个实时的数据摄取架构。一个 API、一个转换数据流样式数据的进程、一个存储。要求最少操作的架构。
API 可以考虑使用 Amazon API Gateway API，数据流可以考虑 Amazon Kinesis data 系，数据最终存储在 S3 中。
3️⃣ 最贴合，在数据流的操作流程种，创建一个 Kinesis Data Firehose 并将 Kinesis data stream 作为数据源，然后用 Lambda 处理数据。

AWS Glue 发现、准备和集成所有任意规模的数据

AWS Glue 是一项无服务器数据集成服务，它简化了发现、准备、移动和集成来自多个来源的数据以进行分析、机器学习（ML）和应用程序开发的工作。

看上去 AWS Glue 更像是为机器学习准备的。而且是由收集数据的功能，不能完成提取 (Extract)、转换 (Transform)、加载 (Load) 的整个流程。

👨‍👨‍👦‍👦 社区讨论：(A) - You don’t need to deployan EC2 instance to host an API - Operational overhead
(B) - Same as A
(C) - Is the answer
(D) - AWS Glue gets data from S3, not from API GW. AWS Glue could do ETL (Extract-Transform-Load) by itself, so don’t need lambda. Non sense.
https://aws.amazon.com/glue/

The company needsan API = Amazon API Gateway API
A real-time data ingestion = Amazon Kinesis data stream
A process that transforms data = AWS Lambda functions
Kinesis Data Firehose delivery stream to send the data to Amazon S3
A storage solution for the data = Amazon S3

十三、DynamoDB backup

A company needs to keep user transaction data in an Amazon DynamoDB table. The company must retain the data for 7 years.
What is the MOST operationally efficient solution that meets these requirements?

Use DynamoDB point-in-time recovery to back up the table continuously.
✅ Use AWS Backup to create backup schedules and retention policies for the table.
Create an on-demand backup of the table by using the DynamoDB console. Store the backup in an Amazon S3 bucket. Set an S3 Lifecycle configuration for the S3 bucket.
Create an Amazon EventBridge (Amazon CloudWatch Events) rule to invoke an AWS Lambda function. Configure the Lambda function to back up the table and to store the backup in an Amazon S3 bucket. Set an S3 Lifecycle configuration for the S3 bucket.

✨ 关键词：transaction data、DynamoDB、retain the data for 7 years

2️⃣ ✅

💡 解析：公司将数据数据保存在 DynamoDB 表中，需要存放 7 年。需要最高效的解决方案。
DynamoDB 本身就高可用，配置定时备份和相关保护策略即可，看上去选 2️⃣。
DynamoDB 的 point-in-time recovery 只提供最长 35 天的数据恢复能力，不选。3️⃣ 和 4️⃣ 操作繁琐。

Backup and restore for DynamoDB

With on-demand backups, you can create a snapshot backup of your table that DynamoDB stores and manages. You’re charged based on the size and duration of your backups. Using on-demand backup, you can restore your entire DynamoDB table to the exact state it was in when the backup was created.

在 DynamoDB 的备份文档中也并未提及可以直接将备份保存到 S3 中，因此 3️⃣ 还存在错误。

👨‍👨‍👦‍👦 社区讨论：Answer is B “Amazon DynamoDB offers two types of backups: point-in-time recovery (PITR) and on-demand backups. (==> D is not the answer)
PITR is used to recover your table to any point in time in a rolling 35 day window, which is used to help customers mitigate accidental deletes or writes to their tables from bad code, maliciousaccess, or usererror. (==> A isn’t the answer)
On demand backupsare designed for long-term archiving and retention, which is typically used to help customers meet compliance and regulatory requirements.
This is the second of a series of two blog postsabout using AWS Backup to set up scheduled on-demand backups for Amazon DynamoDB. Part 1 presents the steps to set up a scheduled backup for DynamoDB tables from the AWS Management Console.” (==> Not the DynamoBD console and C isn’t the answereither)
https://aws.amazon.com/blogs/database/part-2-set-up-scheduled-backups-for-amazon-dynamodb-using-aws-backup/

十四、On-demand DynamoDB

A company is planning to use an Amazon DynamoDB table for data storage. The company is concerned about cost optimization. The table will not be used on most mornings. In the evenings, the read and write traffic will often be unpredictable_{不可预料的}. When traffic spikes occur, they will happen very quickly.
What should a solutions architect recommend?

✅ Create a DynamoDB table in on-demand capacity mode.
Create a DynamoDB table with a global secondary index.
Create a DynamoDB table with provisioned capacity and auto scaling.
Create a DynamoDB table in provisioned capacity mode, and configure it as a global table.

✨ 关键词：unpredictable traffic、cost optimization

1️⃣ ✅

💡 解析：使用 DynamoDB 存储数据，早晨不使用，夜间流量不可预料，流量猛增会非常快。需要最便宜的架构。
按需付费的最佳使用场景。

Amazon DynamoDB 定价

按需容量模式的定价 - DynamoDB 按需模式是一种无服务器选项，采用按请求付费定价和自动扩缩，无需规划、预置和管理容量。

预置容量模式的定价 - 使用预置容量模式，必须指定预计您的应用程序需要的每秒读取和写入次数。您需要根据预置的每小时读取和写入容量付费，而不是根据应用程序消耗的容量付费。

现有应用程序具有稳定且可预测的吞吐量模式

可以预测容量要求以控制成本

👨‍👨‍👦‍👦 社区讨论：On-demand mode isa good option if any of the following are true:

You create new tables with unknown workloads.

You have unpredictable application traffic.

You prefer the ease of paying for only what you use.

C - provisioned capacity is recommended for known patterns. Not the case here.

A company recently signed a contract with an AWS Managed Service Provider (MSP) Partner for help with an application migration initiative. A solutions architect needs to share an Amazon Machine Image (AMI) from an existing AWS account with the MSP Partner’s AWS account. The AMI is backed by Amazon Elastic Block Store (Amazon EBS) and uses an AWS Key Management Service (AWS KMS) customer managed key to encrypt EBS volume snapshots.
What is the MOST secure way for the solutions architect to share the AMI with the MSP Partner’s AWS account?

Make the encrypted AMI and snapshots publicly available. Modify the key policy to allow the MSP Partner’s AWS account to use the key.
✅ Modify the launchPermission property of the AMI. Share the AMI with the MSP Partner’s AWS account only. Modify the key policy to allow the MSP Partner’s AWS account to use the key.
Modify the launchPermission property of the AMI. Share the AMI with the MSP Partner’s AWS account only. Modify the key policy to trust a new KMS key that is owned by the MSP Partner for encryption.
Export the AMI from the source account to an Amazon S3 bucket in the MSP Partner’s AWS account, Encrypt the S3 bucket with a new KMS key that is owned by the MSP Partner. Copy and launch the AMI in the MSP Partner’s AWS account.

✨ 关键词：share an Amazon Machine Image from an existing AWS account with the MSP Partner’s AWS account

2️⃣ ✅

💡 解析：需要共享一个 AMI 给第三方的 AWS 账号，这个 AMI 有一个用 KMS 的密钥加密了 EBS。问该怎么共享这个 AMI。
首先需要共享这个 AMI 本身，之后参照最小权限原则只将这个密钥共享给第三方 AWS 账户，选 2️⃣。

Allowing users in other accounts to use a KMS key

Allowing use of external KMS keys with AWS services
You can give a user in a different account permission to use your KMS key with a service that is integrated with AWS KMS. For example, a user in an external account can use your KMS key to encrypt the objects in an Amazon S3 bucket or to encrypt the secrets they store in AWS Secrets Manager.

The key policy must give the external user or the external user’s account permission to use the KMS key. In addition, you need to attach IAM policies to the identity that gives the user permission to use the AWS service. The service might also require that users have additional permissions in the key policy or IAM policy. For a list of permissions that the AWS service requires on a customer managed key, see the Data Protection topic in the Security chapter of the user guide or developer guide for the service.

没有实操过，对于 Key 的共享和解密 EBS 流程有点云里雾里的，不过总归是可以共享 Key 就是了。

👨‍👨‍👦‍👦 社区讨论：Share the existing KMS key with the MSP external account because it hasalready been used to encrypt the AMI snapshot.
https://docs.aws.amazon.com/kms/latest/developerguide/key-policy-modifying-external-accounts.html

SAA 考试每日练习 - 2024/11/26

一、S3 Lifestyle

二、Multi-consumers in SQS

三、AWS Direct Connect & VPN

四、HA database

五、NLB and ALB

六、DynamoDB point-in-time recovery

七、Gateway Endpoint

八、Security Group

九、Security Group

十、Multi-tiered application and overload

十一、Data sync to S3

十二、Real-time data ingestion architecture

十三、DynamoDB backup

十四、On-demand DynamoDB

十五、AMI Share and KMS Share