A company runs an online marketplace web application on AWS. The application serves hundreds of thousands of users during peak hours. The company needs a scalable, near-real-time solution to share the details of millions of financial transactions with several other internal applications. Transactions also need to be processed to remove sensitive data before being stored in a document database for low-latency retrieval.
What should a solutions architect recommend to meet these requirements?
A. Store the transactions data into Amazon DynamoDB. Set up a rule in DynamoDB to remove sensitive data from every transaction upon write. Use DynamoDB Streams to share the transactions data with other applications.
B. Stream the transactions data into Amazon Kinesis Data Firehose to store data in Amazon DynamoDB and Amazon S3. Use AWS Lambda integration with Kinesis Data Firehose to remove sensitive data. Other applications can consume the data stored in Amazon S3.
C. Stream the transactions data into Amazon Kinesis Data Streams. Use AWS Lambda integration to remove sensitive data from every transaction and then store the transactions data in Amazon DynamoDB. Other applications can consume the transactions data off the Kinesis data stream.
D. Store the batched transactions data in Amazon S3 as files. Use AWS Lambda to process every file and remove sensitive data before updating the files in Amazon S3. The Lambda function then stores the data in Amazon DynamoDB. Other applications can consume transaction files stored in Amazon S3.
The best option would be C. Stream the transactions data into Amazon Kinesis Data Streams.
This is because Amazon Kinesis Data Streams can handle the high volume of data and provide near-real-time data processing, which is crucial for this scenario. AWS Lambda integration can be used to process each transaction and remove sensitive data before storing it in Amazon DynamoDB. DynamoDB is a good choice for storing the processed transactions due to its low-latency data access capabilities. Other applications can consume the transactions data off the Kinesis data stream, ensuring that all applications have access to the latest transactions data.
Options A, B, and D have certain limitations:
- Option A: DynamoDB does not have a built-in feature to remove sensitive data upon write.
- Option B: Storing data in S3 would not provide the low-latency retrieval required for this use case.
- Option D: Processing files in S3 with Lambda would not provide near-real-time data processing.
Therefore, option C is the most suitable solution for this scenario.