« Kinesis » : différence entre les versions
De Banane Atomic
Aller à la navigationAller à la recherche
Ligne 89 : | Ligne 89 : | ||
* S3, [https://docs.aws.amazon.com/firehose/latest/dev/s3-prefixes.html Custom Prefixes for Amazon S3 Objects], [https://aws.amazon.com/blogs/big-data/amazon-data-firehose-custom-prefixes-for-amazon-s3-objects/ Amazon Data Firehose custom prefixes for Amazon S3 objects] | * S3, [https://docs.aws.amazon.com/firehose/latest/dev/s3-prefixes.html Custom Prefixes for Amazon S3 Objects], [https://aws.amazon.com/blogs/big-data/amazon-data-firehose-custom-prefixes-for-amazon-s3-objects/ Amazon Data Firehose custom prefixes for Amazon S3 objects] | ||
* MongoDB | * MongoDB | ||
* HTTP End Point | |||
= Data Analytics = | = Data Analytics = | ||
* Studio Notebook | * Studio Notebook | ||
* Glue catalog | * Glue catalog |
Version du 23 mai 2024 à 09:28
Description
Allows to gather data from different sources using data stream or fire hoses, and get that data to a destination in the expected format.
Allows to collect, process and analyze data in near real-time at scale.
Kinesis data streams are used in places where an unbounded stream of data needs to worked on in real time.
And Kinesis Firehose delivery streams are used when data needs to be delivered to a storage destination, such as S3.
Common use cases
- pattern detection
- click stream analysis (who is clicking on what and when)
- log processing for machine learning
- anomyly detection in IoT devices
Benefits
- fully managed service (focus on data and don't worry about the underlying system)
- can handle large amount of data
- can consume process and buffer data in real-time
- allows production of real-time metrics and reporting
Data Stream
Move data from sources to a destination while having analytics, monitoring alerts, and connections to other services.
Common use cases
- log and data intake and procesing
- real-time metrics, reporting, and data analytics
Benefits
- ensures data durability and elasticity
Producers
Producers put records into Kinesis Data Streams.
- EventBridge
- CloudFront, collect views and analyze real-time metrics of CDN
- CloudWatch
var kinesisDataStreamName = "input-stream"; var region = RegionEndpoint.USEast1; var kinesisConfiguration = new AmazonKinesisConfig { Profile = new Profile("..."), RegionEndpoint = region }; var kinesisClient = new AmazonKinesisClient(kinesisConfiguration); for (var i = 1; ; i++) { var message = new Message(i, DateTime.Now); var data = JsonConvert.SerializeObject(message); using var memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(data)); var request = new PutRecordRequest { StreamName = kinesisDataStreamName, Data = memoryStream, PartitionKey = "partitions1" }; var response = kinesisClient.PutRecordAsync(request).Result; Console.WriteLine($"Shard ID: {response.ShardId}, Sequence Number: {response.SequenceNumber}, HTTPStatusCode: {response.HttpStatusCode}"); await Task.Delay(1000); // every 1 second } class Message(int id, DateTime date) { public int Id { get; set; } = id; public DateTime Date { get; set; } = date; } |
Consumers
Consumers get records from Kinesis Data Streams and process them.
- Data Firehose
- Lambda, read data by batch
- Glue, ETL
- SNS
Amazon Data Firehose
- ingest data from different sources
- transform source records using AWS lambda functions
- deliver data to a specified destination
Similar to Data Stream but acts as an ETL.
Source
- Kinesis Data Streams
Destination
- S3, Custom Prefixes for Amazon S3 Objects, Amazon Data Firehose custom prefixes for Amazon S3 objects
- MongoDB
- HTTP End Point
Data Analytics
- Studio Notebook
- Glue catalog