Amazon Kinesis · Rate Limits

Amazon Kinesis Rate Limits

Amazon Kinesis Data Streams enforces shard-level limits (1 MB/s or 1000 records/s ingest; 2 MB/s retrieval) and account/region service-quota limits on the control-plane API. ProvisionedThroughputExceededException signals shard saturation; LimitExceededException signals stream-count quotas. AWS recommends exponential backoff with jitter and the SDK standard retry mode.

5 Limits Throttle: 400 Quota: 400
Rate LimitingStreamingKinesis

Limits

Shard ingest throughput shard
bytes_per_second · second
1048576
Each shard accepts 1 MB/s or 1000 records/s, whichever comes first. PutRecord/PutRecords return ProvisionedThroughputExceededException when exceeded.
Shard egress throughput (standard consumer) shard
bytes_per_second · second
2097152
Standard consumers share 2 MB/s per shard total. GetRecords throttles when exceeded.
Shard egress throughput (Enhanced Fan-Out consumer) shard/consumer
bytes_per_second · second
2097152
Each EFO consumer gets a dedicated 2 MB/s pipe per shard.
GetRecords calls per shard shard
requests_per_second · second
5
Standard consumer GetRecords cap. EFO consumers use SubscribeToShard instead.
Streams per account/region account/region
varies
see Service Quotas console for Kinesis Data Streams
Default soft limit; raisable via Service Quotas.

Policies

Backoff with jitter
Use truncated exponential backoff with jitter on ProvisionedThroughputExceededException; AWS SDKs default to standard retry mode.
Resharding
Increase capacity by splitting shards (UpdateShardCount or SplitShard); for unpredictable workloads, switch to On-demand mode.
Aggregation via KPL
Use the Kinesis Producer Library to aggregate records into 1 MB payloads and reduce PUT payload-unit count.
Quota increases
Stream count, shard count, and other soft limits can be raised via Service Quotas or AWS Support.

Sources