AWS boto3 Essentials
What is boto3?
boto3 is the AWS SDK for Python. Instead of making raw HTTP requests to AWS APIs, you use Python objects and methods.
import boto3
# Create a client for a specific service
s3 = boto3.client('s3')
ec2 = boto3.client('ec2')
iam = boto3.client('iam')Authentication
boto3 finds credentials automatically in this order:
- Environment variables
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-west-2- AWS credentials file (created by
aws configure)
~/.aws/credentials:
[default]
aws_access_key_id = your_access_key
aws_secret_access_key = your_secret_key
region = us-west-2
- IAM roles (automatic on EC2 instances)
Test your credentials:
import boto3
sts = boto3.client('sts')
identity = sts.get_caller_identity()
print(f"Account: {identity['Account']}")
print(f"User: {identity['Arn']}")Error Handling
Failures in AWS operations can be problematic and difficult to debug. Always handle errors:
import boto3
from botocore.exceptions import ClientError, NoCredentialsError
try:
s3 = boto3.client('s3')
response = s3.list_buckets()
except NoCredentialsError:
print("AWS credentials not found")
except ClientError as e:
error_code = e.response['Error']['Code']
error_message = e.response['Error']['Message']
print(f"AWS Error {error_code}: {error_message}")
except Exception as e:
print(f"Other error: {e}")Common error codes:
AccessDenied- Insufficient permissionsResourceNotFound- Resource doesn’t exist
InvalidParameterValue- Bad parameterThrottling- Rate limited
Basic Usage Pattern
Every boto3 operation follows the same pattern:
import boto3
from botocore.exceptions import ClientError
# Create client
service = boto3.client('service_name')
try:
# Call operation
response = service.operation_name(Parameter='value')
# Process response (always a dictionary)
for item in response['Items']:
print(item['Name'])
except ClientError as e:
print(f"Error: {e}")Response Structure
All boto3 responses are dictionaries with consistent structure:
response = {
'Items': [...], # Main data
'ResponseMetadata': { # AWS metadata
'RequestId': 'abc123',
'HTTPStatusCode': 200,
'HTTPHeaders': {...}
},
'NextToken': 'xyz789' # For pagination
}Access the data you need:
response = ec2.describe_instances()
# Get the data
for reservation in response['Reservations']:
for instance in reservation['Instances']:
print(instance['InstanceId'])Pagination
Many AWS APIs return partial results. Use paginators to get everything:
# Wrong - only gets first page
response = s3.list_objects_v2(Bucket='my-bucket')
# Right - gets all objects
paginator = s3.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket='my-bucket'):
for obj in page.get('Contents', []):
print(obj['Key'])Operations that usually need pagination:
list_users(IAM)describe_instances(EC2)list_objects_v2(S3)
Client vs Resource
boto3 offers two interfaces:
Client - Raw API access (recommended)
s3 = boto3.client('s3')
response = s3.list_buckets() # Returns dictionary
for bucket in response['Buckets']:
print(bucket['Name'])Resource - Object-oriented wrapper
s3 = boto3.resource('s3')
for bucket in s3.buckets.all(): # Returns objects
print(bucket.name)Use client for production code - it’s more explicit and handles all operations.
Regions
Specify regions explicitly:
# Default region from credentials/environment
ec2 = boto3.client('ec2')
# Specific region
ec2 = boto3.client('ec2', region_name='us-west-2')List available regions:
ec2 = boto3.client('ec2')
regions = ec2.describe_regions()
for region in regions['Regions']:
print(region['RegionName'])Rate Limiting
AWS APIs have rate limits. boto3 retries automatically, but add delays for bulk operations:
import time
# Process many items
for item in large_list:
try:
# Make AWS call
response = client.operation(Item=item)
# Small delay to avoid rate limits
time.sleep(0.1)
except ClientError as e:
if e.response['Error']['Code'] == 'Throttling':
time.sleep(1) # Wait longer for throttling
continueService-Specific Notes
IAM - User and permission management
iam = boto3.client('iam')
# Operations: list_users, get_user, list_attached_user_policiesEC2 - Virtual machines and networking
ec2 = boto3.client('ec2')
# Operations: describe_instances, describe_security_groupsS3 - Object storage
s3 = boto3.client('s3')
# Operations: list_buckets, list_objects_v2, get_objectSTS - Identity verification
sts = boto3.client('sts')
# Operations: get_caller_identityDebugging
Enable boto3 logging to see HTTP requests:
import logging
boto3.set_stream_logger('boto3', logging.DEBUG)Check what operation you’re calling:
# This shows the actual API call being made
help(s3.list_buckets)Common Patterns
Check if resource exists:
def resource_exists():
try:
response = client.describe_resource(Id='resource-id')
return True
except ClientError as e:
if e.response['Error']['Code'] == 'ResourceNotFound':
return False
raise # Re-raise other errorsExtract data from nested responses:
# AWS responses are often deeply nested
response = ec2.describe_instances()
for reservation in response['Reservations']:
for instance in reservation['Instances']:
# Instance data here
passHandle missing optional fields:
# Use .get() for fields that might not exist
instance = response['Instance']
public_ip = instance.get('PublicIpAddress', 'No public IP')
tags = instance.get('Tags', [])Quick Reference
# Create client
client = boto3.client('service', region_name='us-west-2')
# Basic operation
response = client.operation_name(Parameter='value')
# Handle errors
try:
response = client.operation()
except ClientError as e:
error_code = e.response['Error']['Code']
# Pagination
paginator = client.get_paginator('operation_name')
for page in paginator.paginate():
# Process page
# Check authentication
sts = boto3.client('sts')
identity = sts.get_caller_identity()