Homework #6: Simple Frontend and Serverless

EE 547: Spring 2026

ImportantAssignment Details

Assigned: 8 April
Due: Wednesday, 22 April at 23:59

Gradescope: Homework 6 | How to Submit

WarningRequirements
  • Python 3.11+ required
  • Docker Desktop installed and running
  • AWS CLI configured with valid credentials
  • HW#4 Problem 2 DynamoDB table (arxiv-papers) populated with data
  • HW#5 Problem 1 Flask API (app.py) working
  • AWS account with access to Lambda, S3, DynamoDB, API Gateway, and EventBridge

Overview

Build a web frontend for the ArXiv paper API using Jinja2 templates, then build a serverless earthquake data pipeline with AWS Lambda, S3, and DynamoDB.

Getting Started

Download the starter code: hw6-starter.zip

unzip hw6-starter.zip
cd hw6-starter

Problem 1: ArXiv Paper Browser

Build a server-rendered web interface for browsing ArXiv paper metadata from DynamoDB, using Jinja2 templates and client-side JavaScript for dynamic UI.

Use Flask, Jinja2, and boto3 for server-side rendering and data access. The starter code includes a JavaScript file and a CSS stylesheet.

Warning

Do not use React, Vue, or other frontend frameworks.

Part A: Jinja2 Templates

Jinja2 is a Python template engine that generates HTML by combining reusable layouts with data. It organizes HTML across multiple pages through inheritance and includes:

  • Inheritance — a base template defines a root page layout with {% block %} placeholders. Child templates extend it with {% extends %}, filling those blocks with page-specific content
  • Includes — a reusable fragment that appears on multiple pages is defined once and pulled in with {% include %}

Create five files in templates/:

templates/
├── base.html            ← base template; all others extend this
├── papers.html          ← child template; paper list by category
├── detail.html          ← child template; single paper, all fields
├── search.html          ← child template; keyword search
└── _paper_card.html     ← partial; included in papers and search

Base Template (base.html)

A base template defines the HTML document structure. It includes everything shared across all pages — navigation, footer, linked assets. Child templates extend it with {% extends 'base.html' %} and fill the content area with {% block content %}{% endblock %}.

The base template contains:

  • Navigation links (Papers, Search)
  • The auth box (described below under Auth Box)
  • A div with id="stats-box" for server status
  • Stylesheet and script links
ArXiv Browser   Papers  |  Search
[username] [password] [Login]   Not connected
page content
#stats-box — server region, table name, uptime, request count

Your base.html should follow this structure:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>{% block title %}ArXiv Browser{% endblock %}</title>
    <!-- CSS: Pico framework -->
    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@picocss/pico@2/css/pico.min.css">
    <!-- CSS: local overrides -->
    <link rel="stylesheet" href="{{ url_for('static', filename='style.css') }}">
    <!-- JS: client-side interactions -->
    <script defer src="{{ url_for('static', filename='app.js') }}"></script>
</head>
<body>
    <!-- header: nav, auth box -->
    <main>{% block content %}{% endblock %}</main>
    <!-- footer: #stats-box -->
</body>
</html>
Linked Stylesheets
  • Pico CSS is a classless CSS framework — it styles HTML tags (<nav>, <article>, <table>, <input>, <button>) directly — it does not require CSS class= attributes on elements. Pico handles the visual defaults so you can focus on template structure rather than styling.

    The <link> tag in the skeleton loads Pico from a CDN (content delivery network) — a hosted copy of the library served from a URL. You do not need to install or download anything locally.

  • The starter code includes style.css with layout adjustments for the auth box and paper cards. You may add to this file for additional styling.

Partial (_paper_card.html)

papers.html and search.html both display paper results using a card layout. Rather than duplicating the markup in multiple templates, define the card once as a partial and include it in both pages.

Each card includes:

  • Paper title, linked to /papers/<arxiv_id>
  • Author list
  • Published date

The JavaScript linked in the skeleton uses data-arxiv-id on each card to enable click-to-expand. After login, clicking a card fetches the full paper from the API and renders the abstract and categories inline without navigating away from the page. Structure each card like:

<article data-arxiv-id="{{ paper.arxiv_id }}">
    <!-- title, authors, date -->
    <div class="paper-detail" style="display:none"></div>
</article>

Use a {% for %} loop to render cards for a list of papers:

{% for paper in papers %}
    {% include '_paper_card.html' %}
{% endfor %}

Use {% if %} / {% else %} to handle empty states — display a message when there are no papers for a category, no search results, or no query has been entered.

Auth Box

The API endpoints require a JWT token. In previous assignments you set the Authorization header with curl -H 'Authorization: Bearer ...'. The auth-box form fields provide the same mechanism in the browser — app.js reads the username and password, sends them to POST /api/login, stores the returned token, and sets the Authorization header on all subsequent API requests.

Add the following to your base.html to include the auth-box. Place it in the header or navigation area — the layout is up to you, but the element IDs must match exactly:

<div id="auth-box">
    <input type="text" id="auth-user" placeholder="Username" size="8">
    <input type="password" id="auth-pass" placeholder="Password" size="8">
    <button id="auth-login">Login</button>
    <span id="auth-status"></span>
</div>

Static Files

Static files are assets served directly to the browser without template rendering, such as JavaScript, CSS, and images. Store these in the static/ directory. Flask automatically serves files from static/ next to app.py. No configuration is needed — url_for('static', filename='app.js') resolves to /static/app.js and Flask handles the request.

Part B: HTML Routes

Your existing app.py only has /api/ routes that return JSON.

Add the following new routes that return HTML pages using render_template(). Use the same DynamoDB access patterns as the existing API endpoints, but return rendered HTML instead of JSON. Do not change your /api/ endpoints.

  1. GET /papers?category={category} — paper list by category

    Default category: cs.LG. Include a <select id="category-select"> populated with category options (server-rendered via Jinja2) and a <div id="paper-list"> as the card container. On initial load, render paper cards server-side using _paper_card.html. After login, the JavaScript handles category switching — fetching from the API and replacing the card list without a page reload.

Category: cs.LG ▾
Attention Is All You Need
Vaswani, Shazeer, Parmar · 2017-06-12
BERT: Pre-training of Deep Bidirectional Transformers
Devlin, Chang, Lee · 2018-10-11
  1. GET /papers/<arxiv_id> — paper detail

    Render all fields for a single paper: title, authors, full abstract, categories, published date, and a link to arxiv.org. If the paper does not exist, return a 404 page with a message.

Attention Is All You Need
Vaswani, Shazeer, Parmar · 2017-06-12
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks...
Categories: cs.CL, cs.LG
  1. GET /search[?q={keyword}] — keyword search

    If q is present, render matching paper cards using _paper_card.html. If absent, render the search form with an empty results area.

    Include an <input id="search-input"> for the keyword field and a <div id="search-results"> as the results container. After login, the JavaScript adds live search — keystrokes fetch results from the API without a page reload.

Search: transformer
Attention Is All You Need
Vaswani, Shazeer, Parmar · 2017-06-12
Language Models are Few-Shot Learners
Brown, Mann, Ryder · 2020-05-28
  1. GET / — redirect to /papers

Part C: Client-Side Interactions

Your HTML routes return fully rendered pages. The provided JavaScript file app.js adds dynamic behavior: after login, interactions like changing the category or typing a search query fetch data from the API and replace the content in that area without a full page reload.

app.js

On page load, app.js uses document.getElementById() to select each element and attach event handlers. Elements expected on every page (#stats-box, #auth-login) trigger an alert() if missing — this means your base.html is not set up correctly. Page-specific elements (#category-select, #search-input) are silently skipped on pages where they should not appear.

Element Page JS Function Behavior
#stats-box all loadStats() On load: fetches GET /api/stats, renders server info
#auth-login all initAuth() On click: sends credentials to POST /api/login, stores JWT
#category-select papers initCategoryBrowse() On change: fetches papers for category, replaces #paper-list
#search-input search initSearch() On keyup (300ms debounce): fetches results, replaces #search-results
[data-arxiv-id] any initPaperExpand() On click: fetches full paper, expands abstract inline
Code: static/app.js
/**
 * ArXiv Paper Browser — Client-Side Interactions
 *
 * This script adds dynamic behavior to the server-rendered pages.
 * Each feature checks for the presence of its target elements
 * and skips initialization if they are not on the current page.
 *
 * Expected elements:
 *   #stats-box         — server status (all pages, in base template)
 *   #auth-user         — username input
 *   #auth-pass         — password input
 *   #auth-login        — login button
 *   #auth-status       — login status text
 *   #category-select   — category dropdown (papers page)
 *   #paper-list        — paper card container (papers page)
 *   #search-input      — keyword input (search page)
 *   #search-results    — search results container (search page)
 *   [data-arxiv-id]    — paper cards (any page, for detail expansion)
 */

// ---- State ----

let authToken = null;
let authUser = null;

// ---- API ----

async function apiFetch(path, options = {}) {
  const headers = { "Content-Type": "application/json", ...options.headers };
  if (authToken) {
    headers["Authorization"] = "Bearer " + authToken;
  }
  const resp = await fetch(path, { ...options, headers });
  if (!resp.ok) {
    throw new Error(resp.status + " " + resp.statusText);
  }
  return resp.json();
}

// ---- Stats ----

async function loadStats() {
  const box = document.getElementById("stats-box");
  if (!box) { alert("app.js: missing #stats-box element"); return; }
  try {
    const s = await apiFetch("/api/stats");
    const mins = Math.floor(s.uptime_seconds / 60);
    box.innerHTML =
      "<small>" +
      s.region + " &middot; " + s.table +
      " &middot; uptime " + mins + "m" +
      " &middot; " + s.requests.total + " requests" +
      "</small>";
  } catch (_) {
    box.innerHTML = "<small>API unavailable</small>";
  }
}

// ---- Auth ----

function initAuth() {
  const btn = document.getElementById("auth-login");
  if (!btn) { alert("app.js: missing #auth-login element"); return; }

  btn.addEventListener("click", async function () {
    const user = document.getElementById("auth-user").value;
    const pass = document.getElementById("auth-pass").value;
    const status = document.getElementById("auth-status");

    try {
      const data = await apiFetch("/api/login", {
        method: "POST",
        body: JSON.stringify({ username: user, password: pass }),
      });
      authToken = data.token;
      authUser = user;
      status.textContent =
        "Logged in as " + user + " (" + authToken.slice(0, 12) + "\u2026)";
      status.style.color = "green";
    } catch (_) {
      authToken = null;
      authUser = null;
      status.textContent = "Login failed";
      status.style.color = "red";
    }
  });
}

// ---- Paper Card Rendering ----

function renderPaperCard(paper) {
  const card = document.createElement("article");
  card.setAttribute("data-arxiv-id", paper.arxiv_id);
  card.style.cursor = "pointer";

  const title = document.createElement("strong");
  const link = document.createElement("a");
  link.href = "/papers/" + paper.arxiv_id;
  link.textContent = paper.title;
  title.appendChild(link);

  const authors = document.createElement("small");
  authors.textContent = (paper.authors || []).join(", ");

  const date = document.createElement("small");
  date.textContent = paper.published || "";

  const detail = document.createElement("div");
  detail.className = "paper-detail";
  detail.style.display = "none";

  card.appendChild(title);
  card.appendChild(document.createElement("br"));
  card.appendChild(authors);
  card.appendChild(document.createTextNode(" \u00b7 "));
  card.appendChild(date);
  card.appendChild(detail);

  return card;
}

function renderPaperCards(container, papers) {
  container.innerHTML = "";
  if (!papers || papers.length === 0) {
    container.innerHTML = "<p>No papers found.</p>";
    return;
  }
  papers.forEach(function (p) {
    container.appendChild(renderPaperCard(p));
  });
}

// ---- Category Browse ----

function initCategoryBrowse() {
  const select = document.getElementById("category-select");
  const list = document.getElementById("paper-list");
  if (!select || !list) return;

  select.addEventListener("change", async function () {
    if (!authToken) {
      list.innerHTML = "<p>Login to browse by category.</p>";
      return;
    }
    list.innerHTML = "<p>Loading\u2026</p>";
    try {
      const data = await apiFetch(
        "/api/papers?category=" +
          encodeURIComponent(select.value) +
          "&limit=20"
      );
      renderPaperCards(list, data.papers);
    } catch (err) {
      list.innerHTML = "<p>Error: " + err.message + "</p>";
    }
  });
}

// ---- Keyword Search ----

function initSearch() {
  const input = document.getElementById("search-input");
  const results = document.getElementById("search-results");
  if (!input || !results) return;

  let timer = null;
  input.addEventListener("input", function () {
    clearTimeout(timer);
    const q = input.value.trim();
    if (q.length < 2) {
      return;
    }
    if (!authToken) {
      results.innerHTML = "<p>Login to search.</p>";
      return;
    }
    timer = setTimeout(async function () {
      results.innerHTML = "<p>Searching\u2026</p>";
      try {
        const data = await apiFetch(
          "/api/papers/keyword/" + encodeURIComponent(q) + "?limit=20"
        );
        renderPaperCards(results, data.papers);
      } catch (err) {
        results.innerHTML = "<p>Error: " + err.message + "</p>";
      }
    }, 300);
  });
}

// ---- Paper Detail Expand ----

function initPaperExpand() {
  document.addEventListener("click", async function (e) {
    // ignore clicks on links
    if (e.target.tagName === "A") return;

    var card = e.target.closest("[data-arxiv-id]");
    if (!card) return;

    // find or create detail container
    var detail = card.querySelector(".paper-detail");
    if (!detail) {
      detail = document.createElement("div");
      detail.className = "paper-detail";
      detail.style.display = "none";
      card.appendChild(detail);
    }

    // toggle if already loaded
    if (detail.dataset.loaded) {
      detail.style.display =
        detail.style.display === "none" ? "block" : "none";
      return;
    }

    if (!authToken) return;

    var arxivId = card.getAttribute("data-arxiv-id");
    detail.innerHTML = "<small>Loading\u2026</small>";
    detail.style.display = "block";

    try {
      var paper = await apiFetch(
        "/api/papers/" + encodeURIComponent(arxivId)
      );
      detail.innerHTML =
        "<hr>" +
        "<p>" + (paper.abstract || "No abstract available.") + "</p>" +
        "<small>Categories: " +
        (paper.categories || []).join(", ") +
        "</small>";
      detail.dataset.loaded = "true";
    } catch (_) {
      detail.innerHTML = "<small>Could not load details.</small>";
    }
  });
}

// ---- Init ----

document.addEventListener("DOMContentLoaded", function () {
  loadStats();
  initAuth();
  initCategoryBrowse();
  initSearch();
  initPaperExpand();
});
Code: static/style.css
/* ArXiv Paper Browser — layout overrides for Pico CSS */

#auth-box {
    display: flex;
    gap: 0.5rem;
    align-items: center;
}

#auth-box input {
    width: auto;
}

#auth-status {
    font-size: 0.85rem;
}

.paper-card {
    cursor: pointer;
}

.paper-detail {
    margin-top: 0.5rem;
}

Part D: Run and Test

pip install -r requirements.txt
python app.py

Open http://localhost:8080 in a browser. Verify:

Server-rendered pages:

  • Navigate between Papers and Search — both render content on load
  • Click a paper title — detail page shows all fields including abstract
  • Visit /search?q=transformer directly — results render server-side
  • Visit /papers/nonexistent — a 404 page appears

JavaScript interactions:

  • Click Login in the auth box — status updates with username and truncated token
  • Change the category dropdown — paper list updates without page reload
  • Type in the search input — results appear as you type
  • Click a paper card body (not the title link) — abstract expands inline

Deliverables

See Submission.

ImportantGrading Commands

We will validate your submission by running the following commands from your q1/ directory:

docker build -t arxiv-app:latest .
docker run --rm -p 8080:8080 \
    -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY -e AWS_SESSION_TOKEN \
    -e DYNAMODB_TABLE=arxiv-papers \
    arxiv-app:latest

These commands must complete without errors. We will then verify:

Server-rendered pages:

  • Papers, detail, and search routes return HTML with server-rendered content
  • Empty states display a message; nonexistent papers return a 404 page

Template structure:

  • All pages extend base.html
  • _paper_card.html is included in both papers.html and search.html

API and client-side:

  • /api/ endpoints still return JSON
  • JavaScript interactions function after login

Problem 2: Earthquake Data Pipeline

Build a serverless data pipeline using AWS Lambda and event-driven triggers that collects earthquake data from the public USGS Earthquake Hazards API, processes it into structured records, and serves it through HTTP endpoints.

Use only boto3 and Python standard library modules (json, urllib.request, os, datetime, decimal). boto3 is available in the Lambda Python runtime — no packaging or layers are needed.

Architecture

The pipeline has three stages. Each stage is a separate Lambda function, triggered by a different AWS service. The services between them (S3, DynamoDB) handle coordination — no function calls another directly.

Data Pipeline
EventBridge
every 10 min
Fetch
USGS API → S3
writes to
S3
raw JSON
Process
validate → DynamoDB
Query
API Gateway
HTTP requests
Query
DynamoDB → JSON
  • FetchEventBridge invokes on a schedule (every 10 minutes). Pulls raw earthquake data from the USGS API, writes the response to S3.
  • ProcessS3 invokes on object creation. Reads the raw data, validates and transforms each record, writes structured items to DynamoDB.
  • QueryAPI Gateway invokes on HTTP request. Queries DynamoDB, returns JSON.

Earthquake Data Source

The USGS Earthquake Hazards Program publishes real-time earthquake data as a GeoJSON feed:

https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_hour.geojson

This endpoint returns all earthquakes recorded in the past hour, updated every minute. The response is a GeoJSON FeatureCollection:

{
  "type": "FeatureCollection",
  "metadata": {"generated": 1775768252000, "count": 6},
  "features": [
    {
      "type": "Feature",
      "properties": {
        "mag": 1.3,
        "place": "29 km WSW of Cantwell, Alaska",
        "time": 1775767515931,
        "updated": 1775767622832,
        "url": "https://earthquake.usgs.gov/earthquakes/eventpage/aka2026gzmlim",
        "status": "automatic",
        "tsunami": 0,
        "sig": 26,
        "magType": "ml",
        "type": "earthquake",
        "title": "M 1.3 - 29 km WSW of Cantwell, Alaska"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [-149.449, 63.25, 82.6]
      },
      "id": "aka2026gzmlim"
    }
  ]
}
Warning

geometry.coordinates is ordered [longitude, latitude, depth_km] — longitude first, not latitude.

Part A: AWS Infrastructure

Use the AWS Console to create the following resources before writing the Lambda functions.

1. S3 Bucket

Create an S3 bucket to store raw earthquake feed responses. Bucket names must be globally unique — a common convention is to include your AWS account ID (e.g., earthquakes-123456789012).

2. DynamoDB Table

Create a DynamoDB table to store processed earthquake records. The Query Lambda serves two access patterns:

  • Querying earthquakes within a date range, sorted by time (most recent first)
  • Looking up a single earthquake by its USGS event ID

Design your table to support both. Use the following schema structure:

# Main table item
{
    "PK": "EQ",                                      # Fixed partition key
    "SK": "2026-04-08T14:30:00Z#us7000abc1",         # time_iso#earthquake_id
    "earthquake_id": "us7000abc1",
    "magnitude": Decimal("5.2"),
    "place": "10km NW of The Geysers, CA",
    # ... remaining fields
}

# Date range query: PK = "EQ", SK between "2026-04-01" and "2026-04-10"
# ISO-8601 strings sort lexicographically — no date parsing needed

# Individual lookup needs a GSI:
# GSI partition key: earthquake_id

Document your schema choices in your README.

3. IAM Role

Your Lambda functions need access to several AWS services: S3 (to read and write raw data), DynamoDB (to read and write earthquake records), and CloudWatch Logs (for logging). Rather than configuring permissions separately for each function, create a single IAM role (earthquake-pipeline-role) and assign it to all three.

An IAM role defines what actions a service (in this case Lambda) is allowed to perform. The trust policy specifies who can assume the role; the permission policy specifies what the role can do.

IAM Policy JSON

Trust policy (allows Lambda to assume the role):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {"Service": "lambda.amazonaws.com"},
            "Action": "sts:AssumeRole"
        }
    ]
}

Permission policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": ["s3:GetObject", "s3:PutObject"],
            "Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "dynamodb:PutItem",
                "dynamodb:BatchWriteItem",
                "dynamodb:Query",
                "dynamodb:GetItem"
            ],
            "Resource": "arn:aws:dynamodb:*:*:table/YOUR_TABLE_NAME*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        }
    ]
}

Replace YOUR_BUCKET_NAME and YOUR_TABLE_NAME with your actual resource names.

Part B: Lambda Functions

Create a Lambda function for each stage. Use a single .py file with:

  • Runtime: Python 3.12
  • Memory: 128 MB
  • Timeout: 30 seconds
  • Role: earthquake-pipeline-role

Environment Variables

Each Lambda function reads configuration from environment variables (os.environ). Set these per-function in the Lambda console under Configuration → Environment variables:

  • Fetch: S3_BUCKET, FEED_URL (optional)
  • Process: DYNAMODB_TABLE
  • Query: DYNAMODB_TABLE

Lambda Functions

1. Fetch Lambda (fetch.py)

Triggered by EventBridge with schedule expression rate(10 minutes).

The USGS feed returns the past hour of data. With a 10-minute fetch interval. Consecutive fetches overlap, so the same earthquake ID may appear in multiple responses. But because the earthquake ID is the DynamoDB primary key, writing the same record again overwrites rather than duplicates.

fetch.py calls the USGS endpoint and writes the raw JSON response to S3. The S3 key includes a UTC timestamp so each fetch produces a distinct object:

import json
import os
import urllib.request
from datetime import datetime, timezone

import boto3

S3_BUCKET = os.environ["S3_BUCKET"]
FEED_URL = os.environ.get(
    "FEED_URL",
    "https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_hour.geojson"
)

def lambda_handler(event, context):
    # Fetch USGS feed (10-second timeout)
    # On non-200 or timeout: log the error, return without writing to S3
    # Do not raise — a clean exit prevents EventBridge from retrying

    # Write raw response to S3:
    #   key format: raw/YYYY-MM-DDTHH-MM-SS.json (UTC)

    # Log: timestamp, HTTP status, feature count
2. Process Lambda (process.py)

Triggered by S3 ObjectCreated events on your bucket, filtered to the raw/ prefix.

When the Fetch Lambda writes a new file to S3, this function is invoked automatically. It reads the raw GeoJSON, validates each feature, extracts a flat set of fields, and writes the results to DynamoDB.

Fields to Extract
{
    "earthquake_id": feature["id"],                    # String
    "magnitude":     Decimal(str(props["mag"])),       # Number
    "mag_type":      props["magType"],                 # String
    "place":         props["place"],                   # String
    "time":          props["time"],                    # Number (Unix ms)
    "time_iso":      "...",                            # String (ISO-8601 UTC, converted from time)
    "latitude":      Decimal(str(coords[1])),          # Number
    "longitude":     Decimal(str(coords[0])),          # Number
    "depth_km":      Decimal(str(coords[2])),          # Number
    "tsunami":       props["tsunami"],                 # Number (0 or 1)
    "significance":  props["sig"],                     # Number
    "status":        props["status"],                  # String
    "url":           props["url"],                     # String
}

Add your DynamoDB partition key and sort key fields to each item according to your schema design.

Validation

Skip features where:

  • mag is null or missing
  • geometry.coordinates is missing or has fewer than 3 elements
  • properties.type is not "earthquake" (the feed includes quarry blasts, explosions, and other non-earthquake events)
  • Latitude is outside [-90, 90] or longitude is outside [-180, 180]

Log the number of features in the file, the number written, and the number skipped.

Idempotent Insert (De-duplication)

The Fetch Lambda runs every 10 minutes, but the USGS feed returns the past hour. Consecutive fetches contain overlapping data. Use the earthquake’s USGS event ID in your DynamoDB key so that writing the same earthquake twice is an idempotent overwrite, not a duplicate.

WarningDynamoDB and Python Floats

boto3 does not accept Python float values for DynamoDB number fields. Convert through Decimal(str(value)):

from decimal import Decimal

Decimal(str(5.2))     # Decimal('5.2') — correct
Decimal(5.2)          # Decimal('5.19999999...') — wrong
import json
import os
import boto3
from decimal import Decimal
from datetime import datetime, timezone
from urllib.parse import unquote_plus

DYNAMODB_TABLE = os.environ["DYNAMODB_TABLE"]

def lambda_handler(event, context):
    # Get bucket and key from the S3 event record:
    #   bucket: event["Records"][0]["s3"]["bucket"]["name"]
    #   key:    unquote_plus(event["Records"][0]["s3"]["object"]["key"])

    # Read raw GeoJSON from S3

    # For each feature:
    #   Validate (skip invalid)
    #   Extract and flatten fields
    #   Add partition key, sort key (per your schema design)
    #   Write to DynamoDB
3. Query Lambda (query.py)

Triggered by API Gateway (HTTP API).

Route requests by path using the API Gateway event fields:

  • event["rawPath"] — the request path, e.g. "/earthquakes" or "/earthquakes/us7000abc1"
  • event["queryStringParameters"] — query parameters as a dict
  • event["pathParameters"] — path parameters as a dict

Implement two endpoints:

  1. GET /earthquakes?start={date}&end={date}&min_magnitude={n}&limit={n} — list earthquakes

    start and end are required (format: YYYY-MM-DD). min_magnitude defaults to 0. limit defaults to 50, maximum 500. Return earthquakes in the date range, filtered by minimum magnitude, sorted by time (most recent first).

    {
      "parameters": {
        "start": "2026-04-01",
        "end": "2026-04-09",
        "min_magnitude": 4.0,
        "limit": 50
      },
      "earthquakes": [
        {
          "earthquake_id": "us7000abc1",
          "magnitude": 5.2,
          "place": "10km NW of The Geysers, CA",
          "time": "2026-04-08T14:30:00Z",
          "latitude": 38.85,
          "longitude": -122.80,
          "depth_km": 2.5
        }
      ],
      "count": 15
    }

    Return 400 if start or end is missing or not valid YYYY-MM-DD.

  2. GET /earthquakes/{earthquake_id} — single earthquake

    Return all stored fields:

    {
      "earthquake_id": "us7000abc1",
      "magnitude": 5.2,
      "mag_type": "mw",
      "place": "10km NW of The Geysers, CA",
      "time": "2026-04-08T14:30:00Z",
      "latitude": 38.85,
      "longitude": -122.80,
      "depth_km": 2.5,
      "tsunami": 0,
      "significance": 416,
      "status": "reviewed",
      "url": "https://earthquake.usgs.gov/earthquakes/eventpage/us7000abc1"
    }

    Return 404 if the ID does not exist:

    {"error": "Earthquake not found", "earthquake_id": "invalid_id"}
Response Format

All Lambda responses through API Gateway use this structure:

return {
    "statusCode": 200,
    "headers": {"Content-Type": "application/json"},
    "body": json.dumps(response_data)
}

Error responses return JSON with an error field and the appropriate status code (400, 404, or 500).

import json
import os
import boto3
from decimal import Decimal

DYNAMODB_TABLE = os.environ["DYNAMODB_TABLE"]

def lambda_handler(event, context):
    # Route by path:
    #   event["rawPath"]                  — "/earthquakes" or "/earthquakes/{id}"
    #   event["queryStringParameters"]    — {"start": "...", "end": "...", ...}
    #   event["pathParameters"]           — {"earthquake_id": "..."}

    # /earthquakes       → query DynamoDB (date range on sort key, filter by magnitude)
    # /earthquakes/{id}  → GSI lookup by earthquake_id

Part C: Deployment and Testing

Deploy all three functions using the AWS Console. Each is a single .py file small enough to paste into the console’s inline code editor.

Triggers

Attach triggers through the console:

  • Process Lambda — S3 trigger, event type ObjectCreated, prefix raw/
  • Query Lambda — API Gateway (HTTP API) trigger
  • Fetch Lambda — EventBridge rule, rate(10 minutes)

Testing

Invoke the Fetch Lambda manually from the console Test tab with an empty event ({}). Verify the raw file appears in S3 under raw/, then check that the S3 trigger fires the Process Lambda (CloudWatch Logs), and records appear in DynamoDB. Test the Query Lambda with the API Gateway URL from curl or a browser:

# List earthquakes in date range
curl "https://YOUR_API_ID.execute-api.us-west-2.amazonaws.com/earthquakes?start=2026-04-01&end=2026-04-10"

# Get single earthquake
curl "https://YOUR_API_ID.execute-api.us-west-2.amazonaws.com/earthquakes/us7000abc1"

Deliverables

See Submission.

Download your Lambda function code from the console (each function → Code tab) and include in your submission.

README.md must include:

  1. API Gateway base URL
  2. DynamoDB table name, partition key, sort key, and any GSIs — with justification for your key design
  3. How data flows through the three Lambdas
  4. The Fetch Lambda runs every 10 minutes, but the USGS feed returns the past hour of data. Explain why this does not produce duplicate records in your table.
ImportantGrading Commands

We will review your three handler files and README. We will verify:

Lambda functions:

  • fetch.py calls the USGS API, writes raw GeoJSON to S3 under raw/, handles errors without raising
  • process.py parses the S3 event, reads GeoJSON, validates features (skips null magnitude, non-earthquake types, invalid coordinates), writes structured records to DynamoDB using Decimal for number fields
  • query.py routes API Gateway events to two endpoints, returns correct JSON, validates required parameters (400), handles missing records (404)

README:

  • Schema documented with key design justification
  • API Gateway URL included
  • Data flow described
  • Idempotent insert explained

TipSubmission
README.md
q1/
├── app.py
├── Dockerfile
├── requirements.txt
├── static/
│   ├── app.js
│   └── style.css
└── templates/
    ├── base.html
    ├── papers.html
    ├── detail.html
    ├── search.html
    └── _paper_card.html
q2/
├── fetch.py
├── process.py
├── query.py
└── README.md