HTTP Fundamentals

What is HTTP?

HTTP (HyperText Transfer Protocol) is a request-response protocol for communication between clients and servers over networks. When you visit a website, your browser (client) sends an HTTP request to a web server, which responds with the requested data.

HTTP operates on a simple pattern:

  1. Client sends requestGET /api/users HTTP/1.1
  2. Server sends response200 OK + data
  3. Connection closes (or stays open for reuse)

Every HTTP interaction follows this request-response cycle. The client always initiates; the server always responds.

HTTP Messages Structure

Both requests and responses have the same basic structure:

START LINE
HEADERS
[blank line]
BODY (optional)

HTTP Request Example

GET /api/papers/2301.12345 HTTP/1.1
Host: arxiv.org
User-Agent: curl/7.68.0
Accept: application/json

HTTP Response Example

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 1247
Server: nginx/1.18.0

{
  "title": "Attention Is All You Need",
  "authors": ["Vaswani", "Shazeer"]
}

HTTP Methods (Verbs)

HTTP methods indicate the intended action:

GET - Retrieve data (no body sent)

GET /papers HTTP/1.1
Host: api.arxiv.org

POST - Send data to server (body contains data)

POST /search HTTP/1.1
Host: api.arxiv.org
Content-Type: application/json
Content-Length: 45

{
  "query": "machine learning",
  "limit": 10
}

PUT - Update/replace resource

PUT /papers/2301.12345 HTTP/1.1
Host: api.arxiv.org
Content-Type: application/json

{
  "title": "Updated Title",
  "abstract": "New abstract text..."
}

DELETE - Remove resource

DELETE /papers/2301.12345 HTTP/1.1
Host: api.arxiv.org

HEAD - Like GET but only returns headers (no body)

HEAD /papers/2301.12345 HTTP/1.1
Host: api.arxiv.org

Status Codes

The server’s response starts with a status code indicating success or failure:

2xx Success

  • 200 OK - Request succeeded
  • 201 Created - Resource created successfully
  • 204 No Content - Success, but no data to return

3xx Redirection

  • 301 Moved Permanently - Resource has new URL
  • 302 Found - Temporary redirect
  • 304 Not Modified - Cached version is current

4xx Client Error

  • 400 Bad Request - Invalid request format
  • 401 Unauthorized - Authentication required
  • 403 Forbidden - Not allowed to access
  • 404 Not Found - Resource doesn’t exist

5xx Server Error

  • 500 Internal Server Error - Server crashed
  • 502 Bad Gateway - Upstream server error
  • 503 Service Unavailable - Server overloaded

Common Headers

Headers provide metadata about requests and responses:

Request Headers

Host: api.arxiv.org              # Required in HTTP/1.1
User-Agent: MyApp/1.0            # Client identification
Accept: application/json         # Preferred response format
Content-Type: application/json   # Body format (POST/PUT)
Content-Length: 123              # Body size in bytes
Authorization: Bearer abc123     # Authentication token

Response Headers

Content-Type: application/json   # Body format
Content-Length: 1247             # Body size
Server: nginx/1.18.0             # Server software
Date: Mon, 16 Sep 2024 14:30:00 GMT
Cache-Control: max-age=3600      # Caching instructions

Testing HTTP with curl

curl is a command-line tool for making HTTP requests. Use -v (verbose) to see the full request/response:

Basic GET Request

curl -v https://httpbin.org/get

Output:

> GET /get HTTP/1.1
> Host: httpbin.org
> User-Agent: curl/7.68.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json
< Content-Length: 294
< 
{
  "args": {}, 
  "headers": {
    "Accept": "*/*", 
    "Host": "httpbin.org", 
    "User-Agent": "curl/7.68.0"
  }, 
  "origin": "203.0.113.1", 
  "url": "https://httpbin.org/get"
}

Lines starting with > are the request, lines with < are the response.

POST Request with JSON Data

curl -v -X POST https://httpbin.org/post \
  -H "Content-Type: application/json" \
  -d '{"query": "machine learning", "limit": 5}'

Flags:

  • -X POST - Use POST method
  • -H "Header: Value" - Add custom header
  • -d "data" - Send data in request body

Testing Different Status Codes

# Test 404 Not Found
curl -v https://httpbin.org/status/404

# Test 500 Server Error  
curl -v https://httpbin.org/status/500

# Test redirect
curl -v https://httpbin.org/redirect/1

Testing with Headers

# Send custom User-Agent
curl -v -H "User-Agent: MyBot/1.0" https://httpbin.org/get

# Send multiple headers
curl -v \
  -H "Accept: application/json" \
  -H "Authorization: Bearer token123" \
  https://httpbin.org/headers

Following Redirects

# Don't follow redirects (default)
curl -v https://httpbin.org/redirect/1

# Follow redirects automatically
curl -v -L https://httpbin.org/redirect/1

HTTP in Web APIs

Most web APIs use HTTP with JSON data:

API Request Pattern

curl -v -X GET https://api.github.com/users/octocat \
  -H "Accept: application/vnd.github.v3+json" \
  -H "User-Agent: MyApp/1.0"

API Response Pattern

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 59

{
  "login": "octocat",
  "id": 1,
  "name": "The Octocat",
  "public_repos": 8
}

APIs often use:

  • JSON for data exchange
  • Bearer tokens for authentication
  • Custom headers for API versioning and rate limiting
  • Query parameters for filtering: ?limit=10&offset=20

HTTP Connection Behavior

Connection Lifecycle

  1. TCP connection established to server
  2. HTTP request sent over connection
  3. HTTP response received
  4. Connection closed (HTTP/1.0) or kept alive (HTTP/1.1+)

Keep-Alive vs New Connections

# Single request - connection closes
curl -v -H "Connection: close" https://httpbin.org/get

# Multiple requests - reuse connection
curl -v https://httpbin.org/get https://httpbin.org/ip

Connection reuse improves performance by avoiding TCP handshake overhead.

Common HTTP Patterns

Query Parameters

# Search with parameters
curl -v "https://httpbin.org/get?query=python&limit=10&sort=date"

Parameters are URL-encoded:

  • Spaces become %20 or +
  • Special characters become %XX codes

Form Data

# Send form data (like HTML form submission)
curl -v -X POST https://httpbin.org/post \
  -d "username=admin&password=secret"

Sets Content-Type: application/x-www-form-urlencoded automatically.

File Upload

# Upload file
curl -v -X POST https://httpbin.org/post \
  -F "file=@document.pdf" \
  -F "description=Important document"

Uses multipart/form-data encoding for file uploads.

Debugging HTTP Issues

Check Response Headers

# Only show response headers
curl -I https://httpbin.org/get

# Show response headers with body
curl -v https://httpbin.org/get

Test Connectivity

# Test if server responds
curl -v --connect-timeout 5 https://example.com/

# Test with specific HTTP version
curl -v --http1.1 https://example.com/
curl -v --http2 https://example.com/

Save Response

# Save response body to file
curl -o response.json https://httpbin.org/json

# Save headers to file
curl -D headers.txt https://httpbin.org/get

HTTP vs HTTPS

HTTP - Unencrypted communication (port 80)

curl -v http://httpbin.org/get

HTTPS - Encrypted with TLS/SSL (port 443)

curl -v https://httpbin.org/get

HTTPS wraps HTTP in an encrypted tunnel. The HTTP message format remains identical, but network traffic is encrypted.

Testing Your Own API

When building HTTP APIs, test common scenarios:

# Test your local server
curl -v http://localhost:8080/papers

# Test error handling
curl -v http://localhost:8080/papers/invalid-id

# Test POST requests
curl -v -X POST http://localhost:8080/search \
  -H "Content-Type: application/json" \
  -d '{"query": "test"}'

# Test with malformed data
curl -v -X POST http://localhost:8080/search \
  -H "Content-Type: application/json" \
  -d '{"invalid": json'

Quick Reference

curl Flags

-v, --verbose           # Show request/response headers
-X, --request METHOD    # HTTP method (GET, POST, etc.)
-H, --header "Name: Val"# Add header
-d, --data "data"       # Send data (POST body)
-o, --output file       # Save response to file
-I, --head              # HEAD request (headers only)
-L, --location          # Follow redirects
-f, --fail              # Exit with error on HTTP errors
-s, --silent            # Quiet mode (no progress bar)
-w, --write-out format  # Output format string

Status Code Ranges

  • 1xx - Informational (rarely used)
  • 2xx - Success
  • 3xx - Redirection
  • 4xx - Client error (your fault)
  • 5xx - Server error (their fault)

Common Content Types

  • application/json - JSON data
  • application/xml - XML data
  • text/html - Web pages
  • text/plain - Plain text
  • application/x-www-form-urlencoded - Form data
  • multipart/form-data - File uploads