HTTP

HTTP, the HyperText Transfer Protocol, is one of the protocols at the heart of the Internet. Messages always come in pairs: a request receives a response.

Methods (verbs)

HTTP exposes a number of methods, used to indicate the type of action to perform. By convention:

  • GET reads a resource and returns its content and metadata.
  • HEAD reads and returns just the metadata (headers).
  • POST creates a new resource.
  • PUT updates a resource in place.
  • DELETE removes a resource.
  • TRACE provides loop-back testing by reflecting a returned response to a server.

The HTTP specification allows extension through addition of HTTP verbs, a property exploited by WebDAV to provide filesystem semantics atop HTTP.

Safe requests are read-only (GET, HEAD): there should be no risk of modification to a resource if the request is replayed. Browsers will ask users to confirm resubmission.

WebDAV

WebDAV provides a virtual file system atop HTTP defined in RFC 4918. It reuses the existing HTTP methods:

  • GET
  • HEAD
  • POST
  • PUT
  • DELETE

It also extends HTTP with domain-specific methods:

  • OPTIONS provides discovery of WebDAV support.
  • COPY copies an existing resource to a new location.
  • MOVE moves an existing resource to a new location.
  • MKCOL creates a collection.
  • LOCK locks a resource.
  • UNLOCK unlocks a resource.
  • PROPFIND gets a resource's property.
  • PROPPATCH modifies a resource's properties.

Resources

URLs

Uniform Resource Locals uniquely identify HTTP resources. They're defined in RFC 3986.

scheme://some.host[:port]/[absolute/path.ext][?query][#fragment]

URL encoding encodes "unsafe characters" outside of the ASCII plane with their encoded equivalents:

CharacterEncoded form
Space%20
Exclamation mark (!)%21
Double quote (")%22
Hash (#)%23
Dollar ($)%24
Percent (%)%25
Ampersand (&)%26

Negotiation

Content types are indicated in the form of MIME types using the Content-Type response header.

Resources might be expressed in a number of different representations via negotiations:

  • Accept: text/html,application/xhtml+xml,*/* indicates a preferred response format of HTML, but acceptance of everything.
  • Accept-Language: en-GB indicates a preference for British English.

Request format

GET / HTTP/1.1
Connection: close

Request headers

  • Connection: keep-alive specifies that the browser wants the server to keep the connection open after sending a response so that another request can be sent.
  • Cookie: name=value; param lets a client send a cookie back to the server.
  • Host: example.com allows "virtual hosting": hosting multiple websites on the same IP address.

Response format

HTTP/1.1 200 OK
Content-Length: 22
Content-Type: text/html
Date: Sat, 19 Sep 2020 21:29:29 GMT

<h1>Hello, world!</h1>

Connections

Connection header:

  • close immediately terminates the connection upon delivery of the response.
  • keep-alive keeps the connection open once the initial request/response flow is complete, allowing reuse.

User agents (browsers) will aggressively download resources over multiple open connections

Pipelining allows a user agent to request multiple resources over a single HTTP connection without waiting for a response.

Proxies

  • Forward proxies relay requests to external servers.
    • As content filters
    • For privacy, e.g. stripping referer headers
  • Reverse proxies are closer to the server and are typically invisible to clients.
    • Compression
    • Load balancing

Cookies

Cookies extend HTTP with a basic form of session state. They're typically used to store user preferences or authentication tokens that can be matched to a server-side session, allowing crossing the 4KB size limit.

Servers can set cookies in response headers:

Set-Cookie: name=value; domain=.example.com; path=/

The client will send relevant cookies in requests:

Cookie: name=value

Cookies can be set with a number of optional parameters:

  • path=/ constrains the cookie to all paths below the root, but could indicate a specific resource.
  • domain=.example.com determines the relevant domain. The leading . allows access to example.com and subdomains.
  • HttpOnly prevents access to the cookie via JavaScript (Private)'s document.cookie, thus preventing XSS.
  • Secure sends the cookie over encrypted transports.
  • SameSite=Strict only sends cookies if requests originate from the same domain, preventing CSRF. Lax enforces the target domain. None allows third-party cookies.

Authentication

HTTP performs authentication via challenge-response sequence.

  1. Client requests GET /private/resource.
  2. Server responses with 401 Unauthorized, setting the WWW-Authenticate header.
  3. The browser prompts for authentication credentials, which are then transmitted to the server in the Authorization header.

There are two types of authentication.

Basic

Basic authentication uses a pre-shared key, transmitted over the wire in base64-encoded cleartext with every request. The server will request authentication as follows:

WWW-Authenticate: Basic realm="Protected area"

The client should respond: with the base64-encoded username and password pair, delimited by a colon (admin:hunter2 in the below) example:

Authorization: Basic YWRtaW46aHVudGVyMgo=

Digest

Digest authentication goes some way toward protecting the password in transit by having the client compute an MD5 hash of the password using a nonce value it controls.

The hash is computed using the following sequence:

ha1 = md5(username:realm:password)
ha2 = md5(method:digestUri)
response = md5(ha1:nonce:ha2)

The server's challenge looks like:

WWW-Authenticate: Digest realm="Protected area" qop="auth,auth-int" nonce="" opaque=""

And the hash is returned via the following request:

Authorize: Digest xxx

Patterns

URL canonicalisation ensures a service is accessed only via a single host.

Tools


Children
  1. Caching
  2. Extended Log file format
  3. GraphQL
  4. Kong
  5. Media (MIME) types
  6. OpenResty
  7. REST
  8. Response statuses
  9. SOAP
  10. W3C Trace Context
  11. nginx

Backlinks