A common mistake is shipping your entire build environment — compilers, package managers, SDKs — in your production Docker image.

A Go application needs the Go compiler to build. But once it is built, the binary runs without Go installed at all. Why include 1 GB of Go tooling in an image that only needs a 10 MB binary?

Multi-stage builds solve this. They let you compile your app in one container and copy only the final artifact into a minimal production image.

The Problem: Fat Production Images

Here is a Dockerfile without multi-stage builds for a Go app:

# Single-stage: bad for production
FROM golang:1.22

WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download

COPY . .
RUN go build -o myapp .

CMD ["./myapp"]

The resulting image: ~1 GB. It includes the entire Go SDK, build cache, and source code — none of which are needed at runtime.

Multi-stage Builds: The Solution

A multi-stage Dockerfile has multiple FROM statements. Each FROM starts a new stage. You can copy files from one stage to the next using COPY --from=stagename.

Only the last stage ends up in the final image. Earlier stages are discarded.

Go: Binary in a Minimal Container

# Stage 1: Build
FROM golang:1.22-alpine AS builder

WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download

COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o myapp .

# Stage 2: Production image
FROM scratch

# Copy only the compiled binary
COPY --from=builder /app/myapp /myapp

# Copy TLS certificates (needed for HTTPS calls)
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/

EXPOSE 8080
CMD ["/myapp"]

Image size comparison:

  • Single-stage with golang:1.22: ~1 GB
  • Multi-stage with scratch: ~8 MB

The final image is built on scratch — an empty base image. It contains only your binary and the CA certificates.

What CGO_ENABLED=0 -ldflags="-s -w" means

  • CGO_ENABLED=0 — disable C bindings. Produces a fully static binary that runs on scratch (which has no C library).
  • -ldflags="-s -w" — strip debug symbols from the binary. Reduces size by 30-40%.

Rust: Distroless for Safety

Rust also compiles to a single static binary, but the build environment is large:

# Stage 1: Build
FROM rust:1.87-slim AS builder

WORKDIR /app
COPY Cargo.toml Cargo.lock ./

# Cache dependencies by building an empty project first
RUN mkdir src && echo "fn main() {}" > src/main.rs
RUN cargo build --release && rm -f target/release/myapp

# Now build the real application
COPY src ./src
RUN cargo build --release

# Stage 2: Production image (distroless)
FROM gcr.io/distroless/cc-debian12

COPY --from=builder /app/target/release/myapp /myapp

EXPOSE 8080
CMD ["/myapp"]

Why distroless/cc instead of scratch?

Rust binaries compiled with the default toolchain link against libc dynamically (unless you use musl target). scratch has no libc. distroless/cc has the minimal C library but nothing else — no shell, no package manager, no utilities.

Image size comparison:

  • Rust builder image: ~1.4 GB
  • Final distroless image: ~25 MB

Python: Slim Production Image

Python is interpreted, so you cannot use scratch. But you can still slim down significantly:

# Stage 1: Build dependencies
FROM python:3.12-slim AS builder

WORKDIR /app
COPY requirements.txt .

# Install dependencies into /install directory
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt

# Stage 2: Production image
FROM python:3.12-slim

WORKDIR /app

# Copy installed packages from builder
COPY --from=builder /install /usr/local

# Copy application code
COPY . .

# Run as non-root
RUN useradd --uid 1001 --no-create-home appuser
USER appuser

EXPOSE 8080
CMD ["gunicorn", "main:app", "--bind", "0.0.0.0:8080"]

Why two stages for Python?

The builder stage installs dependencies (which may compile C extensions, requiring build tools like gcc). The production stage uses the same python:3.12-slim but without the build tools — only the installed packages are copied over.

Java/Kotlin (Ktor): JRE-only Production Image

Kotlin/Ktor apps need the JDK to build but only the JRE to run:

# Stage 1: Build with Gradle
FROM gradle:8.14-jdk21 AS builder

WORKDIR /app
COPY build.gradle.kts settings.gradle.kts ./
COPY gradle ./gradle

# Download dependencies first (cache optimization)
RUN gradle dependencies --no-daemon

# Build the application
COPY src ./src
RUN gradle shadowJar --no-daemon

# Stage 2: Minimal JRE runtime
FROM eclipse-temurin:21-jre-alpine

WORKDIR /app
COPY --from=builder /app/build/libs/myapp-all.jar myapp.jar

RUN addgroup --gid 1001 appgroup && \
    adduser --uid 1001 --ingroup appgroup --no-create-home appuser
USER appuser

EXPOSE 8080
CMD ["java", "-jar", "myapp.jar"]

Image size comparison:

  • Full Gradle JDK builder: ~600 MB
  • Final JRE runtime: ~85 MB

BuildKit Cache Mounts: Faster Repeated Builds

BuildKit (included in Docker Engine 18.09+) enables advanced caching with --mount=type=cache. This keeps your dependency cache between builds:

# Go with BuildKit cache for module downloads
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./

# Cache the Go module cache between builds
RUN --mount=type=cache,target=/root/go/pkg/mod \
    go mod download

COPY . .
RUN --mount=type=cache,target=/root/go/pkg/mod \
    --mount=type=cache,target=/root/.cache/go-build \
    CGO_ENABLED=0 go build -o myapp .

With cache mounts, repeated builds only recompile changed files. This can reduce build time by 80% on large projects.

Multi-platform Builds with docker buildx

docker buildx is Docker’s extended build client. It comes with Docker Engine 23+ and is used for building images for multiple CPU architectures.

For example, build an image for both AMD64 (Intel/AMD servers) and ARM64 (Apple Silicon, AWS Graviton) from your Mac:

# Create a builder that supports multi-platform
docker buildx create --use --name multiplatform

# Build for both architectures and push to Docker Hub
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  -t kemalcodes/myapp:1.0 \
  --push \
  .

This is essential if you develop on an Apple Silicon Mac but deploy to AMD64 servers.

Choosing the Right Base Image for Final Stage

Use CaseFinal Stage ImagePros
Go/Rust static binaryscratch0 MB overhead, maximum security
Go/Rust dynamic binarygcr.io/distroless/cc-debian12Minimal libc, no shell
Python/Node.jspython:3.12-slim or node:22-slimSmall, has runtime
Java/Kotlineclipse-temurin:21-jre-alpineSmall JRE only
Debugging neededalpine:3.21Has shell and package manager

Common Mistakes

Shipping the full SDK in production

If your production image starts with FROM golang:1.22 and does not use multi-stage builds, you are shipping the entire Go SDK to production. Use multi-stage builds for any compiled language.

Copying files that are not needed in the final stage

Only copy what you need in the final stage. Do not do COPY --from=builder /app /app if you only need the binary. Be specific: COPY --from=builder /app/myapp /myapp.

Not using CGO_ENABLED=0 for Go on scratch

If you try to run a Go binary compiled with CGO on a scratch image, it will fail because scratch has no C library. Always use CGO_ENABLED=0 when targeting scratch.

What’s Next?

Next: Docker Tutorial #8: Docker for Development Workflows