Envoy

Some people compare the envoy-proxy with Nginx. The free version of Nginx is a great web-server (caching and ability to serve static files) but poor balancer (no health-checks, lack of balancing algorithms a close to none metrics).

Envoy unable to serve static and has no cache. But envoy is a great balancer and API-gateway. It has a lot of features as a balancer: API, health-checks, different balancing algorithms, and great metrics.

Basic security

The greatest issue for the users of Envoy is the admin interface publicly available. In most cases it’s useless, but there is a metrics endpoint for Prometheus there. It has an unusual location though.

We can solve both issues at once:

---
admin:
  access_log_path: /dev/null
  address:
    socket_address:
      address: 127.0.0.1
      port_value: 9901
static_resources:
  listeners:
    - address:
        socket_address:
          address: "::"
          ipv4_compat: true
          port_value: 9900
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                codec_type: auto
                stat_prefix: ingress_http
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: backend
                      domains:
                        - "*"
                      routes:
                        - match:
                            path: "/metrics"
                          route:
                            prefix_rewrite: "/stats/prometheus"
                            cluster: admin_api
                http_filters:
                  - name: envoy.filters.http.router
                access_log:
                  - name: envoy.access_loggers.file
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
                      path: /dev/null
  clusters:
    - name: admin_api
      connect_timeout: 600s
      type: LOGICAL_DNS
      load_assignment:
        cluster_name: admin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 9901

Now at port 9900 only metrics are available with traditional URL /metrics

HTTP to HTTPS

static_resources:
  listeners:
    - address:
        socket_address:
          address: 0.0.0.0
          port_value: 80
    filter_chains:
      - filters:
          - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                codec_type: auto
                stat_prefix: ingress_http
                route_config:
                  virtual_hosts:
                    - name: backend
                      domains:
                        - "example.com"
                      routes:
                        - match:
                            prefix: "/"
                          redirect:
                             https_redirect: true
                http_filters:
                  - name: envoy.filters.http.router

HTTPS

static_resources:
  listeners:
    - address:
        socket_address:
          address: "::"
          ipv4_compat: true
          port_value: 443
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                codec_type: auto
                stat_prefix: ingress_http
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: backend
                      domains:
                        - "*"
                      routes:
                        - match:
                            path: "/metrics"
          transport_socket:
            name: envoy.transport_sockets.tls
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
              common_tls_context:
                tls_certificates:
                  - certificate_chain:
                      filename: "/etc/envoy/certs/server-cert.pem"
                    private_key:
                      filename: "/etc/envoy/certs/server-key.pem"

Health-checks

There are passive HTTP-upstreams check (response codes 500+) but Envoy is designed to serve swarms of microservices and have quite high default thresholds. I’d recommend to review default values for max_ejection_percent, success_rate_minimum_hosts and success_rate_request_volume. My choice is:

static_resources:
  clusters:
    - name: external
      outlier_detection:
        consecutive_5xx: "3"
        base_ejection_time: "60s"
        success_rate_minimum_hosts: "2"
        max_ejection_percent: "50"

Related links