Documentation
tomodachi

Recommendations for production

When running a tomodachi service in a production environment, it's important to ensure that the service is set up correctly to handle the demands and constraints of a live system. Here's some recommendations of options and operating practices to make running the services a breeze.

The preferred way of operating a tomodachi service?

  • Go for a Dockerized environment if possible – preferably orchestrated with for example Kubernetes to handle automated scaling events to meet demand of incoming requests and/or event queues.

  • Make sure that a SIGTERM signal is passed to the python process when a pod is scheduled for termination to give it time to gracefully stop listeners, consumers and finish active handler tasks.

    • This should work automatically for services in Docker if the CMD statement in your Dockerfile is starting the tomodachi service directly.
    • In case a script is used in CMD, you might need to trap signals and forward them to the service process.
  • To give services the time to gracefully complete active handler executions and shut down, make sure that the orchestration engine waits at least 30 seconds from sending the SIGTERM to remove the pod.

    • For extra compatibility in k8s and to get around most kind of edge-cases of intermittent timeouts and problems with ingress connections, set the pod spec terminationGracePeriodSeconds to 90 seconds and use a preStop lifecycle hook of 20 seconds.

      spec:
        terminationGracePeriodSeconds: 90
        containers:
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 20"]
      
  • If your service inbound network access to HTTP handlers from users or API clients, then it's usually preferred to put some kind of ingress (nginx, haproxy or other type of load balancer) to proxy connections to the service pods.

    • Let the ingress handle public TLS, http2 / http3, client facing keep-alives and WebSocket protocol upgrades and let the service handler just take care of the business logic.

    • Use HTTP options such as the ones in this service to have the service rotate keep-alive connections so that ingress connections doesn't stick to the old pods after a scaling event.
      If keep-alive connections from ingresses to services stick for too long, the new replicas added when scaling out won't get their balanced share of the requests and the old pods will continue to receive most of the requests.

      import tomodachi
      
      class Service(tomodachi.Service):
      		name = "service"
        	options = tomodachi.Options(
              http=tomodachi.Options.HTTP(
                  port=80,
                  content_type="application/json; charset=utf-8",
                  real_ip_from=["127.0.0.1/32", "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16"],
                  keepalive_timeout=10,
                  max_keepalive_time=30,
              )
          )
      
  • Use a JSON log formatter such as the one enabled via --logger json (or env variable TOMODACHI_LOGGER=json) so that the log entries can be picked up by a log collector.

  • Always start the service with the --production CLI argument (or set the env variable TOMODACHI_PRODUCTION=1) to disable the file watcher that restarts the service on file changes, and to hide the start banner so it doesn't end up in log buffers.

  • Not related to tomodachi directly, but always remember to collect the log output and monitor your instances or clusters.

Recommended arguments to tomodachi run in the CLI

tomodachi run service/app.py --loop uvloop --production --log-level warning --logger json

In this example:

  • --loop uvloop: This argument sets the event loop implementation to uvloop, which is known to be faster than the default asyncio loop. This can help improve the performance of your service. However, you should ensure that uvloop is installed in your environment before using this option.

  • --production: This argument disables the file watcher that restarts the service on file changes and hides the startup info banner. This is important in a production environment where you don't want your service to restart every time a file changes. It also helps to reduce unnecessary output in your logs.

  • --log-level warning: This argument sets the minimum log level to warning. In a production environment, you typically don't want to log every single detail of your service's operation. By setting the log level to warning, you ensure that only important messages are logged.

    • If your infrastructure supports rapid collection of log entries and you see a clear benefit of including logs of log level info, it would make sense to use --log-level info instead of filtering on at least warning.
  • --logger json: This argument sets the log formatter to output logs in JSON format. This is useful in a production environment where you might have a log management system that can parse and index JSON logs for easier searching and analysis.

Environment variables

You can also set these options using environment variables. This can be useful if you're deploying your service in a containerized environment like Docker or Kubernetes, where you can set environment variables in your service's configuration. Here's how you would set the same options using environment variables:

export TOMODACHI_LOOP=uvloop
export TOMODACHI_PRODUCTION=1
export TOMODACHI_LOG_LEVEL=warning
export TOMODACHI_LOGGER=json

tomodachi run service/app.py

By using environment variables, you can easily change the configuration of your service without having to modify your code or your command line arguments. This can be especially useful in a CI/CD pipeline where you might want to adjust your service's configuration based on the environment it's being deployed to.