Real-time alerting on hidden errors

- for young teams that use Slack

I’ve recently embarked on the adventure of launching a startup. We have all the freedom in the world to choose whatever technology takes our fancy, but our choices are guided by a desire to build a stable tool quickly. Being able to iterate fast is taken to a whole new level where fast is counted in days and hours instead of weeks. So I have been furiously putting our ideas to code in Python, evolving those ideas - rinse & repeat. As a team, we have had good initial success with regular usage by Koinearth , Gradvine, and others. This is pretty amazing. We very much obsess about the user experience of our early adopters as it has a disproportionate influence on shaping how Paco grows.

Catching Errors

The typical set up at established technology firms is that applications log based on severity levels - debug, info, warning, error are common across many languages. These logs either go into separate files on separate machines, or they are centrally logged to one system. When the logs go into separate files and are spread across machines, most firms with the budgets aggregate them using Splunk or Humio. Then you can build a bunch of rules around what kind of logs will trigger what actions - ranging from creating tickets to sending emails.

With this background, I want to dive into a little hack that has proved very useful for us. In our product, we don’t yet need to worry about aggregating logs and we don’t have the need or the budget for elaborate tooling. Given our focus on getting it right, we do want to know about any errors immediately. So here is a simple setup that I put in place, which is working well for us and might be useful for other teams too.

  1. Write a wrapper around Python’s logging module and expose the same functions that logging.getLogger() exposes - debug, info, warning, error, critical

  2. Set up a new “error-reporting” channel and create a simple slack app with the permissions to write to this channel. You’ll need chat:write scope

  3. In the wrapper functions for error and critical, use slack-sdk to post the incoming messages to the new channel created above

Stripped out code for logging wrapper (you can use a single global object. Discussion on the Singleton pattern using dunder new: python-patterns.guide/gang-of-four/singleton):

import logging
from slack_sdk import WebClient


class Logger(object):
    _logger = None
    _slack_client = None

  def __init__(self, slack_notifications: bool=False):
      if self._logger is None:
        self._logger = logging.getLogger(__name__)
      if self._slack_client is None:
        self._slack_client = WebClient(BOT_TOKEN)

  def notify_on_slack(msg: str) -> None:
        self._slack_client.chat_postMessage()

   def debug(self, msg):
        msg = self.format_msg(msg)
        self._logger.debug(msg)

    def info(self, msg):
        msg = self.format_msg(msg)
        self._logger.info(msg)

    def warning(self, msg):
        msg = self.format_msg(msg)
        self._logger.warning(msg)

    def error(self, msg):
        msg = self.format_msg(msg)
        self._logger.error(msg)
        self.notify_on_slack(msg)

Voila! This will provide easy log monitoring for the messages you care about. I am going to extend this solution to capture analytics and other events that we want to monitor as well, so we can stay on top of our game when it comes to knowing about Paco’s vitals.

P.S. There are 3rd party tools out there (e.g. sentry) that you can use off the shelf. Nothing against them. For me the 2 hours I spent putting this in place was a better trade than monthly fees to be incurred over the next few months.