Ask for advice on providing a logging system as open source

Hello everyone :slight_smile:

I’m an engineer developing software based on Kubernetes and I’m happy to write my first post here.

The reason I’m posting here is to ask for advice on developing and sharing a logging system.

There are many great log-related projects in the Kubernetes ecosystem.
But the logging system I came up with is a little different from commonly used systems.
I provide this system to in-house developers (We run Kubernetes cluster), and I use it usefully myself to track logs, but I’d like to take your advice and make this a more general-purpose project.

Background

  • I understand that most logging systems are centralized, and the resulting computing resources and storage costs are expensive.
    • As a result, many people spend a lot of time reducing unnecessary logs due to cost.
    • It runs into cardinality problems.
    • Initial costs may be high because of a large disk.
  • Projects that collect logs and projects that store and index logs are mostly separated. (flb, elk…)
    • This is a great distinction from a micro-service perspective, but sometimes, an all-in-one package that can both query and save, like prometheus, is required.

What I’m trying to make

  • Low-cost decentralized logging system
    • I think that centralized and decentralized logging systems have opposite strengths and weaknesses.
    • The initial idea of the project was to reduce costs arising from centralization and construction costs.
  • A lightweight and simple logging system based on Kubernetes
    • Tracks and stores logs in k8s stdout/stderr or log files in volume.
      • Consider a model similar to rsyslog, but capable of storing and querying directly.
      • Query provides a simple interface at the grep level. Syntax · google/re2 Wiki · GitHub
      • All logs are stored locally on the nodes of the Kubernetes cluster.
      • Each log chunks are containing timestamps. (query with timerange)
    • User can select log targets based on Kubernetes objects.
      • Targets include cluster, namespace, pod label, set, pod, container, etc.
  • User can search container logs of all clusters at once based on namespace.
    • This system operates as a DaemonSet in Kubernetes.
  • Short retention and external transfer and metrics functions
    • Most logs are meaningless, but they should be traceable when a problem occurs.
    • Retention can be determined based on the local storage size of the cluster node. Therefore, it can be configured with only a node to run Kubernetes without the need for a separate disk.
    • If a user defines the contents as a custom resource, the system will create a sink that performs log metrics and exports.

Thanks for reading this far.

I wrote it too briefly, so there may be a lack of clues, but please leave a comment and I’ll keep checking.

As I wrote at the beginning of the article, I would like some advice on whether this system can be used well for general purposes, or what parts are needed for it to be used well.

1 Like