Hello everyone ![]()
I’m an engineer developing software based on Kubernetes and I’m happy to write my first post here.
The reason I’m posting here is to ask for advice on developing and sharing a logging system.
There are many great log-related projects in the Kubernetes ecosystem.
But the logging system I came up with is a little different from commonly used systems.
I provide this system to in-house developers (We run Kubernetes cluster), and I use it usefully myself to track logs, but I’d like to take your advice and make this a more general-purpose project.
Background
- I understand that most logging systems are centralized, and the resulting computing resources and storage costs are expensive.
- As a result, many people spend a lot of time reducing unnecessary logs due to cost.
- It runs into cardinality problems.
- Initial costs may be high because of a large disk.
- Projects that collect logs and projects that store and index logs are mostly separated. (flb, elk…)
- This is a great distinction from a micro-service perspective, but sometimes, an all-in-one package that can both query and save, like prometheus, is required.
What I’m trying to make
- Low-cost decentralized logging system
- I think that centralized and decentralized logging systems have opposite strengths and weaknesses.
- The initial idea of the project was to reduce costs arising from centralization and construction costs.
- A lightweight and simple logging system based on Kubernetes
- Tracks and stores logs in k8s stdout/stderr or log files in volume.
- Consider a model similar to rsyslog, but capable of storing and querying directly.
- Query provides a simple interface at the grep level. Syntax · google/re2 Wiki · GitHub
- All logs are stored locally on the nodes of the Kubernetes cluster.
- Each log chunks are containing timestamps. (query with timerange)
- User can select log targets based on Kubernetes objects.
- Targets include cluster, namespace, pod label, set, pod, container, etc.
- Tracks and stores logs in k8s stdout/stderr or log files in volume.
- User can search container logs of all clusters at once based on namespace.
- This system operates as a DaemonSet in Kubernetes.
- Short retention and external transfer and metrics functions
- Most logs are meaningless, but they should be traceable when a problem occurs.
- Retention can be determined based on the local storage size of the cluster node. Therefore, it can be configured with only a node to run Kubernetes without the need for a separate disk.
- If a user defines the contents as a custom resource, the system will create a sink that performs log metrics and exports.
Thanks for reading this far.
I wrote it too briefly, so there may be a lack of clues, but please leave a comment and I’ll keep checking.
As I wrote at the beginning of the article, I would like some advice on whether this system can be used well for general purposes, or what parts are needed for it to be used well.