Lumberjack

Lumberjack main page
Lumberjack AI-assisted debugging
High level overview of Lumberjack
High level overview of Lumberjack's anomaly flagging model.

Inspiration

We looked at all stages of software development, and decided to focus specifically on code in production, since it seemed like there was a lot of room for improvement. There's not much you can do to monitor code in production without significantly slowing down code. So, we turned to server logs. Some libraries are nice, and provide clean output that can processed nicely with regex's, but sadly this is nowhere near universal. Server logs are generally unstructured, unlabeled, text data, so we looked to ML and NLP to potentially make some sense of them.

What it does

Lumberjack allows you to monitor Docker containers. Using Socket.io, Lumberjack gets Docker containers' log output and monitors metrics like CPU and RAM usage. It then plugs these log outputs into an unsupervised clustering model to check for outliers. Additionally, OpenAI GPT integration can help explain any log message. If it's an error, Lumberjack will also attempt to provide possible steps to debug.

How we built it

Machine Learning The pipeline, roughly, is:

Data Collection -> 2. Vectorization (FastText) -> 3. Normalization -> 4. Dimensionality Reduction (SVD) -> 5. Clustering (MiniBatchKMeans) -> 6. Outlier Detection

The anomaly detection model is implemented using a MiniBatchKMeans clustering algorithm, which is a variation on K-means. This allows for, by ML standards, very fast clustering and incremental learning.

After the model is initially trained, it's possible to incrementally fit new data, with the option of retraining the vectorizer. It's also possible to simply evaluate new information.

Backend We built the backend with a Flask server and some Socket.io sockets. We did also use some OpenAI API integration, but the bulk of the backend was just GET requests that were rerouted to the Docker domain socket through Python bindings of the Docker SDK.

Frontend For the frontend, we mainly used React and Tailwind for styling. HTTP requests were handled with Axios and streaming data like logs and CPU/RAM info were sent through Socket.io.

Challenges we ran into

Finding a ML solution that works well with text in real time, on large data was particularly challenging. Additionally, logs can vary extremely in structure and there's not much high quality or labeled log data available out there.

Accomplishments that we're proud of

Everything actually works!! We felt this project was pretty ambitious in scope, with all the different technologies, but it all worked out :))) I'm especially proud of the machine learning part, and getting actual decent results out of test data as well as the actual data we generated while working on the project. We weren't even initially sure if log parsing was a problem ML could address.

What we learned

We tried a lot of different machine learning approaches to the problem, including different vectorizers, models such as isolation trees and transformers, and dimensionality reduction techniques. We learned a lot about the ML pipeline and how to actually deploy ML in a somewhat-production environment.

What's next for Lumberjack

One of our major goals for Lumberjack is just to test out our NLP algorithms with significantly more data. We didn't really get to work with a service container for an extended period of time, so we'd like to see what the actual experience of using Lumberjack would be like long term. We'd also like to more on properly integrate the incremental learning aspect.

In the future, we'd like to expand beyond Docker to create an application that might be able to handle more server configurations. One of the ways we might go about this could be just pointing at logfiles but there are many different avenues for us explore.

Built With

Submitted to

HackIllinois 2024
- Winner Best Developer Tool (Warp) (Track)

Created by

I implemented and helped integrate everything ML-related in this project. I wrote the anomaly detection wrapper class and GPT API call function.

James Yang
Evan Lin
Dhruv Chanana
Purdue CS '27 student who enjoys learning more about programming and competing in hackathons.