Let's get started with a Microservice Architecture with Spring Cloud:
How to Reduce Spring Boot Memory Usage?
Last updated: October 29, 2025
1. Introduction
Spring Boot enables us to create production-ready applications with features such as auto-configuration and starter dependencies. However, one of the most common complaints of Spring Boot applications is their memory footprint. Even a basic Spring Boot application with an embedded server would consume 150 MB of memory when launched.
In this tutorial, we’ll explore why this happens and examine ways to reduce memory usage without impacting application functionality.
2. Why Does Spring Boot Use So Much Memory?
Several factors influence the amount of memory consumed. Let’s explore some of the most significant ones in this section.
2.1. JVM Architecture
Object references in a 64-bit JVM are twice as big as object references on a 32-bit JVM. This is by design. As an example, the same app might use 110 MB on a 32-bit JVM and 190 MB on a 64-bit JVM.
In addition, the JVM itself will consume memory for the JIT compiler, code cache, class metadata, internal structures, thread stacks, etc. Even if we set the maximum heap size to 64 megabytes with –Xmx64m, the total memory usage is often much higher.
2.2. Embedded Server Threads
Spring Boot runs an embedded server, such as Tomcat or Jetty. By default, Tomcat creates 200 worker threads, each consuming 1 MB of stack space (in a 64-bit JVM). Even if there are no requests to the server, Tomcat alone will consume a minimum of 200 MB of memory.
2.3. Framework Overhead
Spring Boot does a lot of heavy lifting under the hood, including auto-configuration, dependency injection, and proxies, among other features. All these features need metadata and cached objects in memory.
3. Setting JVM Options
JVM allows us to set various options that help improve memory usage. In this section, let’s examine a few of these settings.
3.1. Use Serial Garbage Collector
Every Java application creates short-lived objects that are used in the application. These objects live in heap memory. The JVM periodically runs a garbage collection process to clean up objects that are no longer in use.
The JVM has three different types of garbage collectors, each differing in terms of speed, memory, and CPU usage. By default, modern JVMs select a multi-threaded garbage collector, which offers high throughput and is ideal for large applications with many CPU cores. But it’s overkill for small applications.
If our application is small, like a Spring Boot application running with tight memory limits, it makes more sense to switch to the Serial Garbage Collector, which uses fewer background threads, thereby demanding less memory.
To enable the serial garbage collector, we can use the -XX:+UseSerialGC option while launching the Spring Boot app, as follows:
java -XX:+UseSerialGC -jar myapp.jar
3.2. Reduce Thread Stack Size
When the JVM starts a thread, it allocates stack memory for it. The stack size determines the number of nested method calls it can handle.
By default, the JVM allocates 1MB of memory for each thread. If there are 1000 threads, that’s 1 GB of memory even for an application sitting idle. For small applications, 1 MB per thread is wasted memory.
Fortunately, JVM allows us to control the stack size with the -Xss option.
If our app’s recursive method calls aren’t deep, like with most Spring Boot applications, we can reduce the size to 512 KB instead of the default 1 MB. When we have hundreds of threads running, this saving will add up fast.
Below is an example usage where we’re using lightweight GC, where each thread uses 512 KB of memory:
java -Xss512k -XX:+UseSerialGC -jar myapp.jar
3.3. Limit Maximum RAM
When the JVM starts, it examines the available memory and, based on that, determines the heap size and non-heap space. If our laptop has 16 GB of RAM, the JVM assumes it has an ample amount of memory available.
But if we deploy the same app in a Docker container, the container may only allow 100 to 200 MB of memory. The problem is that the JVM doesn’t always know container limits, and it may think it can use more memory than what is allowed. This may even cause the application to crash.
Fortunately, we can set the memory cap with the -XX:MaxRAM option. For example, XX:MaxRAM=72m sets the hard cap of 72 MB for all memory, allowing the JVM to allocate memory as needed.
Below is the usage where we’re setting the total memory to 72 MB, where each thread stack size is 512 KB, and we’re using the serial garbage collector:
java -XX:MaxRAM=72m -Xss512k -XX:+UseSerialGC -jar myapp.jar
4. Reducing Web Server Threads
When we launch a Spring Boot application with an embedded Tomcat server, it handles incoming requests using a thread pool. Each request would be handled by one thread. By default, the thread pool would constitute 200 worker threads, which can be overkill for smaller applications.
These threads don’t just sit idle; they consume stack memory and bloat the memory, which is undesirable in low-memory environments like Docker or free cloud tiers, which often come with resource limitations.
Fortunately, there is a way for us to reduce the thread pool size. We can achieve this with a small addition to the application.properties or application.yml file in our application.
The below setting will limit the Tomcat server to use 20 worker threads, which is perfect for small apps:
server.tomcat.max-threads=20
5. Container-Friendly Practices
When we deploy the application inside a Docker container, we usually set limits on CPU and memory usage. If the app uses the resources beyond these limits, the application will be killed immediately. This is commonly referred to as an Out Of Memory Kill (OOMKill).
In this section, let’s examine a few container-friendly practices.
5.1. Set Container Limits Explicitly
When we launch a container, it is a good practice to always define its memory limit.
For example, the command below will launch the container with a memory allocation of at most 128 MB. If the JVM tries to allocate more than 128 MB, the container will be killed:
docker run -m 128m my-spring-boot-app
5.2. Match JVM Flags to Container Limits
Even if we set the memory cap for the container, the JVM might still think it has access to GBs of RAM. To fix this, we need to explicitly tell the JVM how much memory it can use by using the -XX:MaxRAMPercentage option.
Below is the command with example usage. This command will launch the container with a maximum of 128 MB of memory, where the JVM can use 75% of it, which is 96 MB. We’re also using SerialGC, which further saves memory:
docker run -m 128m openjdk:17-jdk java -XX:MaxRAMPercentage=75.0 -XX:+UseSerialGC -Xss512k -jar myapp.jar
6. Other Optimization Techniques
In addition to the strategies we’ve discussed so far, other optimization techniques can be considered for efficient memory usage. A straightforward way is to remove any unused starter dependencies. Each starter brings in additional dependencies and initialization code, which consumes memory.
Another practical approach is to disable caches we don’t use in our app. When we don’t use caching frameworks like EhCache or Caffeine, it’s best not to include them, as they often store data in memory, which can quickly add up and bloat the memory footprint.
Finally, we can also consider switching to a 32-bit JVM if it’s feasible. If our application is small and requires less than 1.5 GB of heap space, then using a 32-bit JVM can significantly reduce memory overhead.
7. Avoid Over-Optimization
While the strategies we’ve discussed reduce the overall memory footprint, over-optimization could result in unexpected issues and can be costly.
It’s essential to understand the demands of our application and future scale before optimizing. Sometimes, the application may really need good memory, leaving not much scope for optimization, but rather to increase the resources.
The idea isn’t to cut down on the memory usage and cut down costs, but to optimally use the memory as per the application demands and potentially prevent any issue related to a lack of memory.
8. Conclusion
Spring Boot apps generally tend to use more memory than plain Java applications because of embedded servers, framework features, and JVM usage.
However, by tuning JVM options, reducing server threads, trimming dependencies, and setting container-aware flags during launch, we can significantly cut the memory usage.
While over-optimization is not advised, the strategies we’ve discussed will make the application more efficient and cloud-friendly.
















