Skip to content

Reduce duplicate string allocation for GC improvements#952

Merged
merlimat merged 2 commits into
apache:masterfrom
rdhabalia:gc_obj
Dec 11, 2017
Merged

Reduce duplicate string allocation for GC improvements#952
merlimat merged 2 commits into
apache:masterfrom
rdhabalia:gc_obj

Conversation

@rdhabalia

Copy link
Copy Markdown
Contributor

Motivation

If broker loads 100K topics with replication enabled for them, then we see around 3M of String objects presents into heap. this contribute in gc-pause for brokers. Out of 3M strings, we saw half of them duplicate string which we can reduce.

Modifications

After checking GC roots for duplicate string:

  • interning replicator strings
  • cache NamespaceName that keeps duplicate namespace-name

Result

Reduce duplicate string.

@jai1 jai1 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - keeping the maximumSize or expireAfterAccess configurable will be better.

@apache apache deleted a comment from jai1 Dec 8, 2017
@merlimat

merlimat commented Dec 8, 2017

Copy link
Copy Markdown
Contributor

@rdhabalia It might be worth to just try to have the JVM do that with -XX:+UseStringDeduplication

@merlimat

merlimat commented Dec 8, 2017

Copy link
Copy Markdown
Contributor


public static String getReplicatorName(String replicatorPrefix, String cluster) {
return String.format("%s.%s", replicatorPrefix, cluster);
return (replicatorPrefix + "." + cluster).intern();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't the string be interned also after String.format() ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I ran a test where I saw that String.format("%s%s",x,y).intern() allocates some more number of string objects than (x+"."+y).intern() when I checked with jmap -histo <pid> | grep String. However, intern() still reuses objects in both cases but somehow, String.format() adds some more instances compare to append.

@merlimat

merlimat commented Dec 9, 2017

Copy link
Copy Markdown
Contributor

@rdhabalia The only concern I have around intern() is to whether the Strings are ever deleted from the interned pool. It's not clear from the Javadoc (https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#intern--). Anyway, since it's done on the clusters string, it probably won't matter.

@merlimat merlimat left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@rdhabalia

Copy link
Copy Markdown
Contributor Author

What about the client version?

Yes, I just targeted the main strings. We can do it for client-version as it will be small set. let me add it,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants