KSP Memory Leak Dive

An accounting of tools and techniques used to reproduce, visualize, fix and test a KSP memory leak.

KSP Memory Leak Dive

Paul Klauser asked in a community forum about a KSP memory leak and mentioned my blog post about removing Metaspace limits. I decided to try to reproduce the issue in NowInAndroid by slapping ./gradlew assembleDebug --no-configuration-cache --no-build-cache --rerun-tasks in a while loop and let it rip with VisualVM attached.

I took a heap dump and examined it in Eclipse MemoryAnalyzer (Eclipse MAT) to see what its Leak Report would tell me.


Debugging & Fixing

A reproduction like this is a great starting point for investigating issues. I decided to begin by ruling out potential causes to narrow the scope of my investigation.

I happened to already have set up various JDKs via SDKMAN so I could spot check whether JDK version could matter at all. Paul had done his evaluation with 17 so I checked 19, 21, 22, and 23. Kotlin doesn’t yet have support for targeting 23 and falls back to 22, but none of the tests changed the rate of Metaspace growth nor the leak suspects found via Eclipse MAT in the dumps.

Next, I explored references Paul shared about class loader leaks on Kotlin’s YouTrack. There are a number of past and new issues referring to class loader leaks. Fortunately Brian Norm from JetBrains pointed out that there was a known leak in KSP and that the Kotlin incremental leak suspects we found were expected - those class loaders get reused between builds.

Heading over to the KSP issue, Brian identified that the fix was most likely cleaning up references to private fields that were clearly holding the class loaders referenced in our heap dumps. Now that I could see the actual field it all made sense how Brian took the information in heap dump leak suspects to get to this answer. I decided to clone the KSP repo and look for any other private fields that might fit this behavior - and there weren’t any. Was this really going to be a 4 line fix? Sounds too easy…


Testing the Fix

After making my PR I wondered how I might actually test the change. NowInAndroid is using Dagger which uses KSP which to build locally requires a dev snapshot of Kotlin. Oof that’s a big chain, but I’ve built snapshots and pushed them into my local maven at ~/.m2 before - how hard could it be?

Building KSP

First let’s find the dev snapshot of Kotlin. It is not on OSS Sonatype nor Maven Central. Instead JetBrains has its own Maven repository for dev snapshots. Nothing on the Kotlin website documentation hinted at where this might be so I looked into the JetBrains/Kotlin repo itself to find a Using -dev versions section that clearly showed the right repository:

maven("https://maven.pkg.jetbrains.space/kotlin/p/kotlin/bootstrap")

Now we just have to add it to the repositories sections throughout ksp:

repositories {
        mavenLocal()
        gradlePluginPortal()
        maven("https://maven.pkg.jetbrains.space/kotlin/p/kotlin/bootstrap/")
        maven("https://www.jetbrains.com/intellij-repository/snapshots")
    }

That's pretty much it. But how do I create a snapshot myself of ksp?

./gradlew tasks
# This renders quite a lot of tasks so I’m going to omit most of it for brevity.
Publishing tasks
----------------
publish - Publishes all publications produced by this project.
publishAllPublicationsToMavenLocalRepository - Publishes all Maven publications produced by this project to the MavenLocal repository.
publishAllPublicationsToSonatypeRepository - Publishes all Maven publications produced by this project to the sonatype repository.
publishPluginMavenPublicationToMavenLocal - Publishes Maven publication 'pluginMaven' to the local Maven repository.
publishPluginMavenPublicationToMavenLocalRepository - Publishes Maven publication 'pluginMaven' to Maven repository 'MavenLocal'.
publishPluginMavenPublicationToSonatypeRepository - Publishes Maven publication 'pluginMaven' to Maven repository 'sonatype'.
publishToMavenLocal - Publishes all Maven publications produced by this project to the local Maven cache.
publishToSonatype - Publishes all Maven publications produced by this project to the 'sonatype' Nexus repository.
releaseSonatypeStagingRepository - Releases closed staging repository in 'sonatype' Nexus instance.

publishToMavenLocal looks to be what we want, so I’ll run that… and it appears to have created artifacts at ~/.m2/repository/com/google/devtools/ksp as expected . However, in inspecting the actual files I don’t see the typical md5 and sha1 hashes… and that’s when I realized that the Maven publishing plugin will not generate these hashes for a local Maven repository. Those hashes are required for Gradle to properly recognize dependency artifacts. Can we just create them ourselves?

Yup!

cd ~/.m2/repository/com/google/devtools/ksp/symbol-processing/2.0.255/
md5 symbol-processing-api-2.0.255.jar symbol-processing-api-2.0.255.jar.md5
shasum -a 1 symbol-processing-api-2.0.255.jar > symbol-processing-api-2.0.255.jar.sha1

Building Dagger 🗡️

We've gotten our local Maven snapshot with manually written md5 and sha1 hashes. Now we can start to try to use it in Dagger…

But wait! We need that Kotlin bootstrap repository again so KSP can compile. We can just do the same thing again right? …Right?

git clone git@github.com:google/dagger.git
cd dagger
./gradlew tasks
zsh: no such file or directory: ./gradlew

What?

Bazel has entered the chat.

So at first glance, Dagger uses Bazel WORKSPACE for its builds, and there seems to be a migration going on in the Bazel community towards bzl modules. Of course there is no documentation anywhere about using Kotlin bootstrap maven repo. Therefore we're going to plow through this WORKSPACE file until we can correctly consume dev Kotlin and local snapshot KSP.

> bazel build android
Starting local Bazel server and connecting to it...
ERROR: /Users/jason/github/dagger/BUILD:64:16: While resolving toolchains for target //:android: No matching toolchains found for types @bazel_tools//tools/android:sdk_toolchain_type.
To debug, rerun with --toolchain_resolution_debug='@bazel_tools//tools/android:sdk_toolchain_type'
If platforms or toolchains are a new concept for you, we'd encourage reading https://bazel.build/concepts/platforms-intro.
ERROR: Analysis of target '//:android' failed; build aborted:
INFO: Elapsed time: 2.076s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (35 packages loaded, 172 targets configured)

This error is trying to say we haven’t set ANDROID_HOME and therefore it cannot find the Android SDK and related toolchain. Once I set that correctly we get this:


> bazel build android
Analyzing: target //:android (77 packages loaded, 1120 targets configured)
INFO: Analyzed target //:android (78 packages loaded, 1310 targets configured).
INFO: Found 1 target...
[12 / 139] 6 actions, 0 running
    Compiling Java headers java/dagger/internal/codegen/base/libshared-hjar.jar (1 source file) [for tool]; 0s darwin-sandbox
    Compiling Java headers external/rules_jvm_external/private/tools/java/rules/jvm/external/libbyte-streams-hjar.jar (1 source file) [for tool]; 0s darwin-sandbox
[14 / 139] 7 actions running
    Compiling Java headers java/dagger/internal/codegen/base/libshared-hjar.jar (1 source file) [for tool]; 0s darwin-sandbox
    Compiling Java headers external/rules_jvm_external/private/tools/java/rules/jvm/external/libbyte-streams-hjar.jar (1 source file) [for tool]; 0s darwin-sandbox
[18 / 139] 4 actions running
    Building external/rules_jvm_external/private/tools/java/rules/jvm/external/libbyte-streams.jar (1 source file) [for tool]; 0s multiplex-worker
[91 / 139] 7 actions running
    Compiling Java headers java/dagger/internal/codegen/libpackage_info-hjar.jar (1 source file) [for tool]; 0s darwin-sa
[98 / 139] 3 actions running
[102 / 139] KotlinKapt //java/dagger/spi/model:model { kt: 1, java: 18, srcjars: 0 } for darwin_arm64 [for tool]; 0s work
[102 / 139] KotlinKapt //java/dagger/spi/model:model { kt: 1, java: 18, srcjars: 0 } for darwin_arm64 [for tool]; 1s work
[103 / 139] [Prepa] KotlinCompile //java/dagger/spi/model:model { kt: 1, java: 18, srcjars: 0 } for darwin_arm64 [for too
[105 / 139] KotlinCompile //java/dagger/spi/model:model { kt: 1, java: 18, srcjars: 0 } for darwin_arm64 [for tool]; 0s w
[105 / 139] KotlinCompile //java/dagger/spi/model:model { kt: 1, java: 18, srcjars: 0 } for darwin_arm64 [for tool]; 1s w
[108 / 139] 2 actions running

Now that it finally builds, how should I get it to publish to my local Maven repository so I can make use of it? The documentation mentions “Snapshot releases are auto-deployed to Sonatype's central Maven repository on every clean build”, so I started digging into Dagger’s CI setup:

release.yml looks promising…

      - name: Publish artifacts
        run: |
          util/deploy-all.sh \
            "gpg:sign-and-deploy-file" \
            "${{ env.DAGGER_RELEASE_VERSION }}" \
            "-DrepositoryId=sonatype-nexus-staging" \
            "-Durl=https://oss.sonatype.org/service/local/staging/deploy/maven2/"

Alright so we have to run util/deploy-all.sh and direct it to push files to local Maven. I created a new version of Dagger, both in the WORKSPACE as well as the arg here. This will become important later when I want to consume Dagger i n NowInAndroid. 

util/deploy-all.sh \
    "deploy:deploy-file" \
    "2.53" \
    "-DrepositoryId=local-repo" \
    "-Durl=file://$HOME/.m2/repository"

Now that I have it building the artifacts in the right place its time to make the necessary modifications to use dev Kotlin and my local KSP artifacts. This is where I got stumped and Erik Kerber helped me out:

… you could probably set up an archive_override to switch to your dev version. The syntax of that would be a matter of if your example is using Bzlmod or WORKSPACE. I don't see any hooks in rules_kotlin specifically for this use case otherwise. Worth an issue on rules_kotlin. Surprised this isn't built in. We can use a custom toolchain for rules_swift.

This led me to learning about archive_override and http_archive. There were also a number of small tweaks I had to make to use the latest versions of Kotlin + KSP in Dagger’s source code, though thankfully all of them were very straightforward. Here is the full changeset for Dagger to use dev Kotlin & local KSP artifacts.

Building NowInAndroid

My local Maven repo finally has everything I need to build NowInAndroid and verify the fix. We’re out of Bazel and I’ve had some practice at this so these changes are fairly straightforward. I do need to add the JetBrains Compose dev Maven repo due to specifying the version of Kotlin being a dev version. Here is the full changeset I made to get NowInAndroid compiling.

Now to run VisualVM one last time to see if the KSP change made an impact on the Metaspace leak.

You Can Do This Too

This concludes our KSP memory leak journey, and the fix was merged in KSP 2.0.21-1.0.26. Hopefully the perspective provided of the debugging & verification process helps you understand a bit more about how this part of the build pipeline works. And now if you spot a KSP Metaspace memory leak you’re armed to take it on yourself.