Strangler-Fig Migration: .NET to Java Sequencing Guide

The .NET-to-Java migration conversation surfaces more often than vendor briefings suggest. Java platform consolidation, Windows licensing pressure, and the operational cost of running two language ecosystems in parallel push organizations to ask whether their .NET estate can be moved to the JVM without a full rewrite. The strangler-fig pattern offers a viable path, but it adds a layer of friction that within-ecosystem modernizations do not face: the two runtimes have different threading models, memory management approaches, and serialization defaults, which means the routing and consistency infrastructure has to work harder than most teams anticipate.

We have worked through enough of these programs to say clearly that the pattern holds, but that teams who treat a .NET-to-Java migration the same way they would treat a .NET-to-.NET-8 lift are in for a difficult surprise around month four. The differences are specific and manageable if you see them early. They are expensive if you discover them during a cutover.

When this migration is actually worth doing

Cross-language migrations carry a higher overhead than within-ecosystem modernizations. The routing infrastructure, the dual-write layer, and the observability stack all have to span two runtimes with different native tooling, different profiling formats, and different monitoring conventions. A .NET-to-.NET 8 migration can reuse most of its operational toolchain; a .NET-to-Java migration cannot.

That overhead is justified in roughly three scenarios. First, your organization is consolidating to a single JVM platform: your Java services outnumber your .NET services, your infrastructure and tooling investment is already JVM-centric, and carrying a parallel .NET operational model has measurable recurring cost. Second, Windows server licensing pressure cannot be resolved by moving to .NET on Linux. Third, your target deployment model requires GraalVM native images or JVM-native frameworks such as Quarkus or Micronaut, which offer startup and memory profiles that the .NET runtime family cannot match on your infrastructure.

A decision chain of three checks. First, is the organization consolidating to a single JVM platform where Java services already outnumber .NET services? If yes, migrate to Java via strangler-fig. If no, second, is there Windows server licensing pressure that moving to .NET on Linux cannot resolve? If yes, migrate. If no, third, does the target deployment require GraalVM native images or a JVM-native framework such as Quarkus or Micronaut? If yes, migrate. If none of the three apply, for example an estate that is 80 percent .NET and 20 percent Java, modernize within .NET 8 instead. — Figure 1. The three scenarios that justify a .NET-to-Java migration, and the outside case the paper says should modernize within .NET 8 instead.

Outside these scenarios, the overhead is rarely justified. An estate that is 80% .NET and 20% Java, where the Java presence is incidental, should modernize within .NET 8. The pattern should follow the organization's actual platform direction, not the other way around.

The seam problem is harder across languages

The strangler-fig seam, the routing layer that intercepts traffic and directs it to either the legacy .NET service or the new Java equivalent, is not technically difficult to build in either ecosystem. The difficulty is that the two systems must agree on wire formats, error semantics, and timeout behaviors that neither runtime enforces identically by default.

.NET's built-in JSON serializer and Java's most common alternatives (Jackson, Gson) have different rules for null handling, date-time formatting, and property naming conventions. A .NET service that returns null for a missing optional field will produce a different JSON payload than a Jackson-serialized Java service using the same domain model, unless both sides are configured explicitly and consistently. We have seen routing layers that passed all functional tests fail in production because the legacy .NET client was stripping nulls that the Java service expected to be present, producing silent data loss in downstream consumers.

The corrective is to specify the wire contract before writing the Java replacement. OpenAPI 3.1 gives you a language-neutral contract specification; generate both client stubs and server stubs from it for both sides of the seam, and treat any deviation from the spec as a bug in both systems, not a negotiated difference. This step is slower up front and faster at every subsequent extraction.

A related issue: .NET exception handling and Java exception handling produce structurally different error payloads by default. Define a shared error schema in the OpenAPI spec, map both runtimes to it explicitly, and validate with integration tests using actual payloads before routing live traffic. The discrepancy will surface eventually; the question is whether it surfaces in test or in an on-call rotation.

Sequencing the extraction

The sequencing logic for a .NET-to-Java strangler-fig follows the same principles as any incremental extraction, with one addition: extract by API surface first, not by data domain.

In a within-ecosystem migration, you can sometimes extract a data domain and let the application layer follow. Cross-language migrations make this impractical because the Java replacement and the .NET original will both be live against the same data store during the transition, and ORM behavior across the two ecosystems differs in ways that produce subtle inconsistencies. Entity Framework and Hibernate handle lazy loading, cascade behavior, and transaction boundaries differently. Running both ORMs against the same schema is possible, but it produces an inconsistency surface proportional to how much each side relies on ORM-managed behavior.

The sequence that reduces risk:

Start with read-heavy, stateless API surfaces that carry clear OpenAPI contracts and no shared mutable state. These surfaces let your team build the routing infrastructure and Java operational toolchain on low-risk traffic before touching anything with write semantics.
Run shadow traffic on the Java replacement before promoting it to primary. Diffy, originally from Twitter's engineering team, automates response comparison across a candidate service and a control service using live traffic without affecting users. It surfaces semantic differences that unit tests consistently miss.
Extract write paths only after the read infrastructure is stable and your team has demonstrated it can operate the Java service through at least one production incident.
Treat the database as a separate strangler, extracted after the application layer stabilizes. Change data capture via Debezium handles the consistency gap between the two ORMs during the transition period.

A four-step vertical sequence. Step one: start with read-heavy, stateless API surfaces that carry clear OpenAPI contracts and no shared mutable state. Step two: run shadow traffic on the Java replacement before promoting it to primary, using Diffy to compare responses on live traffic. Step three: extract write paths only after the read infrastructure is stable and the team has operated the Java service through at least one production incident. Step four: extract the database last, after the application layer stabilizes, using change data capture via Debezium. — Figure 2. The four-step extraction sequence the paper recommends for a .NET-to-Java strangler-fig, in the order it says reduces risk.

The most common sequencing mistake we see is extracting a write path before the team has built operational confidence in the Java service. A failed write path mid-migration creates a recovery scenario far more expensive than the delay cost of better sequencing.

The toolchain gap is real and budgeted for

Every organization migrating from .NET to Java underestimates the toolchain gap. The two ecosystems have nominally equivalent tools: NuGet versus Maven or Gradle, xUnit versus JUnit, Serilog versus Logback, dotnet-trace versus async-profiler. The operational practices built up around .NET tooling do not transfer. A team that has spent five years tuning .NET profiling, structured logging, and health checks has to rebuild that institutional knowledge from scratch on the JVM side.

This is not an argument against the migration. It is an argument for treating the toolchain build-out as a first-class workstream, not a side effect of the code port. The workstream should cover, at minimum: Maven or Gradle build standards with reproducible builds, JVM memory and GC parameter configuration for your container targets (a JVM running in a 512 MB container without explicit heap flags will behave in ways the .NET runtime does not), OpenTelemetry-based observability wired up before any service handles live traffic, and a deployment pipeline for Java services that matches the quality bar of your existing .NET pipelines.

A concrete cost profile from the migrations we have run: the toolchain workstream consumes roughly 20% to 25% of total migration effort for the first service extracted, dropping to approximately 8% to 12% for subsequent extractions as standards stabilize. Organizations that treat this cost as zero pay for the assumption in their first production incident.

The counter-take on Java as the target platform

Conventional advice in 2025 treats Java on Kubernetes as an unambiguous improvement over .NET Framework estates. We disagree with the framing, if not always the conclusion.

.NET 8 and .NET 9 on Linux containers are credible, production-grade modernization targets. The startup time and memory footprint of ASP.NET Core on .NET 8 is competitive with Spring Boot, and the native AOT path via .NET NativeAOT is mature for stateless services. The TechEmpower Framework Benchmarks consistently show ASP.NET Core at or ahead of Spring Boot on raw throughput across most test categories. The JVM is not the only credible runtime for high-scale, container-native workloads.

The migration to Java is warranted when organizational context demands it: platform consolidation, tooling investment, team expertise. It is not warranted because "Java is more enterprise" or because the new team prefers it. Organizations that migrate on the basis of language preference rather than platform strategy tend to arrive at the same operational complexity they left, with a different runtime and a 12 to 18 month delay before they are as proficient in the new ecosystem as they were in the old.

Pick the target because it fits the platform. Not because the pattern makes either choice easy.

Timeline and cost profile

A .NET-to-Java strangler-fig migration for an estate of 100K to 500K lines of code runs 18 to 36 months from first extraction to final .NET decommissioning, with the first Java service in production around month 3 to 4 and meaningful API surface coverage by month 12. Smaller, cleanly bounded estates under 50K lines can realistically complete in 9 to 12 months if the organizational preconditions are in place.

Cost distribution, based on programs we have supported: roughly 25% on routing infrastructure, toolchain build-out, and contract specification; 55% on extraction and Java reimplementation; 20% on data migration, cutover coordination, and .NET decommissioning. Teams that skip the first bucket, treating infrastructure and contracts as overhead rather than investment, typically see the second bucket expand to absorb the missed work at a worse exchange rate.

A bar chart with three bars. Routing infrastructure, toolchain build-out, and contract specification is 25 percent. Extraction and Java reimplementation is 55 percent. Data migration, cutover coordination, and .NET decommissioning is 20 percent. — Figure 3. The paper's cost distribution for a .NET-to-Java strangler-fig program: 25% routing infrastructure and contracts, 55% extraction and reimplementation, 20% data migration and cutover.

The comparison point matters here. A big-bang rewrite of the same 100K to 500K line estate typically runs 24 to 48 months with a materially higher failure rate. The Standish Group's CHAOS report data on large software projects puts the on-time, on-budget, in-scope completion rate for large rewrites at under 10%. The incremental strangler-fig path is slower in perception and faster in outcome.

Where to start

If the .NET-to-Java migration is the right direction for your platform, the following sequence reduces exposure:

Audit your .NET estate for COM dependencies, Windows-specific runtime assumptions, WCF service contracts, and third-party controls with no Java equivalent. These items break the porting estimate; find them before scoping the program, not during cutover.
Establish the Java operational toolchain, build, test, deploy, and observe, for one low-risk service before extracting anything business-critical. The goal is operational confidence, not feature delivery.
Write the OpenAPI 3.1 contract for your first extraction target and validate that the existing .NET implementation actually matches it. Gaps in the current service's contract compliance will surface here, and they are far cheaper to address before the migration starts.
Stand up your routing layer, an API gateway or YARP with an OpenTelemetry-instrumented Java target, and validate end-to-end distributed tracing before routing any live traffic.
Extract read paths first. Shadow the Java replacement against live traffic for 2 to 4 weeks, diff the responses systematically, and promote to primary only after the diff rate has been clean for the final week.
Put the decommissioning schedule for .NET components in writing before beginning extractions. Programs without explicit retirement commitments routinely run legacy and new systems in parallel for 24 months or longer, paying double operational cost and earning none of the consolidation benefit.

The migration is achievable. It is not a technology problem; the technology on both sides is mature and well-documented. The problems that stall these programs are almost always organizational: undefined retirement commitments, underinvestment in routing infrastructure and wire-contract specification, and teams that discover too late how much of their operational knowledge is ecosystem-specific rather than transferable. Get those three things right, and the engineering is the straightforward part.

// Related

Continue reading

APPLICATION MODERNIZATION AND DATA MANAGEMENT

The Data Contract Problem: Why Your Lakehouse Keeps Breaking

Lakehouses do not break because of bad tooling. They break because nobody owns the schemas at the seam between producers and consumers.Lakehouse archi…