How to migrate a Guid Identifier to String with Marten.db and .NET 8?
Below, you will find a practical, step-by-step guide to evolving document identifiers safely in a production Marten database.
Why doing it?
The decision to move from Guid-based identifiers to string identifiers was not something we took lightly.
Our system was already live, running on Marten’s event store, with 2 years of events safely persisted. Every stream, every aggregate, every projection was built around UUIDs. Changing that meant touching the very foundation of our data.
But external constraints changed. We needed identifiers that were:
- easier to share and reason about (something close to what Stripe does with prefix)
- compatible with systems that didn’t speak Guid
So we committed to migrating our Marten event store from Guid stream IDs to string stream IDs — without rewriting history.
How to choose a way to do it?
The first decision was where to perform the migration. Meaning: application-side or database-side?
Replaying events through application code was an option, but it came with serious downsides:
- slow execution
- increased operational risk
- too many moving parts for something that should be deterministic
Instead, we prefer database-side migration.
Our database is PostgreSQL and already had everything we needed. The data was correct, all we had to do 🙄, was copy it, converting UUIDs into a canonical string representation along the way.
And, that one fact, that shaped our decision: we could afford a downtime. Not a day. But a few minutes, and if absolutely necessary, up to an hour.
That latter is crucial:
- the event store wasn’t changing under our feet
- no new events were being appended while we migrated
- no projections were racing against partially migrated data
So, how to do it?
1. Prep the futur data model
Update all our C# class/model to have id as a string type and not GUID anymore. In our case, it concerned all our aggregate.
In our case, all Martendb stuff was stored under a dedicated default schema: eventstore. So we had 2 ways to create from scratch (meaning having the events.stream_id and stream.id type to varchar and not uuid):
- running one of our .NET app with the new code (we had a web app and a console app)
- get the Marten SQL script to create tables from scratch and run it
We choose the first one, because, we were able to do it.
That means, after updated all our class, we also need to update our extension method that configured Marten deps, as below:
1
2
3
4
5
6
7
var martenServices = services.AddMarten(options =>
{
//stuff
options.DatabaseSchemaName = "events";
options.Events.StreamIdentity = StreamIdentity.AsString;
//stuff
});
The schema name was now “events” instead of the default one “eventstore”. And the steam identity type was set to string instead of uuid.
Once updated it, we run our web app and because Marten did not found any data, it create the schema and all the related tables.
2. Migrating Streams without breaking anything
Marten stores stream metadata in the mt_streams table. This table is the backbone of the event store, so we started there.
Our goal was to:
- convert UUIDs to strings
- normalize them to lowercase
- avoid touching streams that were already migrated
The result was a simple constructed insert (build carefully with the help of ChatGpt):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- Copy STREAMS (uuid -> lowercase text), skip ones already present
INSERT INTO events.mt_streams (id, type, version, "timestamp", snapshot, snapshot_version, created, tenant_id, is_archived)
SELECT
lower(s.id::text) AS id,
s.type,
COALESCE(s.version, 0),
s."timestamp",
s.snapshot,
s.snapshot_version,
s.created,
s.tenant_id,
s.is_archived
FROM eventstore.mt_streams s
LEFT JOIN events.mt_streams t
ON t.id = lower(s.id::text)
WHERE t.id IS NULL;
Lowercasing the IDs wasn’t just cosmetic. It ensured that:
- ids remained deterministic
- comparisons stayed case-insensitive
- no surprises appeared later when integrating with other systems
- the LEFT JOIN … WHERE IS NULL pattern made the migration idempotent (we could run it multiple times without fear)
3. Migrating Events while preserving history
With streams in place, the next step was the real test: events.
Events reference their streams by stream_id, and if we got this wrong, we could end up with orphaned events or broken aggregates.
Therefore, the constraints were strict:
- preserve the original seq_id
- maintain event order
- avoid duplicates
- leave payloads untouched
Here’s the SQL we used:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Copy EVENTS (uuid -> lowercase text), skip (stream_id,version) already present
INSERT INTO events.mt_events
(seq_id, id, stream_id, version, data, type, "timestamp", tenant_id, mt_dotnet_type, is_archived)
SELECT
e.seq_id,
e.id,
lower(e.stream_id::text) AS stream_id,
e.version,
e.data,
e.type,
e."timestamp",
e.tenant_id,
e.mt_dotnet_type,
e.is_archived
FROM eventstore.mt_events e
LEFT JOIN events.mt_events t
ON t.stream_id = lower(e.stream_id::text)
AND t.version = e.version
WHERE t.stream_id IS NULL
ORDER BY e.seq_id;
Matching on (stream_id, version) ensured each event was copied exactly once. Ordering by seq_id preserved the original timeline, a small detail that makes a big difference when you care about auditability.
At this point, the data looked right. But there was still one thing left.
4. Teaching PostgreSQL where to continue
Because we manually inserted seq_id values, PostgreSQL’s sequence no longer knew where it should resume.
Left unfixed, the next event append would fail, or worse, overwrite history.
So we explicitly recreated (if missing, we never know!) and aligned the sequence:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- Ensure a sequence/default exists for future inserts on seq_id
DO $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_class WHERE relname = 'mt_events_sequence') THEN
CREATE SEQUENCE events.mt_events_sequence;
END IF;
PERFORM setval(
'events.mt_events_sequence',
(SELECT COALESCE(MAX(seq_id), 0) FROM events.mt_events) + 1,
true
);
ALTER TABLE events.mt_events
ALTER COLUMN seq_id SET DEFAULT nextval('events.mt_events_sequence');
END$$;
From PostgreSQL’s point of view, nothing unusual had happened. From Marten’s point of view, the event store was ready to accept new events again.
5. Rebuilding projections
Since stream identifiers had changed type, we rebuilt all projections from scratch to ensure full consistency. In our case, with our console app, using Hangfire for other features, we build a disabled recurring background job to rebuild projection manually if needed.
For the example, the code looked like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
public class RecreateProjections : JobBase, IAsynchronousJob
{
private readonly IDocumentStore _eventsStorage;
public RecreateProjections(
ILogger<RecreateProjections> logger,
IDocumentStore documentSession,
IOptionsSnapshot<JobSettings> settings) : base(settings, logger, nameof(RecreateProjections))
{
_eventsStorage = documentSession;
}
public override string Name => "Reconstruction des projections.";
[AutomaticRetry(Attempts = 0, OnAttemptsExceeded = AttemptsExceededAction.Fail)]
public async Task ExecuteAsync(
IJobCancellationToken cancellationToken,
PerformContext performContext
)
{
performContext.WriteLine("Démarrage des mise à jours des projections ...");
using var daemon = await _eventsStorage.BuildProjectionDaemonAsync();
try
{
using (HangfireConsoleScope.Bind(performContext))
{
try
{
performContext.WriteLine($"{nameof(SeriesDetailsProjection)} ...");
await daemon.RebuildProjectionAsync<SeriesDetailsProjection>(cancellationToken.ShutdownToken);
}
catch (Exception exception)
{
Logger.LogError(exception, "Lors de la mise à jour des vues.");
}
try
{
performContext.WriteLine($"{nameof(SeriesSummaryProjection)} ...");
await daemon.RebuildProjectionAsync<SeriesSummaryProjection>(cancellationToken.ShutdownToken);
}
catch (Exception exception)
{
Logger.LogError(exception, "Lors de la mise à jour des vues.");
}
// other stuff
}
}
catch (Exception exception)
{
Logger.LogError(exception, "Une erreur est survenue lors de l'exécution du job.");
throw; // Rethrow pour faire "échouer" le job.
}
finally
{
Logger.LogInformation("Fin");
}
}
}
Safety checks
- counting sum of stream and events on the older schema vs the new schema
- in case of using projections, counting sum of it from the older vs the new schema
Lessons learned
Looking back, a few lessons stood out clearly.
First, changing identifier types in an event-sourced system is never a refactor. It’s a data problem.
Second, boring solutions scale better under pressure. SQL scripts, explicit ordering, and deterministic transformations are far easier to reason about.
Finally, operational constraints matter as much as technical ones. Having a short maintenance window wasn’t a limitation, it was a hudge advantage that allowed us to keep the solution straightforward and safe.
Why this approach worked well?
- no application-level replays
- no event rewriting
- downtime was possible
- fully reversible before switching traffic
- most importantly, this treats the identifier change as a data evolution, not a refactor