Add leadership "domains" so multiple Rivers can operate in one schema #1113

brandur · 2025-12-23T22:21:06Z

We've gotten a couple requests so far (see #342 and #1105) to be able to
start multiple River clients targeting different queues within the same
database/schema, and giving them the capacity to operate independently
enough to be functional. This is currently not possible because a single
leader is elected given a single schema and it handles all maintenance
operations including non-queue ones like periodic job enqueuing.

Here, add the idea of a LeaderDomain. This lets a user set the
"domain" on which a client will elect its leader and allowing multiple
leaders to be elected in a single schema. Each leader will run its own
maintenance services.

Setting LeaderDomain causes the additional effect of having
maintenance services start to operate only on the queues that their
client is configured for. The idea here is to give us backwards
compatibility in that the default behavior (in case of an unset
LeaderDomain) is the same, but providing a path for multiple leaders
to be interoperable with each other.

There are still a few edges: for example, reindexing is not queue
specific, so multiple leaders could be running a reindexer. I've
provided guidance in the config documentation that ideally, all clients
but one should have their reindexer disabled.

brandur · 2025-12-23T22:21:55Z

@bgentry This works (I think), but still needs some testing added. Wanted to get your general reaction before taking it all the way.

brandur · 2025-12-24T19:59:44Z

Okay, the tests for this should be in good shape now.

brandur · 2025-12-24T20:00:50Z

riverdriver/riverpgxv5/migration/main/007_river_leader_non_default_name.down.sql

@@ -0,0 +1,3 @@
+ALTER TABLE /* TEMPLATE: schema */river_leader


Since we're adding a migration here, before shipping I'd also go through the existing "needs migration" issues and pull them in.

One nice thing about this migration as currently written is that if you did not run it, it wouldn't be a problem as long as you didn't try to use LeaderDomain. So it's safer than your average migration.

We've gotten a couple requests so far (see #342 and #1105) to be able to start multiple River clients targeting different queues within the same database/schema, and giving them the capacity to operate independently enough to be functional. This is currently not possible because a single leader is elected given a single schema and it handles all maintenance operations including non-queue ones like periodic job enqueuing. Here, add the idea of a `LeaderDomain`. This lets a user set the "domain" on which a client will elect its leader and allowing multiple leaders to be elected in a single schema. Each leader will run its own maintenance services. Setting `LeaderDomain` causes the additional effect of having maintenance services start to operate only on the queues that their client is configured for. The idea here is to give us backwards compatibility in that the default behavior (in case of an unset `LeaderDomain`) is the same, but providing a path for multiple leaders to be interoperable with each other. There are still a few edges: for example, reindexing is not queue specific, so multiple leaders could be running a reindexer. I've provided guidance in the config documentation that ideally, all clients but one should have their reindexer disabled.

bgentry · 2025-12-31T16:23:19Z

client.go

+	// because the default client(s) will infringe on the domains of the
+	// non-default one(s).
+	//
+	// Certain maintenance services that aren't queue-related like the indexer


Suggested change

// Certain maintenance services that aren't queue-related like the indexer

// Certain maintenance services that aren't queue-related like the reindexer

bgentry · 2025-12-31T16:24:49Z

client.go

+	// In general, most River users should not need LeaderDomain, and when
+	// running multiple Rivers may want to consider using multiple databases and
+	// multiple schemas instead.


I would probably try to get some version of this warning in the first paragraph to dissuade people from using this feature. Could describe it as "an advanced option" or something like that. I would just like to ensure people don't start using this setting automatically just bc it's there because it has some major footguns.

bgentry · 2025-12-31T16:34:27Z

client.go

+	// A warning though that River *does not protect against configuration
+	// mistakes*. If client1 on domain1 is configured for queue_a and queue_b,
+	// and client2 on domain2 is *also* configured for queue_a and queue_b, then
+	// both clients may end up running maintenance services on the same queues
+	// at the same time. It's the caller's responsibility to ensure that doesn't
+	// happen.


Gotta say this definitely feels dangerous. I'm wondering what else can break if we allow for breaking one of the main promises of leader election (there can only be one), including Pro features.

bgentry · 2025-12-31T16:35:21Z

client.go

+		// It's important for queuesIncluded to be `nil` in case it's not in use
+		// for the various driver queries to work correctly.


maybe the driver layer should handle nil and []string{} equivalently to avoid needing to deal with that concern at this level?

bgentry · 2025-12-31T16:37:13Z

internal/leadership/elector.go

 	var sub *notifier.Subscription
 	if e.notifier == nil {
-		e.Logger.DebugContext(ctx, e.Name+": No notifier configured; starting in poll mode", "client_id", e.config.ClientID)
+		e.Logger.DebugContext(ctx, e.Name+": Resigned leadership successfully", "client_id", e.config.ClientID, "domain", e.config.Domain)


was this log text altered by mistake?

bgentry · 2025-12-31T16:38:07Z

internal/leadership/elector.go

 		}

-		e.Logger.DebugContext(ctx, e.Name+": Current leader attempting reelect", "client_id", e.config.ClientID)
+		e.Logger.InfoContext(ctx, e.Name+": Current leader received forced resignation", "client_id", e.config.ClientID, "domain", e.config.Domain)


the text here was also changed, I think accidentally

bgentry · 2025-12-31T16:39:39Z

internal/leadership/elector.go

 // another client, even in the event of a cancellation.
 func (e *Elector) attemptResignLoop(ctx context.Context) {
-	e.Logger.DebugContext(ctx, e.Name+": Attempting to resign leadership", "client_id", e.config.ClientID)
+	e.Logger.InfoContext(ctx, e.Name+": Current leader received forced resignation", "client_id", e.config.ClientID, "domain", e.config.Domain)


I don't think this text is necessarily true, there are other reasons this could be called besides a forced resignation.

bgentry · 2025-12-31T16:58:38Z

riverdriver/riversqlite/internal/dbsqlc/river_job.sql

 -- name: JobDeleteBefore :execresult
 DELETE FROM /* TEMPLATE: schema */river_job
-WHERE
-    id IN (
-        SELECT id
-        FROM /* TEMPLATE: schema */river_job
-        WHERE
+WHERE id IN (
+    SELECT id
+    FROM /* TEMPLATE: schema */river_job
+    WHERE (
            (state = 'cancelled' AND finalized_at < cast(@cancelled_finalized_at_horizon AS text)) OR
            (state = 'completed' AND finalized_at < cast(@completed_finalized_at_horizon AS text)) OR
            (state = 'discarded' AND finalized_at < cast(@discarded_finalized_at_horizon AS text))
-        ORDER BY id
-        LIMIT @max
-    )
-    -- This is really awful, but unless the `sqlc.slice` appears as the very
-    -- last parameter in the query things will fail if it includes more than one
-    -- element. The sqlc SQLite driver uses position-based placeholders (?1) for
-    -- most parameters, but unnamed ones with `sqlc.slice` (?), and when
-    -- positional parameters follow unnamed parameters great confusion is the
-    -- result. Making sure `sqlc.slice` is last is the only workaround I could
-    -- find, but it stops working if there are multiple clauses that need a
-    -- positional placeholder plus `sqlc.slice` like this one (the Postgres
-    -- driver supports a `queues_included` parameter that I couldn't support
-    -- here). The non-workaround version is (unfortunately) to never, ever use
-    -- the sqlc driver for SQLite -- it's not a little buggy, it's off the
-    -- charts buggy, and there's little interest from the maintainers in fixing
-    -- any of it. We already started using it though, so plough on.
-    AND (
-        cast(@queues_excluded_empty AS boolean)
-        OR river_job.queue NOT IN (sqlc.slice('queues_excluded'))
-    );
+        )
+        AND (/* TEMPLATE_BEGIN: queues_excluded_clause */ true /* TEMPLATE_END */)
+        AND (/* TEMPLATE_BEGIN: queues_included_clause */ true /* TEMPLATE_END */)
+    ORDER BY id
+    LIMIT @max
+);


This comment applies to all drivers but I'm leaving it here because it doesn't seem like the pgx driver's JobDeleteBefore was updated in this PR yet. This query is currently targeted at the following index:

"river_job_state_and_finalized_at_index" btree (state, finalized_at) WHERE finalized_at IS NOT NULL

That will break if we're also filtering by queue, and this may turn into a sequential scan.

I haven't checked the other queries you've modified here, but this is an important consideration we'd need to be very careful of for all of them.

bgentry · 2025-12-31T17:10:12Z

riverdriver/riversqlite/migration/main/007_river_leader_non_default_name.up.sql

+-- Alter `river_leader` to remove check constraint that `name` must be
+-- `default`. SQLite doesn't allow schema modifications, so this redefines the
+-- table entirely.


SQLite doesn't allow schema modifications, so this redefines the table entirely.

🤯

bgentry · 2025-12-31T17:11:29Z

rivershared/sqlctemplate/sqlc_template.go

 	updatedSQL := sql
-	updatedSQL = replaceTemplate(updatedSQL, templateBeginEndRE)
 	updatedSQL = replaceTemplate(updatedSQL, templateRE)
+	updatedSQL = replaceTemplate(updatedSQL, templateBeginEndRE)


What's going on with the reordering here?

NanoBjorn · 2026-01-09T12:14:00Z

client.go

+	// will continue to run on all leaders regardless of domain. If using this
+	// feature, it's a good idea to configure ReindexerTimeout on all but a
+	// single leader domain to river.NeverSchedule().


Now we are running many pods of the same worker, do we treat as "leader" a pod here or a group of workers? If we are saying that only one pod should be reindexing, then what do we do when the pod dies? Or having let's say 3 reindexer pods and many more that don't would be also alright?

brandur requested a review from bgentry December 23, 2025 22:21

brandur force-pushed the brandur-leadership-realms branch 4 times, most recently from 0819eec to 1c32845 Compare December 24, 2025 19:58

brandur mentioned this pull request Dec 24, 2025

Worker is trying to rescue jobs from queues that are not assigned to it #1105

Open

brandur commented Dec 24, 2025

View reviewed changes

brandur force-pushed the brandur-leadership-realms branch from 1c32845 to ec7e682 Compare December 25, 2025 19:49

brandur force-pushed the brandur-leadership-realms branch from ec7e682 to f41afcc Compare December 26, 2025 19:31

bgentry reviewed Dec 31, 2025

View reviewed changes

NanoBjorn reviewed Jan 9, 2026

View reviewed changes

		@@ -0,0 +1,3 @@
		ALTER TABLE /* TEMPLATE: schema */river_leader

	// Certain maintenance services that aren't queue-related like the indexer
	// Certain maintenance services that aren't queue-related like the reindexer

		// It's important for queuesIncluded to be `nil` in case it's not in use
		// for the various driver queries to work correctly.

Add leadership "domains" so multiple Rivers can operate in one schema #1113

Are you sure you want to change the base?

Add leadership "domains" so multiple Rivers can operate in one schema #1113

Uh oh!

Conversation

brandur commented Dec 23, 2025

Uh oh!

brandur commented Dec 23, 2025

Uh oh!

brandur commented Dec 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants