Skip to content

Conversation

@brandur
Copy link
Contributor

@brandur brandur commented Dec 23, 2025

We've gotten a couple requests so far (see #342 and #1105) to be able to
start multiple River clients targeting different queues within the same
database/schema, and giving them the capacity to operate independently
enough to be functional. This is currently not possible because a single
leader is elected given a single schema and it handles all maintenance
operations including non-queue ones like periodic job enqueuing.

Here, add the idea of a LeaderDomain. This lets a user set the
"domain" on which a client will elect its leader and allowing multiple
leaders to be elected in a single schema. Each leader will run its own
maintenance services.

Setting LeaderDomain causes the additional effect of having
maintenance services start to operate only on the queues that their
client is configured for. The idea here is to give us backwards
compatibility in that the default behavior (in case of an unset
LeaderDomain) is the same, but providing a path for multiple leaders
to be interoperable with each other.

There are still a few edges: for example, reindexing is not queue
specific, so multiple leaders could be running a reindexer. I've
provided guidance in the config documentation that ideally, all clients
but one should have their reindexer disabled.

@brandur brandur requested a review from bgentry December 23, 2025 22:21
@brandur
Copy link
Contributor Author

brandur commented Dec 23, 2025

@bgentry This works (I think), but still needs some testing added. Wanted to get your general reaction before taking it all the way.

@brandur
Copy link
Contributor Author

brandur commented Dec 24, 2025

Okay, the tests for this should be in good shape now.

@@ -0,0 +1,3 @@
ALTER TABLE /* TEMPLATE: schema */river_leader
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're adding a migration here, before shipping I'd also go through the existing "needs migration" issues and pull them in.

One nice thing about this migration as currently written is that if you did not run it, it wouldn't be a problem as long as you didn't try to use LeaderDomain. So it's safer than your average migration.

@brandur brandur force-pushed the brandur-leadership-realms branch from 1c32845 to ec7e682 Compare December 25, 2025 19:49
We've gotten a couple requests so far (see #342 and #1105) to be able to
start multiple River clients targeting different queues within the same
database/schema, and giving them the capacity to operate independently
enough to be functional. This is currently not possible because a single
leader is elected given a single schema and it handles all maintenance
operations including non-queue ones like periodic job enqueuing.

Here, add the idea of a `LeaderDomain`. This lets a user set the
"domain" on which a client will elect its leader and allowing multiple
leaders to be elected in a single schema. Each leader will run its own
maintenance services.

Setting `LeaderDomain` causes the additional effect of having
maintenance services start to operate only on the queues that their
client is configured for. The idea here is to give us backwards
compatibility in that the default behavior (in case of an unset
`LeaderDomain`) is the same, but providing a path for multiple leaders
to be interoperable with each other.

There are still a few edges: for example, reindexing is not queue
specific, so multiple leaders could be running a reindexer. I've
provided guidance in the config documentation that ideally, all clients
but one should have their reindexer disabled.
@brandur brandur force-pushed the brandur-leadership-realms branch from ec7e682 to f41afcc Compare December 26, 2025 19:31
// because the default client(s) will infringe on the domains of the
// non-default one(s).
//
// Certain maintenance services that aren't queue-related like the indexer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Certain maintenance services that aren't queue-related like the indexer
// Certain maintenance services that aren't queue-related like the reindexer

Comment on lines +247 to +249
// In general, most River users should not need LeaderDomain, and when
// running multiple Rivers may want to consider using multiple databases and
// multiple schemas instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably try to get some version of this warning in the first paragraph to dissuade people from using this feature. Could describe it as "an advanced option" or something like that. I would just like to ensure people don't start using this setting automatically just bc it's there because it has some major footguns.

Comment on lines +229 to +234
// A warning though that River *does not protect against configuration
// mistakes*. If client1 on domain1 is configured for queue_a and queue_b,
// and client2 on domain2 is *also* configured for queue_a and queue_b, then
// both clients may end up running maintenance services on the same queues
// at the same time. It's the caller's responsibility to ensure that doesn't
// happen.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotta say this definitely feels dangerous. I'm wondering what else can break if we allow for breaking one of the main promises of leader election (there can only be one), including Pro features.

Comment on lines +907 to +908
// It's important for queuesIncluded to be `nil` in case it's not in use
// for the various driver queries to work correctly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the driver layer should handle nil and []string{} equivalently to avoid needing to deal with that concern at this level?

var sub *notifier.Subscription
if e.notifier == nil {
e.Logger.DebugContext(ctx, e.Name+": No notifier configured; starting in poll mode", "client_id", e.config.ClientID)
e.Logger.DebugContext(ctx, e.Name+": Resigned leadership successfully", "client_id", e.config.ClientID, "domain", e.config.Domain)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this log text altered by mistake?

}

e.Logger.DebugContext(ctx, e.Name+": Current leader attempting reelect", "client_id", e.config.ClientID)
e.Logger.InfoContext(ctx, e.Name+": Current leader received forced resignation", "client_id", e.config.ClientID, "domain", e.config.Domain)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the text here was also changed, I think accidentally

// another client, even in the event of a cancellation.
func (e *Elector) attemptResignLoop(ctx context.Context) {
e.Logger.DebugContext(ctx, e.Name+": Attempting to resign leadership", "client_id", e.config.ClientID)
e.Logger.InfoContext(ctx, e.Name+": Current leader received forced resignation", "client_id", e.config.ClientID, "domain", e.config.Domain)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this text is necessarily true, there are other reasons this could be called besides a forced resignation.

Comment on lines 93 to +107
-- name: JobDeleteBefore :execresult
DELETE FROM /* TEMPLATE: schema */river_job
WHERE
id IN (
SELECT id
FROM /* TEMPLATE: schema */river_job
WHERE
WHERE id IN (
SELECT id
FROM /* TEMPLATE: schema */river_job
WHERE (
(state = 'cancelled' AND finalized_at < cast(@cancelled_finalized_at_horizon AS text)) OR
(state = 'completed' AND finalized_at < cast(@completed_finalized_at_horizon AS text)) OR
(state = 'discarded' AND finalized_at < cast(@discarded_finalized_at_horizon AS text))
ORDER BY id
LIMIT @max
)
-- This is really awful, but unless the `sqlc.slice` appears as the very
-- last parameter in the query things will fail if it includes more than one
-- element. The sqlc SQLite driver uses position-based placeholders (?1) for
-- most parameters, but unnamed ones with `sqlc.slice` (?), and when
-- positional parameters follow unnamed parameters great confusion is the
-- result. Making sure `sqlc.slice` is last is the only workaround I could
-- find, but it stops working if there are multiple clauses that need a
-- positional placeholder plus `sqlc.slice` like this one (the Postgres
-- driver supports a `queues_included` parameter that I couldn't support
-- here). The non-workaround version is (unfortunately) to never, ever use
-- the sqlc driver for SQLite -- it's not a little buggy, it's off the
-- charts buggy, and there's little interest from the maintainers in fixing
-- any of it. We already started using it though, so plough on.
AND (
cast(@queues_excluded_empty AS boolean)
OR river_job.queue NOT IN (sqlc.slice('queues_excluded'))
);
)
AND (/* TEMPLATE_BEGIN: queues_excluded_clause */ true /* TEMPLATE_END */)
AND (/* TEMPLATE_BEGIN: queues_included_clause */ true /* TEMPLATE_END */)
ORDER BY id
LIMIT @max
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment applies to all drivers but I'm leaving it here because it doesn't seem like the pgx driver's JobDeleteBefore was updated in this PR yet. This query is currently targeted at the following index:

"river_job_state_and_finalized_at_index" btree (state, finalized_at) WHERE finalized_at IS NOT NULL

That will break if we're also filtering by queue, and this may turn into a sequential scan.

I haven't checked the other queries you've modified here, but this is an important consideration we'd need to be very careful of for all of them.

Comment on lines +2 to +4
-- Alter `river_leader` to remove check constraint that `name` must be
-- `default`. SQLite doesn't allow schema modifications, so this redefines the
-- table entirely.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQLite doesn't allow schema modifications, so this redefines the table entirely.

🤯

updatedSQL := sql
updatedSQL = replaceTemplate(updatedSQL, templateBeginEndRE)
updatedSQL = replaceTemplate(updatedSQL, templateRE)
updatedSQL = replaceTemplate(updatedSQL, templateBeginEndRE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on with the reordering here?

Comment on lines +243 to +245
// will continue to run on all leaders regardless of domain. If using this
// feature, it's a good idea to configure ReindexerTimeout on all but a
// single leader domain to river.NeverSchedule().

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we are running many pods of the same worker, do we treat as "leader" a pod here or a group of workers? If we are saying that only one pod should be reindexing, then what do we do when the pod dies? Or having let's say 3 reindexer pods and many more that don't would be also alright?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants