feat: add channel-level echo suppression spec points#433
feat: add channel-level echo suppression spec points#433mattheworiordan wants to merge 3 commits intomainfrom
Conversation
New spec points RTL4n, RTL4n1, RTL4n2 define echo=false as a channel param for per-channel echo suppression. RTL7i and RTL7j add test requirements. Updates TO3h and RTN2b with cross-references. Updates params api-docstring to document the echo channel param.
| ** @(RTL4m)@ On receipt of an @ATTACHED@, the client library should decode the @flags@ into an array of @ChannelMode@ s (that is, the same format as @ChannelOptions.modes@) and expose it as a read-only @modes@ field of the @RealtimeChannel@ (or a @getModes()@ method where that is more idiomatic). This should only contain @ChannelMode@ s: it should not contain flags which are not modes (see "@TB2d@":#TB2d) | ||
| ** @(RTL4n)@ If the @params@ object includes an @echo@ key with value @"false"@, messages originating from this connection must not be echoed back on this channel, regardless of the connection-level @echoMessages@ setting ("@TO3h@":#TO3h). The only valid value for the @echo@ channel param is @"false"@. If the @echo@ param is absent or empty, the connection-level @echoMessages@ setting applies. | ||
| *** @(RTL4n1)@ The @echo@ channel param only affects messages (action @MESSAGE@). Presence messages must always be delivered regardless of the @echo@ setting, consistent with connection-level @echoMessages@ behavior ("@RTN2b@":#RTN2b). | ||
| *** @(RTL4n2)@ The @echo@ channel param is a continuing property: changing it via @setOptions@ ("@RTL16a@":#RTL16a) on an already-attached channel must trigger a new @ATTACH@ to update the server. |
There was a problem hiding this comment.
None of these spec items are needed. This is the SDK spec, but these are all instructing the server how to behave ("messages originating from this connection must not be echoed back on this channel" etc). It isn't the SDK's responsibility to interpret, validate, or act based on the params (indeed that's the whole point of params), its responsibility is just to encode them correctly and send them to the server, which behaviour is specified by RTL4k.
There was a problem hiding this comment.
I've removed the cross-references from RTN2b, TO3h, and the docstring change - those were clearly wrong (leaking channel concerns into connection/docstring domains, thanks for spotting 🙏 )
However, I'd like to keep RTL4n and its sub-points. A few reasons:
- RTL4j2 sets precedent. It describes the expected behavior of the rewind channel param in a test scenario, including what the client should receive when rewind=1 is set. If params were truly opaque to the spec, RTL4j2
wouldn't exist either. - This is end-to-end expected behavior, not server internals. The echo param changes what the end user expects from the library. An SDK setting echo=false as a channel param is making a promise to the developer that their messages won't be echoed. If a future regression corrupts or voids that param, without these spec points and tests, no SDK would detect the breakage.
- The spec is the single source of truth for AI agents building SDKs. We've seen this directly with @paddybyers' recent work using the spec to drive SDK implementations. These assertions describe expected behavior that agents (and humans) need to understand. The slight verbosity is a feature, not a bug.
Happy to discuss trimming the wording, but I don't think removing these entirely is right. RTL4n2 I could see dropping since it's just restating RTL16a behavior for params changes.
There was a problem hiding this comment.
I think that spec points like these new RTL4n ones are not just unnecessary but actively confusing, because as written it's very easy to come away with the impression that the SDK is meant to interpret the channel echo param.
I've seen agents fall into similar traps already when implementing presence — e.g. RTP9b for presence.update() says "If the client was not already entered, it enters this client into this channel". This spec point is describing a behaviour of Realtime — all that the client needs to do is send the UPDATE — but the agent wrote an implementation that looked in the local presence set and, if not there, sent an ENTER before proceeding to the UPDATE.
I agree that it's good to test end-to-end behaviours, and if we don't have a better way of enforcing that in a cross-SDK fashion at the moment than having a "there must be a test that x" spec point (i.e. your new RTL7 subpoints), then let's do that. But I don't think we should be specifying end-to-end behaviours other than to describe what the client needs to do to enable them.
There was a problem hiding this comment.
RTL4j2 sets precedent. It describes the expected behavior of the rewind channel param in a test scenario, including what the client should receive when rewind=1 is set. If params were truly opaque to the spec, RTL4j2
wouldn't exist either.
That doesn't follow. At least one channelparams test is necessary to test that the SDK is correctly actually including and correctly encoding channelparams. But since it's opaque, you shouldn't need to test them all.
without these spec points and tests, no SDK would detect the breakage.
None of our SDK test suites come remotely close to being a comprehensive set of tests for all our serverside behaviour. And that's ok, because that's not their job. Such tests should be in the realtime repo, which they are since you added them in http://github.com/ably/realtime/pull/8179/changes#diff-0d808dcb3738cdcf18633f4ba4e6620a800ed8381cba2b2a90c32dc86bf9f512R1566 .
These assertions describe expected behavior that agents (and humans) need to understand. The slight verbosity is a feature, not a bug.
Sure. You could imagine we could have a spec which does that job, essentially a combination of the sdk spec with all our documentation, and it would be quite useful for those reasons.
But that's just not what the spec is right now, not even close. Unless you're proposing that these items (plus a handful of others, mostly from you) are the first in a series of PRs that will increase the size of the spec by a factor of 5 by moving to this approach more generally, the only thing this does it make the spec a bit more incoherent. As Lawrence says, a handful of items speccing out serverside behaviour in a spec that mostly specs out clientside behaviour just increases confusion, both for humans and agents.
And I actually don't think we should move in that direction, for DRY reasons. I think we should keep the spec describing in specific detail what SDKs should do, and encourage agents (and humans) that need to understand how the system works as a whole to read it in conjunction with our docs, not try and make this spec do both.
| ** @(RTL7e)@ This clause has been deleted (redundant to "RSL6b":#RSL6b). | ||
| ** @(RTL7f)@ A test should exist ensuring published messages are not echoed back to the subscriber when @echoMessages@ is set to false in the @RealtimeClient@ library constructor | ||
| ** @(RTL7i)@ A test should exist ensuring published messages are not echoed back to the subscriber when @echo@ is set to @"false"@ in the channel @params@ ("@RTL4n@":#RTL4n), even when @echoMessages@ is @true@ (default) at the connection level. | ||
| ** @(RTL7j)@ A test should exist ensuring that on a connection with @echoMessages: false@, a channel without the @echo@ param still suppresses echo (inherits connection-level behavior). |
There was a problem hiding this comment.
don't think we should be mandating all SDKs ave a test for a serverside behaviour
There was a problem hiding this comment.
RTL7f already mandates exactly this pattern: "A test should exist ensuring published messages are not echoed back to the subscriber when echoMessages is set to false." That's testing server-side echo suppression, the SDK just sets a query param, the server does the work.
RTL7i/RTL7j follow that exact pattern for the channel-level equivalent. If the principle is that we shouldn't mandate tests for server behaviour, then RTL7f should also be removed (see comments below on this).
These tests exist to catch regressions. If a server update changes echo param handling, or if an SDK regression corrupts the param encoding, these tests are the safety net. That's exactly why we have end-to-end test requirements in the spec. If we want to change it, then let's change it, but not just for this PR.
There was a problem hiding this comment.
The plan is that all statements about what tests should exist will move to the UTS. They shouldn't be in this spec.
textile/features.textile
Outdated
| * @(RTN2)@ The default host used for realtime "websocket":https://ably.com/topic/websockets connections is the "@REC1@":#REC1 primary domain and the following query string params should be used when opening a new connection: | ||
| ** @(RTN2a)@ @format@ should be @msgpack@ (default) or @json@ | ||
| ** @(RTN2b)@ @echo@ should be @true@ if the @echoMessages@ client option is true; else it should be @false@, which will prevent messages published by the client being echoed back | ||
| ** @(RTN2b)@ @echo@ should be @true@ if the @echoMessages@ client option is true; else it should be @false@, which will prevent messages published by the client being echoed back. The connection-level @echo@ setting may be overridden on a per-channel basis using the @echo@ channel param; see "@RTL4n@":#RTL4n |
There was a problem hiding this comment.
Similar. First sentence is an instruction for how the sdk should behave (turn echoMessages into an echo qs param), plus a clause explaining what it will do (so the sdk can test it). The new second sentence is essentially user-facing documentation, it's not actionable for an SDK dev.
textile/api-docstrings.md
Outdated
| | push: PushChannel |||| A [`PushChannel`]{@link PushChannel} object. | | ||
| | modes: readonly [ChannelMode] ||| RTL4m | An array of [`ChannelMode`]{@link ChannelMode} objects. | | ||
| | params: readonly `Dict<String, String>` ||| RTL4k1 | Optional [channel parameters](https://ably.com/docs/realtime/channels/channel-parameters/overview) that configure the behavior of the channel. | | ||
| | params: readonly `Dict<String, String>` ||| RTL4k1 | Optional [channel parameters](https://ably.com/docs/realtime/channels/channel-parameters/overview) that configure the behavior of the channel. The `echo` channel param can be set to `"false"` to suppress messages originating from this connection being echoed back on this channel, regardless of the connection-level `echoMessages` setting. See RTL4n. | |
There was a problem hiding this comment.
This is a user-visible docstring, it links to the user docs which have a list of accepted params, it shouldn't be too wordy or try to list or explain all (or any) of them. It also shouldn't reference a spec item, which is meaningless to a user trying to integrate with the library.
There was a problem hiding this comment.
Agreed, reverted in c4da037, I should have spotted that one 😢
The params docstring is user-facing and should remain generic, linking to the docs for details. It shouldn't call out specific params or reference spec items. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These cross-references leak channel-level concerns into the connection domain. The echo channel param is documented in RTL4n where it belongs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for the review @SimonWoolf, I've addressed points above. More broadly though, this review highlights something I think we need to address. With AI agents increasingly contributing to the spec (and using it as source of truth for SDK implementations, as we've seen with Paddy's Right now, conventions are implicit and sometimes inconsistently applied. For example, RTL7f mandates a test for server-side echo suppression (which is the exact pattern RTL7i/RTL7j follow), and RTL4j2 describes expected Simon, you're the second-highest contributor to the spec by lines added (~1,766 additions in the last year), and one of the most technically rigorous reviewers. I'd like to see you drive an To show what I mean, here's a rough draft based on patterns I've extracted from the last ~12 months of PR review feedback. This is just a starting point / example to be thrown away. Could you own and refine it based on your deeper Here's an example of what Claude extracted for an AGENTS.md file: Ably Specification: AI Agent & Contributor GuidelinesThis file provides conventions and guidance for contributing to the Ably features These conventions are derived from patterns observed across PR reviews by core Scope: SDK Spec vs Server BehaviorThis is the SDK specification — it describes what client libraries must do.
Example — good: "If Example — avoid: "The server maintains an internal echo flag per connection Channel ParamsChannel params are passed to the server via RTL4k. The SDK's role is to encode When a channel param has user-observable behavioral implications that SDKs Language and Precision
Cross-References
Redundancy
Test Requirements
API Docstrings (
|
|
Totally agree re having a good |
|
@mattheworiordan fully agree with this 👍, I'm happy to take on, but also very happy for @lawrence-forooghian to drive if he'd like to. |
Summary
Adds specification points for per-channel echo suppression via the
echochannel param:echo=falsesuppresses own messages on a channel regardless of connection-levelechoMessagessetOptionstriggers re-ATTACHWhy
Echo control has been connection-level only (
echoMessages: false) since the beginning. This forces all channels on a connection to share the same echo behavior, which is inadequate for:echoMessages: falseat the connection levelechoMessages: falseat the connection levelThis has been a long-standing discussion. Channel-level control is simple, fully backwards compatible, and removes a common friction point for customers.
Related PRs
Test plan