Domain Driven Design versus ActivityPub

In many places on the web I have been explaining how I prefer to start modeling of any application by explicitly following Strategic DDD. (And note that if you don’t do it explicitly, you always still do it for any software you create, but then implicitly).

Here is an example DDD model showing how two Bounded Contexts relate:

stategic-ddd-bounded-contexts

The Fediverse has enormous potential to support social networking for a huge amount of use cases, where different distrubuted services seamlessly integrate in the delivery of functionality / capabilities. Ideally the Fediverse would be like a task-based service-oriented architecture, where one can combine numerous building blocks to model tasks in support of real-world activity.

A fedi with this level of socio-technical support I call a Peopleverse, which is a (technological) vision to work towards (see also: Let’s Reimagine Social).

However, this vision is only within reach if Fediverse manages to reach much deeper levels of interoperability in its protocol-level communication. This requires careful standardization of the domains that enable different use cases. And the trend is that this isn’t happening at all (see: Major challenges of the Fediverse).

This topic is dedicated to how DDD relates to extensibility mechanisms of the ActivityPub linked data format.

1 Like

Semantics of AP Vocabulary Extensions

In the Forge Federation chatroom a discussion by @okno was started on how expressive the AP messages should be for the ForgeFed specification, where it comes to “forking a repository”. I gave a very lengthy response with my thoughts on the matter, and below is a copy the matrix conversation…

@okno: Also @pere is it planned to add a “Fork” activity to the user activities so a repository can keep track of the existing forks ?

@a: Isn’t it as simple as sending a Create to the original repo, and the original repo lists the fork in an ordered collection?

@fr33domlover: That approach makes sense; I just mean it’s not explicitly in the spec yet. And personally, perhaps due to little experience with forges (ironically), I have questions: Why do repos list and count their forks, what’s the use of that? And how many of those are separate projects, and how many are just personal forks for PRs/collaboration on the main project? Forges don’t seem to care about that difference - why?

Answering these questions also affects the priority of that feature being federated

@Joe: From my experience: 99% of “forks” are really just for working on patches, much less often its an actual fork — as in standalone project. In terms of why: Mostly helpful as ecosystem activity metric and less often to find a fork if the main project is inactive

Differentation between the two would be useful, I think. e.g. sourcehut definitely makes a difference between them, at least in terms of vocabulary, not sure if its exposed as UI/API.

@okno: Why not just define a fork activity ? There is no need to use the default activity vocabulary here. Only forgefed implementors will create fork, there is no backward compatibility to maintain here (mastodon isn’t going to fork your repo). Using the Create activity will work but this would be very confusing in term of working imo.

@circlebuilder: I am personally in favor of having domain models reflect the well-understood activities that occur in the domain. So in this case the domain expert is a “Developer”, and when interviewing them with “What do you do when you want to create a Patch for the Repository?” if they then answer like “Well, first I fork the Repository to my own account” that might indicate that “Fork a Repository” is part of the domain and used in the common (ubiquitous) language when talking about the domain. Specs and impls should use the same terminology, so then it follows there’s may be Fork activity. I would not shy away from creating meaningful semantic terminology, if they reflect truly the process that is used. It is a different approach than mapping most concepts to CRUD, and may lead to more readable and comprehensive implementation code (avoiding huge if-then-else constructs upon receiving Create). Consider this:

  • Domain expert: “I fork the repository”
  • Interviewer: “Okay, in my specification I turn that into ‘you create a repository that references the IRI of the other repository’.”
  • Domain expert: “Uh yes, I guess so, though we call it ‘forking’.”

@fr33domlover: I see your point, but also (1) Activities aren’t UI elements, do you see a reason that their vocabulary matching domain concepts would be a high priority? (2) Activities represent both notifications and action requests (i.e. RPC), which can be much more low-level than domain-level human-meaningful events (3) some actions can be phrased using different verbs, allowing to converge vocabulary with other, related actions, and some vocabulary terms are relevant beyond forges (e.g. Create{Repo} has a meaning even if you don’t understand what Repo is, you still know it’s a Create) so when I pick a name and try to attend to all these considerations, sometimes a less-than-ideal name might get picked, please remember that

Bottom line: I’m open to a “Fork” activity but Create is the sane safe default; repo creation isn’t in the spec yet so this is a good time to open an issue about it to make sure “Fork” verb is considered when time comes

(You could also open a PR if you’re eager to move this forward, but consider to use Fork/Create Repo for RPC in your implementation first, not just for notification, because the RPC aspect is where the important considerations show up)

Create{Repo} will probably exist either way, for creating repos that aren’t forks, so there needs to be a good reason to add a Fork activity for such a specific thing, instead of using Create{Repo} for this as well

@circlebuilder: fr33domlover: Yes, those are exactly some of the considerations. Note that in my description I did not go as far as saying there should be a Fork activity, but that here might be one. I would say that it makes sense to use Fork if the outcome of the activity is different than a Create. In the case of a Create a wholly independent repository is created. When doing a Fork then the fork relationship is tracked. The upstream has a collection of forks, and (at least in some forge impls like GH) the downstream repo indicates it is a fork.

(1) Activities aren’t UI elements, do you see a reason that their vocabulary matching domain concepts would be a high priority?

Yes, I do. AP is a Linked Data format based on the idea to express the semantics as accurately as possible and be both human- as well as machine-readable format. While in LD circles they focus on ontology design, I personally find domain driven design more intuitive. The semantics express the domain, and they are neither UI-level or a lower-level wire format that other protocols may express.

In theory you could get away with modeling any domain using just Note and the CRUD-related activities of AS/AP. It would be very unwieldy as only by the combinations of properties that are present can you deduce what objects you deal with, and what domain logic is invoked would be fully implicit.

Some more background on DDD… A CRUD-only model can have value. In DDD analysis of the Context Map (that depicts subdomains / bounded contexts and their relationships) some subdomain may be less complex and/or less important and best modeled with CRUD. But if the entire domain consists of CRUD-like models, then it becomes an anti-pattern and called an “anemic domain model”. If the CRUD is justified you probably don’t need DDD which is then overkill. But if the domain is complex and CRUD isn’t justified - i.e. by using it it obscures meaningful relationships and makes business / domain logic implicit - then DDD is valuable and refactoring in order.

(2) Activities represent both notifications and action requests (i.e. RPC), which can be much more low-level than domain-level human-meaningful events.

Some tricky bits here. First, instead of “notifications and action requests” I prefer terminology of Events (something happened) and Commands (I want something to happen). In DDD the Commands aren’t part of the domain, only domain events are. If you’d have an Union Architecture with 3 layers outer-to-inner of Infra, Service, and Domain layer, then the Commands would be created in the Service layer based on invocations from the UI or via external endpoints in the Infra layer. Via the Service layer they would invoke functions on a domain object (called the Aggregate Root which is the single-point-of-entry encapsulation of a particular domain model / bounded context, thus creating a consistency boundary). The domain object then creates domain Events. These events then can trigger any further Commands anywhere in the rest of the application or beyond.

In other words: In a distributed (DDD-based) event-driven architcture only the Events are truly relevant in conveying meaning.

However, this is not how AP is spec’ed. It is often not clear if something is a Command or Event, or it might be both depending on context. Furthermore AP presents itself as an actor model based spec. Actors in general receive msgs that are Commands. What makes things further confusing is that the spec covers both C2S and S2S. I haven’t really figured out fully how to deal with this. But maybe (relating things to the architecture I want to use) I can still interpret each AP message more or less as an Event.

Consider this: A C2S client where I Like someone’s toot. As soon as I click “like” in the UI the activity has happened. When handled in the Domain layer a Liked domain event would fire, and via the Infra layer a Like{Note} would be sent. This AP message is an Event, not a Command. It only needs to be handled by anyone interested in my action, until eventual consistency is reached and every party handled my event. In this process numerous S2S transactions might have happened, but they are all consolidating my Event. The event reaches Actors (who treat it as a Command) and do some follow-up action. Conceptually I still have an Event Driven Architecture.

(I am sorry that I am so lenghty in my response, but I wanna explain the way I am thinking of this).

Now on to RPC… First, using the term “RPC” can lead to confusion, as it is an umbrella term which covers both pure message transfer as well as API invocations remotely. I feel that the latter should be avoided on the Fediverse as it will tie all communications rigidly to remote API contracts, which are constantly in flux as apps / domain evolve. It would make my architecture dependent on remote domains in ways that are limiting the flexibility of my application.

OTOH - and I feel your struggle here - you should know enough about remote domains to know what business / domain logic is taking place here. Still I think that with just Events traveling the wire (which in some contexts are processed as Commands) I have enough information to do so appropriately → And here expressing the rich semantics of the domain is vital.

What are Events (and Commands) in technical perspective? They are like structs with properties, and when sent over the wire as Linked Data they are JSON-LD formatted AP messages (which use namespaces to indicate particular vocabulary extensions / domain models).

I am still not up-to-speed on OCaps and Cap’n Proto. But intuitively / “gut feeling” it feels like if a remote endpoint sends me their API contract to invoke, which I must then adhere to and write logic for, would be severely limiting to the flexibility of the Fediverse in contrast to a pure event-based message exchange. But maybe I’m mistaken in this perception. If the API contract represents accurately a domain interface (the aforementioned Aggregate Root object in a domain model) then it would learn me about the remote domain, same as I would have to learn about the remote domain for handling domain Events coming from it.

But these API contracts MUST be part of a very stable specification to be usable across the fedi. For ForgeFed you can enforce that by making them part of the standard. But on the whole the fedi currently doesn’t do standards well. If Fediverse vNext would have API contracts as part of its communication mechanism, there’d be a wild proliferation of ad-hoc created API flavours. Some accurately reflecting a domain, but many of them likely not doing so. Simply because most developers have a different notion of API’s and they will model what they’d needed in an Infra layer, not a domain interface.

(3) some actions can be phrased using different verbs, allowing to converge vocabulary with other, related actions, and some vocabulary terms are relevant beyond forges (e.g. Create{Repo} has a meaning even if you don’t understand what Repo is, you still know it’s a Create) so when I pick a name and try to attend to all these considerations, sometimes a less-than-ideal name might get picked, please remember that

In DDD vocabulary terms are meaningful within a particular Bounded Context. The bounded context is a consistency boundary. Some terminology may still have its proper meaning beyond the context, or it may be that it should be translated into a slightly different concept that is meaningful in another context. E.g. Product is a different concept in a Webshop context than it is in a Shipping context. The only related property in both contexts might be the ProductID, while all other properties are different.

In Linked Data / AP we have namespaces to indicate to which bounded context some object or property belongs. In AP messages we can combine many different namespaces / contexts in a single piece of information that is transferred. These namespaces are very important, and their importance is underestimated. Too many fedi apps just look at the ActivityStreams vocabulary and try to cram their domain using the primitives defined here. Some primitives of AS are generally reusable, like the CRUD-like activities. Others are more specific to a Social Media / Microblogging domain (e.g. Like). It is okay to use them wherever possible, but you should take into account the domain to which they apply, i.e. the namespace is very relevant.

I see AS as kind of a “base library” with the most common primitives for social networking applications. Numerous vocabulary extensions can be defined which, each in their own namespace, define building blocks for more specific types of applications (i.e. in different domains).

In the text above where you wrote Create{Repo} you used a shorthand form omitting the namespaces, which may lead to confusion. What you really wanted to convey was as:Create{forge:Repo}. In other words you must understand the ForgeFed domain to be able to handle this message properly, or there would be a fallback mechanism in the message type of e.g.

{
    "type": ["as:Note", "forge:Repo"]
    ...
}

So that Mastodon could say "I don’t understand forge vocab, but this is just a Note so I’ll display it as such.


Long response.

  • TODO: Gain further insight into relationship DDD versus AP vocab design.
  • TODO: Add issue to ForgeFed regarding Fork vs. Create when it comes to forking a repo.