My current goals for ActivityPub and academic data

I build and run Encyclia, a project that bridges ORCID records into the ActivityPub-based fediverse. The idea is to make ORCID records followable, so that when a researcher publishes a new work, that information can appear in your social inbox.

Independently of that, I have a personal website where, among other things, I host my own academic publications. I want to make my website ActivityPub-interactable as well in the future, but it is a lower priority project.

I am not closely involved with Bonfire, but they are building Open Science Network, a social platform for academic work and exchange. Please get a more detailed summary of what they do from them instead of me, but I have written an entry in the Encyclia FAQ on why our projects are different.

With two active implementations working the federated exchange of academic data, we might think about ways in which it makes sense to augment our data structures beyond the established patterns of microblogging and generic long-form text to add value for academics.

To start us out, in this post I’m going to talk about the goals I want to aim for right now.

The goals I want to aim for right now

If you’re reading this, there’s a high chance you haven’t seen Encyclia in action, since its alpha test hasn’t yet started at time of writing. But the idea, as described above, is fairly simple if you know ORCID. People add metadata about their new publications to their ORCID record (or academic publishers add it for them), Encyclia queries the ORCID API for new works, grabs the metadata, formats it as a human-readable short social post (an ActivityPub Note), and publishes it. Each post’s content typically consists of title, authors, abstract, and identifiers (URL, DOI, ISBN, …), although this data is frequently only partially present in the ORCID record.

Independently of Encyclia, OSN allows users to browse someone else’s ORCID record (if they have added their ORCID iD to their profile) and check out their individual publications.

I would like to come up with a way to add a machine-readable declaration to each ORCID post saying ā€œthis is a metadata summary of this specific academic artifactā€. Most federated software won’t care, but it would allow OSN to treat Encyclia posts different from human-created posts about the same article, without needing to special-case domains or check NodeInfo for the software name.

As for how OSN might like to treat Encyclia posts differently, I can imagine that it would be hugely useful to somehow cross-reference independent posts discussing the same academic article. You can kinda get at that by scraping post contents for DOIs, but not every academic artifact has a DOI, and a declarative approach would allow for clearer and more intent-based labeling. I also think it could make sense depending on the context to separate automated posts about a specific work from human-authored posts, although that can already be done by checking the ActivityPub actor types (Encyclia accounts are Services).

Thinking slightly beyond my immediate needs, it becomes apparent that this is a special case of declarative academic citation. Encyclia posts essentially cite one academic work (in a specific way), and OSN might be interested in a more generalized schema for representing academic citations. That is, for any published ActivityPub object, a way to attach a list of references to other existing works that it cites. For example, when I finish the ActivityPub support for my website, someone could load up one of my articles in OSN, read the full text, or look at the extracted list of references, get information about individual ones, see richly rendered links and updated info that might not have been available at the time I wrote the article’s bibliography, search for other references to the same works, etc.

Look, I’m not in charge of OSN’s features, but the ACM DL shows an extracted list of references almost right below the article’s abstract. Clearly this is something that researchers want. I also remember that OSN wants to offer pathways for academic co-creation, not just consumption and discussion, so a structured way to add academic references to posts would probably be an important component of that.

So, do we want to talk about how to introduce a linked data schema for academic citations in ActivityPub objects?

2 Likes