My current goals for ActivityPub and academic data

I build and run Encyclia, a project that bridges ORCID records into the ActivityPub-based fediverse. The idea is to make ORCID records followable, so that when a researcher publishes a new work, that information can appear in your social inbox.

Independently of that, I have a personal website where, among other things, I host my own academic publications. I want to make my website ActivityPub-interactable as well in the future, but it is a lower priority project.

I am not closely involved with Bonfire, but they are building Open Science Network, a social platform for academic work and exchange. Please get a more detailed summary of what they do from them instead of me, but I have written an entry in the Encyclia FAQ on why our projects are different.

With two active implementations working the federated exchange of academic data, we might think about ways in which it makes sense to augment our data structures beyond the established patterns of microblogging and generic long-form text to add value for academics.

To start us out, in this post I’m going to talk about the goals I want to aim for right now.

The goals I want to aim for right now

If you’re reading this, there’s a high chance you haven’t seen Encyclia in action, since its alpha test hasn’t yet started at time of writing. But the idea, as described above, is fairly simple if you know ORCID. People add metadata about their new publications to their ORCID record (or academic publishers add it for them), Encyclia queries the ORCID API for new works, grabs the metadata, formats it as a human-readable short social post (an ActivityPub Note), and publishes it. Each post’s content typically consists of title, authors, abstract, and identifiers (URL, DOI, ISBN, …), although this data is frequently only partially present in the ORCID record.

Independently of Encyclia, OSN allows users to browse someone else’s ORCID record (if they have added their ORCID iD to their profile) and check out their individual publications.

I would like to come up with a way to add a machine-readable declaration to each ORCID post saying ā€œthis is a metadata summary of this specific academic artifactā€. Most federated software won’t care, but it would allow OSN to treat Encyclia posts different from human-created posts about the same article, without needing to special-case domains or check NodeInfo for the software name.

As for how OSN might like to treat Encyclia posts differently, I can imagine that it would be hugely useful to somehow cross-reference independent posts discussing the same academic article. You can kinda get at that by scraping post contents for DOIs, but not every academic artifact has a DOI, and a declarative approach would allow for clearer and more intent-based labeling. I also think it could make sense depending on the context to separate automated posts about a specific work from human-authored posts, although that can already be done by checking the ActivityPub actor types (Encyclia accounts are Services).

Thinking slightly beyond my immediate needs, it becomes apparent that this is a special case of declarative academic citation. Encyclia posts essentially cite one academic work (in a specific way), and OSN might be interested in a more generalized schema for representing academic citations. That is, for any published ActivityPub object, a way to attach a list of references to other existing works that it cites. For example, when I finish the ActivityPub support for my website, someone could load up one of my articles in OSN, read the full text, or look at the extracted list of references, get information about individual ones, see richly rendered links and updated info that might not have been available at the time I wrote the article’s bibliography, search for other references to the same works, etc.

Look, I’m not in charge of OSN’s features, but the ACM DL shows an extracted list of references almost right below the article’s abstract. Clearly this is something that researchers want. I also remember that OSN wants to offer pathways for academic co-creation, not just consumption and discussion, so a structured way to add academic references to posts would probably be an important component of that.

So, do we want to talk about how to introduce a linked data schema for academic citations in ActivityPub objects?

3 Likes

Social coding commons invites anyone interested to explore Open Science opportunities on the fediverse to join the discussion at Common social groundwork on Matrix, the technical chatroom of Social coding commons. The forum topics here are for more substantive posts, based on findings from the chat.

I want to add a more generic post, with some points, and you can gauge for yourself the relevancy to your project. For the Social web, and the vision of the Decentralized Web in general a ā€œweb of standardsā€ approach is best-practice to lay solid groundwork, the technological foundations upon which a peopleverse can stand.

Web of standards is a topic that is of interest to the #commons-cocreation:forginggrounds :o: Commons collective, and in particular to elaborate further on the #sx:open-social-stack and maturing processes for Grassroots standards evolution.

Recently I bumped into the Fedora: Open-source repository for long-term digital preservation (where Fedora is unrelated and predates RedHat), a Java project that particularly focuses on open standards support. @sneakers-the-rat is I think doing work into similar fields here. Not sure about @mayel and @ivanminutillo re:Bonfire.

Some product highlights are:

  • Native Linked Data Support: Integrate your data with other semantic web data sources, such as WikiData, VIAF, LinkedGeoData, and many more.

  • Standards Based Services: Interact with the application using internationally recognized standards.

For their open standards support they mention some examples:

Fedora is just one single application. I maintain delightful-linked-data and delightful-open-science, and a ton of open standards are used. Which ones are popular, and what are considerations for adoption?

:information_source: Attention reader

For both curated lists I seek co-curators. Esp. for the open-science list, as I don’t use this list myself, and quality of entries is of greater importance. The splendid list as-is, was created by @venema (fediverse: @VictorVenema and @OpenScienceFeed). If there are volunteers, please post in #commons-cocreation:delightful-commons or in the Codeberg repository issue tracker.

:people_hugging: Technology alignment: Open science standards stack

I noticed that most of those that are interested in the intersection open science and academic world vs. open social web (Commons based social stack in SX terminology) have a preference to conduct their research and do mostly solitary work. This has a high risk of divergence and creation of a patchwork and splinterverse, and an every further protocol-decayed high-maintenance future fediverse.

The ScienceFed discovery :o: Commons collective is meant to be a loose alliance, where at least the bare minimum of coordination can take place to mitigate this challenge, which is common to any chaotic grassroots environment. One of the key objectives of Social coding commons is to facilitate collaboration and cocreation at scale, so that in the course of its operations, the members of the collective are stimulated to find mutual benefits, explore synergies, and form closer collaborative arrangements on that basis. :magnet:

Cohesion is among the emergent forces to be stimulated, but also :zap: Attraction of newcomers to the commons collective.

With regards to Grassroots standardization, a kick-off of the collection might be an investigation of the techstacks that the various fedi-related open science projects are using, and where there exist :sparkles: Opportunities for technology alignment, in particular where there are open standard candidates.

Creating a wiki post to collect the input from all of you can be a starting point?

:question: What activities and services would be most helpful to you?

We are here on the basis of Hedonic commons based peer production, and the pursuit of self-interested motives is fine, since it provides participants the most intrinsic motivation to proactively collaborate.

What other topics come to mind, where the FediScience discovery commons collective can help you reach what you are working to establish? How can participation in Social coding commons contribute to satisfy your needs?