Weekly Guild Reports

CSDUMMI · April 2, 2023, 10:53am

In this topic the editors (@CSDUMMI and @Tomat0) will publish brief reports about the activity of the guild.

Anything everyone should know, must be sent to to the editors to be included in this report.

CSDUMMI · June 5, 2023, 6:45pm

June 5. 2023 - Bookwyrm Proposal Accepted

Last week we gathered three proposals and held a vote between them.

The Bookwyrm Migration proposal has won the vote by a 4/5 majority on the first preference.

The form’s results have been published here.

Because this project will be contributing to the Bookwyrm project, we’ll be using the Anti-Capitalist License although the AGPL 3.0 License has won the second question.

What’s next?

We have forked Bookwyrm to our Codeberg organisation. If you send your Codeberg username on the Matrix, we’ll add you this organisation.

These are the current questions we need to answer to work on the Bookwyrm project:

Identify the structure of the codebase (where is the frontend, backend, database schema, etc.)? Collect the information about this in the wiki.
What data needs to be included in an account archive? See this issue
Reach out to the Bookwyrm project. Their Matrix room is this.
How do we set up a test environment? Who wants to set up a test environment? See discussion on Matrix
Should we use ActivityPub as the format for archives?

These are the current questions we need to answer in order to work on the Bookwyrm project.

Anyone should pick whichever question or task they most want to work and we’ll figure out what to do about those questions that nobody works on later on.

For quick communications, use the Matrix room.
If you have a large discussion on a technical topic, open an issue in the Codeberg repository.

CSDUMMI · June 13, 2023, 8:07pm

June 12. 2023 - Code Review

This week we began reviewing the bookwyrm source code. The results of this review is being collected on our wiki in the form of an FAQ. Thanks Valery Briz, @tomat0 and @Ryuno-Ki for working on this.

@dannymate has created two ansible playbooks to setup an environment and a bookwyrm instance. These should make it easier to create test instances.

We also created and discussed several issues to organize our work.

What should be the goal of our work?

This has by now crystallized roughly three possible options for us:

We can base our work on @hugh’s. This means we’ll use plain JSON in the archive, add the media in the form of a tar or zip to the archive, and create an activity
We can follow Mastodon’s approach and use ActivityPub objects (actor, inbox, outbox, likes, followers, following) in the archive, together with media attachements. This would be a more limited archive, because not all bookwyrm activity is actually serialized as ActivityPub yet.
We can make Bookwyrm Groups federated. This would allow groups to be included and permit Group membership across instances.

Which option do you prefer? All of these have, I think, roughly equal effort associated with them as far as I can tell from here now.

We have created this poll on the question.

– @CSDUMMI
– @tomat0

Tomat0 · June 19, 2023, 3:45pm

June 19. 2023 - Test Environment Has Been Set Up

During the previous week, we have decided how we will implement the account migration and successfully set up the test environment for our work. There are now two staging instances of Bookwyrm (linked below) which can be used to import/export data to and from each other.

Decision 1

In our first proper, in-project decision, we had the choice between:

Build based on hugh’s work, debug their code and improve it.
Use ActivityPub as archive. This would be somewhat compatible with other ActivityPub software.
Federate Bookwyrm groups. Which is a prerequisite for adding them to the archive.

We decided to base our work on hugh’s development. For this we setup staging instances now:

Staging Instances & Invite Links

For the next twenty-four hours, these invite links will be valid to generate an unlimited number of test accounts on both instances. Afterwards, you can contact the admin (either @Tomat0 or @CSDUMMI) to generate a new invite link if the need be.

While testing with this new instance, @dannymate and @CSDUMMI found and analyzed a bug in the exporting function created by hugh. The issue can be found here.

What’s next?

We need to solve the [issue #11] (#11 - Export/Import - User-Export Download Errors when User has no Avatar - bookwyrm - Codeberg.org) with a PR on codeberg.
Between the two staging instances, we need to test hugh’s current import/export functionality and come up with a range of test-cases to identify problems.
@Tomat0 and @CSDUMMI have written a wiki page on how to set up a staging instance of Bookwyrm yourself.

Tomat0 · June 26, 2023, 7:06pm

June 26. 2023 – Integrating Existing Code

What did we accomplish?

With the decision to begin basing our work upon Hugh’s code, we began using the staging instances to begin testing, debugging, and fixing the code.

One of the bugs fixed by @dannymate involved a case in which import/export would fail for accounts without an avatar.
@Ryuno-Ki began drafting the documentation on the Wiki for “Views”, explaining the various Django views Bookwyrm makes use of and their purpose.
Hugh has created a PR which adds import functionality for readthroughs, saved lists, and follows. This code likely has bugs that need to be tested and ironed out.

What’s next?

The steps we should aim to tackle in the coming week should include:

As discussed in Issue #1, we have decided upon using a tar format for the importer which will allow for all relevant files and images to be transferred, even when the exporting instance is offline. It is a top priority that we begin work on creating this tar format for use.
As per Issue #3, the code currently performs import/export directly in the request-response loop which is suspected to run the risk of creating timeouts, especially on larger accounts. We should instead implement task scheduling (via Celery) for the import/export functionality.
The PR created by Hugh should be tested and reviewed for any bugs or missing functionality. This can be done on the staging instances.
Continue work on the documentation, hopefully finishing up the page related to Django Views. We should also begin drafting documentation pertaining to Celery and Bookwyrm’s usage of it.

CSDUMMI · July 4, 2023, 7:37am

July 3. 2023 - Successful progress on tasks, extending archives and TAR archives

What did we accomplish?

Hugh continues progress on #15
@dannymate shares a functional prototype for offloading import to Celery. #17
@CSDUMMI begins work & discusses further with @Ryuno-Ki about the use of TAR Archives #16.
Further discussion occurred over possible issues with the export mechanism (#3 - Create tasks for export/import instead of performing them in the request-response loop - bookwyrm - Codeberg.org)

What’s next?

Currently we are progressing quite well. We should aim to complete #15 and #17.

We’ll have to decide on how to create the archive. Currently there appear to be three options:

Create it in-memory (currently the case)
Store it on disk
Stream the archive (creating it on-demand and archiving most used parts.

See #16 for more.

While we can experiment with different ways of generating the archive and creating archive files (tars), we are blocked from implementing a completely self-contained archive until #15 is completed.

Having #17 completed would also be a huge help here.

CSDUMMI · July 14, 2023, 6:09am

July 10. 2023 - Incremental advancements

What did we accomplish?

@dannymate makes progress on the Export PoC and made SubTasks generic #17
Deployment of Hugh’s #15 to facilitate further code reviews. There were group discussions around the export download mechanism, tarfile data inclusion and unit testing.

Our goals

This week, we should merge #15. There are a few conditions that need to meet for that:
1. All code reviews need to resolved
2. Several manual tests must have been completed, including:
  - A test that a no-op archive (exporting & importing into the same account) does not result in any change to the account (and does not throw an error)
  - A test that a full archive (export from an account with books, readthroughs, saved lists, follows, blocks, comments and reviews) to an empty account, recreates all of the archive (without error)
#17 should be rebased on #15 and deployed to test instances.
We should start developing unit tests for the behavior implemented by #15.

The development of the tar archive should be halted in favour of work on these goal this week, as the work on creating self-contained tar archives is very dependent on these above goals.

Tomat0 · July 28, 2023, 5:45pm

July 17. 2023 - JSON, Tests and Mergers

hugh resolved all Code Reviews for #15
@dannymate ran through manual tests related to Hugh’s #15 and raised a couple issues.
@CSDUMMI combed through Hugh’s code and replaced the continuous JSON serialization by a one-time serialization.

What’s next?

The #20 should be discussed and merged. This requires:
- Code Review
- Deployment on the staging instance
The #19 can be merged independently from #20
Unit Tests for the importing and exporting of users should be added.

The last week of the month should focus on creating self-contained tar archives

CSDUMMI · August 1, 2023, 7:43am

July 31. 2023 - Celery Tasks

What have we done?

@dannymate created the final two PRs for moving the Import/Export process to Celery #24 #26.
hugh fixed an export bug #25

What’s next?

Add link to created archive file after export job has completed to export page
Add messages about successful import job on import page
Complete the migration of import/export to celery tasks
Start work on self-contained tar archive

CSDUMMI · August 8, 2023, 8:22pm

August 7. 2023 - Mergers and Tasks

What have we done?

@CSDUMMI added links to exported files in #26
All export and import logic has been moved to celery tasks and merged by @dannymate and @CSDUMMI.

What’s next?

The current state of the export-import branch needs to be tested on the staging server.
Documentation for the export-import branch (in the form of docstrings) should be added and reviewed
All media files fetched from the origin server during imports should be added to a tar archive
A pull request needs to be created to bookwyrm with our changes in export-import. We should reach out to bookwyrm maintainers and create a draft of the PR text.

CSDUMMI · August 16, 2023, 1:34pm

August 14. 2023 -

What have we accomplished?

@dannymate fixed some issues casued by the merger for Celery tasks #27 and re-added Hugh’s fix for #25.
hugh fixed numerous bugs with #28 and is currently being tested for merge.

What’s next?

We’ll have a check-up meeting this week to conclude this sprint and discuss the organization of the next.
@dannymate and hugh work on making #28 stable and tested
Research the Django File Storage and FileField API and answer the questions:
1. How can files be created using FileFields?
2. Where can the FileField be stored?
3. How can old archives be removed to clean-up disk space?
4. How to generate a tar archive file using tarfile and a zip archive file using zipfile.
Implement self-contained archive as a tar or zip file

Tomat0 · August 21, 2023, 9:04pm

August 21. 2023 - Archives

What have we accomplished?

hugh’s code quality improvements, #28 & #30 have been merged
@dannymate fixed some Celery/Django race conditions which were merged alongside #28
@CSDUMMI is making progress with tar.gz exports in #29

End of the Sprint

In the check-up video conference, we discussed the end of this sprint.
The PR to Bookwyrm will be made by the 27th of August.
We will focus on the necessary remaining tasks to create either a PR or Draft PR.
Our PR will have some issues or improvements, that we couldn’t implement but do not consider crucial. All of these, that we have considered, will be listed in the PR for Bookwyrm’s consideration.

Tasks

Implement and merge tar archives containing the avatar image and any covers of books the user exports. (@CSDUMMI)
Implement tests (@hugh & @CSDUMMI for tar archives)
Implement a cool-down period between creating archives to prevent overloading the server
Implement regularly scheduled task to remove old archives according to an environment variable
Write PR text
Propose Pull Request

dannymate · August 28, 2023, 8:43pm

August 28. 2023 - Of tar and Tests

What have we accomplished?

@CSDUMMI & @hugh have implemented a major milestone. Importing and Exporting with tar.gz archives #29. This wraps around the previously used JSON file alongside images from avatars and books for a better experience.
hugh has made major progress implementing import model testing #31 which has allowed him to find and fix a few bugs too.
@dannymate spent a bit of time polishing up job.py #32

What’s left to do?

Merge #29 with a review from @hugh and @dannymate
Add tests for tar.py utility
Rebase #31 on #29 after #29 is merged and adjust tests
@CSDUMMI & @hugh should review #32
Discuss how much of job.py is still needed
Write PR to Bookwyrm
Propose PR
Maybe have a look at security? Does @Ryuno-Ki still want to have a look at this w.r.t. zipslip vulnerabilities?

CSDUMMI · September 13, 2023, 3:16pm

September 13. 2023 - Start of the second sprint

Today we have announced the Second Guild Alpha Sprint.

The last two weeks were spent in the inter-sprint. We have setup a website, a kbin magazine and rewritten the forms for starting a guild.

This sprint has been concluded with the proposal of the PR with our changes to bookwyrm. Our concluding thoughts on this sprint can be read here.

The second sprint is now looking for members and open for project proposals. Read about everything on our website.

We will continue posting our weekly report of the sceond sprint here