flypig.co.uk

List items

Items from the current list are shown below.

Blog

19 Nov 2023 : The Turing Way Book Dash #
I've been making noises about taking part in The Turing Way Book Dash for the last couple of weeks. All of the actual dashing took place last week, and in spite of the long build up, it feels like it's come and gone rather quickly. It was a fantastic week of writing, reviewing, collaborating and generally working a lot with markdown and git. The Turing Way is undoubtedly an impressive project and the Book Dash somehow manages to capture the project's spirit and use that energy to generate words.
 
The Turing Way project is illustrated as a road or path with shops for different data science skills. People can go in and out with their shopping cart and pick and choose what they need.

In this post I'll be sharing my experiences: what is the Book Dash for? How does it work? Why was I involved? If you're interested in reproducible data science, in the broadest sense, then you should consider joining a future Book Dash. Applications are open to anyone and reading this post might help you decide on whether it's right for you.

But, without wanting to spoil the punchline: it almost certainly is. On my travels through space and time I've met many very talented researchers, software developers, technical writers and data scientists. If you fall into any of these camps, then there's a good chance you'd have something to contribute to The Turing Way.

What is The Turing Way book dash?

Let's start with The Turing Way. It's an online book, originally started at The Alan Turing Institute (where I work), but increasingly written by and directed by the community it tries to serve.

Here's how the book describes itself:
 
The Turing Way project is open source, open collaboration, and community-driven. We involve and support a diverse community of contributors to make data science accessible, comprehensible and effective for everyone. Our goal is to provide all the information that researchers and data scientists in academia, industry and the public sector need to ensure that the projects they work on are easy to reproduce and reuse.

That's a great description, but what it doesn't quite say is that The Turing Way project itself tries to embody the good practices it preaches. That means it aims to be open and community-oriented. The processes used to achieve this are themselves documented in The Turing Way. It doesn't quite reach Russellian levels of paradoxical self-reference, but it sometimes feels that might be what it's aiming for.

The Turing Way has a Book Dash every six months. This one was the tenth, which means they must have been running for around five years. The idea is that people at The Turing are busy. People in the community are busy. This can make it hard to focus and contributions to the book can get stuck or de-prioritised. The Book Dash is a full week to focus on getting stuff done and getting changes merged into the book.

I say merged, because the book is built around git and GitHub. It follows the same general rules that many open source projects follow: anyone can create a pull request, there's a solid continuous integration pipeline with a range of pre-merge checks and once something is merged the book automatically republishes itself with the new material. If you're a software developer, you'll feel right at home with the way things work.
 
A diagram describing how GitHub action listen to an event (for example, 'PR' created, issue created, PR merged) and then trigger a job which can be testing, sorting, labelling or deployment.

My Book Dash motivations

Earlier this year I attended RSECon23, a big shindig for Research Software Engineers held in Wales. During the conference dinner one of my colleagues was forced to suffer me ranting about code review.

I've been involved in software development, both commercial and open source, for at least twenty years. My first experience of code review was when I worked at Codemasters in the early noughties. Software development has changed radically since then and code review has become an established norm. And yet, from then until now, I've always found the process deeply uncomfortable at best, and soul-destroyingly antagonistic at worst.

Arrogance leads me to believe I might know some solutions. My colleague, who was probably trying to find a way out of the conversation, suggested I channel some of my frustration into content for The Turing Way. I've had a strong respect for The Turing Way since before I started work at The Turing, so this was immediately appealing.

By coincidence the deadline for applying to join the Book Dash was the following day, so I spent a bleary-eyed couple of late-night hours completing the application form. If I hadn't been so tired I might have had second thoughts. I'm glad I didn't.

Onboarding and dawning realisation

The Book Dash is split into three pieces. A couple of weeks before the Book Dash itself there's an onboarding session, during which we were encouraged to structure our thoughts in a SMART way and to plan our week's work. Between onboarding and the dash there's a git primer session. This covers the basics of git and GitHub for those not familiar with them and also touches on how they're used during the Book Dash. Finally there's the Book Dash itself, five days of intense working, each day split into five 150 minute sessions starting at 8:00 in the morning and running through until 21:30 at night (UK time).

It was the onboarding session when the realisation of what I'd done really sank it. Ironically it came about through a misunderstanding. During one of the breakouts my sessions partner explained that their line manager had been hugely supportive of their work on the Book Dash and that they'd also taken some time off as annual leave.

This commitment impressed me, especially given I was planning to work on the Book Dash only in the evenings. This information, coupled with the intense working sessions throughout the day, made me think I'd not fully appreciated how much would be expected of me during the Book Dash.

Following this realisation I rearranged my annual leave to take a couple of days off during the Book Dash. This would mean working in the evenings on three of the days, then working full days for two.
 
Two people sit across from each other at their laptops on a table but in different places. On the left hand side it's the morning, the sun is shining, someone is bringing in a cup of hot drink. On the screen is a videocall with four participants. On the right hand side it's might with the moon outside and the person is also taking part in a videocall. There is also a cat.

I also committed to work on preparation the week and weekend before the Book Dash. I decided the best way to be sure I'd make progress during the week was to have enough material prepared in advance.

With two additional days and enough material already prepared I felt convinced I could do something worthwhile by the end of the Book Dash.

It turns out I was wrong about two things, and right about two things. We'll come back to these.

Here are the plans I set myself during the onboarding session. We'll come back to these too.
 
  1. Specific: What do I want to accomplish?:
    1. Updated references to Twitter so that they reference X and/or Mastodon
    2. Contribute to the Code Review sections to introduce more material about how to handle code review respectfully and avoiding the pitfalls of power imbalance inherent in the process.
  2. Measurable: How will I measure my success?:
    1. All references to Twitter are gone. Create a PR that removes references to Twitter.
    2. Start a separate PR to discuss Mastodon, just as a skeleton.
    3. A PR that introduces - at a minimum - a new section on code review power dynamics.
  3. Attainable: Can I realistically achieve this goal? What steps will I take?
    1. The switching from Twitter to X is somewhat mechanical so should be a quick win.
    2. The aim to create a PR related to Mastodon only needs to be a skeleton, but can be more fully realised if there's time. There's a lot of good material and expertise to draw from, from the Turing Way community.
    3. Creating the text on Code Review is more of a challenge, but I'm highly motivated to get something done. The hardest challenge will be understanding the text that's already there and aligning what I'm hoping to include with it.
  4. Relevant: Does this goal meet a specific need?:
    1. The shift away from Twitter - both the name and platform - I believe makes this timely.
    2. On a personal level I continue to find code review challenging in spite of the existing good material written about it. It also remains a really important part of the development process, so I hope any improvements that can be made in the sections of the Turing Way would remain relevant.
  5. Time-bound: What is my target deadline? (potentially between 14-18 Nov):
    1. I'm aiming for a simple Twitter/X PR before the end of Tuesday 15th.
    2. The remaining days I'll dedicate to the Code Review changes, ideally a PR before the end of the 18th.
  6. Goal statement:
    1. My goal is to ensure no reference to Twitter goes unchallenged, and to capture my thoughts on the challenges of code review in a way that will be beneficial to others.
Preparation

On the Monday before the Book Dash we had a git training session. This was an excellent refresher. I continue to be astonished at how every organisation or project I've worked on uses git slightly differently. It seems to tap in to a project's culture. The Turing Way doesn't work with forks, but rather gives all contributors direct merge and write access to upstream repository. That surprised me a little at first — it's a very trusting approach — but in retrospect it makes perfect sense for a project so defined by community.

I also discovered that on The Turing Way you merge feature branches rather than rebase them. There are pros and cons to both, but these are the kinds of cultural differences it's helpful to understand before force-pushing changes or complaining about unsquashed commits (which is what I might do on other projects).

Cultural differences, you see.

The instructor for the git session was knowledgeable and clearly understood these nuances well. Very reassuring.

On the following evenings in the week prior to the Book Dash I read through the Community Handbook and started collecting ideas and material together about Code Review.

While reading through the Handbook I started noticing typos and small grammatical errors here and there. The Turing Way being an open project, it feels negligent to leave them unfixed. So I put together a pre-emptive pull request full of small changes. Everyone's first pull request to The Turing Way gets adorned with a huge "Thanks" banner. It's a great way to make new contributors feel welcome and is just one of the many small touches that make the project so special.

The week of the Book Dash part I

My Book Dash week itself was a game of two halves. For the first half I continued my usual Alan Turing Institute work (unrelated to the Book Dash) from home during the day. Then in the evenings I attended the 17:00 and 20:00 Book Dash sessions. These are carefully orchestrated using Etherpad and Zoom, and and with instant messaging provided by Slack. The sessions provide an opportunity to collaborate with others, or to simply focus on writing material. The Cuckoo timer is used to split the session up into Pomodoro periods.

I picked a rather mechanical task as a way of easing myself into the process: converting all of the Twitter references to use X instead. It turns out there are quite a few, plus there was some work to be done rewriting the parts that no longer apply. X has diverged a fair bit from the site it was when the Twitter material was written.
 
Two birds in a fountain of open data.

Although I was able to make progress on this, I admit I found it quite challenging to focus initially during the sessions. My problem wasn't so much that I didn't know what I was doing, but more that there were so many possibilities that I could have been doing. It was difficult to pick something to focus on. This was exacerbated by the numerous notifications from GitHub streaming into my inbox. When multiple people are working intently on a project for a period of time this is one of the consequences. At this early stage the notifications were a really useful way to get an understanding of how things were supposed to work. But they sure were distracting.

Plus, working from 8:00 to 17:00 on my usual job before switching straight to several hours of Book Dash was a challenge. Switching context from one to the other left my head reeling, leaving a gap of up to an hour before I could properly focus.

Besides updating the Twitter content I'd hoped to write something about Mastodon, but this didn't get any further than me writing an issue about it. The week of the Book Dash part II

Things changed for the second half of the week. These were the days I booked as annual leave. Ironically I decided to travel to London for them, so I was in the same office where I usually work in the British Library. This is the first time I've chosen to spend my holiday in the office.

It felt very different to my usual working day. I switched to my second task, that of writing about Code Review. There's already a lot of great material in The Turing Way on the subject, but I wanted to adjust it in three main ways:
  1. To make a clearer distinction between open and closed source projects; I strongly believe that the different modes do and should support different approaches to Code Review.
  2. Emphasising the need to build and run the code as part of the code review process. Code Review shouldn't just be a theoretical exercise.
  3. Adding in a section on when Code Review goes wrong. This is one of the few topics on which I feel I have some useful experience to share.
I handled the first two points on the Thursday and the third point on the Friday. I managed to do a fair bit of writing and am happy with the progress I made. I also had a great session with the illustrator from Scriberia, who managed to instantly take my poorly defined idea and turn it into something clear and visually striking. That was a great experience.

Celebrations

On the final day we had two public celebration sessions. I was also able to do a fair bit of writing between them.

The celebrations were superb. The Turing Way team knows how to run an online event that's inclusive, accessible, lively, but which also manages to allow everyone's voice to be heard. It's hard to overstate how hard it is to get this mix right, but the team managed it with style.

It made for a brilliant showcase of all the work everyone had done during the week; an excellent way to round things off.

Final thoughts and the future

Although it was a bit of a slow start, I think I eventually found my rhythm during the Book Dash. I wrote some words, made some edits, got a couple of pull requests merged. The largest of my contributions is still in a draft pull request, but I feel like it's now gained sufficient mass and momentum that I will finish it. Throughout the week I got great feedback from The Turing Way team and the community. I was also particularly pleased to get detailed and insightful feedback on my pull request from thigg, my fellow Sailfish OS contributor, after posting about it on Mastodon. The text draws on the literature, but also on my personal experience, which makes the need for a plurality of views all the more important. It's really difficult for me to express how happy I am that I got such useful feedback from the reviewers.

Going back to the plan I threw together during the onboarding session, here's how I got on:
  1. All references to Twitter are gone. Create a PR that removes references to Twitter. Done!
  2. Start a separate PR to discuss Mastodon, just as a skeleton. Not done! But I did at least create an issue.
  3. A PR that introduces - at a minimum - a new section on code review power dynamics. Done!
It was a shame not to write anything about Mastodon, but once I've finished off the material on Code Review, that'll be my next task.

Finally, what were the two things I was wrong about and the two things I was right about?

I was wrong to think I could get away with doing the Book Dash in the evenings alone. It requires sustained commitment.

I was wrong to think my breakout session partner had taken two days off to work on the Book Dash. They actually took it as a break from both work and the Book Dash. I'd just misunderstood!

But I was right to conclude, based on this faulty information, that I should myself take some days off to work exclusively on the Book Dash. If there is a next time (which I hope there is) I'll aim to take the entire week off for it.

And finally, I was right to think that success during the week depended on spending some time prior to plan things out. If I hadn't had a pull request ready and several issues lodged already at the start, I think I'd have lost too much focus early on.

The Book Dash was a brilliant experience; I'm glad I took part. It successfully embodied the principles of openness, reproducibility and community that the book itself espouses. And was great fun. Thanks to everyone involved!

All the images in this post are The Turing Way project illustrations by Scriberia. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807.

If you've been following along with my Gecko dev diary you'll know I took a couple of weeks out to take part in the Book Dash. I'll soon be heading back to daily gecko development, so if you've been waiting for that, thank you for your patience and it'll be returning very soon.

Comments

Uncover Disqus comments