Why I Don’t Like “Strategy”, Part II: Execution First; Plus: Rotate external presentations with the Share-Out; Cognitive load; Data engineering code is software, not duct tape; LLM tools
Hi, everyone!
Sorry for missing the last couple of weeks - after a particularly busy time, we ended up making some last-minute plans to enjoy the holiday weekend last weekend here in Canada and I’m just now catching up.
If one of the last couple of weekends was a long weekend for you, I hope you enjoyed it!
Last issue I wrote about not being a fan of how our community talks about “strategy” because it’s a catch-all for very different activities, activities which are important enough to be routine parts of the job we’ve taken on rather than something we save for special occasions.
Today I want to talk about another reason I don’t like how we talk about “strategy” - it’s too often used as a way to avoid the quotidian work of running an effective team.
When I visit or talk to RCD organizations, the biggest opportunities I see where managers can do some work to see better results are overwhelmingly on the day-to-day functioning of the team.
This is completely understandable! We’re not taught to be managers, and the peer teams we might use as role models are almost invariably somewhat haphazardly managed. We don’t have much to guide us.
Because of that, it’s often here where we can most often profitably spend our time first.
I frequently see issues like quiet frustration that something isn’t being done well, team members not collaborating because of some long-avoided conflict, not being sure who’s doing what, duplicated effort or even working at cross-purposes, lack of effective hand-off between teams or sub-teams, poorly articulated goals and expectations, and (increasingly) poor retention.
If there are routine team-functioning issues, if the team isn’t firing on all cylinders, there’s no point in working on strategy, because:
Any meaningful strategy comes down to choices about what to do and not do. None of that matters if things don’t get done.
The kinds of common problems described above — quiet frustration that something isn’t being done well, team members not collaborating because of some long-avoided conflict, not being sure who’s doing what… — halt forward progress no matter which direction “forward” is.
Teams in this situation team already have some implicit strategy, and it’s not getting executed well. That’s the problem that needs to be addressed first.
I’ve heard the following argument several times, and even made it myself once or twice:
What this team needs is clarity on vision. Once that’s in place, it’ll be much easier to improve execution because everyone will be pulling in the same direction.
It’s a very plausible argument!
But it doesn’t work.
I’ve tried it myself. I’ve watched as others — in some cases much better leaders than I am — tried it.
Maybe it should work. Maybe I and the others were just holding it wrong.
But the only way I’ve seen to successfully deal with problems of routine operations of the team is to face them head on.
The “strategy first” argument is particularly seductive for many of us, because it conveniently postpones the unpleasant interpersonal stuff to some future point after the strategic Big Thinking is done (“Maybe the problem will fix itself once we’ve Set A Strategy!”). And we’re often much more comfortable doing the Big Thinking work.
But if tasks currently aren’t getting done to a level consistent with our expectations, Strategy won’t fix that. If there’s no culture of feedback and recognition currently, Strategy won’t help. If work is being done that doesn’t align with current priorities, there’ll be work that’s at cross-purposes with the new priorities, too.
There’s no glory in doing the routine management work our teams need (and deserve). We don’t get to send glossy documents about it around to stakeholders and decision makers.
But it matters. And it’s the job.
It’s not particularly hard! Yes, it’s labour intensive, and slow, and can be kind of stressful at times. But quietly not dealing with the problem is also kind of stressful.
There are some time-tested approaches for dealing with these problems specifically:
We know how to address these common issues. There’s time-tested approaches, and resources to help. We, and our teams, deserve to have them resolved. And then, thinking about positioning, medium-term prioritization, problem solving, and stakeholder engagement can be valuable.
Are there team issues you see that I haven’t listed here? Any success stories about dealing with them you’d like to share? Any questions you have about any of them? Email me at jonathan@researchcomputingteams.org.
Before we get to the roundup, I really want to hear from you about what you’d like to read more of in this newsletter, what challenges you are facing or you find other managers from the research world are facing, and what tradeoffs you’re considering. I want to make sure this newsletter and occasional resources are as valuable as possible for our community!
You can always email me, but I’d love to have a reader input chat with you if that works better - we could talk about what you’d like to see more (or less!) of, what you think would be most valuable for managers like us, or just ask about things you’re seeing. Feel free to schedule a quick free chat!
And now, on to the roundup!
Across the way at last week’s Manager, Ph.D. I wrote about how to handle an influx of new team members. In the roundup were articles on:
This week I talked about some more general considerations when helping a team adapt to change, and the roundup had articles on:
The Share-Out - Jim Savage
Savage describes a couple of nice approaches to getting all the team members aligned and communicating on a new project by having them involved in internal or external discussions; it culminates in his favoured approach, “the Share-Out”:
I’m a huge fan of rotating between (willing) team members for external presentations - it’s great for deepening knowledge, building intra-team “connective tissue”, and growing the team members. I’ve tried pieces of this before but never this systematically - I can imagine it working extremely well.
Do you have team members responsible for big-deal external presentations? What do you find works and doesn’t work? Let me know at jonathan@researchcomputingteams.org.
Clever Code Considered Harmful - Josh Comeau
Cognitive Load Developer’s Handbook - Artem Zakirullin
Comeau references Kernigan’s old line, ““Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?”.
For us, though, it’s often even worse than that; we’re frequently going to be handing off our code to non-dev researchers. if we’re as clever as we can be coding, what chance do the researchers have of debugging?
Comeau’s article strongly advocates for preferring readability over cleverness, and using average developers or even interns as the benchmark for readability. This can be enforced with code reviews (or for some things, linting and metrics).
Zakirullin talks about cognitive load - not just complexity or cleverness, although that’s part of it, but also just how spread out is the knowledge one needs to understand this code, how many facts does one have to have in one’s brain at once.
Readability and understandability are vital considerations for research software. Our communities are rightly strongly in favour of open science, of using open source code for science so the methods are transparent and reproducible. But if the code is impenetrable, then we’re throwing away some of that transparency.
A referee would, rightly, bounce a manuscript back for having a Methods section that’s “cleverly” written but requires hours upon hours of tracing through to understand. Referees have standards for readability, and if we’re serious about our software being a research output like a paper (or a research input many people will use and contribute to), we should have the same standards.
From Chaos to Collaboration: 5 “Do’s and Don’ts” for Data Engineers Working in Teams - Eden Bar-Tov
Like bash scripts for sysadmin tasks, we tend to think of data engineering scripts and pipelines as one-offs, to be written out and forgotten about. But also like bash scripts for sysadmin tasks, they tend to last a long time and eventually become load-bearing.
Bar-Tov shares some of his advice for data engineers, which largely apply what we’ve learned in the last decades of software development to data pipelines:
Speaking of data quality, data wrangler is a new tool from Microsoft (with VSCode integration) which looks pretty cool, as does datalab (a “data linter”).
This also looks useful - thebe adds Jupyter cells to otherwise static webpages, connecting to a jupyter lab server or, with thebe-lite, running it in jupyter-lite.
My prediction (disclaimer, my day job is at NVIDIA) is that deep learning frameworks and models, and especially those for generative AI models like LLMs, are quickly going to become, from our point of view as people who support research, something akin to (say) linear algebra libraries for our researchers.
If only because deep learning itself uses linear algebra it won’t be as ubiquitous as linear algebra libraries. However, I think this is a useful mental model. Because there’ll be:
Our data science/engineering/management teams will be closer to this than others, but if I’m right many of even our software and systems teams will have to have some familiarity with some combination of:
Eventually I’d like to put together a curriculum for a crash-course for research computing & software teams about effectively supporting use of these tools. Does that sound useful? What does your team need, and what has it found useful? I’d love to hear - just email me at jonathan@researchcomputingteams.org and if there’s demand I’ll try to put something together.
For now things I’ll be doing report some useful looking tools, tutorials, and techniques. In the past couple of weeks some things that caught my eye:
Learn how https works by watching this website fetch itself, byte by byte, over TLS.
Computational problems can be far, far harder than NP-complete.
Deep dive into dm-verity, file system integrity for embedded linux (in the kernel since 3.4).
Super handy MacOS utility I had never heard about - networkQuality (and a server you can stand up if you want to test particular links).
This is cool - a clustered Map Of GitHub, by the person who brought you Map Of Reddit.
Finally, a handheld 386 gaming pc.
MS Paint finally gets a dark mode.
Prime finding using find(1).
A formal model of x86 instructions in Z3.
.bashrc for python REPLs - PYTHONSTARTUP.
Web Assembly is going to be more composable with POSIX programs - TCP, pthreads, pipes… - with WASIX.
Outputting SVG images directly from Postgres.
IP6oS3 - you know, IP6 over S3, like you do.
Bioinformaticians are finding ChatGPT useful for automating some tasks.
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
Jonathan
Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.
So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.
This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.