#183 - 23 June 2024

We Need To Talk About AI. Plus: Upfront technical requirements are possible; (Too much) Manual work is a bug; Technical governance and leadership; Define your audience better than 'non-experts'; Microvms and Isolation

I think I’ve served you poorly by not writing more about AI.

As many of you know, my current day job is with a company now best known as an AI hardware/software vendor. I prefer to avoid topics where there would appear to be a conflict of interest. That’s a fine goal, but in this case I think I’ve taken it too far. It means I’ve shirked my duty talking about a range of timely topics that all our teams are wrestling with.

Whether wearing this hat or my day-job hat, I talk to a lot of academic leadership. Many VPRs and Deans are freaking out, some less quietly than others, about what they’re going to do about supporting AI in their research institutions - where they should start, what resources should they commit, what to pay attention to and what to dismiss.

And that last part is a real problem for them to suss out. Like all of us, they have limited resources, including time to seek out advice. The conversations they can have most readily are with various genuinely passionate advocates, who think that where they should start is “everywhere” and what resources they should commit is “everything”.

Or alternatively, while their researchers are demanding more support immediately, there are conversations available with others telling them to wait it out, this is all a fad that will pass.

(Research HPC teams, you know I love you, but as a community we’re the worst for exactly that kind of ornery dismissiveness of not-invented-here kinds of research computing. The history of the last twenty years of computing, as narrarated by the median research HPC cluster team, would have gone something like: “Big data - that’s stupid, ignore it and it’ll go away. Cloud - that’s stupid, ignore it and it’ll go away. Containers - that’s stupid, ignore it and it’ll go away. Machine learning - you mean curve fitting? Ignore it and it’ll go away.“ Eventually people stop asking. If what they wanted was a reflexive “no” from a computer person, they could file a ticket with enterprise IT.)

This should be a time for our teams to shine. We’re perfectly positioned to take our rightful places at the table, in discussions on one of the biggest research strategy topics of the day, where we can offer informed, nuanced advice about what’s possible now, where there’s low-hanging fruit, and how and where modest investments (both hardware and personnel) made in our bailiwicks could advance our institution’s research priorities while preserving the other work that’s going on.

But bafflingly many of the teams I’ve talked to have been sitting on the sidelines, waiting for the decisions to be made elsewhere and the consequences to happen to them.

That’s madness. Instead of building trust and demonstrating that we’re expert peers in these strategic conversations, it positions us as junior support players content to do what we’re told.

If decision makers don’t see us as trusted advisors on how best to apply new digital technologies to institutional research priorities, then what is the point of us? If all we do is perform the technical tasks assigned to us, why do they need us to be part of the institution at all when there are so many vendors out there?

I’m seeing a couple institutions build parallel teams and structures right now, explicitly to bypass teams who they don’t see as up to the job. It’s rare, but it happens. Each one represents a disastrous squandering of an opportunity, not just for resources but to be seen as worthy participants in research strategy discussions. Advancing research in our institutions and communities is our entire job, and if we’re not willing to take a leadership role in discussions where we have relevant expertise, what are we doing?

As I’ve said before (#104), yes, research should be driving what we do, but sometimes things that happen in our sphere makes possible different kinds of research. We follow, but we need to lead, too. It’s our responsibility to take that leadership when necessary.

When AlphaFold 2 made its big showing in 2020, teams could be forgiven for not really having much of an opinion on what it meant for research. Stable Diffusion didn’t seem obviously interesting to researchers other than in fields like CV. And when ChatGPT came on the scene in 2022, no one really had any good sense of what if anything it might be useful for other than writing poetry like a pirate or cheating on grade-school essay questions.

We’re experts, and we’re most comfortable having discussions in our area of expertise. It makes sense that we’d avoid getting drawn into heated and seemingly hypothetical discussions outside of our experience when it seemed tangential to our mission.

But it’s now summer 2024, and a lot more is known. We don’t need to be experts in a technology to keep abreast of journals and see where people are having important successes.

Just as scripting languages like Python and R made automating work on quantitative data vastly more accessible to researchers and trainees across the academy, LLMs and multimodal models are making automating work on qualitative data like text, images, and videos more accessible. This opens doors for more and larger-scale qualitative research in the humanities and social sciences, but also text- or video-based work in clinical fields.

That’s in addition to successes that people are seeing working in chemistry, or PDEs, or..

At the same time, we’re seeing enthusiastic overreach in those same fields. Work that can’t be readily replicated elsewhere, or models pulling out features intrinsic to the training data set to generalize on. That might indicate real limitations, or it could be researchers lacking solid advice from knowledgable technical teams, or both.

What all this means for our institutions will vary. Existing resources and expertise are different from place to place, and priorities are never the same in any two organizations.

But we have more than enough existing expertise to combine with our deep local understanding of our institutions needs and take leadership in offering advice. We can talk to departments and faculties, give seminars, send relevant papers up the CIO/VPR chain…

Yes, we’re overwhelmed with other challenges and needs, but so are our institution’s decision makers. They need, and deserve, our leadership on this new set of technologies, and the next one to come after that.


We should also have conversations about how our work is likely to change as increasingly the trainees we work with will be coming in having worked with such technologies extensively for two+ years.

I’ll write more about these things in the fall, but some thoughts before I go:

I spend a lot of time talking about changes an RSE service might see when all of the trainees and researchers we work with can generate code instantly. And can ask an LLM to explain code to them. And can have useful “rubber duck” debugging conversations with a rubber duck who is actually quite clever, always available, and also sometimes drunk.

For teaching coding:

  • Teaching and evaluation of coding has to incorporate these kinds of tools, because the trainees will absolutely be using them.
  • With code-explanation-as-a-service readily available, teaching reading code becomes much easier, and showing learners how to identify high-quality codebases becomes especially important.
  • Higher level concepts become more important and usefully happen earlier: When code writing is so fast but accuracy can be spotty, testing fundamentals, version control, and CI/CD becomes vital
  • Higher-level architectural concepts also become more important earlier, because integrating generated code with existing code will matter

For writing software:

  • Documenting code and giving examples and keeping them up to date becomes even more important, as now those are also instructions for LLM coding tools.
  • When glue code connecting well-documented APIs is essentially zero-cost and particularly accurate to generate, plugin support becomes a very useful way to allow extensions to a code base even though earlier that would require pretty sophisticated users.

For research data management, we’ll see acceleration of trends that had already existed: well-curated data sets are going to be even more valuable; policy, governance, and access will be things that get thought about first, not towards the end; and there’ll be more and more pressure for larger datasets, faster.

For research computing systems, we’ll again see acceleration of increasing trends to supporting wide mixes of very different kinds of workloads; complex workflows; standing up APIs and supporting call-outs to external APIs.

And for our day to day as managers, these tools can be useful for telling our stories. When every manager has an automated comms intern who can draft success story blog posts from email chains, generate images (or alt text for existing images), and can summarize presentations and discussions into bite-sized chunks, highlighting our wins and the value we bring to our research institutions become tasks we’d be remiss not to prioritize.

What are you seeing change now? What conversations are you having? Hit reply or email me at jonathan@researchcomputingteams.org.


As suggested above, I’ll be taking off for July and August, so this will be the last new newsletter for a bit. I’m going to try doing something different over the summer, and sending “best of” issues from the archive. Let me know what you think, or if you have suggestions for issues to re-share.

And speaking of the archives, I’ve improved the newsletter archive search (moved to pagefind, for those curious) by quite a bit. Indexing the well over half-million words in the newsletter (not including job ads!) on a static page is going to allow me to do a couple new things later this year, including allowing linking to particular roundup articles or essays. More on that later.

And with that, on to the roundup!


Managing Teams

Over at Manager, Ph.D., in #175 I talked about how useful vacations can be for thinking about how your staff will work without you. It’s a great opportunity to step back, look at what the team is doing, and think about the systems, priorities, and expectations you’re putting in place. Also, in doing so you’re going to find some things you can delegate even when you are around.

In the roundup there were articles that discussed:

  • Giving the right amount of context
  • How effective feedback nudges behaviour
  • That we need respectful disagreement
  • Project management mistakes
  • Proposing action
  • Debugging management

Technical Leadership

Debunking the Myth that Upfront Requirements are Infeasible for Scientific Computing Software - Runku Gupta, BSSW
Debunking the Myth That Upfront Requirements Are Infeasible for Scientific Computing Software - Smith, Srinivasan, and Shanka, SE4Science 2019

Any of these sound familiar?

As the following quotes highlight, previous research has repeatedly shown that many in the community believe that upfront requirements are infeasible for [Scientific Computing Software]: “Full up-front requirement specifications are impossible: requirements emerge as the software and the concomitant understanding of the domain progress.” [5] “Since scientific software is deeply embedded into an exploratory process, you never know where its development might take you. Thus, it is hard to specify the requirements for this kind of software up front as demanded by traditional software processes.” [6] “The research scientists … do not appreciate the need to articulate requirements fully and upfront as demanded by a staged methodology, and found this articulation very difficult to do.” [7]

I see two opposite issues with planning technical projects among members of our communities, which both stem from the same black-or-white thinking. Either there is vast over-specification and over-planning, where the team then sticks to the details of the plan long after it’s clear the wheels are starting to come off; or there’s the dismissive “we can’t plan this, this is research! It’s the frontier of knowledge! Let’s just get started” sort of rationalizations as seen above.

Either come from a mistaken belief that we either have perfect knowledge or zero knowledge of our goals and the future, and there’s nothing in between.

But that’s goofy. The entire field of project management is one of planning and execution under uncertainty. That’s why risk registers are discussed so often. Every project is uncertain. Any project may have to take unexpected changes in direction. Every project starts off with the least amount of understanding of what will actually happen right at the beginning.

That doesn’t mean you don’t plan, discuss, and make it clear what success looks like right from the start. Because research is so complex and uncertain, it behooves us to do the basics right and create as much of a foundation for success as we can at the beginning. Not having clearly what a successful outcome looks like at the beginning of a hard project is just dereliction of duty. With no star to steer by, success on turbulent waters quickly becomes very unlikely.

Smith, Srinivasan, and Shanka’s article, summarized by Gupta, does a nice and nuanced job of discussing up-front requirements for scientific software, but the same approach can be applied to many technical projects in research support. No, you can’t perfectly define every requirement, but many things you can, and other things can be sketched out provisionally and revisited.


Manual Work is a Bug - Roscoe A. Bartlett, BSSW
Manual Work is a Bug - Thomas A. Limoncelli, ACM Queue

This touches on something I talk about over on Manager, Ph.D. quite a bit about processes.

I really like Limoncelli’s article and Bartlett’s summary. They both drive home that you can’t jump to automation right away - you have to go through the process once or twice to really understand it and what might happen. But then it’s straightforward to incrementally automate:

This culture can be summarized in two sentences: (1) Every manual action must have a dual purpose of completing a task and improving the system. (2) Manual work should not be tolerated unless it generates an artifact or improves an existing one.

This touches on so many things we do.

And I’d take a pretty broad view of automation. There are two things wrong with manual processes - someone has to figure out what to do, then they have to do it. Part of the “semi-automation” can just be writing down a runbook / standard operating procedure for the process, so people have ready access to a known-good way of doing the thing, even if they personally have to do a couple of the steps. Once there’s a written artifact somewhere, you can just constantly make the next time better (Manager, Ph.D. #153).

We have too few resources available to us to squander them on figuring out how to do, then manually doing, thankless tasks.


Product Management and Working with Research Communities

Switzerland mulls research infrastructure governance body - Emily Twinch, Research Professional News
Scientific leadership should be responsible for data platforms - Kenny Workman, Latch Bio

These two recent articles, read together, point to a key tension in our roles, and one that I spend a lot of time thinking and writing about.

Twinch’s article points to yet another example in a welcome trend: decision makers recognizing the importance of a coherent, consistently funded approach to developing, maintaining, and growing our teams and infrastructures, which are key components of any modern research and scholarship programme:

Infrastructures are “fragmented, heterogeneous and geared to short-term funding cycles”, says Swiss National Science Foundation

Our teams are technical experts and professionals and need a governance and funding structure which reflects and builds on that technical expertise and professionalism.

But also, as Workman points out:

The current standard is to delegate [responsibility for technical research platforms] to a computational lead who is further removed from the scientific goals of the organization by training and detached from the daily behavior and needs of the end users. This has led to the widespread deployment of bloated, buggy, and often disjointed platforms that are never truly adopted by research teams and take many years and resources to stand up.

The purpose of our teams is the advancement of science and scholarship with the computational experience we bring, not the computation itself. Workman writes from the point of view of leadership in a biotech company, but it applies at least as much in academic or government labs. Major decisions must be driven by specific research needs.

This tension comes up all the time. One of the first questions technical research support leads ask each other when they meet is whether they reports up to the CIO or the VPR. One is better aligned to the operational and funding needs of the team, the other is better aligned to the mission.

To be clear, I think this tension is a feature, not a bug. It absolutely is awkward to (literally or figuratively) have two bosses, and to be pulled in two directions constantly. But it’s the nature of our work. Technology serves the needs of research, but new technologies can also open up venues for new research questions, in a never-ending back-and-forth (#104), and our cross-discipline vantage point can help us translate approaches to new fields (#119). As I wrote about above, navigating that and helping advice about what new research is possible with new technology is just as much a part of our job as shaping what technology we develop to meet current research needs.

As with so many things about our line of work: if it were easy, others could do it. But they can’t, so we do.


The Mythical Non-Roboticist - Benjie Holson

Holson’s article is talking about the problems he sees with periodic efforts to write robotics frameworks for non-roboticists; the argument is that any such effort is fatally flawed but very close to being an effort that is extremely worthwhile.

There’s really useful points here made that I think matter for product management of any sort of technical effort supporting domain (but not technical) experts. In particular:

“Non-[X]” isn’t defined well enough to be useful, for any value of X. I love this quote:

Don’t design for amorphous groups. If you can’t name three real people (that you have talked to), that your API is for, then you are designing for an amorphous group and only amorphous people will like your API.

And two, a reminder to distinguish between the intrinsic complexity of a problem domain and the unnecessary complexity of an implementation of something. Any effort to stamp out unnecessary complexity (even of things we might have become inured to) is valuable! But be wary of trying to hide intrinsic complexity unless you know you can safely reduce the scope of the problem being solved (which again, comes down to targeting your audience better than “Non-experts”).


Research Software Development

When Is Parallelism Fearless and Zero-Cost with Rust? Abdi et al, SPAA ’24

The article is worth reading, but the abstract sums up the results:

The Rust programming language is lauded for enabling fearless concurrency with zero cost: detecting concurrency errors at compile time. […] We find that Rust, with the Rayon library, indeed delivers fearlessness for program phases comprising only regular parallelism, e.g., prefix-sum. However, for applications with any irregular parallelism, the programmer must choose between unsafe code or high-overhead dynamic checks with errors that manifest at run time, leaving the arduous task of parallel programming as scary with Rust as with its predecessors.

This basic issue isn’t unique to Rust. In fact, it’s my complaint about many novel parallelism frameworks, and I’m guilty of some of this myself: if someone demonstrates to you how trivial a parallel stencil computation is in framework X but shows nothing else, run away, because the real challenge is when you’re trying to do something like a tree-based method or use some other sort of dynamic shared data structure.


Emerging Technologies and Practices

Couple neat items on light weight isolation crossed my desk this week. These approaches are increasingly of interest for running complex workflows of not-necessarily-trusted software

Isolate, a long-running effort initially developed by Martin Mareš and Bernard Blackham, is “a sandbox built to safely run untrusted executables, like programs submitted by competitors in a programming contest”. That could be interesting either just for teaching, or for testing out code.

Then there’s How we scale our microVM infrastructure using low-latency memory decompression by Ives van Hoorne, which discusses code sandbox’s approach for decompressing memory on the fly for cloning micro-VMs (for migration, or scaling up). I keep being hopeful for microVMs to gain traction in environments where people don’t want to run docker containers, but have been disappointed so far, partly due to the fact that passthrough of things like network or accelerators remains unevenly supported.


Random

Discovering a 55-year old bug in the first Lunar Lander Game, which in retrospect is the only possible explanation for why I never got the hang of it.

If this affects you, you probably already know about it, but just in case - Numpy 2.0 is out, and it’s going to break some stuff. It’s good and important that occasionally key pieces of the ecosystem break some backwards compatibility - I’ve argued in the past that we don’t do it enough - but it does mean some work for everyone. Periodic reminder that dependencies with pinned versions and CI/CD with a decent test suite are both good things.

Interesting distinction between three ways of using vector instructions - horizontal (do more of the same operations), vertical (do larger operations), or exotic (implement specific operations that would otherwise require branching logic).

Relatedly, orc, from gstreamer, is a runtime compiler for inner loops, and a library of motifs for inner loops which can be compiled down into various targets

Choose your cereal-based terminal color scheeme with root loops, and learn a bit about color spaces while doing it.

When SQLite will and won’t work in production.

Spreadsheet is All You Need - a small transformer in a spreadsheet.

They’re all you need to animate a roller coaster, as well.

AI for (surface) meshes.

50 years of diff.


That’s it…

And that’s it for another week. If any of the above was interesting or helpful, feel free to share it wherever you think it’d be useful! And let me know what you thought, or if you have anything you’d like to share about the newsletter or stewarding and leading our teams. Just email me, or reply to this newsletter if you get it in your inbox.

Have a great weekend, and good luck in the coming week with your research computing team,

Jonathan

About This Newsletter

Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.

So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.

This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.

All original material shared in this newsletter is licensed under CC BY-NC 4.0. Others’ material referred to in the newsletter are copyright the respective owners.


Jobs Leading Research Computing Teams

This week’s new-listing highlights are below in the email edition; the full listing of 247 jobs is, as ever, available on the job board.