Hi, all:
Some jurisdictions are starting to make plans for people returning to offices. For those of us who can readily work from home, that return may not be for a while; but planning is still worthwhile.
Return will likely look like “split offices” for a while - a few people going in in “shifts” while others work from home. The challenge then is to get the best of both worlds, convenience of WFH and serendipitous interactions, rather than the worst of isolation and cliquing up.
Have you started thinking about how this would work for your team? I’d like to have our on-site team members be on staggered shifts so everyone gets a chance to be in-person with everyone else over time, but I don’t know a good way to do that while still giving people some predictability. I think it’ll also be important for everyone to call into meetings independently from their own computer — physical distancing will probably demand that, but I don’t want there to be an “on-site” experience radically different from those still distributed.
On a sadder note, this week, research computing lost an enthusiastic reporter on HPC topics and a pioneer of some of the earliest research-computing publications: Rich Brueckner. Research computing discussions online would be much poorer today if he hadn’t been such a driving force.
This week’s link roundup follows:
The New Science of Building Great Teams - Alex “Sandy” Pentland, HBR
“Bursty” Communication Can Help Remote Teams Thrive
These two articles circulated independently this week, expressing related ideas about how communication works in high performing teams.
The first emphasizes how communication works between team members. It’s worth reading, but two key points:
The first is also mentioned by Google among many others. Preconditions for that rough equality of contribution are that it’s safe to speak up without having someone jump down your throat and that no one person dominates conversation.
The second is just as important. We don’t want to be bottlenecks for our team! One-on-ones are great ways to spur teammate-to-manager conversation. Stand-ups and staff meetings are great forums for encouraging team members to talk and work together without our intervention.
The second emphasizes another point:
Bursty is a feature, not a bug, here. Alternating flurries of peer discussion and relative quiet as each works diligently away in silence is a good and healthy sign. It’s a signal that communication is achieving a vital goal - helping the peers get meaningful work done.
Remote brainstorming for regular humans - Bartek Ciszkowski
Whiteboarding and brainstorming are harder to do when the team is distributed. Here are some suggestions for Ciszkowski on how to do distributed brainstorming:
I’ll just add something that’s not unique to distributed brainstorming: ban commenting on the ideas raised (especially critiques!) You are aiming for lots of ideas first. Only after idea-gathering is complete is it time for distillation and evaluation.
How to Search and Find Layoff Lists Online - Jonathan Kidder, Wizard Sourcer
For those with open positions — with layoffs everywhere, are you considering taking a more active approach to recruitment this time ‘round? Many opt-in lists of laid-off workers, especially in tech, are out there. This post pointed me to a list for Canada, and I’m weighing the possibility of contacting promising candidates to let them know about current and upcoming positions.
Product for Internal Platforms - Camille Fournier
This is an article written for tech companies about how easy it is to go off the rails developing the enterally-used tech platform for developers. It holds a lot of lessons for research computing (software, systems, or data) though. The traps you can fall into are the same, because you are developing tools for a small, captive audience. It’s too easy to lose track of what a broad range of “customers” need to succeed:
When platform teams build to be building, especially when they have grand visions of complex end goals with few intermediary states, you end up with products that are confusing, overengineered, and far from beloved.
Recommendations:
A month long conference is a neat concept - Matt Webb
We’re all learning quickly what does and doesn’t work for online conferences and webinars. Here Webb mentions three things that seems to work well; the first I’ve seen a lot of, but the last two are interesting:
Technical discussions are hard; a few tips - Gaël Varoquaux
The challenges of maintaining community software as seen by a well known neuroscience and machine learning software developer and manager at INRIA. Varoquaux discusses maintainer’s anxiety, contributor’s fatigue, the difficulty of communication. Varoquaux also describes things he’s found that helped:
Critique software, but understand the constraints it’s written under - Neil Chue Hong and Simon Hettrick, Research Professional News
Why you can ignore reviews of scientific code by commercial software developers - Phill Bull
An open letter to software engineers criticizing Neil Ferguson’s epidemics simulation code - Konrad Hinsen
BCS says software for scientific modelling needs standards - Mark Say, UK Authority
If you follow research software discussions at all you’re aware of the mess around Neil Ferguson’s epidemic simulation software. The code was a mess and had bugs but was mostly fine. The results were highly politicized. Soon enough people with no experience in research software or epidemiology but strong opinions about what the modelling software should be showing wrote scathing and unfair (and irrelevant) critiques. Some of the posts above are backlash to the backlash. It’s all just a mess.
Rather than rehash the debate, let’s highlight some points that can get drowned out in the loud back-and-forth:
I don’t know what if anything this situation teaches us, honestly. The government should have previously had some kind of trusted modelling ready as part of pandemic preparedness. Ferguson’s code, which his group was using for papers, shouldn’t have been such a mess. Conscripting code only intended for research into the policy-making loop in a time of crisis put everyone in an untenable position.
New users generate more exceptions than existing users (in one dataset - Derek Jones, The Shape Of Code
Not surprising for us in research computing but nice to have it validated with data: new users of software find new ways to trigger software faults. This is one of the reasons why the transitions that research software goes through — from being used by the creator to being used by friendly users, and then again to being used by a wider community — is so challenging and requires so much retooling.
Hypermodern Python - Claudio Jolowicz
As long as you’re porting that Python2 code to Python3, maybe it’s time to revisit package management, testing, CI/CD, and documentation with current python ecosystem approaches.
The Mystery of the Failing Jobs: Insights from Operational Data from Two University-Wide Computing Systems - Kumar et al.
Research Computing Team Studies Supercomputer Reliability - HPCWire
A really interesting paper from job history of two HPC clusters - and the authors made the data set they used publicly available! Some findings:
Very interesting stuff. Results will be presented at the IEEE Conference on Dependable Systems and Networks.
Deploying Scientific AI Networks at Petaflop Scale on Secure Large Scale HPC Production Systems with Containers - David Brayford and Sofia Vallercorsa
A paper describing the work done at Leibniz Supercomputing Center using containers (LANL’s CharlieCloud for rootless containers) to run a large tensorflow model training within Slurm on their system.
Choosing 2FA authenticator apps can be hard. Ars did it so you don’t have to - Dan Goodin & Mark Gamache, Ars Technica
If you’re considering rolling out 2FA for your systems, this review at Ars Technica may be of interest - they liked Duo, Auth, or LastPass. They really didn’t like Google or Microsoft’s offerings as being too hard or too easy to deal with in the case of a lost device.
A war story about COVID, cloud, and cost - Forrest Brazeal
A reminder (again) that provisioning things in the cloud as if you’re building a static cluster is not a guarantee you’re doing things the cheapest or most performant way.
ISC 2020 Digital - 22-25 June, Free Registration required
ISC 2020’s virtual agenda is out: it’ll be a mix of prerecorded and live-streamed talks. Videos will be available solely to registered “attendees” for two weeks after the event and then publicly available. The workshops on Thursday June 25th — HPC containers, I/O, LLVM, monitoring — look particularly interesting.
IEEE/IFIP Intl. Conference on Dependable Systems and Networks - June 29-Jul 2
The conference Kumar et al.’s work above will be presented at. Work on dependable systems, software, networks, security & privacy, and fault injection tools. Looks really interesting. Registration not yet open.
RustConf 2020 - 20 Aug, $105
If you or one of your team is interested in Rust, this is an inexpensive one day live-streamed conference which includes at least two pretty research-computing relevant talks: Rust for Computational Biology, and Rust for telescope control.
Wow, the new $75USD Raspberry Pi 4 has 8GB of RAM?
I still keep getting surprised a ~decade into the data science phenomenon how data-intensive research computing is everywhere now. A big US trucking company, DAT, now has a Vice-President responsible for data science (amongst other things).
You can now make your web page look like a default LaTeX style with CSS. (Controversial take: I like what LaTeX can do but I really dislike the default styles and fonts).
A good reminder about why saving numerical data in binary formats is the right way to go: even integer parsing, never mind floating point, is surprisingly hard to do efficiently.
Interesting take on GUIs vs CLIs - the argument is that CLIs are what happen when you prioritize and make explicit (reify) the interactions between the user and the system, as opposed to trying to make those interactions easy/“frictionless”.
A very deep dive into the simplest possible C++ program.
That’s it…
And that’s it for another week.
To those who knew Rich Brueckner, and to everyone who’s lost someone, take care of yourself this weekend. And my best of luck in the coming week with your research computing team and with everything else going on,
Jonathan