Hi!
I’m back and refreshed from my mini-break from the newsletter; there’s a lot of optimism in the air here as COVID-19 numbers continue to drop and projects are moving forwards.
I don’t have a lot else to add right now, so let’s go straight to the roundup!
When a 1:1 turns into a vent session - Yihwan Kim
One-on-one 101 - Tiffany Longworth, Clickety
The Update, The Vent, and The Disaster - Michael Lopp
A couple good articles on one-on-ones from a a couple of weeks ago.
In the first, Kim offers five pieces of advice around vents. In particular, vents aren’t about problem solving, they’re about letting something out - and that something has been festering for a while. So Kim’s advice is:
In particular sometimes just letting something out is enough - but sometimes it’s raising something that needs to be addressed in future discussions.
In Longworth’s article, she talks about her formula for one-on-ones:
The last article is an old one but includes elements of both Kim and Longworth’s articles. In particular, Lopp considers a one-on-one that is just “an update” to be a bit of a miss, and shares his own approach for dealing with those - bringing up three particular points, doing a mini performance review on some specific area, or taking the time to get input on some fire he’s fighting. For the vent or the disaster Lopp gives similar advice to Kim - and suggests that one purpose of one-on-ones is to provide an environment where some of those minor dissatisfactions can be surfaced before they become vents or disasters.
Robert Half Research Points To Strong Job Optimism Among U.S. Workers - Robert Half Press Release
A recruiting company (so take with a grain of salt) did an online survey
The online survey was developed by Robert Half and conducted by an independent research firm from March 26 to April 15, 2021. It includes responses from more than 2,800 workers 18 years of age or older at companies in 28 major U.S. cities with 20 or more employees.
and they found that 32% were planning to start a job search in the next few months. There’s a lot of natural attrition that’s not happened over the last year and a half due to people clinging to familiarity in a time of uncertainty, and now as things open a bit up people are tired and looking for a change. Salary and not enough job advancement were tied for top reason why people were considering looking for new work.
The survey also said that and 47% wanted fully remote work.
There’s an interesting discussion of effective messaging for candidates for those doing active recruiting - on LinkedIn in this case, but it could just as well be over email or somewhere else. The advice is pretty straightforward but easy to forget - talk to them like an individual human - with some concrete suggestions about when and how to send the message.
Are other groups in research computing doing active recruiting, or soliciting recommendations? I don’t hear of a lot of groups doing it.
Always be quitting - Julio Merino
If we knew we were quitting (or just going on a long vacation) in two months, what would we be doing differently at work? Probably documenting a lot more, making sure people were coming to meetings so that they could take our place when we weren’t there, training up people to be able to take over parts of our role for us.
But those activities are key and routine parts of being an effective manager or technical leader. Merino commends this approach to us - to be imagining that we were leaving, and being in the mindset of making ourselves less necessary. It’s a vital approach to scaling up our team and making room to take on new challenges, even in the same role.
In particular, Merino suggests -
These are great points to be consistently working on, whether you’re a manager, an individual contributor with a technical leadership role, or even just intending to take on either of those roles.
How to move a 400-person hackathon online - Juri Chomé
Even post pandemic, a lot of people are going to be more willing to do virtual events than they were in 2019, which opens up a lot of collaborative opportunities. Here Chomé walks through how they moved ZuriHac, a Haskell hackathon held in June ever year, to purely online. They talked about the tools they used (Discord for chat, with one use of Zoom, Streamyard + Youtube for streaming presentations, repl.it for some live coding, and an existing registration system) and how they actually ran things. With ~50 projects, the remote collaboration within projects wasn’t hard, but coordinating all of the activities across projects and the hackathon-wide activities took a lot of doing.
Resource and Career Center Pilot for Advancing Computational and Data-intensive Research Earns $1.49M NSF Award - press release, Internet2
This looks to be a promising development for the development of research computing and data as a profession. Since #21 I’ve been talking about the CaRCC’s Research Computing and Data Capabilities Model, a still-underused tool for thinking about the variety of services needed for providing RCD support to researchers in academic institutions. Some of the same group is going to turn their mind to RCD staff training and career pathways. From the press release:
Among the goals of the pilot program is to develop resources for staff training and workforce development, including leading practices for recruitment, onboarding, advancing diversity, equity and inclusion, professional development, and proven models for student internship and training programs. […] In addition to training and workforce development, the Resource and Career Center pilot will create and share a model of career arcs for RCD professionals to explain career options and help existing RCD professionals explore professional development and advancement opportunities.
Glue: the Dark Matter of Software - Marcel Weiher
Weiher hypothesizes that one reason that code sizes of systems are exploding even as individual reusable components are plentiful is that we don’t have any real sensible way to deal with glue code. There are DSLs for algorithms but nothing similar for the plumbing code that makes the output of one component useful as the input for another; and what’s more, the number of those connections is quadratic in the number of components.
Clever vs Insightful Code - Hillel Wayne
Wayne distinguishes between “clever” code, which exploits arcane knowledge of the programming language/os/library, with “insightful” code, which exploits knowledge of the problem. Both can make code brittle - changes may make the cleverness/insight irrelevant - but documenting the insight you’re exploiting is a good and useful thing, and puts the focus where it should be, on the problem.
I tried out VSCode Remote Repositories this past week (including for the newsletter website) and I have to say it was really slick. Fire up vscode, open a GitHub repo, and browse code or even create and develop on a branch right there. If you want to run tests etc you’ll want to use code spaces, but for code browsing or simple development this is really slick.
(Technical) Infosec Core Competencies - Jan Schaumann
Linux Privilege Escalation: Exploiting Capabilities - Stefano Lanaro
Kerckhoffs’s Law for Security Engineers - Devdatta Akhawe
How to Handle Secrets on the Command Line - Carl Tashian
Good security is good operations.
Schaumann goes through a very approachable what he views as core competencies for information security practitioners, and crucially most of them are not deep and profound understanding of obscure corners of the kernel or cryptographic implementations, but being able to read and learn from relevant news sources and being a careful and deliberate system administrator:
You may also notice that a lot of this overlaps with a general understanding of… well, computering on the internet, with operations and system administration concepts. This is no coincidence. Good ops is good security.
Relatedly, Lanaro walks us through exploiting linux capabilities - not by plunging deep into the linux kernel, but by treating it correctly as a finer-grained but similar issue to exploting suid executables, looking for files that have capabilities they shouldn’t have and using them.
Akhawe talks about Kerchoff’s law - no “security via obscurity”, rely on no secrets other than the key - and why it matters; you can change the key easily. Except, unless you actually are changing the key from time to time, you don’t actually know for sure that you can change the key easily. Build in key rotation from the beginning to make sure that a breach of the key can be readily mitigated.
And finally, Tashian goes through the steps necessary to make sure secrets provided on the command line to aren’t available to anyone who happens to “ps” at the right time.
All of these have a similar theme - good security is good operations, and being excellent at security doesn’t mean wizard-level knowledge of hardware and kernels but by broad knowledge of a lot of tools, and using them wisely.
Unreliability At Scale - David Rosenthal
Silent Data Corruptions at Scale - Dixit et al., arXiv:2102.11245
Cores that don’t count - Hochschild et al., HotOS ’21
CPUs with bugs aren’t new - some of us will remember the Pentium FDIV bug - but with increasing complexity (and smaller components) hyperscalers are starting to see bugs on individual cores of individual machines that are extremely difficult to detect much less debug.
Rosenthal provides a nice summary of two recent papers, by Facebook (Dixit et al.) and Google (Hochschild et al.) Both tell wonderful debugging stories - Facebook discovering files were being dropped by a pipeline and it all came down to a specific mathematical operation being repeatably mis-performed on core 59 on a particular box - they nailed it down to a 60-line assembly reproducer. Google found silent data corruption caused by a “few mercurial cores per several thousand machines” that included (from Rosenthal’s summary):
Migrate Your Workloads with the Graviton Challenge! - Steve Roberts, AWS News Blog
In #74 we mentioned an AWS-sponsored hackathon for the porting codes for the graviton2s; for those who are interested in the work but are too busy to block off time for a hackathon, AWS is also supporting a graviton challenge, with all valid entries getting a $500 AWS credit, and prizes for best adoptions. Between 22 June and 31 Aug, perform eight steps over any four, non-consecutive, days, and record a short video about your results and you’re in.
A tonne of calls for submissions for events to be held as part of SC21 in November in St Louis, MO:
EMCLPKDD Workshop on Automating Data Science (ADS2021) - 17 Sept, Virtual, 6 page papers due 23 June
Topics include
Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads (REX-IO ’21) - 7 Sept, Virtual, 6 page papers due 2 July
Topics include
SeptembRSE, the Fifth Conference of Research Software Engineers - 6-30 Sept Virtual, Submissions (Talks, posters, walkthroughs, workshops, panels, discussions) due 2 July
Covering anything of interest to research software development.
18th Annual IFIP International Conference on Network and Parallel Computing (IFIP NPC) - 3-5 Nov 2021, Paris, 12-page papers due 11 July
A wide conference covering topics on parallel and distributed applications and algorithms, architectures and systems, and distributed software environments and tools.
OpenMPCon - Virtual event 3 Sept, Submissions (presentations, posters, tutorials, BoF, panel discussions) due 5 July
Topics of interest include
ACM HPDC 2021 - The 30th International Symposium on High-Performance Parallel and Distributed Computing, 21-25 June Online, $50-$100
Little short notice for this one, sorry, but next week there’s a series of talks on a pretty wide range of parallel and distributed computing topics - program is here.
Nordic-RSE online unconference 2021 - 29-30 June, 13h-16h CET, Free(?) registration
Not limited to those in the nordics (or even those who consider themselves RSEs). Helpfully, the conference provides shorter and longer teasers for the event (note to organizers! This stuff makes it much easier for people like me to include your event in roundups!)
Are you developing software or tools that are driven by research/engineering in either academia or industry? Need to network, share knowledge and experiences with your peers? Maybe you have heard of something called research software engineers and you would like to know more? Nordic-RSE invites everyone interested in such topics to join our online unconference (lightweight get-together) on June 29 and 30, where we let the participants shape the agenda (“birds of a feather”). To kick off the event, we also have four invited speakers, including:
Kristoffer Carlsson (JuliaComputing) “Julia for research software engineers”, Athanasia Monika Mowinckel (University of Oslo) “Developing and distributing in-house R-packages”, Shahnawaz Ahmed (Wallenberg Centre for Quantum Technology) “Keep your code alive - lessons from the QuTiP project”.
In further “cryptocurrencies making everything worse” news, Dockerhub is now cancelling autobuilds for free accounts. I’m sort of dreading GitHub Actions going the same way. We’ve signed up for an open source docker hub teams account but the line is so long we’ve been waiting to hear back for weeks.
A simple handmade CP/M or DOS linker for teaching about how linkers (and the old .COM files) work.
Modularize your bashrc with a .config/bashrc.d directory.
A tutorial on memoizing slow bash script operations that are used repeatedly in a long script - using bash4 coprocesses.
Bash quoting is subtle, and then with ssh commands another layer of quoting and parsing is added. Here’s a walkthrough of quoting bash commands called over ssh.
VSCode now lets you directly work with remote github repos without downloading them; I tried them myself this week. This along side Codespaces is probably going to change my development workflow quite a bit.
A walk through on using extremely cheap google compute platform resources to run an app - the example is an Elixir app and there’s some stuff in there that’s elixir specific to optimize for the constrained resources but most of it is pretty general.
There’ll be a twitter chat on the extreme-scale scientific software stack on June 21st - interesting to see the attempt to use different media for communication and outreach.
“Modulinos” - files that behave as a library when imported, or a script when executed - in bash.
asciiflow for drawing ascii diagrams, and exporting them (including prepending them with typical comment characters for use in code).