The Problem with Research Software Engineering

Показать описание

A discussion about how to make research software engineering a bit better!

Bibliography

Leios Labs

Рекомендации по теме

Комментарии

A bit of a different video today about something that's been on my mind. I know it's a bit of a rant and more or less a clip from my livestream, but I thought some people might benefit from it! Let me know if you like this type of content as well. If so, I am happy to do more "lecture-style" videos on various topics.

LeiosLabs

Having recently started as a "real" software engineer after finishing my PhD, I recognize many of these problems. We did do version control and unit-testing for our research software, but I often passed up on good software documentation in favor of writing the actual research articles. I've also had many requests from colleagues to share my code for making high-quality graphs. Most of the time I had to reply with: "You can have my code, but it won't work directly on any other data than mine. Please take my code as-is, and use it as an example to try writing something of your own." I know I could have made my graphing tools much more modular and general, but at the end of the day I needed to have my thesis finished.

bartzijlstra

As someone who has worked in both pure software development and pure CS research positions, I completely agree. Specially when it comes to documentation and peer review of code, I’m shocked by a lack of standardization. Asking a researcher for access to their code is a true roll of the dice.

AngryArmadillo

I worked as a research assistant in a chemistry laboratory that primarily deals with simulation. The lab head is still using a FORTRAN for nucleation simulation. I believe that code is at least 20 years old. When I tried to read the code it has variables like 'xxx' and 'yyy'.

rentristandelacruz

Congrats on your phd, I completely agree with everything you say in this video.

zebulon

Competitive programmers may be able to help. You can get relatively clean and simple code from very complex new algorithms if you ask competitive programmers. We are trained to code common algorithms really quickly and occasionally search for better (faster, more memory efficient, working online, etc.) algorithms to implement so we can use them as "secret weapons" during contests.
As an example: given a tree graph of N nodes, it is widely known that you can find its centroid decomposition in O(N log N) time. However, a quick Google search will lead you to a paper demonstrating O(N) centroid decomposition which has no code. To verify, we usually just read the paper, code the algorithm ourselves, and stress test it against the verified slower algorithm with thousands of randomly generated cases.
Might it be possible for researchers to get competitive programmers verify their work?

Pa_Nic

I work in the DSP field and we work closely with people in academia. I 100% agree with what you say. So much time could have been saved if the code handed to us was written better or even followed the paper.
I think a big thing is that some older people in academia have the attitude of "if you used simulations, you didn't solve the problem." I personally think it's weird to see people not use software as a tool for verification on both generated and real data.

AaronPM

This was very helpful. I'm going to look more into JOSS.
As a Physicist interested in Scientific computing, unit testing seems like almost a foreign concept, and I feel fairly inadequate compared to my computer science peers.
I've had enough exposure to the importance of version control prompting me to learn git myself. For anyone else in a similar position, look at the MIT Missing Semester Jan2020 IAP for similar computer sciencey-"filler" education.
More videos about CliMA would be cool : )

apurbabiswas

My experience with researchers writing code was that the piece of software they needed most was git. So much

Axman

Thank you for posting this. Going through my PhD now, I experience many of these pains that you've clearly outlined here. If we could continue to grow this discussion and build a scientific community more embracing of software engineering practices, starting with git and code re-usability, the long term gains would certainly outpace the short term learning pains.

brandonnelson

Thank you so much for posting this video. What I've heard for a lot about algorithms is that when a paper is written, and it says that it has great performance, it's very likely that the implementation will be very costly and won't have better performance than the current solution. OFC, there are also some breakthroughs.

youtubereview

As a master student working in a research group I could not agree more with all the things you just said.

felixrichter

This video is so spot on! Publishers need to see this.

alijassim

Congrats to your PhD!

Thank you for your perspective on research software engineering. I have never seen course offers at my university for scientists on how to write good software and in the end it comes down to teaching yourself.
I work in the same field (PhD candidate in computational fluid dynamics with LBM / physics) and I've seen lots of bad code as well, due to the points you've discussed. But that's not always the case.

The incentive to write clean code is given at least once you work on software as a team. We do refactoring and on a regular basis and make sure every line is properly documented.
Because of the teamwork, version control becomes a necessity as well.
Testing code is actually most of the work. If code is not testable and the results are not reproducible, it is trash code, no matter in which field.
The main incentive for our software project actually was hardware (GPU) efficiency and performance. No other software on the market is capable of comparable performance, so we had to write our own.
Regarding job chances, research software engineering is not a dead end at all. If you really master scientific programming, you don't have to apply for a job because companies will apply for your time.

ProjectPhysX

Right on point. I am currently trying to refractor an old academia codebase consisting of Matlab, Python, Java, and C++ that are glued together using Matlab, and it is just a nightmare. And yes, Matlab is evil - you often see thousands of lines of code without encapsulation and a huge namespace. I genuinely think that much more people would have used the code if it was written in a more professional fashion.

gavinpeng

This is an essential topic for research. More incentive should be given towards research software development. Many of the high quality research depend on how well a simulation or model has been formulated and executed. Better programming practices in developing research works will lead towards better research scopes.

rifatahamed

This video speaks to me so much. I was a software engineer/systems engineer before going back to grad school, and I was the only computational-focused person in my lab for Neuroscience. There were other folks who knew how to program (and some who couldn't do more than a stats script), but writing "good" code (as loaded as that is) was just not a priority because no one else was ever going to see it (because there was no avenue to share and no one wants to replicate results anyway).
Lo and behold, my code ends up being pretty useful for some other work (related to TBI), and it is fortunately very documented so I was able to share it. It's far from perfect, and finding the balance of where to stop on it because it was good enough was a huge challenge. I would've loved to submit it to a journal and get it more polished, but there was no value in that (at least relative to the other priorities I had to graduate).
I wish I knew how to help push the culture forward in this space. I left academia after graduating, so I'm afraid I'm not being very helpful. I've started publishing again recently around my volunteer work, so maybe that's my avenue to help.

KevinHorecka

found you through OIST's youtube channel, love your videos! thanks for sharing your passions

tallon

This video has really made clear some issues I've noticed at my current (research focussed) job and it's very satisfying to hear it stated succinctly

HatersGonnaHate

Thank you for this video. I'm a 3rd year doctoral student in Applied Math, and specifically the scientific computer subdisciplines you mention. I'm currently finalizing a moderate size (about 4000 lines of C) codebase to be open sourced along with a paper submission. There a serious crunch-time feeling which is causing various holes in documentation as well as crappy inefficient fixes. You're definitely right, writing well documented code feels impossible when one is also supposed to also be pushing out theoretical breakthroughs of some flavor.

On the other hand, it is also very hard to write code that works without a strong grounding in the theory of a subject.

SoopaPop

The Problem with Research Software Engineering

The Problem with Research Software Engineering

Bridging the Chasm Between Research & Software Development • Linda Stougaard Nielsen • GOTO 2022...

The Research Software Directory: Improving the impact of research software

The Research Software Engineer Movement

Keynote - The Rise of the Research Software Engineer | Mike Croucher

OP Lunch Talk #67: 'The Research Software Engineer Movement'

Research Software Done Differently

Software Engineer Asks To STOP AI RESEARCH

Why starting with the right engineering is key to AI success!

The Story of the Research Software Engineer

The Research Software Engineering Landscape

Lightning Talk: The Research Software Engineer - a new career path in academia - Louise Brown

Research Software Publishing: Challenges and Opportunities

Better Software, Better Research

Getting Started with Q Research Software

A World-Wide Movement to Improve the Recognition of Research Software by Dr. Sandra Gesing

Software tools to help scientists do research on real world problems like education & mental hea...

Why I left Data Science research for software engineering | Academia vs. Industry

Why is Python ideal for research software development? (Pradeep Reddy Raamana)

Myths and misconceptions about research software development in academia

Research Software Engineers and their role for open and reproducible research

DON'T Become a Software Engineer - Do THIS instead

Recognising the value of research software

Academic Research & IT Policies: Research Software