Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code review by yourself #1

Open
chendaniely opened this issue Sep 29, 2014 · 3 comments
Open

Code review by yourself #1

chendaniely opened this issue Sep 29, 2014 · 3 comments
Labels

Comments

@chendaniely
Copy link

Steve Eddins from Mathworks made a reference to this book (IIRC) when I asked him about how to conduct code reviews when you do not have someone else to review your code.

Apparently I am not the only one, and is a common 'problem' in science.

He/the book suggested that if there is no other person there to review your code, a good way to do it yourself is to use a debugger and step through every line of code and make sure what you think is in each variable, is the actual value.

Unit testing is one way to tie the debugger into the process, and it isn't as time consuming as it may seem.

Not sure if anyone else does this, but I'd love to hear how you 'code review' on your own.

@bkatiemills
Copy link
Member

Hey @chendaniely,
Awesome, I will check this out! My typical advice for solo code review isn't actually that much different from regular code review:

  • test
  • have a checklist of standards set before you write one line of code
  • test

Testing is key for vetting that things actually work, but some of the real magic is in that checklist of things to look out for. If you don't set some rigorous standards down from square one, the temptation to say "ennnnnhhhh it's probably good" once your code runs / compiles for the first time is overwhelming to the point of being a genuine bias. If you decide what's "good enough" before you begin, it becomes much harder to just give yourself a pass at the earliest opportunity :)

That being said, nothing will ever beat a second set of eyes; a wise person once said: "code is not for precisely communicating with computers; it is for precisely communicating with your colleagues." - without someone else to spot check your work, it's really tough to tell if it's going to make sense to anyone else. Will definitely check out your book, fold that info in, and explore how to break down some barriers so you don't have to code alone.

You're right, this is a key problem - thanks for pointing it out! What have you (and others) tried in the past?

@chendaniely
Copy link
Author

@BillMills I know code review is important when building packages/modules/etc... but what about scripts?

For example, a script (or ipython notebook) that reads in some data, cleans it a bit, and plots some output. Do your 'standards' change, if at all, when coding modules versus analysis scripts?

What about scripts in the early data analysis process when you are just exploring and trying to see/understand what is going on?

I won't mention how often, but I do use the forbidden copy + paste method at times for these... and I can't be the only one!

@bkatiemills
Copy link
Member

TL;DR: If you're publishing results based on it, review it - reproducibility demands it.

That's an interesting point! There is certainly throwaway code used all the time just to see if something will fly; no one is going to iterate review for these back-of-the-envelope exercises. But then, where is the boundary between casual and substantial?

Review (code or otherwise) is a process to determine if the subject under review has lived up to the standards set for it. So right away, we can throw out any circumstance where we have no expectations or standards for the code in question. Fun hacking is just fun hacking.

Standards in turn, are codifications of characteristics that (we hope) help the thing satisfy a set of values. In which case, what are the values that code review is trying to make code live up to via those standards? I submit that two of the things we're reaching for are code longevity and reusability (maybe performance too, but that's a topic for another day). So, if you're writing code meant to be used once by one person, maybe review doesn't matter; sometimes, you just need a result now.

This is true in general, but now consider the needs of open science. If that script you describe is part of an analysis you intend to draw publishable conclusions from, at the barest minimum the code should be as simple to understand as possible, so that those validating & reproducing your work can follow your method. So in some sense, all code used for science needs to be reusable for the sake of reproducibility, and needs to be held to some corresponding standard.

Which sounds like a ghastly amount of effort, but let's not conflate the science you want to actually publish and share, with the quick script you describe for just seeing what's up :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants