SE-EDU
  • Home
  • About
  • Guides for SE student projects »

    Guidelines for working with PRs

    Selecting what to work on

    • If submitting a PR for an existing issue,
      • It's best for first time contributors to start by fixing an issue specifically labelled for first timers e.g., d.FirstTimers, good first issue.
      • It's best to post in the issue to ask if it is OK for you to submit a PR for that issue and wait for approval.
      • Check the issue discussion thread to see if there are Some activity during last seven daysactive PRs for that issue. You can offer to PR for an issue that has stalled PRs i.e., there is a PRs submitted for that issue but the PR author seems to have stopped working on it.
    • If the issue list does not contain what you want to work on, post an issue first and wait for it to be acknowledged. Otherwise you could end up fixing something that does not need fixing.

    Scoping a PR

    General rule: try to keep PRs as small as possible because smaller PRs get merged faster.

    A PR should contain a single, standalone, and complete change to the code base, unless in exceptional cases where the PR is part of a bigger change.

    • Single means a PR should not try to fix more than one fix, unless there are multiple things that must be done together or not at all.

      Refrain from do minor improvements to the code e.g., fix typoshousekeeping in the neighboring code unless the line in concern is already touched by the PR (i.e., the housekeeping does not increase the line count of the PR.

      Rationale: Imagine we decide to revert the PR for some reason. If a PR contains, unrelated changes or multiple independent changes, we will not be able to revert the offending change without losing the other changes as well.

      If you notice a need for housekeeping in the neighboring code as you do your PR, create an issue for it in the issue tracker.

    • Standalone means the PR should contain a meaningful change that moves the code base from one working state to another.

    • Complete means the PR should contain everything related to the change, including the following:

      • functional code
      • code comments
      • test cases
      • user docs and developer docs

    "This PR is just the code fix. I'll update tests and documentation in a separate PR" is not acceptable!

    However, it is fine to push the functional code first to get early feedback, as long as the rest is added to the same PR later.

    Submitting a PR

    When submitting PRs, follow the forking workflow. A summary of the steps is given below.

    Step 0 Do these steps if you haven't done them already:

    0.1 Fork the upstream repo.

    0.2 Clone the fork to your computer.

    0.3 Set up the dev environment as described in the project docs. Confirm the set up is correct.

    Step 1 Create a branch from the master branch, following the naming convention given.

    Follow these rules to improve consistency:

    • Use a meaningful name consisting of some relevant keywords, in the kebab case format e.g., refactor-ui-tests.
    • If the branch is related to an issue, use the format issueNumber-some-keywords-from-issue-title e.g., 1234-ui-freeze-error

    Step 2 Add your code to the branch while ensuring you follow these:

    • relevant coding standards (the full list is given here)
    • Commit message format

    Every commit must have a well-written commit message subject line.

    • Try to limit the subject line to 50 characters (hard limit: 72 chars)

    Rationale: Some tools show only a limited number of characters from the commit message.

    Use the imperative mood in the subject line.

    • Good: Add README.md
    • Bad: Added README.md
    • Bad: Adding README.md

    Capitalize the first letter of the subject line.

    • Good: Move index.html file to root
    • Bad: move index.html file to root

    Do not end the subject line with a period.

    • Good: Update sample data
    • Bad: Update sample data.

    You can use scope: change format (when applicable).

    • e.g. Person class: remove static imports
      Unit tests: remove blank lines

    Commit messages for non-trivial commits should have a body giving details of the commit.

    • Separate subject from body with a blank line.
    • Wrap the body at 72 characters.
    • Use blank lines to separate paragraphs.

    Example: A commit message for a commit that is part of a multi-commit PR:

    Unify variations of toSet() methods

    There are several methods that convert a collection to a set. In some
    cases the conversion is in-lined as a code block in another method.

    Unifying all those duplicated code improves the code quality.

    As a step towards such unification, let's extract those duplicated code
    blocks into separate methods in their respective classes. Doing so will
    make the subsequent unification easier.

    Tip for SourceTree users: To help you keep the commit message width to a 72 characters:

    1. Choose Tools -> Options.
    2. Click on the General tab, scroll down to the Commit settings section.
    3. Tick the Display a column guide at commit message at [72] characters option.

    Use bullet points as necessary. Instead of relying entirely on paragraphs of text, use other constructs such as bullet lists when it helps.

    Example: A commit message for a bug fix:

    Find command: make matching case insensitive

    Find command is case sensitive.

    A case insensitive find is more user-friendly because users cannot be
    expected to remember the exact case of the keywords.

    Let's,
    * update the search algorithm to use case-insensitive matching
    * add a script to migrate stress tests to the new format

    Explain WHAT, WHY, not HOW.

    • Use the body to explain WHAT the commit is about and WHY it was done that way. The reader can refer to the diff to understand HOW the change was done.

    • Give an explanation for the change(s) that is detailed enough so that the reader can judge if it is a good thing to do, without reading the actual diff to determine how well the code does what the explanation promises to do.
      If your description starts to get too long, that's a sign that you probably need to split up your commit to finer grained pieces. [adapted from: git project]

    • Minimize repeating information that are given in code comments of the same commit.

    Structure the body as follows:

    {current situation} -- use present tense

    {why it needs to change}

    {what is being done about it} -- use imperative mood

    {why it is done that way}

    {any other relevant info}
    • Avoid terms such as 'currently', 'originally' when describing the current situation. They are implied.
    • The word Let's can be used to indicate the beginning of the section that describes the change done in the commit.

    Example: A commit message for a code quality refactoring:

    Person attributes classes: extract a parent class PersonAttribute

    Person attribute classes (e.g. Name, Address, Age etc.) have some common
    behaviors (e.g. isValid()).

    The common behaviors across person attribute classes cause code duplication.

    Extracting the common behavior into a super class allows us to use
    polymorphism when dealing with person attributes. For example, validity
    checking can be done for all attributes of a person in one loop.

    Let's pull up behaviors common to all person attribute classes into a new
    parent class named PersonAttribute.

    Using inheritance is preferable over composition in this situation
    because the common behaviors are not composable.

    Refer to this S/O discussion on dealing with attributes
    http://stackoverflow.com/some/question

    Refer to the article How to Write a Git Commit Message for more advice on writing good commit messages.

    • guidelines for commit organization

    Working with Git

    Organizing commits

    Commits in a branch or a PR is said to be well-organized if they have the following qualities:

    • Each commit contains a single logical change, and this change must stand on its own. i.e. each commit has a single responsibility, and that responsibility must be fully carried out.
      For example, if the commit message says Move delete() from Person class to Address class, the commit cannot contain the addition of delete() to Address class only; it should also contain the deletion of delete() from the Person class for it to be a complete implementation what is stated in the commit message.

    • Each commit has a well-written commit message i.e., it follows these guidelines.

    • Commits are ordered in a bottom-up fashion, each commit building on top of the previous one towards the end goal of the PR.

      Rationale: Reviewers should be able to review one commit at a time.

    • Ideally, a commit does not modify more than 100 lines of code.

      Rationale: Bigger commits are harder to review.

      "Ask a programmer to review 10 lines of code, he'll find 10 issues. Ask him to do 500 lines and he'll say it looks good." --[source]

      Commits containing mechanical changes (e.g. automated refactorings, cut-paste type code movements, file renames, etc.),

      • should include only one mechanical change per commit e.g., rename a single variable across the code base.
      • should not contain other non-mechanical changes, unless unavoidable.
      • can exceed 100 LoC.
      • should have the description of the change in the commit message (so that the results can be reproduced).
    • Every commit pass CI. when you merge a series of commits (without squashing), every commit in your push (not just the last commit) should pass CI.

      Rationale: Build-breaking commits in the version history hinder the ability to use git bisect for locating bugs.

    Here is an example PR of commits that are organized as described above.

    Refactor commits before pushing. It is unlikely that you can produce a series of commits that meet all the above criteria in the first try. In such cases, refactor commits until they meet the criteria. This S/O post describes how to refactor commits (even easier to do with visual tools such as SourceTree -- see this video).

    Merging branches

    When merging branch, the aim is to keep the version history neat so that it is easy to do things such as the following:

    • Find which commit introduced a bug using git bisect.
    • Undo a specific change by reverting a commit in the history without breaking anything else.
    • The default strategy is to do a squash-merge. This is suitable when the branch tackles one task but multiple commits that are not well-organized (as per the definition of 'well-organized' in the panel below).

    Organizing commits

    Commits in a branch or a PR is said to be well-organized if they have the following qualities:

    • Each commit contains a single logical change, and this change must stand on its own. i.e. each commit has a single responsibility, and that responsibility must be fully carried out.
      For example, if the commit message says Move delete() from Person class to Address class, the commit cannot contain the addition of delete() to Address class only; it should also contain the deletion of delete() from the Person class for it to be a complete implementation what is stated in the commit message.

    • Each commit has a well-written commit message i.e., it follows these guidelines.

    • Commits are ordered in a bottom-up fashion, each commit building on top of the previous one towards the end goal of the PR.

      Rationale: Reviewers should be able to review one commit at a time.

    • Ideally, a commit does not modify more than 100 lines of code.

      Rationale: Bigger commits are harder to review.

      "Ask a programmer to review 10 lines of code, he'll find 10 issues. Ask him to do 500 lines and he'll say it looks good." --[source]

      Commits containing mechanical changes (e.g. automated refactorings, cut-paste type code movements, file renames, etc.),

      • should include only one mechanical change per commit e.g., rename a single variable across the code base.
      • should not contain other non-mechanical changes, unless unavoidable.
      • can exceed 100 LoC.
      • should have the description of the change in the commit message (so that the results can be reproduced).
    • Every commit pass CI. when you merge a series of commits (without squashing), every commit in your push (not just the last commit) should pass CI.

      Rationale: Build-breaking commits in the version history hinder the ability to use git bisect for locating bugs.

    Here is an example PR of commits that are organized as described above.

    Refactor commits before pushing. It is unlikely that you can produce a series of commits that meet all the above criteria in the first try. In such cases, refactor commits until they meet the criteria. This S/O post describes how to refactor commits (even easier to do with visual tools such as SourceTree -- see this video).

    • Use a merge commit if the commits are well-organized, and the branch tackles only one task. In this case the commit message of the merge commit should explain the full task.

    • Use a rebase-merge if the commits are well-organized and each commit is an independent task (as opposed to steps or a bigger tasks).

    • In other cases, consider reorganizing/splitting the branch to match one of the above.

    • PR scoping guidelines given above

    General rule: try to keep PRs as small as possible because smaller PRs get merged faster.

    A PR should contain a single, standalone, and complete change to the code base, unless in exceptional cases where the PR is part of a bigger change.

    • Single means a PR should not try to fix more than one fix, unless there are multiple things that must be done together or not at all.

      Refrain from do minor improvements to the code e.g., fix typoshousekeeping in the neighboring code unless the line in concern is already touched by the PR (i.e., the housekeeping does not increase the line count of the PR.

      Rationale: Imagine we decide to revert the PR for some reason. If a PR contains, unrelated changes or multiple independent changes, we will not be able to revert the offending change without losing the other changes as well.

      If you notice a need for housekeeping in the neighboring code as you do your PR, create an issue for it in the issue tracker.

    • Standalone means the PR should contain a meaningful change that moves the code base from one working state to another.

    • Complete means the PR should contain everything related to the change, including the following:

      • functional code
      • code comments
      • test cases
      • user docs and developer docs

    "This PR is just the code fix. I'll update tests and documentation in a separate PR" is not acceptable!

    However, it is fine to push the functional code first to get early feedback, as long as the rest is added to the same PR later.

    Step 3 Sync your branch with the upstream master, if the master branch advances while you work on your code (i.e., pull upstream master, merge to your branch).

    Step 4 Create a PR when the code is ready, as follows:

    1. Run code style checks (if any) to ensure the code complies with the project standards.
    2. Push the branch to your fork.
    3. Create a draft PR from your fork to the upstream repo.
    4. Check the draft PR on GitHub to confirm the following:
      • it follows the PR format conventions
      • it does not contain any unintended changes.
      • it passes Continuous Integration (Travis, AppVeyor, GitHub Actions, etc.)CI checks, if any.
    5. Remove the 'draft' status of the PR. Post a ready for review comment for good measure.

    Step 5 Revise as per reviews until the PR is merged.

    • Feel free to post a reminder comment if you don't get a review within 2-3 days.
    • When you receive a review,
      1. Revise the code as per the review.
      2. Push the new code to the branch in your fork.
      3. Post a comment to indicate the PR is ready for a new review.

    Reviewing a PR

    Best practices for reviewing PRs

    Reviewing PRs is not just about the code or the tools, the way you phrase your comments matters a too, especially if you are a peer reviewer.

    Best practices for reviewers

    • Add specific comments at relevant places of the code, rather than give one overall comment for the entire PR.
      • It is typical for the comment to be added right below the code line it refers to.
        It is possible to mark multiple lines as linked to the comment too.
      • You can use Markdown (specifically, GitHub-Flavored Markdown) in your comments.
    • It's best to phrase comments as questions, especially if you are a peer reviewer.
      e.g., Should this be extracted out? rather than Extract this out or This should be extracted out.
    • Say 'I like', not 'good/bad'. Consider these two alternatives:
      Option 1: This separation of X from Y is good (or correct or wrong or bad)
      Option 2: I like how you separated X from Y (or didn't like or Not sure I like)
      The second one is less judgemental and less likely to cause the author to become defensive.
    • Feel free to ask for more info from the author, to help you understand the code/design. For example, you can ask why the author chose to write the code in a specific way.
    • You can also suggest alternatives for the author to consider.
      Combining this with the previous point, you can ask Any reason why you did it this way instead of that way?
    • Feel free to compliment the author when appropriate instead of focusing on negative things only.
      e.g., hey, I like how clean this bit of code is 👍
    • Say please, but don't say please. Beware of overusing 'please' as it can be interpreted as a condescending tone. For example, someone can interpret Please use better variable names as Please for the love of God can you use better variable names?. Instead, you can say Perhaps a more intuitive variable name here? which doesn't run any risk of misinterpretation.
    • No need to repeat the same comment many times. It's not the job of the reviewer to clean up after a sloppy author. If you notice the same problem in multiple places, after commenting an a few of them, you can simply say ... I noticed the same issue in several other places too.
    • Remember the other readers. PR comments can be read by people other than the reviewer and the author e.g., future programmers. Use regular English and avoid slang, colloquialisms, cultural references etc.

    Further readings:

    Best practices for authors

    • Don't get into arguments with reviewers. If you disagree with the reviewer, you can explain your own view in a non-confrontational way without trying to prove your way is better.
    • Thank reviewers for their inputs.

    • Check for basic PR hygiene, and remind the PR author to rectify if necessary.
      • contains a single, stand-alone, complete fix
      • relevant comments, dev/user docs, tests have been updated
      • PR title/description format is expected
      • doesn't contain unrelated changes

    • Before approving a PR, confirm that all your previous comments have been addressed.

    Merging PRs

    Follow the convention for Git branch merging, as given in the panel below.

    Merging branches

    When merging branch, the aim is to keep the version history neat so that it is easy to do things such as the following:

    • Find which commit introduced a bug using git bisect.
    • Undo a specific change by reverting a commit in the history without breaking anything else.
    • The default strategy is to do a squash-merge. This is suitable when the branch tackles one task but multiple commits that are not well-organized (as per the definition of 'well-organized' in the panel below).

    Organizing commits

    Commits in a branch or a PR is said to be well-organized if they have the following qualities:

    • Each commit contains a single logical change, and this change must stand on its own. i.e. each commit has a single responsibility, and that responsibility must be fully carried out.
      For example, if the commit message says Move delete() from Person class to Address class, the commit cannot contain the addition of delete() to Address class only; it should also contain the deletion of delete() from the Person class for it to be a complete implementation what is stated in the commit message.

    • Each commit has a well-written commit message i.e., it follows these guidelines.

    • Commits are ordered in a bottom-up fashion, each commit building on top of the previous one towards the end goal of the PR.

      Rationale: Reviewers should be able to review one commit at a time.

    • Ideally, a commit does not modify more than 100 lines of code.

      Rationale: Bigger commits are harder to review.

      "Ask a programmer to review 10 lines of code, he'll find 10 issues. Ask him to do 500 lines and he'll say it looks good." --[source]

      Commits containing mechanical changes (e.g. automated refactorings, cut-paste type code movements, file renames, etc.),

      • should include only one mechanical change per commit e.g., rename a single variable across the code base.
      • should not contain other non-mechanical changes, unless unavoidable.
      • can exceed 100 LoC.
      • should have the description of the change in the commit message (so that the results can be reproduced).
    • Every commit pass CI. when you merge a series of commits (without squashing), every commit in your push (not just the last commit) should pass CI.

      Rationale: Build-breaking commits in the version history hinder the ability to use git bisect for locating bugs.

    Here is an example PR of commits that are organized as described above.

    Refactor commits before pushing. It is unlikely that you can produce a series of commits that meet all the above criteria in the first try. In such cases, refactor commits until they meet the criteria. This S/O post describes how to refactor commits (even easier to do with visual tools such as SourceTree -- see this video).

    • Use a merge commit if the commits are well-organized, and the branch tackles only one task. In this case the commit message of the merge commit should explain the full task.

    • Use a rebase-merge if the commits are well-organized and each commit is an independent task (as opposed to steps or a bigger tasks).

    • In other cases, consider reorganizing/splitting the branch to match one of the above.