or comment on changes (especially when collaborating with others)
## EK 2023-01-10 use geom_bar instead of geom_histogram## p <- p + geom_histogram(stat = "identity")p <- p +geom_bar(stat ="identity")## HT 2023-01-09 remove legend from plotp <- p +theme(legend.position ="none")
Either way it can get messy and hard to track/revert changes!
git
git is a version control system that allows us to record changes made to files in a repository or repo.
Each version has a unique ID and metadata:
Who created the new version
A short description of changes made
When the version was made
Versions can be compared, restored and merged.
git repository
To get started, a repository must be created locally (within a working directory on your computer) or on a remote hosting platform (we’ll use GitHub).
git can then track when files/folders are
Added
Modified
Deleted
Repositories can have multiple branches of development. We will work on a single branch, with the default name of main.
Staging and committing
Versions are created in a commit.
We prepare the commit by staging changes we want to record:
Untracked files (git treats the whole content as new)
Tracked files that have been modified or deleted since the last commit
Think of it like taking photographs: we stage the scene by adding/removing people, or changing people’s outfits, when we have a scene we want to save we take a photograph.
GitHub
git + Github
The full power comes by connecting a local repo to GitHub.
You can make changes locally and push them to GitHub
You can make changes via the GitHub website and later pull them into your local copy.
Collaborators can also push/pull changes to the repo.
Further exercises to do while other people set up authentication.
Try uploading a picture from Unsplash. Go to Add file > Upload files. Edit your README to add the image.
Go to Add file > Create new file. Type subfolder/ in the “Name your file box” to create a subfolder. Now type README.md in the “Name your file box”. Add some content to the README and commit - try some new markdown syntax, e.g. emoji or a table.
Turn on two-factor authentication for your GitHub account.
Use a personal access token (PAT) for all Git remote operations from the command line or from R.
Allow tools to store and retrieve your credentials from the Git credential store. If you have previously set your GitHub PAT in .Renviron, stop doing that.
Highly recommend reading this entire vignette and following all guidance
sitrep and vaccinate
library(usethis) # make sure > v2.0.0git_sitrep() # current situation reportgit_vaccinate() # add files to global .gitignore (best practice)
Get a personal access token (PAT)
First, make sure you’re signed into GitHub. Then run
Important! Copy token to clipboard, do not close window until stored (see next slide)!
You may want to store token in a secure vault, like 1Password or BitWarden
Put your PAT into the local Git credential store
By installing usethis, you will also have the gitcreds package to manage git credentials.
Put your PAT into the Git credential store by running the following command and entering your copied PAT at the prompt (assume the PAT is on your clipboard).
gitcreds::gitcreds_set()
If you don’t have a PAT stored, will prompt you to enter: paste!
If you do, you will be given a choice to keep/replace/see the password
At this stage you should see two untracked files in your Git Pane that were created when setting up the project: an .Rproj file and a .gitignore file.
The .gitignore file specifies files that git should ignore - they won’t appear in the git pane even as untracked files.
Examples of how to specify files in .gitignore:
Single file: .Rhistory
File pattern: *.log (all files with .log extension)
Directory (and files in it): /dirname/
The .gitignore file must at least be staged to have an effect.
First commit from RStudio
Stage and commit the .Rproj file and .gitignore file, with the message “setup RStudio project”.
Click on the clock icon in the Git pane to view the history of previous commits.
Close the “Review Changes” window. Now click the green up arrow to push your changes to GitHub.
Go to the repo on GitHub and verify your changes have been pushed.
Pulling changes from GitHub
Edit the README once more on GitHub in a new commit.
Back in RStudio, on the Git tab, click the blue down arrow to pull the changes from GitHub
The changes to README should now appear in your local copy
Avoid conflicts
If you work on both the local and GitHub copy, it’s possible to get out of sync and end up with conflicting versions of the same file.
It is possible to fix this, but it can be tricky/confusing. It’s best to avoid problem in first place!
Recommended practice:
Always commit and push changes at the end of an RStudio session
Always pull changes at the beginning of an RStudio session
Set .Rprofile to check git status
This is a neat trick (credit: Lisa De Bruine)
Open your .Rprofile
usethis::edit_r_profile()
and add the following (credit: Lisa De Bruine)
cat(cli::col_blue(system("git status -u no", TRUE)))
This will run git status from the command line when you start R, giving a (blue coloured!) message e.g.
On branch main Your branch is behind 'origin/main' by 1 commit,
and can be fast-forwarded. (use "git pull" to update your
local branch) nothing to commit, working tree clean
If your main branch is behind the main branch on origin (GitHub), you should pull changes before making new edits.
Advanced: Amend commit (before pushing)
Sometimes we don’t stage everything we intended to include in a commit, e.g. we committed a file before saving the latest changes.
If we haven’t yet pushed the commit to GitHub, simply stage the extra commits and check the “Amend previous commit” box under the commit message.
The original commit message will be shown - you can edit this to change the message for the amended commit (useful if you forgot to reference a GitHub issue number)
Advanced: Undo last commit (before pushing)
Alternatively, you can undo a commit before pushing.
To undo the commit, keeping files as they are
git reset HEAD~1
(change the 1 to a higher number to go back more than 1 commit).
To undo the commit and all the changes in that commit
git reset --hard HEAD~1
This goes back to the version at the last commit.
Advanced: Undo last commit (after pushing)
It is best practice to create a new commit that undoes the changes. Run
git revert HEAD
This edits the files to undo the changes in your last commit. You should then commit these edits, with a relevant message.
It is possible to use git reset --hard to undo a commit and then git push origin main --force to force this change onto GitHub. Sometimes repository maintainers do not allow this as it rewrites the history, which can cause problems for people that have cloned or forked your repo.
General workflow
Commit regularly, once you’ve got a small complete change, e.g. a working draft of a function, a bug fix, a draft of a README.
It is easier to review/revert changes if they relate to a single file or common issue
Ideally, make a commit everytime you make a substantial, coherent set of changes.
At least make a commit every time you take a break, especially when leaving at the end of your working session
Push often enough that GitHub is a useful backup
Adding an existing project to GitHub
The simplest approach is to create a GitHub repo with just a README as before, create the corresponding RStudio project, copy your files into the new directory, stage and commit them.
If you are already using git and want to move the project to GitHub, see Adding a local repository to GitHub using git. Once the project is on GitHub you can clone it into an RStudio project.
Some other tools
gert R package has functions for interacting with git
It can be worth using dedicated software for interacting with git/GitHub