Skip to content
This repository was archived by the owner on Jul 4, 2023. It is now read-only.

A blog post to promote the use of GitHub and open sourcing at the Office for National Statistics

Notifications You must be signed in to change notification settings

best-practice-and-impact/ons-github-post

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 

Repository files navigation

Gitting the most out of version control

Git at the ONS

Git is a version-control system for tracking changes in a collection of files, called a repository. It works best for text-based files like code but can be used to version-control any file type. It provides additional features that support multiple people working on the same files within a project, whilst creating an audit trail for all changes to those files. This means that you don’t need to manually keep copies of file versions, for example by saving them with a new name: “some_code_v3_new_final.py”.

Git Logo

This form of version control is particularly useful when carrying out analysis using coding tools, such as Python or R. You're able to save collections of related changes (called commits), resolve conflicts between changes from multiple users and effortlessly roll back to earlier versions of the project or individual files. This is essential for reproducing outputs from previous versions of your analyses. You can find an introduction to Git course and screencast on the Learning Hub, or more detailed usage in the online Pro Git book.

Last month a new version of Git (v2.27) was rolled out to users across the ONS.

Git forms the basis of several version control platforms. You may have experience using Git to version control your projects on the ONS GitLab platform, or equivalent in the Data Access Platform (DAP). An update to the system level configuration of Git now enables ONS machines to connect to GitHub - the world's leading software development platform. GitHub provides very similar features to GitLab but allows us to easily share work between government departments and even publicly. It provides tight control over access for viewing and contributing to each project, so that code can still be kept private when necessary.

This form of version control is a core element of Reproducible Analytical Pipelines (RAP), which automate analytical processes. The Centre for Crime and Justice (ONS) have recently moved versioning of the code for their recently developed RAP to GitHub. This code has been used to produce the latest “Nature of Crime” statistical tables. You can read more about their RAP development and use of Git.

Open source

Open source describes software for which the source code is freely available for use, which can be modified and redistributed.

"Make all new source code open and reusable, and publish it under appropriate licences. Or if this isn’t possible, provide a convincing explanation of why this can’t be done for specific subsets of the source code."

--- Government Service Manual

Whenever possible, we should open source the code behind our analyses and statistics. This practice promotes collaboration and innovation, which are core elements of our current strategy. Doing so is beneficial for ONS-based programmers, developers in other departments and the public. These benefits include:

Personal development:

  • Attribution - coding in the open creates a public record of your contributions to software, as evidence of your programming skill
  • Collaboration - you can gain experience working with other programmers
  • Review - peers and experts in the field can provide advice on developing your code further
  • Satisfaction - it's nice to be able to give back to the community that provides software that we all use on a day to day basis

Public good:

  • Transparency - users can understand and reproduce our analysis, which is a core element of the Statistics Code of Practice
  • Shared value - others can benefit from using your code and won't have to reinvent the wheel
  • Shared opportunity - others can gain insight and experience from reading and possibly even contributing to your code

As with all documents that the civil service publish, there are security considerations to be made when releasing code publicly. It is important to note that code repositories are not an appropriate way to distribute data. ONS policies on security and all else can be found on SharePoint.

Your work is covered by Crown copyright - noting this and providing an appropriate license alongside your code is helpful to let others know how they can use, modify and redistribute your work. This is typically detailed as text in a LICENSE file. This online tool might help you to choose an appropriate license for a given project, however, the Government Digital Service generally recommends using the MIT license for code and the Open Government License (OGL) for documentation.

Getting started with GitHub

You should discuss the prospect of open sourcing with the head(s) of your team before getting started. If you have concerns or questions surrounding the topic, please look to online resources first but do feel free to get in touch for advice.

To get started, you'll need to register for a personal GitHub account. You can use a personal email address for this, as your use of the platform might extend beyond your work at the ONS. However, you should still add your ONS email address to the account once it's created, to ensure that GitHub knows which account to assign your work to when working from an office machine.

GitHub Octocat

You can now follow these simple steps to create and clone a repository from GitHub. The courses on the GitHub Learning Lab are strongly recommended, as interactive tutorials for walking through the basics. If you have any issues with Git or would like to modify your Git configuration further, I've put together a guide on troubleshooting ONS Git configuration that you might find useful. Our team also have example projects for Python and R that you might find as a useful starting point.

Once you've created your own account, you are also able to create "Organizations" (I know, the z makes me sad too). Creating one of these for your team/division allows you to group open source code repositories together and easily collaborate on these projects internally and externally. Additionally, teams can be created, within these organization groups, to manage access to view or work on specific projects.

It's worth noting that there is currently no centralised coordination of these organization groups at the ONS. You can, however, look towards owners of existing ONS GitHub Organizations for advice on creating one within your division/team, or perhaps associating your work with theirs:

If you're generally interested in contributing to or releasing your own open source software, you should also keep an eye out for next month’s Hacktoberfest. This international event points out plenty of resources on how to start contributing to open source and highlights beginner friendly issues across GitHub (plus you can get a free t-shirt and stickers!).

I hope that this article has been a useful introduction to versioning with Git, open source, and has started conversations around open sourcing your work using GitHub.

About

A blog post to promote the use of GitHub and open sourcing at the Office for National Statistics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published