ECHILD: Structuring your project and code
21 Sep 2023
Based on the principles in the UK Gov’s Analytical Quality Assurance (“Aqua”) book
Arose from the Review of quality assurance of government analytical models after the InterCity West Coast franchise competition in 2012
Seeks to ensure appropriate quality assurance of models used by government: their inputs, methodology and outputs
Long title: QUality Assurance of Code for Analysis and Research (QUACAR /kwakə/)
And you thought initialisms and acronyms in academia were contrived…
If you can’t prove that you can run the same analysis, with the same data, and obtain the same results then you are not adding a valuable analysis
With a repeatable, transparent production process we can:
Requires good documentation, not simply code in a repo.
Requires good documentation.
Quality assurance is time-consuming and resource-intensive
Quality assure code and outputs proportionately
Some assurance processes can be automated, e.g.
does your code run
…But not all, e.g.
does your code do what you think it does (though automation can help here)
Document any quality assurance processes or their absence.
This is most helpful when conducting the same analysis routinely (e.g. official statistic).
Less helpful in research when generally developing new, one-time analyses.
However, both should adopt a modular design:
break complex logic down into small, understandable chunks that can be documented and tested more easily.
Good directory structure and file hygiene goes a long way…
Logical segregation of projects and analytical tasks
Consistency, above all else
Short but descriptive and human readable names
No spaces, for machine readability - underscores (_) or dashes (-) are preferred
Use of consistent ISO date formatting (i.e. YYYY-MM-DD)
Padding the left side of numbers with zeros to maintain order
e.g. 01 instead of 1. So 10 appears after 09 rather than before 2.
A DAG is for life, not just for causal inference.
–(Euler, 1736)1
Visualise relationships between tasks:
Documentation is a love letter that you write to your future self.
–Damian Conway
Focus on readable, modular code
Use docstrings for modules, functions, classes, and methods
Use code comments with purpose:
But sparingly:
Documenting your project will makes it much easier for others to understand your goal and ways of working
WRITE A README
You’ll thank yourself later