<aside> 📌 The below has now crystallised into instructions in a readme on the chemnlp repo here.

</aside>

Initial thinking

Overview

We need to think more about the git strategy for key dependencies which we want to modify. Two concrete examples of this at the moment are the gpt-neox and lm-evaluation-harness codebases. Some things to consider:

  1. We want the workflow to be as easy as possible for collaborators to contribute.
  2. We want to be able to pull upstream changes from either repo.
  3. We likely want to maintain our own set of changes and we may or may not want to push those changes back to the original repo at some point.
  4. ChemNLP is likely to want to modify either of gpt-neoxor lm-evaluation-harness in many of its GH issues.
  5. These aren't just "normal" dependencies - they are key parts of the pipeline we're likely going to want to develop in conjunction with ChemNLP.

Reference

TODO

Option 1: GitHub forks

Option 2: Git submodules