Implementing $(IA)^3$ | Notion

Add (IA)^3 to model training pipeline · Issue #91 · OpenBioML/chemnlp

feat: add IA3 prompt tuning by maw501 · Pull Request #2 · OpenBioML/gpt-neox

Next steps

Add $l_v$ and $l_k$ ✅
Set-up configuration for $l_{ff}$ ✅
Check total trainable parameters ✅
Don’t perform weight decay ✅
Sort out loading of checkpoints ✅
Re-read paper (+ LoRA) to sense check vs. implementation configs ✅
Check if stride needed ✅
Change ia3_prompt_tuning → ia3_tuning ✅
Fix FFN being in wrong place ✅
Add docstrings to new classes ✅

TODO (before merging PR)

Resolve git strategy ✅
Test on multi-GPU server
Update submodule ref in chemnlp

Resources

PEFT implementation of LoRA (which is similar)
$(IA)^3$ codebase