Models | Notion

Types

Baseline LLM - hasn’t been trained on any of our* chemistry data
PEFT LLM - some parameters have been trained on our* chemistry data
Finetuned LLM - all parameters have been trained on our* chemistry data
~~SFT + Instruction Tuned LLM - additional safety / reinforcement learning …? (tbd)~~

*it’s probable a pretrained model has already been exposed to some chemistry data.

Sizes

This assumes it will be easiest to stay closer to Eleuther’s model artefacts for integration with https://github.com/EleutherAI/gpt-neox training codebase. The models featured below consist of a subset of open sourced models by EleutherAI on Hugging Face.

Types

Sizes

Small

Medium

Big

Others to consider