PyPI page
Home page
Author:
None
License:
Apache-2.0
Summary:
Mechanistic interpretability as reward signal for RL training of LLMs
Latest version:
0.1.0
Required dependencies:
accelerate
|
datasets
|
huggingface-hub
|
numpy
|
pydantic
|
requests
|
safetensors
|
torch
|
tqdm
|
transformers
Optional dependencies:
einops
|
mypy
|
nnsight
|
peft
|
pre-commit
|
pytest
|
pytest-cov
|
ruff
|
sae-lens
|
transformer-lens
|
trl
|
vllm
Downloads last day:
0
Downloads last week:
9
Downloads last month:
203