Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models ...
Note: jsrun is experimental. Expect breaking changes between versions. One of the most compelling use cases for jsrun is building safe execution environments for AI agents. When LLMs generate code, ...