diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md index 5493c4201..279994c59 100644 --- a/.github/pull_request_template.md +++ b/.github/pull_request_template.md @@ -13,4 +13,4 @@ - [ ] Format your code according to the [Code Formatting with Pre-Commit](https://docs.sglang.ai/references/contribution_guide.html#code-formatting-with-pre-commit). - [ ] Add unit tests as outlined in the [Running Unit Tests](https://docs.sglang.ai/references/contribution_guide.html#running-unit-tests-adding-to-ci). - [ ] Update documentation / docstrings / example tutorials as needed, according to [Writing Documentation](https://docs.sglang.ai/references/contribution_guide.html#writing-documentation-running-docs-ci). -- [ ] Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to [Benchmark and Profiling](https://docs.sglang.ai/references/benchmark_and_profiling.html). +- [ ] Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to [Benchmark and Profiling](https://docs.sglang.ai/references/benchmark_and_profiling.html) and [Accuracy Results](https://docs.sglang.ai/references/accuracy_evaluation.html). diff --git a/docs/references/accuracy_evaluation.md b/docs/references/accuracy_evaluation.md index 053dd8369..123d1cab0 100644 --- a/docs/references/accuracy_evaluation.md +++ b/docs/references/accuracy_evaluation.md @@ -1,6 +1,6 @@ # Measuring Model Accuracy in SGLang -This guide shows how to evaluate model accuracy using SGLang's [built-in benchmarks](https://github.com/sgl-project/sglang/tree/b045841baeff37a5601fcde23fa98bd09d942c36/benchmark). +This guide shows how to evaluate model accuracy using SGLang's [built-in benchmarks](https://github.com/sgl-project/sglang/tree/b045841baeff37a5601fcde23fa98bd09d942c36/benchmark). Please include accuracy on crucial benchmarks in your PR if you make modifications on the model side, like the kernel and model architecture. ## Benchmarking Model Accuracy @@ -47,7 +47,7 @@ def few_shot_gsm8k(s, question): ) ``` -These adjustments give us the us the reported accuracy. +These adjustments should return the desired accuracy. ## Extending Evaluation Capabilities