1. 提示词
是否可以请您参考PyTorch的文档格式和文档风格,使用Markdown格式为 `next_obs` 变量编写一段相应的文档说明呢?
2. Evaluation using 2080Ti
bash
python submit_eval_jobs.py --n-gpus 1
3. Scripts
2.1 Infer/run_cot_eval.py
2.1.1 Arguments
Required Arguments
Argument | Type | Description |
---|---|---|
--answer_extraction_fn |
str | Function name for extracting answers from model outputs |
--eval_fn |
str | Function name for evaluating predictions |
Model Configuration
Argument | Type | Default | Description |
---|---|---|---|
--model_name_or_path |
str | None | Path or HuggingFace model identifier |
--tokenizer_name_or_path |
str | None | Tokenizer path (defaults to model path) |
--load_in_8bit |
bool | False | Load model in 8-bit quantization mode |
--load_in_half |
bool | False | Load model in half precision (float16) |
--gptq |
bool | False | Use GPTQ 4-bit quantization |
--use_vllm |
bool | False | Use vLLM for inference acceleration |
Data Configuration
Argument | Type | Default | Description |
---|---|---|---|
--data_dir |
str | "data/mgsm" | Directory containing test data |
--max_num_examples |
int | None | Maximum number of examples to evaluate |
--infer_train_set |
bool | False | Evaluate on training set instead of test set |
--prompt_format |
str | "sft" | Prompt format: 'sft' or 'few_shot' |
--few_shot_prompt |
str | None | Few-shot prompt class name |
Inference Configuration
Argument | Type | Default | Description |
---|---|---|---|
--eval_batch_size |
int | 1 | Batch size for evaluation |
--temperature |
float | 0.0 | Sampling temperature |
--max_tokens |
int | 1024 | Maximum tokens to generate |
--gpus |
str | None | Comma-separated GPU IDs |
Parallel Processing
Argument | Type | Default | Description |
---|---|---|---|
--n_subsets |
int | 1 | Number of data subsets for parallel processing |
--subset_id |
int | 0 | Current subset ID for this process |
--n_repeat_sampling |
int | 1 | Number of repeated samplings |
--repeat_id_start |
int | 0 | Starting repeat ID |
Output Configuration
Argument | Type | Default | Description |
---|---|---|---|
--save_dir |
str | "results/mgsm" | Directory to save evaluation results |
--complete_partial_output |
bool | False | Complete partial model outputs |