Generation

Efficient generation with flexible stopping criteria

from dart_math.gen import *

WARNING 12-10 04:54:46 _custom_ops.py:14] Failed to import from vllm._C with ImportError('libcuda.so.1: cannot open shared object file: No such file or directory')

Difficulty-Aware Rejection Sampling (with Code Execution) in 5 Lines of Code

from dart_math.data import load_query_dps
from dart_math.gen import gen, is_dp_dars_finished
from dart_math.eval import EvaluatorMathBatch
# ...
generator = Generator(llm, sampling_params, resp_sample_cls=RespSampleVLLM, batch_evaluator=(EvaluatorMathBatch() if not args.gen_only else None), code_exec_cfg=CodeExecCfg.load_from_id_or_path(args.code_exec_cfg) if args.code_exec_cfg else None)
generator.gen(query_dps=query_dps, dp_stop_criteria=is_dp_dars_finished, save_path=args.gen_save_path, n_paths_per_save=args.save_gen_path_bs)

generator.gen generates with the vLLM model llm using sampling parameters sampling_params on query data points query_dps until every data point meets the stopping criteria dp_stop_criteria.
Samples are generated in batch and evaluated with batch_evaluator if specified.
Generated samples are saved to save_path.

For a more detailed usage example, please refer to our generation script for DART-Math.

source

Generator

 Generator (llm:vllm.entrypoints.llm.LLM,
            sampling_params:vllm.sampling_params.SamplingParams,
            resp_sample_cls:type=<class 'dart_math.data.RespSampleVLLM'>,
            batch_evaluator:dart_math.eval.EvaluatorBatchBase|None=None,
            code_exec_cfg:dart_math.exec.CodeExecCfg|str|None=None)

Generator with various features such as stopping criteria and code execution.

	Type	Default	Details
llm	LLM		The `vllm` model to generate with (or other objects with compatible `generate` interfaces).
sampling_params	SamplingParams		The sampling parameters for the `llm` (or other objects with compatible interfaces). NOTE: `n > 1` might cause bugs in `vllm` for now (0.4.2).
resp_sample_cls	type	RespSampleVLLM	The class to collect the generated response as.
batch_evaluator	dart_math.eval.EvaluatorBatchBase \| None	None	The batch evaluator to evaluate the generated responses. `None` means no evaluation.
code_exec_cfg	dart_math.exec.CodeExecCfg \| str \| None	None	The tool using configuration.

source

Generator.gen

 Generator.gen (query_dps:list[dart_math.data.QueryDataPoint],
                dp_stop_criteria:Callable[[dart_math.data.QueryDataPoint],
                bool], save_path:str|None=None,
                n_paths_per_save:int|None=None)

Generate responses on the given query data points with specified stopping criteria.

	Type	Default	Details
query_dps	list		The query-level data points to generate responses on.
dp_stop_criteria	Callable		The function to check if a query data point should be stopped generating on.
save_path	str \| None	None	Path to save the generated reponses to. `None` or `""` means no saving.
n_paths_per_save	int \| None	None	Response-level samples or `None` if saving.
Returns	list[dart_math.data.RespSampleBase] \| None		The generated responses or `None` if saving.

source

Generator.gen_pure

 Generator.gen_pure (input_strs:list[str])

Code execution only supports one-path generation for now.

	Type	Details
input_strs	list	The input strings as direct input to the model.
Returns	list	The generated responses grouped by input strings.

API Reference

Data Preprocessing

source

get_icl_egs

 get_icl_egs (dataset:str, n_shots:int=None, model_dirname:str|None=None)

Get the ICL examples for the dataset.

	Type	Default	Details
dataset	str		Preset dataset ID.
n_shots	int	None	Number of examples in the few-shot prompt. `None` / Negative means adaptive to the datasets.
model_dirname	str \| None	None	HF ID or path to the model.
Returns	list		ICL examples adaptive to the dataset (and model).

Stopping Criteria

source

is_dp_dars_finished

 is_dp_dars_finished (dp:dart_math.data.QueryDataPoint)

Judge whether DARS for a data point is finished and return the stopping reason or None if not finished.

	Type	Details
dp	QueryDataPoint	Query data point having at least the following attributes: `max_n_trials` (and `n_trials`), `min_n_corrects` (and `n_corrects`).
Returns	str \| None	The stopping reason or `None` if not finished.

IO

source

get_res_fname

 get_res_fname (model_name_or_path:str, max_new_toks:int,
                temperature:float, top_p:float, prompt_template:str,
                dataset:str, n_shots:int, tag:str, inf_seed:int)

Get the JSONL file name to save results to.

	Type	Details
model_name_or_path	str	HF ID or path to the model.
max_new_toks	int	Maximum length of the model output in token.
temperature	float	Temperature for sampling.
top_p	float	Top-p for sampling.
prompt_template	str	ID or path to the prompt template.
dataset	str	Name of the dataset to generate on.
n_shots	int	Number of egs in few-shot prompt.
tag	str	Tag describing sample number informantion for the result file.
inf_seed	int	Seed for randomness.
Returns	str	Path to the result file.