1 of 5

Usage

Learn more on using AlphaCodium.

Configuration

The file: alpha_codium/settings/configuration.toml contains the configuration for the project.

In the config section, choose the model you'd like to use ("gpt-4", "gpt-3.5-turbo-16k", or others).

You can adjust the flow by setting these configurations:

solve
self_reflection
possible_solutions
generate_ai_tests
initial_code_generation
public_tests
ai_tests

Solving Problems

To solve a specific problem, run:

Copy

python -m alpha_codium.solve_problem \
--dataset_name /path/to/dataset \
--split_name test \
--problem_number 0

Parameters:
- dataset_name: Path to the dataset folder downloaded in the installation step.
- problem_number: Index of the problem (zero-based).
- split_name: Could be either valid or test.
- Each run logs the results to a file named alpha_codium/example.log. Reviewing the log file is a good way to understand what is going on in each stage of the flow.

Solving the entire dataset

To solve the entire dataset, run:

Copy

python -m alpha_codium.solve_dataset \
--dataset_name /path/to/dataset \
--split_name test
--database_solution_path /path/to/output/dir/dataset_output.json

Parameters:
- split_name: Could be either valid or test.
- database_solution_path: Path to the directory where solutions will be saved
- The dataset section in the configuration file contains the configuration for the running and evaluation of a dataset.

dataset.num_iterations defines the number of iterations for each problem (pass@K). For a large number of iterations, it is recommended to introduce some randomness and different options for each iteration to achieve top results.

Important Note: Solving the entire dataset is a long process, and it may take a few days to complete with large models (e.g. GPT-4) and several iterations per problem.

Evaluation

To evaluate the solutions, run:

Copy

Solving Problems

To solve a specific problem, run:

Copy

python -m alpha_codium.solve_problem \
--dataset_name /path/to/dataset \
--split_name test \
--problem_number 0

Parameters:
- dataset_name: Path to the dataset folder downloaded in the installation step.
- problem_number: Index of the problem (zero-based).
- split_name: Could be either valid or test.
- Each run logs the results to a file named alpha_codium/example.log. Reviewing the log file is a good way to understand what is going on in each stage of the flow.

Solving the entire dataset

To solve the entire dataset, run:

Copy

python -m alpha_codium.solve_dataset \
--dataset_name /path/to/dataset \
--split_name test
--database_solution_path /path/to/output/dir/dataset_output.json

Parameters:
- split_name: Could be either valid or test.
- database_solution_path: Path to the directory where solutions will be saved
- The dataset section in the configuration file contains the configuration for the running and evaluation of a dataset.

Important Note: Solving the entire dataset is a long process, and it may take a few days to complete with large models (e.g. GPT-4) and several iterations per problem.

Configuration

The file: alpha_codium/settings/configuration.toml contains the configuration for the project.

In the config section, choose the model you'd like to use ("gpt-4", "gpt-3.5-turbo-16k", or others).

You can adjust the flow by setting these configurations:

solve
self_reflection
possible_solutions
generate_ai_tests
initial_code_generation
public_tests
ai_tests