Automatic Prompt Generation using DSPy

Karthik Kalyanaraman

⸱

Cofounder and CTO

Oct 25, 2024

Introduction

In this post, I will show you a simple implementation of "automatic prompt generation" for solving math problems from the GSM8K dataset using the techniques used in MIPROv2 optimizer of DSPy. This program is made up of 3 modules:

Module 1 generates demos for the prompt
Module 2 generates an instruction for the prompt
Module 3 uses the outputs of module 1 & 2 to generate the final prompt.

Module 1

This module takes a labeled training data set and generates 2 (NUM_SETS) sets of 10 demos each:

5 demos are directly sampled from the dataset
5 demos are generated using the model and satisfies the metric i.e. generated output = expected output

Module 2

This module takes the 2 sets of 10 demos generated in Step 1 along with a string representation of the application code i.e. the code of this program and generates 2(NUM_INSTRUCTIONS) different instructions by

Identifying the class of problems using the demos
Identifying the intent of user using the program semantics

Module 3

In this final step, it takes the outputs from the previous steps as inputs and generates two different final prompts (since we have 2 sets of 10 demos from step 1 and 2 instructions from step 2).

Conclusion

That's how you can generate prompt candidates using DSPy. Note that we started purely with a bunch of labeled datasets and nothing else. If you are curious to dive deep and understand more about this prompt optimization technique, check out the research paper here. If you would like to start using this optimizer, check out the dspy docs here.

Source Code

You can find the full source code for this example here.

Additional Notes

Each one of the 3 modules are built using the ChainOfThought optimizer and Signature hints to guide the program to do what we want to do.
You can use Langtrace to understand what goes in and out of the LLM and trace every step within each module deeply.
The final prompts can be further optimized using a metric and you can technically generate 4 prompts with 2 demos and 2 instructions (2 x 2 permutation). These are left out for the sake of simplicity.
Since module 2 uses the program code to identify the intent, re-structuring your code or adding comments can affect the outputs.

Langtrace x DSPy

Langtrace natively supports the tracing and monitoring of key metrics from DSPy optimizers and pipelines. This is helps you with understanding how a chosen module or an optimizer from DSPy works under the hood and gives you key visibility into better optimizing the performance of your application.

For more information, check out our previous blog post on this integration here. Here are some additional threads that people have found helpful: