Structured Output Generation using DSPy and Outlines

Karthik Kalyanaraman

⸱

Cofounder and CTO

Oct 27, 2024

Introduction

In this post, we are going to see how to solve classification/name-entity recognition class of problems using DSPy and Outlines (from dottxt) . This approach is not only ergonomic and clean but also guarantees schema adherence.

Approach

Let's do a simple boolean classification problem. We start by defining the DSPy signature.

Now we write our program and use the ChainOfThought optimizer from DSPy's library.

Next, we write a custom dspy.LM class that uses the outlines library for doing text generation and outputting results that follow the provided schema.

Finally, we do a two pass generation to get the output in the desired format, boolean in this case.

First, we pass the input passage to our dspy program and generate an output.
Next, we pass the result of previous step to the outlines LM class as input along with the response schema we have defined.

Conclusion

That's it! This approach combines the modularity of DSPy with the efficiency of structured output generation using outlines built by dottxt.

Source Code

You can find the full source code for this example here.

Langtrace x DSPy

Langtrace natively supports the tracing and monitoring of key metrics from DSPy optimizers and pipelines. This is helps you with understanding how a chosen module or an optimizer from DSPy works under the hood and gives you key visibility into better optimizing the performance of your application.

For more information, check out our previous blog post on this integration here. Here are some additional threads that people have found helpful: