Evaluate CrewAI Agents using Langtrace

Karthik Kalyanaraman

⸱

Cofounder and CTO

Dec 20, 2024

Introduction

With the latest update, you can now evaluate both entire agent sessions and individual operations across your entire agent stack—tool calls, VectorDB retrievals, and LLM inferences.

For instance, evaluating the performance of individual agents in a CrewAI-powered setup provides insights into how each agent is performing. By aggregating median and average scores, you can easily spot outlier agent runs—where a higher or lower median compared to the average signals performance anomalies.

Track and optimize agent performance with precision across every aspect of your AI operations.

Ready to try Langtrace?

Try out the Langtrace SDK with just 2 lines of code.

Get Started

Star on Github

433

Ready to deploy?

Try out the Langtrace SDK with just 2 lines of code.

Get Started

Get Started

Want to learn more?

Check out our documentation to learn more about how langtrace works

Documentation

Documentation

Join the Community

Check out our Discord community to ask questions and meet customers

Join Discord

Join Discord

Langtrace – Improve your LLM apps

Star on Github

433

© 2024 Langtrace

Terms of Service

Langtrace – Improve your LLM apps

Star on Github

433

© 2024 Langtrace

Terms of Service

Langtrace – Improve your LLM apps

Star on Github

433

© 2024 Langtrace

Terms of Service