← Back to all sparks
C

Comet

AI-ASSISTANTS
Velocity5.0

ML experiment tracking and LLM observability platform, including Opik for evaluating LLM apps.

Comet's Opik pushes deeper into agent eval and framework-portable observability.

llm-observabilityagent-evaluationframework-portabilitycost-trackingcrawl-source-mismatch
Current state
This feed tracks Comet's Opik, an LLM/agent observability and evaluation tool, though it crawls Comet's marketing site and mixes genuine feature posts with cost-tracking and observability explainers. The product-bearing items center on agent evaluation (Test Suites), tracing, and a new integration with Oracle's Open Agent Specification for framework portability.
Where it's heading
Opik is broadening from observability into the full agent build-test-ship loop: standardized agent specs, automated evaluation suites, and an Agent Playground (visible deeper in the feed). The throughline is reducing lock-in to any single agent framework while owning the evaluation and debugging layer on top.
Prediction
Expect more eval-automation and framework-interop features, plus continued cost-tracking content aimed at teams feeling LLM-spend pain. Release cadence is partly obscured because the crawl source is the marketing site rather than a dedicated changelog.

Recent moves

  1. 18h ago

    How Evaluation-Driven Development (EDD) Works

    View source ↗
  2. 3d ago

    Opik + Oracle Agent Specification: Build Once, Run Anywhere

    Opik integrates with Oracle's Open Agent Specification, letting teams build agents against a portable spec and avoid framework lock-in. Extends Opik's reach from observability toward standards-based agent portability.

    View source ↗
  3. 7d ago

    AI Evaluation Simplified: Automate Dataset & Metric Eval Workflows with Test Suites

    Introduces Test Suites to automate dataset and metric evaluation workflows, reducing the manual work of building reference datasets and judge prompts. Advances Opik's move into the full agent evaluation loop.

    View source ↗
  4. 7d ago

    Advanced Claude Code Cost Tracking: How to Save 30% on Token Spend

    How-to on cutting Claude Code token spend. Cost-tracking marketing content riding on coding-agent adoption, not a product change.

    View source ↗
  5. 15d ago

    Understanding Your Claude Code Spend: What’s Actually Driving the Cost

    Analysis post on what drives Claude Code spend. Thought-leadership content, not a release.

    View source ↗
  6. 29d ago

    Agent Tracing and Observability: Log & Debug Complex AI Systems

    Explainer on agent tracing and observability for debugging multi-agent systems. Educational content reinforcing Opik's core use case, not a product change.

    View source ↗