Autoresearch
Autoresearch automatically finds the best SQL variation for a metric by testing against ground truth data.
How it works
- You provide a metric and its ground truth SQL (the "correct" answer)
- OnlyMetrix generates up to 30 SQL variations with different approaches
- Each variation is scored against the ground truth using precision, recall, and F1
- The best variation wins and can be promoted to your metric catalog
When to use it
- You have a complex metric and aren't sure which SQL approach is most accurate
- You want to validate that your metric SQL matches a known-good baseline
- You're migrating metrics from another tool and want to verify accuracy
Running Autoresearch
Navigate to Autoresearch in the sidebar, select a metric, and click Run. The process runs in the background — you'll see results when it completes.
Understanding results
Results show each variation with:
- F1 score — balanced measure of precision and recall
- Precision — how many returned results are correct
- Recall — how many correct results were found
- SQL — the exact query used
Three-way classification
Metrics are classified based on their Autoresearch results:
| Classification | Meaning |
|---|---|
| Structured | High F1, compiled to IR, supports all analysis primitives |
| Opaque-Upgradeable | Low F1 but could be improved with better SQL |
| Opaque-Intentional | Intentionally kept as raw SQL (complex custom logic) |