Topic 02 · AI Tagging
AI Tagging.
The vendor-neutral research index for AI image tagging. Capability matrices across classical computer-vision APIs (Google, AWS, Azure, Clarifai, Imagga, Hive, Cloudinary) and frontier multimodal LLMs (Anthropic Claude, OpenAI GPT-4o, Google Gemini). Same rubric. Same month. Public docs only.
Featured report
Statistics — featured (cited from external research)
-
Statistic · Vision API economics
70×
Cost spread between cheapest and priciest frontier multimodal LLM
Gemini 2.0 Flash $0.0002 to Claude Opus 4.7 (3 MP) $0.014 per image. Computed from published per-token rates × documented image-token formulas.
AI Taggingn=7 products -
Statistic · Benchmark
94%
Top MMMU-Pro multimodal score, May 2026 (GPT-5.4 Pro)
GPT-5.4 Pro 94% · Claude Mythos Preview 92.7% · Gemini 3.1 Pro 83.9%. The top tier has pulled away. Source: BenchLM.ai.
AI Taggingn=27 models -
Statistic · Benchmark methodology
17–27pt drop
Multimodal LLMs lose 17-27 points when text shortcuts are removed
On MMMU-Pro vision-only, Claude 3.5 Sonnet drops from 68.3% to 48.0%. Source: Yue et al., arXiv 2409.02813.
AI Taggingn=6 models cited -
Statistic · Market sizing
$7.49B by 2032
Image recognition API market: $3.12B (2025) to $7.49B (2032), 13.6% CAGR
Three analysts converge on 13-15% CAGR. Sources: LP Information, Fortune Business Insights, Grand View Research.
AI Tagging3 sources
Statistics — from the AI Tagging Provider Index (capability slices)
-
Statistic · AI Tagging Index
3of 10
Vision APIs that can answer free-form questions about an image
Only three of ten leading image-tagging APIs can answer arbitrary natural-language questions about an image. The other seven return labels from a closed taxonomy.
AI Taggingn=10 -
Statistic · AI Tagging Index
9of 10
Vision APIs publishing per-unit pricing
Nine of ten publish a per-unit rate publicly. One requires a sales call. Token-based LLM pricing carries a 5-15× per-image cost premium.
AI Taggingn=10 -
Statistic · AI Tagging Index
4of 10
Vision APIs with a free tier — no credit card
Four of ten let a developer make real API calls without entering a payment method. The three hyperscalers and all frontier LLMs require a card.
AI Taggingn=10 -
Statistic · AI Tagging Index
5of 10
Vision APIs with documented OCR / text extraction
Five ship a documented OCR endpoint. The frontier LLMs read text well via prompt but it's not a named, documented capability.
AI Taggingn=10 -
Statistic · AI Tagging Index
6of 10
Vision APIs with custom model training via API
Six let operators train on their own labeled data via API. This is the column the classical CV providers still win.
AI Taggingn=10 -
Statistic · AI Tagging Index
5of 10
Vision APIs publishing a paid-tier uptime SLA
Five publish a paid-tier SLA. Two of the three frontier multimodal LLMs publish only a status page, not a contract.
AI Taggingn=10 -
Statistic · From the corpus
5+tags / asset
Average AI-generated tags per creative asset
Across 1M+ creative assets. Objects, mood, brand elements, clip-level video tags.
AI Taggingn=1M+ assets
In preparation
-
Report 05 · In preparation
AI Tagging Accuracy Field Study
Precision, recall, and human-agreement rates of LLM-generated creative tags across 10,000+ human-verified assets. Per category, per model. Drops Q3 2026.
Q3 2026
What we're publishing on this topic
AI tagging is the most-claimed feature in 2026 DAM and creative-tooling marketing, and the least-rigorously-benchmarked in the same year's journalism. The Provider Index (Report 03) measures the capability surface of every major image-tagging API — what they can do, where they hide pricing, where the classical-CV vs frontier-LLM split shows up. The Accuracy Field Study (Report 05, in preparation) will measure what tags they actually produce, scored against human-verified ground truth. Both reports re-verify on a published cadence; corrections from providers fold in within 7 days.
If you build on a vision API and want to nominate a provider for the next index revision, or share a behavioural data point we should incorporate, mail the team via the About page.