How We Detect 300K+ Sponsorships from YouTube Video Descriptions

Every day, thousands of YouTube creators publish videos with sponsorship deals embedded in their descriptions. Affiliate links, promo codes, "sponsored by" mentions, tracking URLs — they're all there, hiding in plain text.

The question is: how do you extract structured data from unstructured video descriptions at scale?

The Two-Stage Pipeline

We didn't start with an LLM. We started with regex.

Stage 1: Regex Detection

The first pass processes every video description through a pattern matching engine. We look for:

Promo code patterns — "Use code MKBHD", "discount code:", "coupon:"
Sponsorship disclosure — "#ad", "sponsored by", "paid partnership", "this video is brought to you by"
Known brand URL patterns — nordvpn.com/creator, squarespace.com/creator

This stage is fast, cheap, and catches ~80% of obvious sponsorships. But it misses nuanced mentions and can't identify the specific brand from a generic tracking URL.

Stage 2: LLM Verification

Videos flagged with high confidence in Stage 1 go to the LLM for brand identification and relationship classification. The model:

Identifies specific brand names from context
Classifies the relationship type (sponsor, affiliate, promo code)
Determines placement location (description, title, pinned comment)
Assigns a confidence score

We use structured output with strict JSON schemas to ensure consistent, parseable results.

The Numbers

After 6 months of running this pipeline:

2M+ videos processed
300K+ sponsorship signals detected
10,847 unique brands identified
95%+ accuracy on verified relationships
Processing time: ~4 hours for a full daily batch

What's Next

We're exploring transcript analysis as a third detection stage — catching verbal sponsor mentions that never appear in the description. Early tests show this could increase detection by another 15-20%.

This is part of our Engineering series where we share how SponsorTrace is built.

Loading content...

See the data in action

10,000+ brands, 300K+ sponsorship signals — searchable and filterable. Try it free.

Start Free — No Card Required

How SponsorTrace Detects YouTube Sponsorships

How We Detect 300K+ Sponsorships from YouTube Video Descriptions

The Two-Stage Pipeline

Stage 1: Regex Detection

Stage 2: LLM Verification

The Numbers

What's Next

See the data in action

How We Detect 300K+ Sponsorships from YouTube Video Descriptions

The Two-Stage Pipeline

Stage 1: Regex Detection

Stage 2: LLM Verification

The Numbers

What's Next