Skip to main content
Use Cases/Use Case / Brand Monitoring

Web Scraping for Brand Monitoring

Monitor brand mentions across the web using fastCRW search + scrape: discover mentions on news sites, blogs, and forums, extract sentiment, and get real-time alerts.

Published
May 12, 2026
Updated
May 12, 2026
Category
use cases
Verdict

fastCRW enables cost-effective brand monitoring at scale. Use `/v1/search` to find mentions across the web, scrape results with `/v1/scrape`, analyze sentiment with LLM extraction, and set up real-time alerts to protect and understand your brand reputation.

Search the web for brand mentions using `/v1/search` endpointScrape full articles and forums to extract context and sentimentExtract structured sentiment and topic tags with LLM analysis

Why Brand Monitoring Requires Web Scraping

Your brand is discussed across hundreds of websites every day: news articles, blog reviews, customer forums, social media aggregators, and competitor websites. Manual monitoring is impossible.

Automated monitoring gives you:

  • Real-time visibility into brand mentions across news, blogs, forums, and reviews
  • Sentiment analysis to understand whether mentions are positive, negative, or neutral
  • Context extraction to see exactly what's being said and in what context (product review vs. crisis coverage)
  • Trend detection to spot emerging issues before they blow up
  • Competitive intelligence on how your brand is positioned versus competitors
  • Crisis alerts to notify leadership immediately when negative sentiment spikes

Without scraping, you're flying blind. Manual monitoring catches only obvious mentions, and sentiment analysis requires reading dozens of articles manually—work that compounds daily.

Where fastCRW Helps

Monitoring needfastCRW role
Mention discovery/v1/search finds all brand mentions across news, blogs, forums, reviews
Full context extraction/v1/scrape returns complete article text to understand the mention's context
Sentiment analysisLLM extraction automatically categorizes mentions as positive, negative, or neutral
Alert routingCustom logic on scraped data triggers alerts for crisis situations (negative spikes, security mentions)
Historical trackingStore scraped mentions with timestamps to track sentiment trends over months
Competitive analysisSearch for competitor names alongside your brand to understand relative positioning
Trend identificationIdentify emerging topics, keywords, or concerns linked to brand mentions

Typical Brand Monitoring Workflow

  1. Define monitoring scope: Identify brand keywords, product names, executive names, and competitor names to track.
  2. Schedule searches: Use /v1/search to find mentions daily (or hourly for high-priority keywords).
  3. Scrape results: For each search result, use /v1/scrape to extract full article text, author, publish date.
  4. Analyze sentiment: Use LLM extraction to automatically categorize sentiment and identify key topics mentioned.
  5. Deduplicate: Store content hashes to avoid processing the same mention multiple times.
  6. Alert on anomalies: Trigger immediate notifications for crisis keywords (security breach, lawsuit, executive departure), negative sentiment spikes, or mentions from key influencers.
  7. Report trends: Monthly or quarterly, aggregate sentiment and themes to brief leadership on brand perception.
  8. Take action: Respond to major issues via PR, social media, or direct outreach.

Good Fits for Brand Monitoring

  • Enterprise brands with significant online presence and reputation risk
  • SaaS startups tracking product reviews, customer sentiment, and competitive comparisons
  • Consumer brands monitoring retail reviews, user discussions, and social mentions
  • Executives and public figures tracking personal brand perception and media coverage
  • Crisis management teams detecting emerging issues before escalation
  • Competitive intelligence teams comparing brand positioning across market
  • Customer success teams discovering user-generated content and feedback
  • Marketing teams measuring campaign impact and brand sentiment changes

Architecture: Building a Brand Monitoring Pipeline

A production-grade brand monitoring system requires:

1. Search Layer Use /v1/search to query the web for brand mentions on a recurring schedule. Maintain a list of keywords:

  • Primary keywords (exact brand name, domain, tagline)
  • Product keywords (product names, feature names)
  • Executive keywords (CEO name, founder name)
  • Competitive keywords (competitor names, comparative phrases)

2. Result Filtering Not all search results are relevant. Filter by domain type (news sites, blogs, forums are useful; spam directories are noise). fastCRW helps by returning clean URLs; your logic filters by domain authority or category.

3. Scraping Layer For each relevant search result, use /v1/scrape to extract full article text and metadata (author, publish date, source domain). Store raw results.

4. Analysis Layer Use LLM extraction to analyze:

  • Sentiment (positive, negative, neutral, mixed)
  • Topics (what aspects of your brand are discussed: product quality, pricing, customer service, security, etc.)
  • Tone (news coverage vs. review vs. criticism vs. praise)
  • Key quotes or claims about your brand

5. Deduplication Multiple outlets may cover the same news story. Deduplicate by content hash or semantic similarity to avoid counting the same mention multiple times.

6. Storage & Indexing Store mentions in your database with: mention date, source URL, article text, extracted sentiment, topics, and any custom fields. Index by date, sentiment, and topic for quick filtering.

7. Alerting & Escalation Set up rules to trigger alerts:

  • Crisis keywords (breach, lawsuit, bankruptcy) → immediate exec alert
  • Negative sentiment spike → daily digest alert
  • Major news outlets (NYT, TechCrunch, Forbes) → always alert
  • Competitor mentions → log for competitive intelligence

8. Reporting & Dashboard Build a dashboard showing:

  • Mention volume and sentiment trend over time
  • Top sources and influencers mentioning your brand
  • Sentiment breakdown (% positive, negative, neutral)
  • Emerging topics and themes
  • Alert history and responses

Implementation Walkthrough: Brand Monitoring Pipeline

Here's a working Python implementation that searches for brand mentions, scrapes results, analyzes sentiment, and triggers alerts:

import requests
import json
from datetime import datetime, timedelta
from typing import Optional
from enum import Enum

# fastCRW API configuration
CRW_API_KEY = "your-api-key"
CRW_BASE_URL = "https://api.fastcrw.com/v1"

class Sentiment(str, Enum):
    POSITIVE = "positive"
    NEGATIVE = "negative"
    NEUTRAL = "neutral"
    MIXED = "mixed"

def search_brand_mentions(query: str, max_results: int = 20) -> list[dict]:
    """Search for brand mentions across the web using /v1/search."""
    payload = {
        "query": query,
        "limit": max_results,
        "lang": "en"
    }
    
    response = requests.post(
        f"{CRW_BASE_URL}/search",
        json=payload,
        headers={"Authorization": f"Bearer {CRW_API_KEY}"}
    )
    
    if response.status_code == 200:
        data = response.json()
        return data.get("results", [])
    else:
        print(f"Error searching for '{query}': {response.status_code}")
        return []

def scrape_mention(url: str) -> dict:
    """Scrape a mention URL to extract full context and analyze sentiment."""
    extraction_schema = {
        "type": "object",
        "properties": {
            "title": {"type": "string", "description": "Article or page title"},
            "author": {"type": "string", "description": "Author name"},
            "published_date": {"type": "string", "description": "Publication date (ISO 8601)"},
            "sentiment": {
                "type": "string",
                "enum": ["positive", "negative", "neutral", "mixed"],
                "description": "Overall sentiment about the brand"
            },
            "topics": {
                "type": "array",
                "items": {"type": "string"},
                "description": "Key topics mentioned (product quality, pricing, customer service, security, innovation, etc.)"
            },
            "key_quote": {
                "type": "string",
                "description": "Most representative quote about the brand from the article"
            },
            "brand_mention_type": {
                "type": "string",
                "enum": ["news", "review", "comparison", "criticism", "praise", "other"],
                "description": "Type of mention"
            }
        },
        "required": ["sentiment", "topics"]
    }
    
    payload = {
        "url": url,
        "format": "markdown",
        "extraction": {
            "schema": extraction_schema
        }
    }
    
    response = requests.post(
        f"{CRW_BASE_URL}/scrape",
        json=payload,
        headers={"Authorization": f"Bearer {CRW_API_KEY}"}
    )
    
    if response.status_code == 200:
        return response.json()
    else:
        print(f"Error scraping {url}: {response.status_code}")
        return {}

def is_relevant_result(search_result: dict, brand_name: str) -> bool:
    """Filter search results to keep only relevant mentions."""
    # Exclude known spam and noise domains
    spam_domains = [
        "pinterest.com",
        "amazon.com/dp/",  # Product reviews
        "aliexpress.com",
        "ebay.com",
        "youtube.com/watch",  # Video links without text
        "instagram.com",  # Social media links
        "twitter.com/web"  # Web version of tweets without content
    ]
    
    url = search_result.get("url", "").lower()
    
    for spam in spam_domains:
        if spam in url:
            return False
    
    # Keep if brand name is in title or URL
    title = search_result.get("title", "").lower()
    return brand_name.lower() in title or brand_name.lower() in url

def monitor_brand(brand_name: str, keywords: list[str], alert_threshold: float = 0.3):
    """Main pipeline: search for mentions, scrape, analyze sentiment, alert on negative spikes."""
    all_mentions = []
    crisis_alerts = []
    
    print(f"=== MONITORING BRAND: {brand_name} ===\n")
    
    for keyword in keywords:
        print(f"Searching for: {keyword}")
        results = search_brand_mentions(keyword, max_results=20)
        
        filtered_results = [r for r in results if is_relevant_result(r, brand_name)]
        print(f"  Found {len(filtered_results)} relevant results")
        
        # Scrape and analyze each mention
        for result in filtered_results[:10]:  # Limit to 10 per keyword for demo
            url = result.get("url")
            print(f"    Scraping {url[:60]}...")
            
            mention = scrape_mention(url)
            if mention:
                extraction = mention.get("extraction", {})
                all_mentions.append({
                    "keyword": keyword,
                    "url": url,
                    "title": result.get("title"),
                    "snippet": result.get("snippet"),
                    "scraped_at": datetime.utcnow().isoformat(),
                    "sentiment": extraction.get("sentiment"),
                    "topics": extraction.get("topics", []),
                    "mention_type": extraction.get("brand_mention_type"),
                    "key_quote": extraction.get("key_quote"),
                    "author": extraction.get("author"),
                    "published_date": extraction.get("published_date")
                })
                
                # Check for crisis keywords
                sentiment = extraction.get("sentiment", "").lower()
                topics = extraction.get("topics", [])
                
                if sentiment == "negative" and any(crisis in str(topics).lower() for crisis in ["security", "breach", "lawsuit", "scandal"]):
                    crisis_alerts.append({
                        "severity": "high",
                        "url": url,
                        "sentiment": sentiment,
                        "topics": topics,
                        "key_quote": extraction.get("key_quote")
                    })
    
    # Analyze sentiment distribution
    sentiments = [m["sentiment"] for m in all_mentions if m["sentiment"]]
    negative_count = sum(1 for s in sentiments if s == "negative")
    positive_count = sum(1 for s in sentiments if s == "positive")
    neutral_count = sum(1 for s in sentiments if s == "neutral")
    
    if sentiments:
        negative_ratio = negative_count / len(sentiments)
    else:
        negative_ratio = 0
    
    # Alert if negative sentiment exceeds threshold
    if negative_ratio > alert_threshold:
        crisis_alerts.append({
            "severity": "medium",
            "reason": "Negative sentiment spike",
            "details": f"{negative_ratio:.1%} of mentions are negative (threshold: {alert_threshold:.1%})"
        })
    
    return {
        "brand": brand_name,
        "total_mentions": len(all_mentions),
        "sentiment_breakdown": {
            "positive": positive_count,
            "negative": negative_count,
            "neutral": neutral_count,
            "mixed": sum(1 for s in sentiments if s == "mixed")
        },
        "mentions": all_mentions,
        "crisis_alerts": crisis_alerts
    }

def format_report(monitoring_result: dict) -> str:
    """Format monitoring results for display and logging."""
    report = f"""
=== BRAND MONITORING REPORT ===
Brand: {monitoring_result['brand']}
Date: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}

SUMMARY
-------
Total Mentions Found: {monitoring_result['total_mentions']}

SENTIMENT BREAKDOWN
-------------------
Positive: {monitoring_result['sentiment_breakdown']['positive']}
Negative: {monitoring_result['sentiment_breakdown']['negative']}
Neutral:  {monitoring_result['sentiment_breakdown']['neutral']}
Mixed:    {monitoring_result['sentiment_breakdown']['mixed']}

TOP 5 MENTIONS
---------------
"""
    
    for mention in monitoring_result['mentions'][:5]:
        report += f"\n[{mention['sentiment'].upper()}] {mention['title']}\n"
        report += f"  Source: {mention['url']}\n"
        report += f"  Topics: {', '.join(mention['topics'][:3])}\n"
        if mention['key_quote']:
            report += f"  Quote: \"{mention['key_quote'][:100]}...\"\n"
    
    if monitoring_result['crisis_alerts']:
        report += f"\n\nCRISIS ALERTS ({len(monitoring_result['crisis_alerts'])})\n"
        report += "-" * 40
        for alert in monitoring_result['crisis_alerts']:
            report += f"\n[{alert.get('severity', 'unknown').upper()}] {alert.get('reason', 'Alert')}\n"
            if 'details' in alert:
                report += f"  {alert['details']}\n"
            if 'url' in alert:
                report += f"  Source: {alert['url']}\n"
    
    return report

# Example usage
if __name__ == "__main__":
    brand_name = "TechCorp"
    monitor_keywords = [
        "TechCorp",
        "TechCorp acquisition",
        "TechCorp CEO",
        "TechCorp security breach",
        "TechCorp vs Competitor"
    ]
    
    result = monitor_brand(brand_name, monitor_keywords, alert_threshold=0.35)
    
    # Print formatted report
    report = format_report(result)
    print(report)
    
    # Save report for audit trail
    with open(f"brand_monitoring_{brand_name}_{datetime.now().strftime('%Y%m%d')}.json", "w") as f:
        json.dump(result, f, indent=2, default=str)

Production Considerations

Scaling to 24/7 monitoring:

  • Use a scheduler (Celery, APScheduler, or fastCRW's built-in scheduling) to run searches on a recurring basis
  • For breaking crises, implement an alert cascade: immediate email/SMS for high-severity issues, daily digest for lower priority
  • Cache search results for 1–2 hours to avoid redundant queries on the same terms
  • Implement backoff logic for blocked or rate-limited sites

Handling false positives:

  • Many search results mention your brand in passing (e.g., "Tesla" mentioned in an article about electric vehicles unrelated to sentiment). Use filtering and LLM extraction to reduce noise.
  • Maintain a whitelist of trusted sources and a blacklist of spammy domains
  • Manual review of high-severity alerts before escalation to leadership

Maintaining historical data:

  • Store every mention with sentiment, topic, and metadata for trend analysis
  • Implement data retention policies (e.g., keep raw mentions for 90 days, aggregated metrics for 2 years)
  • Use time-series aggregation to plot sentiment trends, mention volume over time, and topic shifts

Crisis response:

  • When a crisis alert fires, immediately notify PR, legal, and leadership
  • Prepare templated responses for common issues
  • Track response time and effectiveness for post-crisis analysis
  • Monitor sentiment recovery after a crisis to gauge success of response efforts

Competitive monitoring:

  • Set up parallel monitoring for 3–5 competitors alongside your own brand
  • Track comparative mentions (e.g., articles comparing your product to competitors)
  • Identify where competitors are winning and use insights to inform product/marketing strategy

Pricing Math: Brand Monitoring at Scale

Assume you monitor 5 brand keywords plus 2 competitor names, searching daily, with 20 results per search = 140 searches/month.

Breakdown:

  • Searches: 140 searches/month × 3 credits = 420 credits
  • Scraping: 140 searches × 10 results/search × 8 credits per scrape = 11,200 credits
  • LLM extraction: 1,400 scraped articles × 5 credits = 7,000 credits
  • Total: ~18,620 credits/month

Plan options:

  • Pro plan ($13/mo, 10,000 credits): Covers monitoring of 1–2 brands with basic search frequency
  • Business plan ($49/mo, 50,000 credits): Covers 5+ brand keywords plus competitors, with daily searches and LLM sentiment analysis. Typical for mid-market companies.
  • Enterprise ($custom): For continuous real-time monitoring of 10+ brands with hourly searches and crisis response workflows.

Cost optimization:

  • Search less frequently for low-priority keywords (weekly instead of daily)
  • Use HTTP scraping (cheaper) for most mentions; reserve Chrome rendering for complex layouts
  • Implement smart alerting to scrape only high-relevance results (news sites, high-authority domains)
  • Cache results for 12–24 hours to avoid re-scraping the same URL

FAQ

Q: How do I monitor real-time social media mentions?

A: fastCRW's /v1/search finds links to social media discussions (Twitter/X threads, Reddit posts, TikTok comments). Scrape those links to extract context. For true real-time Twitter monitoring, combine fastCRW with Twitter's API (for authenticated subscribers). fastCRW works best for persistent, citable content (articles, blog posts, forums).

Q: What's the difference between fastCRW brand monitoring and tools like Mention or Brandwatch?

A: Those tools offer full-stack solutions: real-time search, aggregation, sentiment, dashboard, alerts, and team workflows. fastCRW is the scraping/search layer; you build the monitoring pipeline on top. Use fastCRW if you want to customize logic (add custom alerts, integrate with your CRM, control data retention). Use Mention/Brandwatch if you want turnkey monitoring out of the box.

Q: How do I track sentiment changes for the same brand over months?

A: Store each mention with timestamp, sentiment, and topic. Aggregate by day/week/month, computing average sentiment score per period. Plot sentiment over time in a dashboard. Use moving averages to smooth out noise and detect trends.

Q: What should I do when I find false negative reviews?

A: First, verify the claim is actually false. Then, assess whether a public response is warranted. For major platforms (G2, Trustpilot), directly respond in the comments. For blog posts, reach out to the author privately. For lies or defamation, consult legal counsel before taking action.

Q: Can I monitor mentions behind paywalls?

A: No. fastCRW cannot bypass paywalls (NY Times, Wall Street Journal, Bloomberg). Focus on open-access sources. For enterprise-grade coverage of paywalled content, use media monitoring APIs like Cision or MediaDailyNews (which license press databases).

Q: How do I distinguish my brand from similarly-named competitors?

A: Be specific in search queries: search for your exact brand name, domain, or unique tagline. When scraping, use LLM extraction to ask "Is this mention about [your brand] or a competitor?". Manually review ambiguous results. Store the determination for future reference.

Q: What's the fastest way to respond to a brand crisis?

A: (1) Set up high-severity crisis keyword alerts to notify leadership immediately. (2) Maintain a rapid response team (PR lead, CEO, legal). (3) Scrape and log the initial mention for documentation. (4) Publish a response within 2–4 hours while facts are being verified. (5) Update response as more information becomes clear. (6) Monitor sentiment recovery over the following days/weeks.

Q: Can I use fastCRW to monitor internal (private/intranet) brand mentions?

A: No. fastCRW scrapes public web content only. For internal mentions, use internal communication tools (Slack, Teams, Confluence). fastCRW is for external brand perception and public mentions.

Q: How long does it take to scrape 100 mentions?

A: At ~2 seconds per scrape, 100 mentions takes ~3–5 minutes with serial scraping. With parallel requests (10 concurrent), you can do it in 30 seconds–1 minute. Use fastCRW's batch endpoint or parallelize via your task queue.

Continue exploring

More from Use Cases

View all use cases

Related hubs

Keep the crawl path moving