Web Agents for E-commerce Price Monitoring: MAP Enforcement, Competitive Intelligence, and Catalog Sync

Web Agents for E-commerce Price Monitoring: MAP Enforcement, Competitive Intelligence, and Catalog Sync
The scale problem: why traditional scrapers can't keep up
You have 1,000 SKUs. Sold across 20 retailers. You need daily pricing data.
That's 20,000 data points to collect every morning — before your pricing team starts work.
Traditional scrapers were built for a simpler web. Static HTML, predictable selectors, no JavaScript. Today's retail sites are the opposite: prices rendered client-side, layout A/B tests that silently break your CSS selectors every few weeks, bot detection that blocks entire IP ranges. Maintaining that infrastructure isn't a one-time project. It becomes a full-time engineering responsibility.
The failure mode is quiet and expensive. Your extraction returns stale data with no error. A competitor drops below your Minimum Advertised Price (MAP) price. You find out three weeks later when a distributor emails to complain. By then, other retailers have matched the low price, and you're managing a channel conflict instead of preventing one.
Web agents change the architecture of the problem. Instead of maintaining brittle selectors across 20 retailer codebases, you describe the goal in plain English — "find the current price of this product" — and the agent navigates, renders, and extracts by reading page content directly. No selectors to maintain. No silent failures when layouts change.
This article walks through three production-ready use cases: competitor price monitoring, MAP violation detection, and dynamic pricing intelligence — with working code for each.
When does a web agent actually beat a extraction for price monitoring?
Before the use cases: if you're still deciding whether this applies to you, here's the honest decision framework.
Quick reference: when a web agent beats a extraction for price monitoring
- Target sites render prices via JavaScript — anything a traditional extraction can't see after initial page load
- Volume × frequency exceeds extraction maintenance capacity — roughly 50+ SKUs daily across 5+ retailers
- You need schema-consistent output — downstream pricing systems consuming structured JSON
- Your existing tooling stops working on sites with strict access requirements — behavioral fingerprinting, session management, IP variability
- You need an audit trail — each agent run produces a timestamped record, not just a number
Use case 1: Competitor price monitoring at scale
The pattern: give the agent a list of product URLs, a structured output schema in the goal prompt, and run them concurrently. Total time equals the slowest single task — not the sum of all tasks.
Installation
pip install tinyfish
export TINYFISH_API_KEY=sk-tinyfish-*****The code
import asyncio
import json
from datetime import datetime, timezone
from tinyfish import AsyncTinyFish, BrowserProfile
client = AsyncTinyFish() # Reads TINYFISH_API_KEY from environment
PRODUCTS = [
{"product_id": "airpods-pro-3", "url": "https://www.bestbuy.com/site/..."},
{"product_id": "airpods-pro-3", "url": "https://www.amazon.com/dp/..."},
{"product_id": "sony-wh1000xm6", "url": "https://www.target.com/p/..."},
# Add up to 1,000 product URLs
]
async def extract_price(product_id: str, url: str) -> dict:
goal = (
"Find the current listed price of the product on this page. "
"Return a JSON object with these fields only: "
"price (number, no currency symbol), currency (ISO 4217 code), in_stock (boolean). "
"If no price is visible, set price to null."
)
response = await client.agent.run(
url=url,
goal=goal,
browser_profile=BrowserProfile.STEALTH,
)
# For debugging: response.streaming_url contains a live browser replay of this run (valid 24h)
# response.result is a dict shaped by your goal prompt — or None if the run failed
# at the infrastructure level (browser crash, timeout, etc.)
result = response.result or {}
# Two distinct failure modes to handle:
# 1. Infrastructure failure: response.result is None → caught by `or {}`
# 2. Goal failure: run completed but agent couldn't achieve the task →
# result contains {"status": "failure", "reason": "..."} instead of your data
# This is the most common production surprise — COMPLETED status ≠ goal achieved
if result.get("status") "failure":
return {
"product_id": product_id,
"price": None,
"error": result.get("reason", "goal_failed"),
"scraped_at": datetime.now(timezone.utc).isoformat(),
"source_url": url,
}# (continued)
# Happy path: result contains the fields you specified in the goal prompt
return {
"product_id": product_id,
"price": result.get("price"), # number, as specified in goal
"currency": result.get("currency", "USD"),
"in_stock": result.get("in_stock"), # boolean, as specified in goal
"scraped_at": datetime.now(timezone.utc).isoformat(),
"source_url": url,
}
async def main():
# All requests fire concurrently — total time = slowest single task
tasks = [extract_price(p["product_id"], p["url"]) for p in PRODUCTS]
results = await asyncio.gather(*tasks)
print(json.dumps(results, indent=2))
asyncio.run(main())Note on concurrency: When you send more requests than your plan's concurrent session limit, TinyFish queues the excess runs automatically (status: PENDING) — they start as sessions free up. Size your batches to your plan's concurrency limit for predictable run times.Note on result handling:status: "COMPLETED"means the browser ran successfully — not that your goal succeeded. A run that hit an Access Denied page will also returnCOMPLETED, butresultwill contain{"status": "failure", "reason": "..."}instead of your data. The code above handles both cases explicitly. This is the most common source of silent failures in production price monitors.
Output schema
[
{
"product_id": "airpods-pro-3",
"price": 249.99,
"currency": "USD",
"in_stock": true,
"scraped_at": "2026-03-27T14:32:01Z",
"source_url": "https://www.bestbuy.com/site/..."
},
{
"product_id": "airpods-pro-3",
"price": 239.00,
"currency": "USD",
"in_stock": true,
"scraped_at": "2026-03-27T14:32:04Z",
"source_url": "https://www.amazon.com/dp/..."
}
]The parallel execution advantage
| Approach | 1,000 products @ 3s each | Actual wall-clock time |
|---|---|---|
| Sequential extraction | 1,000 × 3s | ~50 minutes |
| TinyFish concurrent agents | 1,000 × 3s | ~3–5 minutes |
The math: total time equals the slowest single task, not the sum. At 1,000 concurrent agents, a batch that took 50 minutes sequentially completes in the time it takes to process one page.
Success rates by retailer category
| Retailer category | Success rate |
|---|---|
| Major global e-commerce platforms | up to 85% on sites with strict automation requirements |
| Sites with strict access requirements | Lower — success rate varies by site configuration |
|---|
Use case 2: MAP violation detection with evidence reports
MAP enforcement has a discovery problem. Your team can't manually check every authorized retailer every day — and by the time a violation surfaces through a distributor complaint, the channel damage is done.
A scheduled agent run closes the gap: extract prices, compare against your MAP database, generate a timestamped evidence record per violation. The whole workflow runs while your team sleeps.
The code
import asyncio
import json
from datetime import datetime, timezone
from tinyfish import AsyncTinyFish, BrowserProfile
client = AsyncTinyFish()
# Your MAP pricing database
MAP_PRICES = {
"airpods-pro-3": 249.00,
"sony-wh1000xm6": 299.00,
}
# Authorized retailers to monitor per product
RETAILERS = {
"airpods-pro-3": [
{"name": "BestBuy", "url": "https://www.bestbuy.com/site/..."},
{"name": "Target", "url": "https://www.target.com/p/..."},
{"name": "Walmart", "url": "https://www.walmart.com/ip/..."},
],
}
async def check_retailer(product_id: str, map_price: float, retailer: dict) -> dict | None:
response = await client.agent.run(
url=retailer["url"],
goal=(
"Return a JSON object with: price (number, no symbol), currency (ISO 4217). "
"If no price is found, set price to null."
),
browser_profile=BrowserProfile.STEALTH,
)
result = response.result or {}
advertised = result.get("price")
if advertised is None or result.get("status") "failure":
return None # Could not extract price — log separately
if advertised < map_price:
return {
"product_id": product_id,
"retailer": retailer["name"],
"map_price": map_price,
"advertised_price": advertised,
"violation_amount": round(map_price - advertised, 2),
"evidence_url": retailer["url"],
"detected_at": datetime.now(timezone.utc).isoformat(),
}
return None # Compliant
async def run_map_check():
all_tasks = []
for product_id, map_price in MAP_PRICES.items():
for retailer in RETAILERS.get(product_id, []):
all_tasks.append(check_retailer(product_id, map_price, retailer))
results = await asyncio.gather(*all_tasks)# (continued)
violations = [r for r in results if r is not None]
print(json.dumps(violations, indent=2))
return violations
asyncio.run(run_map_check())Evidence report output
{
"product_id": "airpods-pro-3",
"retailer": "Target",
"map_price": 249.00,
"advertised_price": 219.99,
"violation_amount": 29.01,
"evidence_url": "https://www.target.com/p/...",
"detected_at": "2026-03-27T14:35:22Z"
}This structure pipes directly into your brand protection workflow. Route violations to Slack for same-day retailer outreach, to your ERP for automated distributor notification, or to a compliance dashboard. The timestamped URL is your legal evidence record — captured at the exact moment of detection.
Use case 3: Dynamic pricing intelligence
Static MAP enforcement catches violations after the fact. The harder problem is building a pricing system that reacts to market changes before they compound.
The structural issue: prices on food delivery, travel, and marketplace platforms load dynamically per user session, vary by location, and change multiple times per day. A extraction built on CSS selectors breaks every time a vendor updates their layout — and in high-velocity markets, that's often. An agent built on a goal — "find the current price of this item at this location" — adapts to layout changes automatically, because it's reading the page content and layout directly, not pattern-matching against a selector you wrote last quarter.
The output that makes this useful downstream isn't a report — it's a structured feed with enough granularity to drive pricing decisions:
{
"market_id": "sf-94102",
"restaurant_id": "R_8821",
"item_id": "burger-classic",
"base_price": 12.99,
"delivery_fee": 1.99,
"promo_price": null,
"scraped_at": "2026-03-27T18:00:00Z"
}The code structure to produce this is identical to Use case 1 — AsyncTinyFish + asyncio.gather() across a list of URLs, with a goal prompt that specifies the schema above. The only difference is the schema itself. If you've already built Use case 1, this is a goal prompt change, not an architecture change.
Note: This represents a generalized deployment pattern, not a published case study. Specific customers are not identified.
A major food delivery platform uses this pattern to track millions of pricing variables per month across thousands of markets. The scale is unusual; the architecture isn't.
Handling access requirements on retail sites: an honest assessment
The majority of major consumer retail platforms runs at up to 85% success rate on sites with strict automation requirements. For most teams, this covers 80%+ of the retailers they care about.
The exception is sites running enterprise behavioral analysis systems. These systems don't look for a missing header — they model whether the entire session pattern looks human. Success rates are lower and inconsistent across all automation tools. No vendor publishes reliable numbers for the hardest-protected sites, and you should be skeptical of any that do.
Practical approach for protected retailers:
from tinyfish import TinyFish, BrowserProfile, ProxyConfig, ProxyCountryCode
client = TinyFish()
result = None
with client.agent.stream(
url="https://protected-retailer.com/product/...",
goal="Extract the current price and stock status",
browser_profile=BrowserProfile.STEALTH,
proxy_config=ProxyConfig(enabled=True, country_code=ProxyCountryCode.US),
) as stream:
for event in stream:
# SDK CompleteEvent: result lives in event.result_json
if getattr(event, "type", None) "COMPLETE":
# Layer 1: infrastructure failure
result = event.result_json
break
# Layer 2: goal failure
if result and isinstance(result, dict) and result.get("status") "failure":
result = NoneTinyFish handles detection at the infrastructure level — not through JS injection applied after browser start. Adding a matching country-code proxy handles geo-specific access requirements. For sites that still reliably block: weigh whether the data value justifies ongoing engineering time, or whether an alternative data source exists.
Cost model: what does daily monitoring actually cost?
TinyFish pricing by plan: Pay-as-you-go $0.015/credit · Starter $15/mo (1,650 included credits, $0.014/credit overage) · Pro $150/mo (16,500 included credits, $0.012/credit overage) · Enterprise custom. One price check = one step.
| Scale | Daily steps | Monthly steps | Pay-as-you-go | Pro plan |
|---|---|---|---|---|
| 100 products × 5 retailers | 500 | ~15,000 | ~$225/mo | $150/mo (within 16,500 included) |
| 500 products × 10 retailers | 5,000 | ~150,000 | ~$2,250/mo | $150 + ~$1,602 overage ≈ $1,752/mo |
| 1,000 products × 20 retailers | 20,000 | ~600,000 | ~$9,000/mo | $150 + ~$7,002 overage ≈ $7,152/mo |
| Enterprise scale | millions/month | — | — | Enterprise |
The free tier — 500 credits, no credit card — covers a complete scan of up to 500 product URLs. Enough to test your real target retailer list and validate extraction quality before committing to a production schedule.
For context on extraction maintenance costs: a junior engineer spending 25% of their time keeping selectors current across 20 retailer sites is a real line item. That's not in the table above.
Build vs. buy: when your existing extraction is still the right answer
The honest answer is that many teams don't need a web agent for price monitoring. If your requirements are under 50 products, static pages, and a handful of stable low-protection retailers — Scrapy or a basic Playwright script is cheaper. Build it yourself.
The crossover point isn't a product decision, it's a maintenance math question: when does the engineering time spent keeping selectors current across 20 retailer sites exceed the cost of outsourcing that infrastructure? That typically happens around 100+ products, 5+ retailers, or the first time a retailer redesign takes down your monitoring for a week without anyone noticing.
Web agents also make sense when extraction accuracy has downstream consequences. MAP violation reports submitted with incorrect prices are worse than no report — they expose you to retailer disputes you can't back up. A extraction that silently returns the wrong price is a liability. An agent run with a failure status is just a gap in your data, which is recoverable.
Get started
The free tier gives you 500 credits with no credit card required — enough to run a complete price scan across 500 products and validate results against your real target retailers.
For teams monitoring 10,000+ products or needing SLA guarantees, contact our enterprise team for volume pricing and dedicated support.
FAQ
Can web agents handle price monitoring on major e-commerce platforms?
Yes. Major e-commerce platforms run at up to 85% success rate on sites with strict automation requirements. Standard browser profile covers most product pages; switch to managed browser profile for high-volume or well-protected retailers.
How fresh is the price data?
Agents run on demand, so freshness is whatever schedule you set. Daily is the most common pattern. For flash-sale categories, teams commonly run 4× daily or trigger runs on inventory alerts.
Does this work for international retailers?
Yes. Include a currency field (ISO 4217) in your goal schema. For geo-restricted content, add proxy_config with the matching country code (US, GB, CA, DE, FR, JP, AU supported).
What does `COMPLETED` status actually mean?
Infrastructure success only — the browser launched and finished. It does not mean your goal succeeded. Always check the result field: if it contains {"status": "failure"}, the agent ran but couldn't extract the data. This is the most common production gotcha.
What if one agent in a batch fails?
Each run is independent — one failure doesn't affect others. Every run includes a streaming_url for debugging (valid 24 hours). Failed runs where infrastructure succeeded are not billed.
Is there a concurrency limit?
Yes. Exceeding it triggers automatic queuing — no 429 errors, but later requests take longer. Check your plan's limit in the dashboard.
See It in Action
The free tier includes 500 steps — enough to run a complete e-commerce monitoring workflow against real data before committing to a plan.
Related Reading
- Pillar: What Can AI Web Agents Actually Do? 10 Real-World Use Cases
- How to Monitor 1,000 Websites in Parallel with the TinyFish API
- Web Agents for Procurement: Multi-Vendor Portal Automation
- When Web Agents Fail: Debugging Goal-Based Automation



