Bot Filtering
Ovyxa automatically detects and filters bot traffic so your analytics reflect real human visitors. Bot filtering is enabled by default — no configuration needed.
How Bot Detection Works
Ovyxa uses a multi-layered approach to detect non-human traffic:
1. User Agent Detection (isbot)
The isbot library identifies known bots, crawlers, and automated tools by matching User-Agent strings against a comprehensive database of 1,500+ bot signatures, including:
- Search engine crawlers (Googlebot, Bingbot, DuckDuckBot)
- Social media preview bots (FacebookExternalHit, Twitterbot, LinkedInBot)
- SEO and monitoring tools (Ahrefs, Semrush, Pingdom, UptimeRobot)
- Infrastructure bots (AWS Lambda, Google Cloud Functions)
- AI scrapers and training bots
2. Datacenter IP Ranges
Requests originating from known datacenter IP ranges are flagged. These include major cloud providers where automated scripts typically run:
- AWS, Google Cloud, Azure
- DigitalOcean, Linode, Vultr
- OVH, Hetzner, Scaleway
Legitimate users occasionally use datacenter IPs (VPNs, corporate proxies), so this signal is weighted rather than absolute.
3. Referrer Spam Detection
Known referrer spam domains are identified and flagged. These are fake referrers designed to pollute your analytics with junk traffic sources.
How Filtered Traffic is Handled
Bot events are not discarded — they're flagged with is_bot=1 and still stored in the events table. This means:
- Aggregate views and materialized views filter bots out automatically (
WHERE is_bot=0) - Your dashboard shows only human traffic by default
- Raw data still contains bot events for auditing and analysis
- No data is lost — you can always query bot traffic separately if needed
Automatic — No Setup Required
Bot filtering works out of the box for all sites. There's nothing to enable or configure. Every event that hits the ingestion pipeline is automatically checked against all detection layers before being classified.
Site Shields
For additional control beyond automatic bot filtering, use Shields in your site settings:
- IP Exclusions — block specific IPs (your own, office networks, CI/CD servers)
- Path Exclusions — exclude URL patterns from tracking (admin pages, staging paths)
- Hostname Filtering — only track events from allowed hostnames
- Country Blocking — exclude traffic from specific countries
See Shields Settings for configuration details.
Viewing Bot Statistics
Bot traffic metrics are available in the API response metadata:
{
"data": { ... },
"meta": {
"filtered_bot_traffic": 127
}
}
The filtered_bot_traffic field shows how many bot events were excluded from the current query results.
FAQ
Will bot filtering affect my event counts?
No — bot events are flagged but still counted toward storage. They are excluded from your dashboard metrics and aggregate views, so your analytics only show real human visitors.
Can I see my bot traffic?
Bot events are stored with is_bot=1. You can query them directly via the ClickHouse database if you're self-hosting, or via the API in a future update.
What if a real user is flagged as a bot?
False positives are rare but possible (e.g., users with unusual User-Agent strings). The multi-layered approach minimizes this — a user would need to match multiple bot signals to be filtered.
Can I disable bot filtering?
Bot filtering is always active. You cannot disable it at the site level, as it protects data quality for all users.