How VPN and Proxy Detection Works (and Why It's Hard)

VPN and proxy detection works by layering multiple independent techniques — known IP range databases, ASN and infrastructure analysis, network fingerprinting, shared blocklists, and behavioral signals — because no single method catches every anonymizer. Datacenter VPNs are relatively easy to identify; residential proxies, which borrow real consumer IP addresses, are genuinely hard. The systems that perform best treat detection as a probability problem, not a yes/no lookup.
This post walks through why the problem matters, how each detection layer works, and what to actually do once you have the signal.
Why Businesses Need to Detect VPNs and Proxies
Anonymizers break the assumption that an IP address tells you something true about a visitor. That matters in concrete, expensive ways:
- Payment fraud. Stolen-card operations route through proxies to match the cardholder's billing country and dodge velocity checks.
- Account takeover. Credential-stuffing tools distribute login attempts across thousands of proxy IPs so no single address trips a rate limit.
- Promo and trial abuse. One operator behind a rotating proxy pool can register hundreds of "new" accounts to drain referral bonuses and free tiers.
- Geo-compliance. Streaming rights, gambling licenses, and sanctions rules require knowing where a user really is — which means knowing when location is spoofed.
- Scraping and bot traffic. Industrial scrapers rotate through proxy networks specifically to defeat IP-based rate limiting.
None of this means VPN users are fraudsters — most aren't. It means VPN status is essential context for every other decision your risk logic makes.
Know Your Adversary: The Four Anonymizer Types
| Type | How it works | Typical IP origin | Detection difficulty |
|---|---|---|---|
| Datacenter VPN | Encrypted tunnel to a commercial VPN server | Hosting/cloud ASNs | Low–moderate |
| Tor | Multi-hop onion routing through volunteer relays | Published exit nodes | Low |
| Datacenter proxy | Simple HTTP/SOCKS relay on a rented server | Hosting/cloud ASNs | Low–moderate |
| Residential proxy | Traffic relayed through real consumer devices | Home ISP ranges | High |
| Mobile proxy | Traffic relayed through cellular connections | Carrier CGNAT ranges | Very high |
The pattern is clear: the closer an anonymizer's exit point looks to a real consumer connection, the harder it is to catch. Mobile proxies are the extreme case — carrier-grade NAT means thousands of legitimate users already share each IP, so the address itself is nearly useless as evidence.
The Detection Toolbox
Known IP range databases
The foundation layer. Commercial VPN providers operate finite server fleets, and those servers live at addresses that can be discovered — by subscribing to the services, resolving their endpoint hostnames, and monitoring their infrastructure over time. The same applies to open proxies and datacenter proxy vendors. Tor is even simpler: exit nodes are published by the Tor Project itself.
The weakness is freshness. VPN providers add and rotate servers constantly, so a range database is only as good as its update cadence.
ASN and infrastructure analysis
Every IP belongs to an Autonomous System, and ASNs are classifiable: consumer ISP, mobile carrier, cloud provider, hosting company. Genuine human traffic almost never originates from a hosting ASN — people don't browse the web from inside a rented server. So datacenter origin alone is a strong anonymizer indicator, even for VPN servers that haven't been individually cataloged yet.
Latency and network fingerprinting
A relayed connection has physics working against it. If an IP claims to be in one region but round-trip timing behaves like the traffic traveled much farther, something is in the middle. Related techniques examine TCP/IP characteristics — values that shift when traffic passes through a tunnel or a different operating system than the browser claims. These signals are noisy individually but powerful as corroboration.
Shared blocklists and abuse intelligence
When an IP participates in credential stuffing, spam, or scanning across many networks, that history becomes a reputation signal. An address that was clean yesterday but is suddenly generating abuse reports across unrelated services has very likely joined a proxy pool or botnet.
Behavioral signals
Some evidence only emerges from your own traffic: dozens of distinct accounts appearing from one IP within minutes, a single session hopping across IPs in different countries, or request patterns too uniform to be human. Behavioral data is the layer attackers can't buy their way around, because it's generated by what they do, not where they connect from.
Why Residential Proxies Are the Hardest Case
Residential proxy networks route traffic through real consumer devices — often via SDKs bundled into free apps whose users rarely understand they've become exit nodes, sometimes via outright malware. The result: your attacker's request arrives from a genuine home broadband IP, on a consumer ISP's ASN, in a plausible city.
Range databases don't help, because the IP genuinely is residential. ASN analysis doesn't help, for the same reason. Detection has to lean on the harder layers:
- Rotation patterns. Residential proxy IPs cycle through many "users" quickly. An address serving five unrelated sessions in an hour doesn't look like a family's router.
- Latency mismatch. The relay hop adds delay inconsistent with the claimed location.
- Network membership intelligence. Proxy networks can be probed and mapped from the inside, identifying participating addresses directly.
- Behavioral corroboration. Automation artifacts — headless browser tells, impossible navigation speed — surface what the IP conceals.
This is exactly why serious detection is layered. GeoIPHub's 8-layer detection engine draws on 60+ intelligence sources for this reason: when the cheap signals fail, the expensive ones have to be ready. We documented how that engine evolved in the GeoIPHub case study.
The False-Positive Tradeoff
Detection aggressiveness is a dial, not a switch, and turning it up has costs:
- Corporate networks route employees through gateways that look proxy-like.
- CGNAT puts thousands of legitimate mobile users behind one shared IP.
- Privacy-conscious customers use consumer VPNs for entirely benign reasons — and they skew toward technical, higher-value users.
Flag too eagerly and you're rejecting real revenue to stop hypothetical fraud. Flag too cautiously and the proxy traffic walks in. The honest answer is that the right threshold depends on what's at stake per transaction: a comments section can tolerate proxy traffic; a payout flow cannot. Which leads to the most important design decision of all.
Use the Signal, Don't Worship It: Risk Scoring Over Hard Blocking
The mature pattern is to treat anonymizer detection as one weighted input into a composite score, then respond proportionally:
- Low risk — clean residential IP, normal behavior: let it through, no friction.
- Moderate risk — VPN detected, everything else normal: allow, but log and watch.
- Elevated risk — proxy plus new account plus velocity anomaly: step up verification (MFA, email confirmation).
- High risk — known abusive IP, automation fingerprints, stacked signals: block or route to manual review.
A single 0–100 risk score makes this easy to implement — one threshold check per tier instead of a tangle of field-level rules. GeoIPHub returns exactly that, alongside 140+ data fields and sub-100ms responses, with a free tier of 2,000 requests per day for testing the integration against your real traffic. For a broader look at the data behind the score, see What is IP intelligence.
Closing Thoughts
VPN and proxy detection is hard because the adversary gets to choose where their traffic appears to come from — and the best anonymizers choose addresses indistinguishable from your customers'. The defense isn't one clever trick; it's layers that fail independently, refreshed constantly, feeding a score that lets you apply friction only where evidence stacks up.
If you're weighing how to build this capability into your own product, Keplaris builds and operates this kind of system in production — we'd be glad to compare notes on your detection architecture.
Frequently asked questions
VPN and proxy detection combines several techniques: databases of known VPN and proxy IP ranges, ASN analysis to spot datacenter-hosted traffic, latency and network fingerprinting to detect relayed connections, shared blocklists of abused addresses, and behavioral signals like many accounts appearing from one IP.
Often, yes. Commercial VPNs route traffic through servers in datacenters whose IP ranges are publicly attributable to hosting providers and known VPN operators. Well-maintained detection systems identify the bulk of mainstream VPN traffic, though smaller or self-hosted VPNs are harder to flag.
Residential proxies route traffic through real consumer devices on home ISP connections, so the IP looks identical to a legitimate customer's. Detection relies on subtler evidence — rapid IP rotation, mismatched latency, behavioral anomalies, and intelligence on proxy network membership — rather than simple range lookups.
Usually not. Many legitimate users run VPNs for privacy or on corporate networks, so hard-blocking them sacrifices real customers. The better pattern is risk scoring: treat VPN use as one input, and reserve blocks or step-up verification for sessions where multiple risk signals stack up.
A Tor exit node is the final relay where Tor traffic re-enters the public internet, so websites see the exit node's IP instead of the user's. Because the Tor Project publishes its exit node list, Tor is the easiest anonymizer category to detect reliably.
Get in touch.
Whether you have questions or just want to explore what's possible, we're here to help.
