Cloudflare WAF Rules V3¶
Reference: https://webagencyhero.com/cloudflare-waf-rules-v3/
A few years ago, I created some custom firewall rules on Cloudflare to help protect my client’s sites from bots, spammers, hackers, etc. Over the years, those rules have helped stop thousands, if not millions, of attacks on my clients and other websites hosted/managed by designers/marketers from The Admin Bar Facebook Group.
I am a HUGE FAN of Cloudflare and highly recommend it for everyone. I have clients on the Free, Pro, and Business plans. Cloudflare is a saving grace for anyone hosting and/or managing websites. After much testing and changing the rules, I finally have my version 3 ready. The rules are similar to my old rules but for the better. I kept it under 5 rules so they’ll work with any Cloudflare Plan.
Cloudflare Enterprise These rules WILL NOT work with Cloudflare Enterprise. Some providers that use Cloudflare Enterprise are Kinsta (Required), Rocket.net (Required), Cloudways (Optional). You need direct access to Cloudflare.com to add these rules.
Main Rules¶
Allow Good Bots¶
The “Allow Good Bots” rule grants full, unrestricted access to bots that you approve of, including those you manually add and those classified as safe by Cloudflare.
Cloudflare provides information about bots in the verified bots categories.
Cloudflare Radar Verified Bots
While you can customize this list to suit your needs, Cloudflare generally does an excellent job of allowing legitimate bots through its Known Bot and Verified Bot categories.
- Whitelisting
For this set of rules, I did not include a third-party services allow list because I wanted to identify which ones are not already covered by Cloudflare's Known Bots and Verified Bot Rules. Many of the third-party services we use might already be included in their rules, but I want to verify this. Since we all use different services, I need your assistance with testing. Please test on a few of your clients' sites and report back.I was planning to create a Cloudflare IP list, but I encountered issues with Cloudflare accepting some of the IPv6 addresses.
- Note
If you're using my previous rules, you might already have a Good Bot rule in place. You can continue using that rule, but make sure to add Cloudflare Verify Bots to it.See section modify Allow Good Bot Rule below.
KNOWN BOT RULE UPDATE - 4-15-2025
The Known Bot rule is no longer included in my default rules. Originally, Cloudflare categorized ChatGPT (and some others) under the Verified Bots group AI, but it has since been moved to the broader Known Bots category. While ChatGPT is a powerful tool, it has been known to overload sites or servers during its scans. Because of this, I’ve excluded it from my default configuration.If you'd like to allow it again, you can re-enable it by adding the Known Bots entry as shown in the screenshot. Note that the current expression no longer includes it.
Allow Good Bots Screenshot
02-cloudflare-allow-good-bot-v2.png
Allow Good Bots Expression
(cf.verified_bot_category in {"Search Engine Crawler" "Search Engine Optimization" "Monitoring & Analytics" "Advertising & Marketing" "Page Preview" "Academic Research" "Security" "Accessibility" "Webhooks" "Feed Fetcher"}) or (http.user_agent contains "letsencrypt" and http.request.uri.path contains "acme-challenge")
Modified Good Bots¶
Ensure that you include the “Verified Bots” category in your existing Allow/Skip Good Bot rule.
I added everything under Verified Bot Category except for AI Crawler, Aggregator, and Other.
Note
I only have the 'or' part of the expression for the Verified Bots. You can add it to the end of your existing Good Bot Rule.
YOU ONLY NEED TO DO THIS PART IF YOU ALREADY HAVE A GOOD BOT RULE. IF YOU ARE STARTING FRESH YOU CAN IGNORE THE MODIFIED ALLOW GOOD BOT RULE.
Modified Allow Good Bots Screenshot
Modified Allow Good Bots Expression
or (cf.verified_bot_category in {"Search Engine Crawler" "Search Engine Optimization" "Monitoring & Analytics" "Advertising & Marketing" "Page Preview" "Academic Research" "Security" "Accessibility" "Webhooks" "Feed Fetcher"}) or (http.user_agent contains "letsencrypt" and http.request.uri.path contains "acme-challenge")
Note You can add the Let's Encrypt entries as well, although the Verified Bot Rule should already cover them. I included them as a precautionary measure.
Aggressive Crawlers¶
The “Aggressive Crawlers” rule is designed to block overly persistent bots. While it effectively prevents many fake bots, it can also block aggressive SEO crawler bots.
Note
To block the aggressive SEO crawlers, you need to uncheck "All remaining custom rules" in the "Allow Good Bot" rule. I'll provide screenshot under the Aggressive Crawler screenshot/expression.
Aggressive Crawler Screenshot
Aggressive Crawler Expression
(http.user_agent contains "yandex") or (http.user_agent contains "sogou") or (http.user_agent contains "semrush") or (http.user_agent contains "ahrefs") or (http.user_agent contains "baidu") or (http.user_agent contains "python-requests") or (http.user_agent contains "neevabot") or (http.user_agent contains "CF-UC") or (http.user_agent contains "sitelock") or (http.user_agent contains "crawl" and not cf.client.bot) or (http.user_agent contains "bot" and not cf.client.bot) or (http.user_agent contains "Bot" and not cf.client.bot) or (http.user_agent contains "Crawl" and not cf.client.bot) or (http.user_agent contains "spider" and not cf.client.bot) or (http.user_agent contains "mj12bot") or (http.user_agent contains "ZoominfoBot") or (http.user_agent contains "mojeek") or (ip.src.asnum in {135061 23724 4808} and http.user_agent contains "siteaudit")
How to block the agressive crawlers. (CAREFUL WITH THIS ONE)
To block aggressive SEO crawler bots like Ahrefs and SEMrush, uncheck "All remaining custom rules" in your allow rule settings.However, be cautious: without including the Known Bot & Verified Bot Category in some other rules, you might inadvertently block some legitimate services.For a more targeted approach, you can add the specific user agents of these SEO crawler bots to your Block List.
- Uncheck this box on the Allow Good Bot Rule
Challenge Large Providers / Country¶
This rule addresses two key issues. It manages challenge connections from VPS servers hosted on Google Cloud, Amazon EC2, and Azure, as well as visitors from outside your country of origin.
Hackers and spammers often use VPS servers from Google, Amazon, and Azure to launch rapid attacks on sites or waste resources by scanning them. These servers can be active for a day or longer, consuming resources and posing a threat to your site.
- Note
Legitimate services do use Amazon, Google, and Azure, so if you're using a third-party that needs to connect to your site, you might need to whitelist their IPs in the Allow Good Bot rule. However, Cloudflare's Known & Verified Bots list might already include these services, so you may not need to take additional steps. It depends on the specific service you're using.
MC Large Providers / Country Screenshot
MC Large Providers / Country Expression
(ip.src.asnum in {7224 16509 14618 8075 396982} and not cf.client.bot and not cf.verified_bot_category in {"Search Engine Crawler" "Search Engine Optimization" "Monitoring & Analytics" "Advertising & Marketing" "Page Preview" "Academic Research" "Security" "Accessibility" "Webhooks" "Feed Fetcher" "Aggregator"}) or (not ip.src.country in {"US"} and not cf.client.bot and not cf.verified_bot_category in {"Search Engine Crawler" "Search Engine Optimization" "Monitoring & Analytics" "Advertising & Marketing" "Page Preview" "Academic Research" "Security" "Accessibility" "Webhooks" "Feed Fetcher" "Aggregator"} and not http.request.uri.path contains "acme-challenge" and not http.request.uri.query contains " ?fbclid" and not ip.src.asnum in {32934})
Note Make sure you change the country setting on the rule above to your clients primary country of origin. If you do not want to manage challenge outside the country of origin use the rule below.
MC Large Providers W/O Challenge Country Screenshot
MC Large Providers W/O Challenge Country Expression
(ip.src.asnum in {7224 16509 14618 15169 8075 396982} and not cf.client.bot and not cf.verified_bot_category in {"Search Engine Crawler" "Search Engine Optimization" "Monitoring & Analytics" "Advertising & Marketing" "Page Preview" "Academic Research" "Security" "Accessibility" "Webhooks" "Feed Fetcher" "Aggregator"} and not http.request.uri.path contains "acme-challenge")
Challenge Path / VPN Managed Challenge¶
This rule tackles two primary concerns: it manages challenge connections from VPN providers and monitors access to specific paths (wp-login.php and xmlrpc.php).
Note
While legitimate users do use VPNs, hackers and spammers often exploit them too. In my experience, the negative impact from malicious visitors outweighs the benefits for legitimate users. Therefore, I restrict full access to websites for VPN providers, implementing managed challenges via Cloudflare. Cloudflare Access
Another thing I do recommend is Cloudflare Access for the WordPress login page. I will create another post in the future on how to set that up. Cloudflare access is free for up to 50 users (or 50 email addresses) per Cloudflare account.
Path / VPN Managed Challenge Screenshot
Path / VPN Managed Challenge Expression
(ip.src.asnum in {60068 9009 16247 51332 212238 131199 22298 29761 62639 206150 210277 46562 8100 3214 206092 206074 206164 213074}) or (http.request.uri.path contains "wp-login")
Block Web Host / Paths / TOR¶
This rule includes a list of web hosts I have compiled over the years. While it doesn’t cover every host, it does encompass many of the major ones. Additionally, this rule blocks access to paths such as xmlrpc.php, wp-config.php, and wlwmanifest. It also includes AI Crawler and other bots from the Cloudflare Verified Bot list.
On very rare occasions, someone might use their own custom VPN or using something like VMWARE on providers like Digital Ocean, phoenixNAP (rather than one of the major providers), and this rule would block them. In such cases, you might consider setting this rule to "Manage Challenge" instead of "Block." Although it's uncommon, it does happen.
In my experience, the risk outweighs the benefit, so I keep mine set to "Block" and do the same for my largest client's site. They run an eCommerce site and have experienced significant fraud. Since we started blocking VPNs, their fraud incidents have decreased.
Note
I don't allow TOR or TOR Exit Nodes with Cloudflare. Legitimate users may use TOR, but so do the bad guys. I prefer to block them.
Block Web Host / Paths / TOR Screenshot
Block Web Host / Paths / TOR Expression
(ip.src.asnum in {200373 198571 26496 31815 18450 398101 50673 7393 14061 205544 199610 21501 16125 51540 264649 39020 30083 35540 55293 36943 32244 6724 63949 7203 201924 30633 208046 36352 25264 32475 23033 32475 212047 32475 31898 210920 211252 16276 23470 136907 12876 210558 132203 61317 212238 37963 13238 2639 20473 63018 395954 19437 207990 27411 53667 27176 396507 206575 20454 51167 60781 62240 398493 206092 63023 213230 26347 20738 45102 24940 57523 8100 8560 6939 14178 46606 197540 397630 9009 11878}) or (http.request.uri.path contains "xmlrpc") or (http.request.uri.path contains "wp-config") or (http.request.uri.path contains "wlwmanifest") or (cf.verified_bot_category in {"AI Crawler" "Other"}) or (ip.src.country in {"T1"})
Additional Options¶
Whitelisting Server IP¶
Since we are blocking some website hosts, adding your web server IP to the Security—WAF—Tools section on Cloudflare is a must to ensure your cron jobs continue working.
Whitelisting Server IP for CRONs
Disable Onion Routing¶
As mentioned above, I do not allow TOR to TOR exit nodes, so I also turn off the Cloudflare Onion Routing Setting. The setting is located under the network tab on Cloudflare.
You can learn more about TOR and Onion Routing on Cloudflare’s KB Onion Routing and Tor support.