Welcome to WebmasterWorld Guest from 54.166.133.84

Forum Moderators: Ocean10000 & phranque

Requests for No ads in error log

"No ads" being added after last directories

     
6:57 pm on Jan 7, 2019 (gmt 0)

Junior Member from CA 

10+ Year Member Top Contributors Of The Month

joined:Oct 1, 2002
posts: 142
votes: 11


Anyone else getting 404 errors in their logs for requests for "No ads" added after last directory slash?

It's not just 1 particular directory, or any specific number of subdirectories - or all search requests.

Example request:
/home/example/public_html/happy/new/year/No ads
/home/example/public_html/niddlenaddlenoo/No ads


Which receives a 404 response.

I get around 100 of these requests per day, and from various IP addresses.
7:17 pm on Jan 7, 2019 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2018
posts:68
votes: 19


I get these too.

So far they all seem to be from residential IPs using recent browser versions.

Something to do with ad blockers maybe?
7:34 pm on Jan 7, 2019 (gmt 0)

Junior Member from CA 

10+ Year Member Top Contributors Of The Month

joined:Oct 1, 2002
posts: 142
votes: 11


I honestly have no idea. I tried to redirect them to the directory they follow in htaccess but can't get it to work due to the space between the words, tried dozens of combinations but nothing worked 100%.
7:53 pm on Jan 7, 2019 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2018
posts:68
votes: 19


Try this, works for me.

RewriteCond %{REQUEST_URI} No\sads$
8:13 pm on Jan 7, 2019 (gmt 0)

Junior Member from CA 

10+ Year Member Top Contributors Of The Month

joined:Oct 1, 2002
posts: 142
votes: 11


Still no luck. I just get the uri and No%20ads following it
8:28 pm on Jan 7, 2019 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2018
posts:68
votes: 19


No\sads$ will catch 'No ads' at the end of the URI.

You may need to add other RewriteCond declarations to cater for your specific needs, and your own RewriteRule at the end.

Clear your browser cache while you test.
9:37 pm on Jan 7, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15494
votes: 744


Try this
But that's just a Condition. What rule is it attached to? If you throw in a RewriteCond and forget to make the rule itself, the condition will be applied to the next RewriteRule, no matter what it is. (In Apache, unlike in robots.txt, blank lines have no syntactic significance.)

Is it definitely "No ads" with a space, rather than "No_ads" with a lowline? Are they part of an ordinary human request, or isolated robotic requests?

:: detour to recent archived logs ::

Nope, didn't think I'd ever seen it (covering all bases with a case-insensitive \bNo\W?ads). Do you use some particular CMS?
12:47 am on Jan 8, 2019 (gmt 0)

Junior Member from CA 

10+ Year Member Top Contributors Of The Month

joined:Oct 1, 2002
posts: 142
votes: 11


OK, this seems to work (Apache, .htaccess)

RewriteCond %{REQUEST_URI} No\sads$
RewriteRule (^.*)No\sads$ https://www.example.com/$1 [R=301,L]


@Lucy24

definitely "No ads" with a space.

Appear to be variety of ip's from human requests.

Custom CMS but no way that text could get into a uri or page link.
1:02 am on Jan 8, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15494
votes: 744


What's the condition for? The body of the rule has already specified the "No ads" part, so the condition makes no extra constraints. If the body of the rule matches, the condition will always, 100%, succeed.

Now, this happens to be one of the rare cases where a condition might actually be the way to go, because the part you want to capture comes before the part you're matching. But only if you switch the capture to the condition, like:
RewriteCond %{REQUEST_URI} (.+?)No\sads
RewriteRule No\sads$ https://www.example.com/%1 [R=301,L]
The idea here is that the act of capturing makes a teeny bit of extra work for the server--and 999 times out of 1000, it ends up having to throw away the capture when the request turns out not to have the "No ads" element after all. There's really no efficient way to do it; ^((\w+/])*) or ^(([\w-]+/])*) would be even more work to achieve the same result. So why not shift the capture to a condition.

If you don't feel like doing this, stick with your existing rule but get rid of the condition.

All of this is assuming that your URLs already end in / slash, since that's what is getting captured. If instead you've got extensionless URLs and then the requests add "/No ads" then make sure the / is omitted from the capture.
1:42 pm on Jan 8, 2019 (gmt 0)

Junior Member from CA 

10+ Year Member Top Contributors Of The Month

joined:Oct 1, 2002
posts: 142
votes: 11


Thanks as always @Lucy24

I now have

RewriteRule (^.*)No\sads$ https://www.example.com/$1 [R=301,L]


Which redirects to the directory trailing slash as per site setup.

Still curious as to why/where the "No ads" is coming from, so if anyone has more info...
3:08 pm on Jan 8, 2019 (gmt 0)

Junior Member from CA 

10+ Year Member Top Contributors Of The Month

joined:Oct 1, 2002
posts: 142
votes: 11


Oops, thanks as well @ClosedForLunch - still half asleep when I previously posted and edit time had expired.
9:58 pm on Jan 8, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15494
votes: 744


if anyone has more info...
Another WebmasterWorld member sent me a log snippet with requests just like the one you describe. From this I learn:
-- in logs it comes through as "No%20ads" (should have realized this, but after re-checking, I've still never personally seen it)
-- the "No ads" requests came after a perfectly normal human visit, starting about 10 minutes later and continuing sporadically for at least half an hour
-- the "No ads" part is not actually tacked on to a full URL; instead it is treated as part of the same directory. So for page /dir/subdir/pagename.html, there will be repeated requests for /dir/subdir/No%20ads. If the original human request was in fact for an URL ending in / this will make no difference, but if you have pages in the form pagename.html then keep this in mind when redirecting.
-- the site in question is hand-rolled, so any cms at your end is probably a red herring.

Continue checking your logs and see if the redirect is in fact followed-up. Once you start serving a 301 instead of a 404, there won't be as many requests. (Does anyone happen to know how long a browser typically remembers a 301? I have no idea: could be 10 minutes, could be a week.)
3:33 pm on Jan 24, 2019 (gmt 0)

Junior Member from CA 

10+ Year Member Top Contributors Of The Month

joined:Oct 1, 2002
posts: 142
votes: 11


Just an update - and another question :-)

The rewrite appears to be working fine for the "No Ads" as I no longer see the error in the logs.

I do have another query though. I have also been receiving these requests in the error logs for awhile:

AH00128: File does not exist: /home/EXAMPLE/public_html/https:/www.exa-mple.com/


Where exa-mple.com is my correct domain name. (exa-mple.com has a dash as part of my domain name)

Any guru's out there know the correct way to rewrite that to
https://www.exa-mple.com
7:37 pm on Jan 24, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15494
votes: 744


Any gurus out there know the correct way to rewrite that

Are you sure you want to? Typically these requests come from ineptly coded robots who don't deserve anything but a 404 anyway. Cross-check against your access logs and see what else comes up: If by some freak of chance it is a human, you'll also see requests for the favicon as well as any supporting files used by your error documents. (I have an /errorstyles.css in part so I can more easily identify wrongly blocked humans. Or perhaps rightly blocked ones; it depends.)

/home/EXAMPLE/public_html/https:/www.exa-mple.com/
Is it really https:/ with just one slash, or did you mistype?
11:34 pm on Jan 24, 2019 (gmt 0)

Junior Member from CA 

10+ Year Member Top Contributors Of The Month

joined:Oct 1, 2002
posts: 142
votes: 11


Yeah, thinking about it you're probably right Lucy24, it's just annoying seeing lots of errors in the logs, ocd...

My bad on the https:/ should be two slashes.
2:39 am on Jan 25, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15494
votes: 744


it's just annoying seeing lots of errors in the logs
In some respects, errors in logs are like the “errors” reported by GSC: Yes, OK, thank you, everything is happening exactly the way I want, now will you shut up about it already and stop calling everything an error.

In the server’s case, an “error” is anything leading to a 400-class or 500-class response, even when that is exactly the response you want to have happen. In fact it would be worrying if there were no errors ever. As with GSC “errors”, you can glance at them and make sure there’s nothing unwanted or unexpected. Concentrate on 500-class errors, as those are far more likely to mean you really did do something wrong.