Forum Moderators: phranque

Message Too Old, No Replies

preventing directories being detectable

         

wex65

4:44 pm on Nov 23, 2021 (gmt 0)

Top Contributors Of The Month



I am running apache on a Rocky 8.5 server (Wordpress site) and have managed to get everything nicely through the PCI test. However, I have several hundred information only messages for directories being detectable.

An example:

Evidence
HTTP Response Code: 503
URL: https: //XXX.XXX.XXX.XXX:443/_vti_bot/

I assume they prefer a 404(?) to be returned instead of an HTTP which indicates the presence of a directory? Is there an way way to configure apache for this?

[edited by: engine at 4:51 pm (utc) on Nov 23, 2021]
[edit reason] tweaked url [/edit]

w3dk

5:51 pm on Nov 23, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



What's triggering the 503? Ordinarily, a directory request would return a 403 Forbidden (when there's no directory index document and mod_autoindex is disabled).

Do other requests return a 503?

Are you using FrontPage?

> Is there an way way to configure apache for this?

Yes, but first find out what's triggering the "503 Service Unavailable".

wex65

6:04 pm on Nov 23, 2021 (gmt 0)

Top Contributors Of The Month



Thanks, I am not running anything other than apache/wordpress.

Any way to determine on the server what is causing the 503? i.e. some RHEL8 command?

not2easy

6:21 pm on Nov 23, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



You could view your server logs or check into error logs. (that may depend on some configuration settings) I would first look at the server access logs to find the UA and the request that is setting it off. It could well be something just doing its job.

w3dk

6:30 pm on Nov 23, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



> I am not running anything other than apache/wordpress.

In that case, why do you have a "_vti_bot" subdirectory? This is indicative of MS FrontPage (perhaps leftover from a previous site?). If this is no longer required then just delete it - you should then have your 404!

wex65

6:55 pm on Nov 23, 2021 (gmt 0)

Top Contributors Of The Month



I checked and do not have a "_vi_bot" directory. I think they are hitting many hundred potential directories for various platforms to test which are 'accessible/detectable'. I looked into the apache logs and it seems that the hits to my domain are OK but directly to the IP address are not.

For example, mydomain.com/_vi_bot gives a 404 whereas 111.222.333.444:443/_vi_bot provides the 503

I think I need to somehow redirect attempts to the IP address to the domain, I am guessing in the Virtual Host?

lucy24

7:05 pm on Nov 23, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Edit: We overlapped, so I didn’t see the most recent post.

URL: https: //XXX.XXX.XXX.XXX:443/_vti_bot/
Now, wait. I assume you have a canonicalization redirect. The 503 response would then imply that the Mystery Error arises before the request reaches the redirect; otherwise it would instead be recorded as a response to
https://example.com/_vti_bot/
(You did say this is your own server, right? If it were any form of shared hosting, requests without a hostname would never even reach the site.)

503 responses, unlike 403 or 404--or, more to the point, 500--don’t normally arise on their own. There has to be something in the site code telling the server to return that response. A crude but simple test is to open your configuration files in a text editor and search globally for the string “503”.

Oh, and, before anything else ... Is that 503 the response issued by the server (as recorded in logs), or is it the response received by the visitor? They’re not always the same, especially when a CMS is involved.

phranque

7:26 pm on Nov 23, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



requests for an IP address will go to the default virtual host configured for that Apache server.
assuming that you are on a shared hosting environment, you will not likely have control of the configuration for the default virtual host.

phranque

7:30 pm on Nov 23, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Is that 503 the response issued by the server (as recorded in logs), ...

in a shared hosting environment you probably won't be able to see the server logs for the default virtual host.

wex65

8:01 pm on Nov 23, 2021 (gmt 0)

Top Contributors Of The Month



I will do more research this evening. The server is a dedicated machine (Linode) so I have root access and the ability to make any necessary changes.

I will do some tests from another server using the curl -i to see what response I get back as I am unable to replicate the 503 they claim to be seeing.

The access log I have been checking is at /var/log/httpd/access_log

lucy24

8:59 pm on Nov 23, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I am unable to replicate the 503 they claim to be seeing.

Oh, cripes. You mean all of this is a response reported by someone else? Why are they requesting your IP address rather than the hostname, with appended nonexistent directory?

:: idly thinking that on your own server it would be the easiest thing in the world to return a 503 to any request for a numerical IP, should you choose to do so ::

:: further idly wondering if robots go away faster if they meet a 500-class rather than a 400-class response ::

requests for an IP address will go to the default virtual host configured for that Apache server
But I should hope that the administrator would not use some randomly chosen customer’s website for this purpose, without telling them they’d done so.

Edit: Out of curiosity I looked up my personal site’s IP address and then requested that IP in the browser. I landed on a “Site not found” message from the host. That seems a more reasonable response.

phranque

9:34 pm on Nov 23, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



btw - welcome to WebmasterWorld [webmasterworld.com], wex65!

The server is a dedicated machine (Linode) so I have root access...


look in your server config file(s) for the <VirtualHost> container(s).
from the apache documentation:
If multiple virtual hosts contain the best matching IP address and port, the server selects from these virtual hosts the best match based on the requested hostname. If no matching name-based virtual host is found, then the first listed virtual host that matched the IP address will be used. As a consequence, the first listed virtual host for a given IP address and port combination is the default virtual host for that IP and port combination.

source: https://httpd.apache.org/docs/current/mod/core.html#virtualhost

also see:
VirtualHost Examples [httpd.apache.org]

wex65

8:35 pm on Nov 26, 2021 (gmt 0)

Top Contributors Of The Month



Guys, apologies for the late reply but it took me a day or so to get my head around what was/is going on.

First, to clarify the question above..."Why are they requesting your IP address rather than the hostname, with appended nonexistent directory?" This is part of a banking PCI test. They perform many thousand tests against the server and one of them is to 'sniff' what directories exist. I think the fact that they get back a 503 indicates (wrongly!) to them the existence of the folders.

I did some testing to try to wok out what is going on and think I am a lot closer to at least identifying the underlying cause.

I am use the Letsencrypt SSL for my domain but you cannot apply their certificates to an IP address.

So, when the PCI tester attempts to visit a folder using the IP address instead of the domain, they get a SSL error. This below is one of @700 near identical reports from them.

Evidence
HTTP Response Code: 503
URL: https://111.222.333.444:443/sm/

I think they get a 503 for EVERY folder they test so they assume they all exist as they are not getting 404s?

I tried to curl the URL from another server I have and get the following:


[root@xyz ~]# curl -IL https://111.222.333.444:443 | grep "^HTTP\/"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (51) Unable to communicate securely with peer: requested domain name does not match the server's certificate.
[root@xyz ~]#

I think it it essentially telling me it cannot communicate as there is no matching SSL certificate for the IP address.

I suspect the solution is to auto-redirect all attempts to connect directly to the IP address to the domain?

Doing some digging I found that a proposed solution that says to insert the following into the httpd.conf which I did, and it has no effect.

RewriteEngine on
RewriteCond %{HTTP_HOST} ^https://111.222.333.444
RewriteRule (.*) http://example.com/$1 [R=301,L]

Another suggestion was to include the following in httpd.conf, again I see no impact. Yes, I did restart httpd after each change.

##this is for http redirection to domain name###
<VirtualHost *:80>
ServerName www.example.com
ServerAlias *
Redirect / https://www.example.com/
</VirtualHost>

##This is for https redirection to domain name###
<VirtualHost *:443>
ServerName www.example.com
ServerAlias *
Redirect / https://www.example.com/
</VirtualHost>

So I am somewhat at a loss on how to direct this traffic to my domain or at least return a 404 instead of the 503.

[edited by: phranque at 2:02 am (utc) on Nov 27, 2021]
[edit reason] unlinked urls [/edit]

lucy24

1:34 am on Nov 27, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think the fact that they get back a 503 indicates (wrongly!) to them the existence of the folders
I don’t follow the reasoning here.

phranque

2:25 am on Nov 27, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I think the fact that they get back a 503 indicates (wrongly!) to them the existence of the folders.

it indicates no such thing:
https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.4
10.5.4 503 Service Unavailable

The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. If known, the length of the delay MAY be indicated in a Retry-After header. If no Retry-After is given, the client SHOULD handle the response as it would for a 500 response.

10.5.1 500 Internal Server Error

The server encountered an unexpected condition which prevented it from fulfilling the request.


curl: (51) Unable to communicate securely with peer: requested domain name does not match the server's certificate.
...

I think it it essentially telling me it cannot communicate as there is no matching SSL certificate for the IP address.

I suspect the solution is to auto-redirect all attempts to connect directly to the IP address to the domain?

it is telling you that it cannot complete the secure handshake required for the https connection.
this means it never gets to the point of the HTTP Request reaching your web server, so you'll never get a chance to issue the 301 response.

I am use the Letsencrypt SSL for my domain but you cannot apply their certificates to an IP address.

this is the problem you must solve.

wex65

1:22 pm on Nov 27, 2021 (gmt 0)

Top Contributors Of The Month



"it indicates no such thing:"

I know what you mean and agree, maybe I worded it badly but what I meant to say is that they have an issue with the presence of a 503 as opposed to a 404.

I didn't realize but see from the post above that apache isn't even 'seeing' the hits as the lack of an SSL for the IP address prevents further progress....to apache.

I see there is a mod_security way to prevent hits to the IP address. I will look into whether this will be possible without the SSL being present.

Thanks for the direction, it helps to know that the reason my changes to htttpd.conf weren't having an effect is that the hits weren't making it to apache.

phranque

7:33 pm on Nov 27, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



... they have an issue with the presence of a 503 as opposed to a 404.

503 implies a temporary status.
that means that the next similar request might be a 200 or a 404 or anything else and that's what they need to know.

if you cannot find these 503 responses in your web server access log files, that means that the "503" is being generated/reported by their testing server or reporting tool.