Welcome to WebmasterWorld Guest from 3.81.29.226

Forum Moderators: ocean10000

Message Too Old, No Replies

web.config indicate 410 pages

Tell SEs that numerous should be de-indexed

     
3:51 pm on Dec 24, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 4, 2001
posts:2297
votes: 100


First, a quick back story. Have a website that was hacked in June. The hack was not discovered for several days as it was rather ingenious, but by then, the damage was already done.

The hack was a VB Script inserted into the index.asp file which, if you typed the web address directly, took you to the website. However, if it detected a SE bot or the referrer was a SERP link, it directed you to a site selling counterfeit sports memorabilia. However, it utilized the sites web address, which I will refer to as MYURI, as the source website. There were two variants on the results:
MYURI/?spam_page/
and
MYURI/?spam_page.html

The damage was that SE’s, mostly Google, indexed those pages to the tune of over 13,500 spam pages despite the fact the sitemap.xml only every had 12 pages to the site. Needless to say, this has knocked the website out of SERPs.

Despite all best efforts, I cannot get Google to remove those 13,500+ pages. I have tried a few rules in the web.config file to report the pages as 410, but I am apparently missing something and web.config files are not my forte.

What I am looking for is a rule or condition that detects any URI with a question mark (?) as that seems to be the only constant factor, and redirect them to a 410.asp page. Any help would be appreciated.

As an FYI, the site was moved to a new host and all the .htm pages were changed to .asp pages.The current web.config file has

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<system.webServer>
<defaultDocument>
<files>
<clear />
<add value="index.asp" />
<add value="default.asp" />
<add value="default.aspx" />
<add value="index.php" />
<add value="index.htm" />
<add value="index.html" />
</files>
</defaultDocument>
<rewrite>
<rules>
<rule name="410 Response" patternSyntax="Wildcard" stopProcessing="true">
<match url="*http://MYURI/?*" />
<action type="Redirect" url="https://MYURI/410.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 1" stopProcessing="true">
<match url="^index\.htm$" />
<action type="Redirect" url="https://MYURI/index.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 2" stopProcessing="true">
<match url="^ another-page \.htm$" />
<action type="Redirect" url="https://MYURI/another-page.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 3" stopProcessing="true">
<match url="^v\.htm$" />
<action type="Redirect" url="https://MYURI/ another-page.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 4" stopProcessing="true">
<match url="^ another-page \.htm$" />
<action type="Redirect" url="https://MYURI/ another-page .asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 5" stopProcessing="true">
<match url="^ another-page \.htm$" />
<action type="Redirect" url="https://MYURI/ another-page.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 6" stopProcessing="true">
<match url="^godspell- another-page \.htm$" />
<action type="Redirect" url="https://MYURI/ another-page.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 7" stopProcessing="true">
<match url="^ another-page \.htm$" />
<action type="Redirect" url="https://MYURI/ another-page.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 8" stopProcessing="true">
<match url="^ another-page \.htm$" />
<action type="Redirect" url="https://MYURI/ another-page.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 9" stopProcessing="true">
<match url="^ another-page \.htm$" />
<action type="Redirect" url="https://MYURI/ another-page.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 10" stopProcessing="true">
<match url="^ another-page \.htm$" />
<action type="Redirect" url="https://MYURI/ another-page.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 11" stopProcessing="true">
<match url="^ another-page \.htm$" />
<action type="Redirect" url="https://MYURI/ another-page.asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 12" stopProcessing="true">
<match url="^ another-page \.htm$" />
<action type="Redirect" url="https://MYURI/ another-page .asp" redirectType="Permanent" />
</rule>
<rule name="301 Redirect 13" stopProcessing="true">
<match url="^ another-page \.htm$" />
<action type="Redirect" url="https://MYURI/ another-page.asp" redirectType="Permanent" />
</rule>
</rules>
</rewrite>
<httpProtocol>
<customHeaders>
<remove name="X-Powered-By-Plesk" />
</customHeaders>
</httpProtocol>
<httpErrors>
<remove statusCode="502" subStatusCode="-1" />
<remove statusCode="501" subStatusCode="-1" />
<remove statusCode="500" subStatusCode="-1" />
<remove statusCode="412" subStatusCode="-1" />
<remove statusCode="406" subStatusCode="-1" />
<remove statusCode="405" subStatusCode="-1" />
<remove statusCode="404" subStatusCode="-1" />
<remove statusCode="403" subStatusCode="-1" />
<remove statusCode="401" subStatusCode="-1" />
<remove statusCode="400" />
<error statusCode="400" path="D:\www\MYURI\error_docs\bad_request.html" />
<remove statusCode="407" />
<error statusCode="407" path="D:\www\MYURI\error_docs\proxy_authentication_required.html" />
<remove statusCode="414" />
<error statusCode="414" path="D:\www\MYURI\error_docs\request-uri_too_long.html" />
<remove statusCode="415" />
<error statusCode="415" path="D:\www\MYURI\error_docs\unsupported_media_type.html" />
<remove statusCode="503" />
<error statusCode="503" path="D:\www\MYURI\error_docs\maintenance.html" />
<error statusCode="401" prefixLanguageFilePath="" path="D:\www\MYURI\error_docs\unauthorized.html" />
<error statusCode="403" prefixLanguageFilePath="" path="D:\www\MYURI\error_docs\forbidden.html" />
<error statusCode="404" prefixLanguageFilePath="" path="D:\www\MYURI\error_docs\not_found.html" />
<error statusCode="405" prefixLanguageFilePath="" path="D:\www\MYURI\error_docs\method_not_allowed.html" />
<error statusCode="406" prefixLanguageFilePath="" path="D:\www\MYURI\error_docs\not_acceptable.html" />
<error statusCode="412" prefixLanguageFilePath="" path="D:\www\MYURI\error_docs\precondition_failed.html" />
<error statusCode="500" prefixLanguageFilePath="" path="D:\www\MYURI\error_docs\internal_server_error.html" />
<error statusCode="501" prefixLanguageFilePath="" path="D:\www\MYURI\error_docs\not_implemented.html" />
<error statusCode="502" prefixLanguageFilePath="" path="D:\www\MYURI\error_docs\bad_gateway.html" />
</httpErrors>
</system.webServer>
</configuration>
8:02 pm on Dec 24, 2018 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Feb 5, 2004
posts: 618
votes: 105


I had a client who had a similar type of hack happen and was probably discovered after a week. This was probably 3 years ago. It didn't create new pages but changed the current one by inserting links and advertisements for SE's only. We removed the infected files and the pages slowly changed back in the search results.

Has your site not come back yet in the search results? Are you just worried because these pages show up in the Search Console as 404 errors? These are notorious hard to get rid of but I don't think they do any damage.
9:52 pm on Dec 24, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 4, 2001
posts:2297
votes: 100


The pages are still showing up as MYURI/?spam-page. According to G's Search Console, they have over 13,500 indexed. If I do a search site:MYURI, they will appear in the results. I would have thought that since June they would have started to decline, but it appears they have not. It knocked the website from the 2 or 3 position on page 1 of SERPs to pretty much non-existent. The site is very, very product specific and only 2 others are permitted to offer it. Obviously, if MYURI is not appearing where it use to, it is devastating.
Are you just worried because these pages show up in the Search Console as 404 errors?
And yes. I want them to show up/respond as 410.
3:33 pm on Dec 25, 2018 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Feb 5, 2004
posts: 618
votes: 105


They should have, I am wondering if you have a manual/automated action against you because of your hacked site but I am sure you probably already checked that is the search console. You may have to resubmit your site for inclusion in the search engine.

I would maybe also post about this in the Google Search forum so more people can read about it.

There has also been a few search ranking changes as a whole since June that has affected a lot of sites which may also account in part for the lower ranking.
7:14 am on Dec 26, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15934
votes: 887


any URI with a question mark (?) as that seems to be the only constant factor
Two general observations--obviously can’t be specific, as I don’t speak IIS.

#1 The question mark means that everything after it is interpreted as a query string, even if it looks like a normal URL ending in .html. Can the rule handle this as-is, or do you need a separate layer of instructions? (I’m thinking by analogy of Apache, which breaks the request into pieces: protocol, hostname, urlpath-as-such, and query. Everything other than the urlpath needs special handling.)

#2 Never mind about GSC. Have search engines been crawling these bogus URLs, and if so, have they been receiving a 410 response? (You will know when they do, because GSC will then start reporting thousands of “errors” which you can happily ignore.)

If your legitimate URLs never contain query strings, it seems like a far smaller number of rules should do the trick. “If the request contains a ? then take such-and-such action”.