I have a page on my website which alphabetically lists links to around 100 directories. Late last year I started observing an excessive number of requests in my logs from Googlebot, adding a slash at the end of the file extension and requesting non existing urls from off this one page such as:
example.com/a-zIndex.htm/ExampleDirectory1/ExampleDirectory36/ExampleDirectory8/ExampleDirectory65/anotherPage.htm
example.com/a-zIndex.htm/ExampleDirectory16/ExampleDirectory26/ExampleDirectory81/YetAnotherPage.htm
example.com/a-zIndex.htm/ExampleDirectory18/ExampleDirectory16/ExampleDirectory84/ExampleDirectory94/ExampleDirectory4/YetAnotherPageAgain.htm
...
...
etc
To my horror, I discovered that these requests were all resolving, so I redirected example.com/a-zIndex.htm/ to example.com/a-zIndex.htm thinking that would sort things out. Now I am seeing over 69,000 urls in GWT returning 404's. When I click on the tab to show me where the page is linked from, the urls displayed there are also 404 'not found' yet the date on some of them shows that the page was first discovered only 4 days ago - this is about 3 weeks after I set up the redirect.
My alphabetical listing page has disappeared from the SERPs - yet I notice, Google is happy to include one or two of the now non existent pages in its results.
What would be the best way to handle this problem? I'm thinking of blocking Googlebots access to non existent directory /a-zIndex.htm/