Forum Moderators: phranque

Message Too Old, No Replies

403 Forbidden WP-CONTENT

         

cheegum

10:27 am on Aug 29, 2018 (gmt 0)

10+ Year Member



My /wp-content/ folder is forbidden as i have disabled directory indexing. But this has started showing up in webmaster tools as Access Denied - Why is Google crawling /wp-content/uploads and how do i stop them from crawling these folders?

Images are still accessible on the full URL

TorontoBoy

1:10 pm on Aug 29, 2018 (gmt 0)

5+ Year Member Top Contributors Of The Month



All images, attachments are stored in your /wp-content/uploads directories. Therefore Google and most bots just go directly to the source and bypass your page. For indexing just images, Google does not want your web page content, just the image. This is very common. I think you should enable these directories.

Can you disallow viewing the directory index but allow direct access to the contents, if they know they are there? It is easy for a bot to scrape your page and extract only the images/download files. This is a common script request. They then have the direct URLs. Google and others do this.

tangor

4:00 pm on Aug 29, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google does not want your web page content, just the image


Rather amazing statement if you think about it. The images are available via the normal content, so it does beg the question why the need to bypass?

Leosghost

4:17 pm on Aug 29, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The images are available via the normal content, so it does beg the question why the need to bypass?

So they can cache a copy and wrap ads around it* rather than send the searcher to the site.
If they don't do it now, they will, this is "just in case we need them"..

not2easy

6:06 pm on Aug 29, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



You can stop the crawling with Disallow: in robots.txt for the folders you don't want crawled. They may still list these as "blocked resources" but they won't crawl if disallowed.

If you want them to have access to the images, you can use "Disallow:" for the folders and "Allow:" for the */*.jpg, */*.png or the filenames you use within those disallowed folders. Use Allow after Disallow.

Google will follow your instructions, but other search engines may follow or not. It doesn't block the robots, just tells them your preferences.