Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Is Google getting site info by searching for non webpage files?

         

Sgt_Kickaxe

6:09 pm on Jun 16, 2010 (gmt 0)



My GWT account has an error listed that I've never seen before, until today.

A 401 authorization required error is showing up for a wordpress file called wp-app.php.

The page isn't linked from anywhere and the 401 error coincides with a sitewide temporary(45 minutes) .htaccess change which was put in place by the new host during a site move from one host to another.

Is Google looking for standard files associated with common CMS's in order to determine if they are content management systems? Why would Google be looking at non webpage files anyway?

tedster

11:30 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



One reason might be to get a list of sites that could be vulnerable to hacks and malware hosting in the future. I've seen a number of "probes" like this recently.

claus

11:59 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



All large search engines run 404 tests, asking for random files in order to see if the 404 page does something "suspicious".

However, in this case Google is searching for a part of WordPress and this will not give a 404 error. This is not a security probe: The file in question is a legitimate part of a WordPress install.

So, there is only one possible explanation: Google wants to know if you run WordPress or not.

As to the "why" ... Googles mission statement is to catalog all the worlds information. AFAIK, this includes what kind of software you run your site on, what you had for breakfast, and the eye colour of your best friends first pet.

Sgt_Kickaxe

12:07 am on Jun 17, 2010 (gmt 0)



My best friends pet is blind, that vw bug in the laneway with the camera on the roof is going to be sitting there a long time.

Should I assume that an up to date wordpress install is also a ranking factor vs a non-updated install?

tedster

12:13 am on Jun 17, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To clarify my idea: given that Wordpress hacks are epidemic, Google might well want to run a malware check more often on sites they know are running Wordpress.

Sgt_Kickaxe

3:56 am on Jun 17, 2010 (gmt 0)



... and penalize sites that aren't running the latest version?

or as google puts it, rewarding sites that ARE up to date?

Matt Cutts did go out of his way to emphasize using version 2.9.2 of wordpress in his latest I/O video, I like receiving instructions like that but despise having to guess at what instructions have ranking penalties attached. It's good practice but some modify core files and are capable of patching without upgrading (with good reason not to upgrade depending on patch notes)

tedster

4:25 am on Jun 17, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



and penalize sites that aren't running the latest version?

How about just check in more often, especially after a new Wordpress update. That would help Google to catch malware more quickly, protecting their SERPs a bit more.

Sgt_Kickaxe

4:42 am on Jun 17, 2010 (gmt 0)



I'd like that solution better too, surely the new Google caffeine horsepower could be used to that end?

I've been looking to figure out why one of my sites lost some traffic recently (I have a couple that did), I've made some posts in various forums about various related topics, and I'm having a "duh" moment. The wordpress installation on this site is outdated by one publish. Though I did make some changes using the changelog I didn't download all of the latest package and the site says it's not version 2.9.2.

~ Google cache is dated May 1st for index page, it's Godaddy hosted and Godaddy just got attacked on a large scale, and google bot did come looking for wordpress exclusive files...

A downgrade in trust seems not only plausible but likely, I deserve it, and I can fix it easily so I've done so (sorta, it was upgraded but didn't say so). Now I wait and see how long it takes to get a current cache, with fingers crossed. Sometimes the simplest reasons are hard to spot! (old age?)

tangor

5:26 am on Jun 17, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sometimes the obvious is overlooked while looking for "reasons". My take? The Google is shut down, licking wounds from Caffeine injection (too much data) and algos stressed in that regard, and... a belated realization that the web does not CHANGE all that fast. Bing and Yahoo still hammer my sites, Google is 1/5th... and has been for the last three weeks. Not yesterday or the day before...