Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

webmaster tools - external links

how do you tell what's a scraper?

         

needhelp

9:53 am on Apr 21, 2007 (gmt 0)

10+ Year Member



First, my apologies. I was reading a great post talking about the function in Webmaster tools to view your external links (for verified sites) and how you can use it to find sites "to file DMCA notices". I've been up for hours reading posts (sadly yes) and now can't find that particular post again!

In the list of external pages linking to my site (main and internal pages), there are hundreds of urls that when I click on them from the tools page, I get a site that has no mention of my site on it anywhere (so clearly it's not a valid inbound link). Are these sites somehow hijacking or spamming me or scraping? I don't even know what the right terms are here...sorry.

Why are these pages showing up as pages linking to me according to Webmaster tools if they have no link to me at all?

Thanks!

tedster

4:26 pm on Apr 21, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are at tleast two possibilites here:

1. The report is wrong (bugs do show up in GWT more often than I am comfortable with)
2. The link is there but "cloaked" so they only show it to googlebot and not to regular browsers

You can easily check for basic "user agent" cloaking by using Firefox and installing the User Agent Switcher extension, so your browser says it is googlebot when it asks for the page in the report. If they are taking the lazy way to cloak - and many do - the page you get this way would show your link.

"Scraping" would mean they took your content and are publishing it on their website. If this causes them to rank where your page normally would, then they have "hijacked" your position in Google. "Spam" or "webspam" is a more general word that means using any kind of trick to rank higher than a page would on its own. Cloaking, scraping, and hijacking are all tools that some people use to spam Google.

needhelp

6:23 am on Apr 24, 2007 (gmt 0)

10+ Year Member



Great reply, thanks! I'll try that. If I had 3 wishes: health, happiness, a world without spammers!

needhelp

9:37 pm on Apr 24, 2007 (gmt 0)

10+ Year Member



Any chance I can get help on how to create a user agent for google? I've installed the switcher agent, but now I don't know how to switch it to google under the options, add user tab... thanks in any case for reading.

tedster

9:51 pm on Apr 24, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can just grab the string from your server logs. I've been using Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html)