Welcome to WebmasterWorld Guest from 54.226.143.14

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

unusual

   
9:52 am on Mar 15, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Anybody have a clue?
The first one I simply accepted as trash.
The second one makes me wonder, although CC has many open proxies.

I've broken the link in the UA.
This page is slightly relative (at least in name) to an active news topic.

112.119.106.zz - - [15/Mar/2012:00:52:38 +0000] "HEAD /MyFolder/MySub/MyPage.html HTTP/1.1" 403 - "http:// googlenewssubmit. com/" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; Media Center PC 6.0; InfoPath.2; MS-RTC LM 8)"

173.12.249.zzz - - [15/Mar/2012:06:28:45 +0000] "HEAD /SameFolder/SameSub/SamePage.html HTTP/1.0" 200 - "http:// googlenewssubmit. com/how-can-it-help-me/" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.2; OfficeLiveConnector.1.4; OfficeLivePatch.1.3; yie8)"
4:03 pm on Mar 15, 2012 (gmt 0)



Googlenewssubmit is a commercial press release company, promising they can get you to the top of Google News using various (invalid?) methods.

They are essentially advertising themselves by sending referrer spam into your logs, hoping that you will check out their prices and buy their "services". In itself an "invalid method". :-)

In my categorizations they fall under "link_spammer".
3:08 pm on Mar 23, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



The problem is SEs now days pulling out URLs from the page content (not just html anchors) and unfortunately this one is no exception posting log records. Perhaps it's one reason they spam this way.
3:19 pm on Mar 23, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



enigma1,
Could you expand?

Are the SE's pulling out URL's that are NOT embedded links?
4:55 pm on Mar 23, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



There are rumors they do. There are several threads implying this and at least from my logs I see weird accesses.

This post talks about googlebot trying to interpret js but what's important to note is that it parses content and "interprets it".
[webmasterworld.com...]
[webmasterworld.com...]
and various other threads I cannot recall right now.
9:36 pm on Mar 23, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Are the SE's pulling out URL's that are NOT embedded links?

Yes, there are mountains of them in gwt error pages-- sometimes even when there is a link wrapped around the damaged text.

Concrete example under "not found":

hovercraft/h..

That's quoted verbatim, dots and all. If you follow the "linked from" links you arrive eventually at a page with the world's spammiest meta tags and a list of urls, including--

Wait, I've got to do some more verbatim quoting under the vague head of "With friends like these..."

<td width="580">
<div class="msnresult">
<div style="margin-bottom:5px; padding-left: 8px;">
<a href="http://www.example.com/{directory}/{filename}.html" target="_blank" class="msneresult" rel="nofollow">{page title} - {my domain name}</a>
<div class="msnresultcnt">
{text of my meta description}</div><span class="msnresulturl">http://www.example.com/hovercraft/h...</span></div></div></td>


Notice (a) the teeny-weeny detail that the "not found" version snips off one more dot-- truncated urls on the page always have three-- and (b) they seem to have decided that "nofollow" doesn't count on this page. The dot-snipping doesn't kick in after a fixed number of characters, though it may be some physical lenghth in pixels. I ran out of interest at this point ;)

What's notable is that the "real" link, unsnipped, is only two lines away. But that one doesn't count as a link. (I checked a different gwt page.)

Someone, somewhere, programmed a computer to make these decisions.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month