Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

PDFs used in hacked sites - algo "vulnerable" to links in pdfs

         

Leosghost

1:04 am on Jul 13, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I haven't seen this mentioned here ..so

Sophos threat hunter Dmitry Samosseiko says attackers are hacking sites and implanting hundreds of thousands of malicious PDF files a day to build a new cloaking system that foils Google's search algorithm analysis.

Link farmers bust Google search algos
[theregister.co.uk...]

---

Mod's note: OP's original title and description lines for this thread read...
title: PDFs used in hacked sites to "poison" Google SERPS
description: Algo is "vulnerable" to links in PDFs

Since we generally don't use description lines in this particular forum, and the system has a length limit, it was necessary to shorten these to get it all in, as I feel both were important to what this thread's about. Note that " search poisoning" came from a quote in the article.


[edited by: Robert_Charlton at 4:49 am (utc) on Jul 13, 2015]

ken_b

2:38 am on Jul 13, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There was a thread about this a couple days ago

[webmasterworld.com...]

Robert Charlton

5:52 am on Jul 13, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



LG, thanks for posting this. The earlier thread ken_b mentions in the post above is in OpenGoogle, in Supporters, so won't be accessible to many members. It's good to have a discussion here, particularly because it clearly relates to another current thread in this forum....

Google SERPs: What does Viagra have to do with chalkboards?
https://www.webmasterworld.com/google/4756776.htm [webmasterworld.com]

I noted in the chalkboard thread, which is about spammy serps from hacked sites which were also cloaked, that my test queries were returning a lot of pdfs. Perhaps a fourth to a third of the hacked results I saw were pdfs, much larger than I'd expect for the type of basically product searches I was trying. It's now very clear that they were related to what Sophos describes in the article. Likely to be different button pushers, but chances are that spamming with pdfs has been a hot topic in black hat circles.

That said, it's not at all clear to me from The Register's article, or from the Sophos blog articles that inspired it, whether Sophos established that the link juice came from links in the pdfs. (I'm not arguing that one way or the other. There have been some studies published that haven't been able to find an effect.)

IMO, it's doubtful that the use of PDFs on an ordinary and uncloaked scale would in any way be effective. From all I've found on the topic, Google up until now has claimed that pdfs don't pass PageRank... and there isn't even a "nofollow" attribute for individual pdf links, so up until now, at least, Google hasn't been concerned about them.

Also on a normal scale, spam by PDFs would be quickly spotted... and you can pull up the cloaked pdfs in searches you follow by appending filetype:pdf to any Google search you're following.

Judging from the Register article, and from what I've seen in running my test searches for the "chalkboard" thread, we are talking about a massive number of spammed files here, hundreds of thousands. On a much, much smaller scale, comparisons I've run over the years for HTML files vs PDF files have suggested that while PDFs would always rank, they'd rank quite a bit lower than HTML equivalents of the same information.