| 7:48 am on Sep 10, 2009 (gmt 0)|
Difficult to take a call on your claim of sneaky tactic without seeing the actual url.
But Google started using technology called optical character recognition ( OCR ) to extract text out of the PDF’s from late 2008 onwards.
What it basically does is that it takes the snapshots of PDF’s as input, runs optical character recognition on them and index the text just like regular text.
If it can see the text, it would be seeing the links too?
If you want to know geek details about the open source OCR software that Google sponsers, OCROPUS –
refer to: [code.google.com...]
(If you have Acrobat Pro 9, you can see the option under Documents => OCR Text Recognition => Recognize Text using OCR)
| 3:10 am on Sep 14, 2009 (gmt 0)|
If your tool is showing you something that you can't verify by manual inspection, especially something that doesn't seem logic or rational...
Well, I would think about the value of that tool for a minute, and just perhaps ... I'd ask the tool maker what was up with that before I scrapped it.
| 6:56 am on Sep 14, 2009 (gmt 0)|
Hi guys, thx for taking the time to giving your thoughts on the subject. I think you're both on to something. As I can see that all the PDF's actually contain the brand name of my competitor (he has a generic name: lets say it was "stool"). So OCR technology atleast recognizes the keyword "stool" to be important for the pdf - however how the connection to my competitors website is made I don't know (there are no visible links). Perhaps like Claus says it is an error (I just dont believe this as my competitor work in the field of SEO). There must be more to it - but what?
| 8:08 am on Sep 14, 2009 (gmt 0)|
I do not know how your competitors are getting links from this. But i know for sure such links do work. Try to get some links from internal pages of high domain PR sites and you will see the results.
| 9:14 am on Sep 14, 2009 (gmt 0)|
Crosscheck your tool with [search.yahoo.com...]
In the following search, replace example.com with the domain name in question.
|linkdomain:example.com site:.gov |
Does it show links from those PDF files?