Forum Moderators: open
Or, what about the number of visitors to each web page? Or even the most popular paths through the site or the time spent at a web page.
If google, for example wants the most popular pages to have a higher ranking, these details would certainly have some weight, yet I have not found anyone addressing this.
Now, site popularity, and link relativity can be trully established IF weblogs are spidered. And if they are, why not tailor web logs just the way we want them, with thousands of fictitous refers, keywords and the time spent.
?
BTW: Log files in this case being the access log data i.e. lines of information in a text format written by the server to record access activity of users on a site or server.
I have no idea where you are going with this but .. if you really expect there is any value in people falsifying their own logs and getting them read by a search engine ..
Who would you want to find and visit them from that search engine?
:-)
I don’t think search engines have access to log files. If they did, then I would pretend I was a spider and do some serious study of log file data. I really like log file data.
Maybe search engines keep track of how many times your pages showed up on their SERPs and maybe they even track when you get clicked. However, as you hinted above, it is possible to fabricate these types of results so I don’t think it would be given much weight towards relevancy
However, why fabricate them? To index them Google and others would have to find a link to the log report page, and you dont want your readers seeing fabricated info do you? So much for the credibility of your web site, and the practice itself is a form of "deceiving search engines" which Google states is a quick way of getting penalties. Not only are you deceiving search engines but also deceiving your readers. Why not just make a nice index or web map instead?
At best, a published log file might have a similar effect as a site map as far as the internal links are concerned. If that is what you're heading for, why not just do a real site map that also benefits your users?
The other side are the external links, which may place you in a bad neighbourhood if left unchecked. Of course, if you want to specifically link to certain sites, creating a honest links page would be the obvious solution. Again, this will have the same effect for the engines, and benefits your human visitors a lot more than a "fabricated logfile".
I think the reason why nobody is doing what you suggest is that at least the same results can be had with other and more useful tools, while avoiding some of the potential problems.
What about link popularity, or the amount of traffic your web page is getting coming from external links. If SE's track popularity by click throughs, and if they do that by web logs, then to fabricat the amount of traffic could be a very easy and very powerful thing to do.
No?
I believe this means that they (legitimate search engines, not spambots) will NOT recognize URLs that are not part of a hyperlink, so the mere presence of "my-favorite-somthing-site.com/something_index.hmtl(*)" in a file will NOT add page rank to that page, nor will it associate this forum with a bad neighborhood around that page.
All it will do is find this thread when someone is looking for 'super adult site in neighborhood with favorite page rank'.
(*)I had included the "http://" bit on that (fake) URL above, but it occurred to me that the forum software might turn that into a real hyperlink, which would have made my statement false. But that's the forum software, not the search engine spider.
[edited by: msgraph at 9:44 pm (utc) on Nov. 8, 2002]
[edit reason] changed url to something more friendly [/edit]
Suppose you have a log file that saves the "REFERER" field, and someone in a "bad neighborhood" (say, the index.htm file at the vicious-spam.com comain) links to you. Now your log file will contain the text string "vicious-spam.com/index.htm" -- but that is not a URL: it is merely a text string containing dots, dashes, letters, and slashes. The robot is only going to recognize a string as a URL if it begins with the magic word "http://" (which is the protocol) AND is formatted like a URL, or else if it appears in HTML context as a URL (say, in the HREF field of an A tag.)
Neither of which is true of your logfile.
So the 'bot will either just index the words ("vicious", "spam", "com", "index", and "htm") or it will completely ignore them because it's looking for a recognizable URL. And when someone searches for "vicious spam com index htm" -- the search engine results may include your logfile. But if someone searches for backlinks to their site, the results will NOT include your logfile.
Good point! And if that site is a good quality site, those fictitous hits going to your site would be great - no?
Hi arlin, welcome to WebmasterWorld.
BTW, to be more precise, I should restate that last sentence as: Think of this exploit as the ability to create a link on a hallway page on someone else's website. These scripts generally just link back to a single page on the offending site.
As for the quality of the link, particular for those chasing PR, I'd think it would be very low. But think of the volume of links such a spider could create. And there are those who believe that even low-quality links add up in the algos.
Weblogs, also called blogs, are a kind of online diaries that are updated daily, contain lots of links and are heavily inter-linked in cliques and groups. They also tend to link to similar pages (Person A sees the link on the site of friend B and decides to link to it on their own weblog, and so on). Blogs were in the past responsible for pretty much all so-called 'Google bombing', because they had the power to strongly influence page rank with that network of different pages, frequent updates, strong community and hundreds of links. Google has since tried to lessen their influence a bit.