I’ve heard a lot of people mentioning that Google has lots of links to them because they are a website that link to lots of people. I don’t think so.
No, I don’t mean I don’t believe that websites with lots of links end up being linked by lots of people… More specifically, I do believe the abstract mathematical finding that if you build a digraph in such a way that the probability of linking a node to another depends on the number of links from the destination node, we will have a power law in the number of nodes with different number of edges. What I don’t believe is that this tells anything close to the whole story about what goes on in the web.
Come on, do you really see that much Google links here and there? No, you don’t!! Because it makes no sense linking to Google. Whenever you link to Google it’s either to use it as a search tool inside your site, or to suggest someone to research about a certain topic. These are not “normal” links! It’s not like “oh, I just read this amazing article criticising naïve CS students, you should read it.”
I don’t even think Google is a normal web page. It’s more a meta-page. It’s links are different. Google has no information, it just know who has.
Wikipedia has information (note that false and fake information also count here). It has links also, but often just to reinforce some information that is condensed there already.
Blogs have information. Blogs even have posts with no links at all! How about that?
Porn pages, what I believe is an important element in web-browsing, also can be divided in ones with just links, then the pictures. How about that? These pictures don’t link to any other pages, specially not Google, but they are often the exact pages I’m looking for at the moment.
And here we come to another problem. People aways talk about browsing as hopping from one page to another. That’s a lie. You hop until you find something you want. If you are satisfied, you stop, because you found it. Google does its job if you find what you want in a single link…
But of course sometimes you are just browsing. Then Google is useless, you need something more like Wikipedia, or APOD. Or reading scientific articles that cite other articles.
And we arrive at another concept I’ve been struggling with lately. People often say that if a website links to another, then they are alike… That’s not true. First of all, if you are in a page, and see a link to another page that you believe will but just like that one, you wouldn’t click it. Or at least you shouldn’t because it will be useless! If they are similar, what you need should be there already.
The dissimilar pages are precisely what we want. The pages with NEW information!…
And what about links such as “If you are looking for blahblah instead, then follow this other link”? Definitely, we must take a lot of care when we talk about what a page link means.
People oversimplify web pages and the browsing activity, trying to find a simple and cute “solution” to “the web enigma”. There is no such easy explanation and model about the web and about browsing.
I mean, look at your browser, dude! MULTIPLE TABS! If you are not into multiple tabs yet, you are stuck in the XX century. Where do multiple tabs enter in the magical world of the Markov Web Experience?
And I want to know about the porn. Why is porn always denied when people talk about browsing? Perhaps just because people are ashamed of erotica… Or perhaps because they are a good example that don’t fit the model of web pages linking to similar pages that you go on switching until you get linked back to Google again. What say you? Do porn pages carry this second subversiveness?
There are multiple kinds of links, multiple intentions when the authors create links, and multiple intentions when the users are surfing. That includes following links and putting links with no clear intention. People go back and forth, people open stuff just to decide seconds later it was all crap.
And can someone tell me where is Digg (and similar mechanisms) in all these discussions about browsing and linking? Browsing is all about trying links and just afterwards deciding if they were good or not. We don’t decide what we want and what we like at the moment we click.
We have a long road ahead studying how the web works and how we work inside it. Let’s start now, dropping these simplistic ideas about how we do Internet surfing. Let’s stop and actually LOOK at what we do, at how our pages are… I beg you, leave behind the amazement and amusement with the size of this wonderful digraph, and with the recent and forcedly learning of stochastic processes, and let’s move on to more sophisticated ideas and models of the web and their crawlers!
Even Google has Digg-ish mechanisms already. And people still tell the story about considering links as votes, etc. No, it s not that simple, and it never was. Yes, it’s a beautiful model, and the other phenomena subject to it are still beautiful and fascinating. But isn’t it time for a reality check regarding the way we teach basic “web science”?…
Last updated:
Sunday, 29 Mar
2009 - 06:26 UTC