Search Engines Explained
How do Search Engines Work?
With thousands of millions of Web Pages and over 2billion keyword searches per day you can begin to understand the complexity of the Search Engine´s job and the mass of data that it has to handle.
Each search engine has software robots called “spiders” or bots” which search the Web and visit web pages to look for words that occur in the title, subtitles, metatags and other positions that it considers relevant within the site – they operate with sophisticated and often changing algorithms. The spiders also find and follow any hyperlinks that appear on the site thus being able to not only visit more pages but also to “rank” your site in relevance to it´s content.
Google also caches a copy of the site upon initial “spidering” and on subsequent visits will compare the latest visit to the cached copy to see if there has been any changes. No changes means a less relevant and possibly outdated site and may, subject to other variables, be “marked down” in Googles indexing system.
In the simplest example, a search engine could easily just store the word and the URL where it was found. In reality though this would make for an engine of limited use, since there would be no way of determining whether the word was used in an important or a trivial way on the page, whether the word was used once or many times or whether the page contained links to other pages containing the same word. In other words, there would be no way of building the ranking list that tries to present the most useful pages at the top of the list of search results.
To make for more useful and relevant results, most search engines store more than just the word and URL. A Search Engine might store the number of times that the word appears on a page. The engine might assign a weight to each entry, with increasing values assigned to words as they appear near the top of the document, in sub-headings, in links, in the meta tags or in the title of the page. Each commercial SE has a different algorithm for assigning weight and value to the words in its index. This is one of the reasons that a search for the same word on different search engines will produce a differing list with the pages often presented in different orders.
SE indexing has a single purpose: It allows information to be found as quickly as possible by a searcher. There are quite a few ways for an index to be built, but one of the most effective ways is to build a hash table. In hashing, a formula is applied to attach a numerical value to each word. The formula is designed to evenly distribute the entries across a predetermined number of divisions. This numerical distribution is different from the distribution of words across the alphabet, and that is the key to a hash table’s effectiveness and one of the reasons that results across millions of pages can be returned so quickly.
The algorithms used are kept secret so that the ranking system used cannot be easily identified and therefore manipulated by webmasters to ensure their site gets a high ranking. Similarly, the algorithms used are changed and enhanced on a regular basis to not only ensure that relevant and fresh content is shown to searchers but to reduce the chances of any manipulation.
Google being the top SE with over 50% of all web traffic is constantly striving to keep at the top of it´s game and always being at the cutting edge of the “search game” and is why SEO is always an ongoing strategy.
That´s why trying to keep your Site high in the SERP´s should be considered part of an over all ongoing strategy, to develop and adapt as algorithms change to reflect the dynamic nature of “search” and the Internet.
Most small businesses struggle to do this and is one of the reasons why Website SEO have introduced SEO Packages at a fixed price in order to keep you and your Site at the “leading edge”.
Popularity: 1% [?]