Sunday, May 27, 2012

How to block pages from Search Engines

Many times it happens that you do not want some of your webpages to show on search engines. Well, you can do that easily by guiding the search engine crawlers not to index some of your web pages. There are three ways of doing this as given below:-

First Method:- With the help of Robots.txt

Robots.txt are special files which directs the search engine crawlers not to crawl certain webpages. Example contents:-

User-agent: *
Disallow: /cgi-bin/
Disallow: /banner/
Disallow: /~flower/
 
This directive disallows the robots from accessing cgi-bin,banner 
and flower directories.
  
User-agent: *
Disallow: /
 
This directive disallows the robots from accessing the entire 
contents from the server.
 

Second Method:- With the help of Meta Robots tag

Meta Robots tag directs the search engine bots not to index pages with the help of no index attribute. Some examples are given below:-

<meta name=”robots” content=”index,follow”>

This tag directs the search engine robots to index the content and follow the urls of the page.

<meta name=”robots” content=”noindex,nofollow”>

This tag directs the search engine robots not to index the content and not to follow the urls of the page.


Third Method:- IP Blocking

Blocks certain IP's from accessing the server. You can create this with the help of an .htacess file.
Post a Comment