Robots file is most important ,A robots.txt is a file placed on your server to tell or instruct the various search engine spiders not to crawl or index certain sections or pages of your website or blog.The robots.txt file itself is a simple text file, which can be created in Notepad or The robots.txt is a very simple text file that is placed on your root directory. An example would be Seo-service-4-u.blogspot.com/robots.txt. This robots.txt file tells various search engine and other robots which areas of your website or blog they are allowed to visit and index.You can only have one robots.txt on your website or blog.
GOOD : seo-service-4-u.blogspot.com/robots.txt
BAD : Won't work: seo-service-4-u.blogspot.com/directory/robots.txt
If you are using Wordpress a sample robots.txt file would be:
User-agent:
Disallow: /wp-
User-agent: User-agent means that all the search bots like Google, Yahoo, Bing ,Ask,MSN , Alexa,GigaBlast , DMOZ Checkerand ,Baidu and so on should use those instructions to crawl your website.
Disallow: /wp- Disallow means this will make sure that the search engines will not crawl the Wordpress files.
Web Robots are sometimes referred to as Web Crawlers, or Spiders. Therefore the process of a robot visiting your website is called "Spidering" or "Crawling". When we says that the search engines spidered my website or blog, it means the search engine robots or Web Crawlers have visited their website.This robot Web Crawlers is known by a name and has an independent IP address.IP address is not importance to us, but knowing robot names will help in create a robots.txt file.This is why the file is called "robots.txt." Following are the list of the robots very popular Specific Search Robots names with there bot name:
Specific Search Engines Robots
Engine Bots
Google.com ************* Googlebot
Alexa.com ************* Ia_Archiver
MSN.com ************* Msnbot
Altavista.com************* Scooter
Excite.com ************* ArchitextSpider
Euroseek.net ************* Arachnoidea
Gendoor.com ************* GenCrawler
Infoseek.com ************* UltraSeek
Hotbot.com ************* Slurp
Nave.com *************Naverbot, yeti
Looksmart.com************* MantraAgent
Lycos.com ************* Lycos_Spider_(T-Rex)
Baidu.com ************* Baiduspider
Cuil.com ************* Twiceler
GigaBlast.com ************* Gigabot
Yuntis.com ************* Gulper
LookSmart.com ************* MantraAgent
Teoma.com ************* Teoma_agent1
SearchHippo.com ************* Fluffy the spider
AlltheWeb.com ************* FAST-WebCrawler
Euroseek.com ************* Arachnoidea
Specific Special Bots
Google Image ************* Googlebot-Image
Google Mobile ************* Googlebot-Mobile
Yahoo MM ************* Yahoo-Mmcrawler
MSN PicSearch *************Psbot
GOOD : seo-service-4-u.blogspot.com/robots.txt
BAD : Won't work: seo-service-4-u.blogspot.com/directory/robots.txt
If you are using Wordpress a sample robots.txt file would be:
User-agent:
Disallow: /wp-
User-agent: User-agent means that all the search bots like Google, Yahoo, Bing ,Ask,MSN , Alexa,GigaBlast , DMOZ Checkerand ,Baidu and so on should use those instructions to crawl your website.
Disallow: /wp- Disallow means this will make sure that the search engines will not crawl the Wordpress files.
Web Robots are sometimes referred to as Web Crawlers, or Spiders. Therefore the process of a robot visiting your website is called "Spidering" or "Crawling". When we says that the search engines spidered my website or blog, it means the search engine robots or Web Crawlers have visited their website.This robot Web Crawlers is known by a name and has an independent IP address.IP address is not importance to us, but knowing robot names will help in create a robots.txt file.This is why the file is called "robots.txt." Following are the list of the robots very popular Specific Search Robots names with there bot name:
Specific Search Engines Robots
Engine Bots
Google.com ************* Googlebot
Alexa.com ************* Ia_Archiver
MSN.com ************* Msnbot
Altavista.com************* Scooter
Excite.com ************* ArchitextSpider
Euroseek.net ************* Arachnoidea
Gendoor.com ************* GenCrawler
Infoseek.com ************* UltraSeek
Hotbot.com ************* Slurp
Nave.com *************Naverbot, yeti
Looksmart.com************* MantraAgent
Lycos.com ************* Lycos_Spider_(T-Rex)
Baidu.com ************* Baiduspider
Cuil.com ************* Twiceler
GigaBlast.com ************* Gigabot
Yuntis.com ************* Gulper
LookSmart.com ************* MantraAgent
Teoma.com ************* Teoma_agent1
SearchHippo.com ************* Fluffy the spider
AlltheWeb.com ************* FAST-WebCrawler
Euroseek.com ************* Arachnoidea
Specific Special Bots
Google Image ************* Googlebot-Image
Google Mobile ************* Googlebot-Mobile
Yahoo MM ************* Yahoo-Mmcrawler
MSN PicSearch *************Psbot
No comments:
Post a Comment