Saturday, 26 November 2016

Companies are making Fun Robots.txt - Usefullytips

Here i am going share few do's and don'ts in Robots.txt along with few examples where some of companies created their robots.txt file

Most boring subject in SEO is Robots.txt. Is there an interesting problem to be solved in this file, maximum erros comes not understanding the directives and typos.  Main purpose of a robots.txt is just crawel where they can and cannot go.

Some of Basic thing you need to know in robots.txt

User-agent — specifies which robot.
Disallow — suggests the robots not crawl this area.
Allow — allows robots to crawl this area.
Crawl-delay — tells robots to wait a certain number of seconds before continuing the crawl.
Sitemap — specifies the sitemap location.
Noindex — tells Google to remove pages from the index.
# — comments out a line so it will not be read.
* — match any text.
$ — the URL must end here.

Some of the things you need to know about robots.txt

syntax for robots.txt is

sub domain should also have it own robots.txt. is not same as

robots.txt is ignored by crawlers or spiders

Both URL's and robots.txt are case sensitive

You can manage crawlers setting in Google Search Console where crawler-delay was not honoured by Google

according to Google’s Gary Illyes: Allow CSS and JS 


    User-Agent: Googlebot
    Allow: .js
    Allow: .css

  • Declare your robots.txt in both Google search console and Bing Webmaster Tools

  • Eric Enge of stone temple consulting says that  Noindex will work where as John Mueller ( Google webmaster trends Analyst) was opposite for using it, He says better you use noindex via meta robots or x-robots.
  • Maximum size of robots.txt file is 500KB
  • For avoiding duplicate content don't block crawlers
  • Never disallow the pages which are redirected because spiders will not able to follow the redirect
  • Disallow pages prevents previous verison from being shown in
  • Go to and search, you will seeing older version of robots.txt
Comedy stuff of Robots.txt file of few companies

Many companies have mention their logo of their brand in robots.txt file here you can see sample examples of robotx.txt

ASCII art and job openings, as well know that nike slogan is just do it.. in the same way they mention slogan in their robots.txt file "just crawl it"and also included their logo

Seer  robots.txt file    

TripAdvisor robots.txt file

Fun robots

Yelp in their robots.file they have include Asimov's three laws

# As always, Asimov's Three Laws are in effect:
# 1. A robot may not injure a human being or, through inaction, allow a human
#    being to come to harm.
# 2. A robot must obey orders given it by human beings except where such
#    orders would conflict with the First Law.
# 3. A robot must protect its own existence as long as such protection does
#    not conflict with the First or Second Law.

Youtube in their Robots.txt file

# robots.txt file for YouTube
# Created in the distant future (the year 2000) after
# the robotic uprising of the mid 90's which wiped out all humans.

Page One Power in  their robots.txt.

#This is not the droid you're looking for.
#but we are the link builders you've been looking for.

In Killer-robots.txt file Both Larry Page and Sergey Brin are safe from Terminators

Reddit Robots.txt File 

Whats is Humans.txt?

Humans.txt defines "It's an initiative for knowing the people behind a website. It's a TXT file that contains information about the different people who have contributed to building the website."

Check out :

Reference link : -

Tags: robots.txt file, what is the maximum size of robots.txt file,  seo, search engine optimization, technical seo, robots, robots.txt, robots file

No comments:

Post a Comment