Add Disallow: to robots.txt

alrra · kcmckell · commit 9fd8aa5b3028 · 2014-02-24T17:22:56.000-10:00
The addition of `Disallow:` is made in order to be compliant with: * the `robots.txt` specification (http://www.robotstxt.org/), which specifies that: "At least one Disallow field needs to be present in a record" * what is suggested in the documentation of most of the major search engines, e.g.: - Baidu: http://www.baidu.com/search/robots_english.html - Google: https://developers.google.com/webmasters/control-crawl-index/docs/getting_started http://www.youtube.com/watch?v=P7GY1fE5JQQ - Yandex: help.yandex.com/webmaster/controlling-robot/robots-txt.xml Besides the addition specified above, this commit also: * adds a comment making it clear to everyone that the directives from the `robots.txt` file allow all content on the site to be crawled * updates the URL to `www.robotstxt.org`, as `robotstxt.org` doesn't quite work: curl -LsS robotstxt.org curl: (7) Failed connect to robotstxt.org:80; Operation timed out Close h5bp#1487.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,7 @@
 ### HEAD
 
+* Add `Disallow:` to `robots.txt`
+  ([#1487](https://github.com/h5bp/html5-boilerplate/issues/1487)).
 * Remove default foreground color from form elements
   ([#1390](https://github.com/h5bp/html5-boilerplate/issues/1390)).
 * Remove default margin from print styles
diff --git a/robots.txt b/robots.txt
@@ -1,3 +1,5 @@
-# robotstxt.org/
+# www.robotstxt.org/
 
+# Allow crawling of all content
 User-agent: *
+Disallow: