{"id":7543,"date":"2019-11-23T11:24:26","date_gmt":"2019-11-23T11:24:26","guid":{"rendered":"https:\/\/www.brainlabsdigital.com\/?p=7543"},"modified":"2025-06-20T20:06:28","modified_gmt":"2025-06-20T20:06:28","slug":"web-based-robots-txt-parser-using-googles-open-source-code","status":"publish","type":"post","link":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/","title":{"rendered":"Web-based robots.txt parser using Google&#8217;s open-source Code"},"content":{"rendered":"<p>The punchline: I\u2019ve been playing around with a toy project recently and have deployed it as a free web-based tool for checking how Google will parse your robots.txt files, given that their own online tool does not replicate actual Googlebot behaviour. Check it out at <a href=\"https:\/\/www.realrobotstxt.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">realrobotstxt.com<\/a>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-7544\" src=\"https:\/\/brainlabdev.wpenginepowered.com\/wp-content\/uploads\/2020\/11\/free-web-based-robotstxt-parser.png\" alt=\"free web based robotstxt parser\" width=\"869\" height=\"531\" srcset=\"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/free-web-based-robotstxt-parser.png 869w, https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/free-web-based-robotstxt-parser-300x183.png 300w, https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/free-web-based-robotstxt-parser-768x469.png 768w\" sizes=\"auto, (max-width: 869px) 100vw, 869px\" \/><\/p>\n<p>While preparing for my recent <a href=\"https:\/\/www.slideshare.net\/DistilledSEO\/searchlove-london-2019-will-critchlow-misunderstood-concepts-at-the-heart-of-seo-182776330\" target=\"_blank\" rel=\"noopener noreferrer\">presentation at SearchLove London<\/a>, I got mildly obsessed by the way that the deeper I dug into how robots.txt files work, the more surprising things I found, and the more places I found where there was conflicting information from different sources. Google\u2019s open source robots.txt parser should have made everything easy by not only complying with their <a href=\"https:\/\/webmasters.googleblog.com\/2019\/07\/rep-id.html\" target=\"_blank\" rel=\"noopener noreferrer\">newly-published draft specification<\/a>, but also by apparently being real production Google code.<\/p>\n<p>Two challenges led me further down the rabbit hole that ultimately led to me building a <a href=\"https:\/\/www.realrobotstxt.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">web-based tool<\/a>:<\/p>\n<ol>\n<li><span style=\"font-size: 1.05em;\">It\u2019s a C++ project, so needs to be compiled, which requires at least some programming \/ code administration skills, so I didn\u2019t feel like it was especially accessible to the wider search community<\/span><\/li>\n<li><span style=\"font-size: 1.05em;\">When I got it compiled and played with it, I discovered that it was <\/span><a style=\"font-size: 1.05em;\" href=\"https:\/\/twitter.com\/willcritchlow\/status\/1194282036581847040\" target=\"_blank\" rel=\"noopener noreferrer\">missing crucial Google-specific functionality<\/a><span style=\"font-size: 1.05em;\"> to enable us to see how Google crawlers like the images and video crawlers will interpret robots.txt files<\/span><\/li>\n<\/ol>\n<h2>Ways this tool differs from other resources<\/h2>\n<p>Apart from the benefit of being a web-based tool rather than requiring compilation to run locally, <a href=\"https:\/\/www.realrobotstxt.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">my realrobotstxt.com tool<\/a> should be 100% compliant with the <a href=\"https:\/\/github.com\/google\/robotstxt\/blob\/master\/protocol-draft\/draft-koster-rep-00.txt\" target=\"_blank\" rel=\"noopener noreferrer\">draft specification<\/a> that Google released, as it is entirely powered by their open source tool except for two specific changes that I made to bring it in line with my understanding of how real Google crawlers work:<\/p>\n<ol>\n<li><span style=\"font-size: 1.05em;\">Googlebot-image, Googlebot-video and Googlebot-news(*) should all fall back on obeying <strong>Googlebot<\/strong> directives if there are no rulesets <em>specifically<\/em> targeting their own individual user agents &#8211; we have <\/span><a style=\"font-size: 1.05em;\" href=\"https:\/\/twitter.com\/willcritchlow\/status\/1194307059292000257\" target=\"_blank\" rel=\"noopener noreferrer\">verified<\/a><span style=\"font-size: 1.05em;\"> that this is at least how the images bot behaves in the real world<\/span><\/li>\n<li><span style=\"font-size: 1.05em;\">Google has a range of bots (AdsBot-Google, AdsBot-Google-Mobile, and the AdSense bot, Mediapartners-Google) which <\/span><a style=\"font-size: 1.05em;\" href=\"https:\/\/support.google.com\/webmasters\/answer\/6062596?hl=en\" target=\"_blank\" rel=\"noopener noreferrer\">apparently<\/a><span style=\"font-size: 1.05em;\"><strong> ignore<\/strong> User-agent: * directives and <em>only<\/em> obey rulesets specifically targeting their own individual user agents<\/span><\/li>\n<\/ol>\n<p>[(*) Note: unrelated to the tweaks I\u2019ve made, but relevant because I mentioned Googlebot-news, it is very much not well-known that <a href=\"https:\/\/twitter.com\/willcritchlow\/status\/1194284042885185537\" target=\"_blank\" rel=\"noopener noreferrer\">Googlebot-news is not a crawler<\/a> and hasn\u2019t been since 2011, apparently. If you didn\u2019t know this, don\u2019t worry &#8211; you\u2019re not alone. I only learned it recently, and it\u2019s pretty hard to discern from the documentation which regularly refers to it as a crawler. The only real official reference I can find is the <a href=\"https:\/\/webmasters.googleblog.com\/2011\/08\/google-news-now-crawling-with-googlebot.html\" target=\"_blank\" rel=\"noopener noreferrer\">blog post announcing its retirement<\/a>. I mean, it makes sense to me, because having different crawlers for web and news search opens up dangerous cloaking opportunities, but why then refer to it as a crawler\u2019s user agent throughout the docs? It <em>seems<\/em>, though I haven\u2019t been able to test this in real life, as though rules directly targeting Googlebot-news function somewhat like a <a href=\"https:\/\/twitter.com\/willcritchlow\/status\/1194284048870457346\" target=\"_blank\" rel=\"noopener noreferrer\">Google News-specific noindex<\/a>. This is very confusing, because regular Googlebot blocking does <strong>not<\/strong> keep URLs out of the web index, but there you go.]<\/p>\n<h3>I expect to see the Search Console robots.txt checker retired soon<\/h3>\n<p>We have seen a gradual move to <a href=\"https:\/\/searchengineland.com\/google-search-console-drops-preferred-domain-setting-318356\" target=\"_blank\" rel=\"noopener noreferrer\">turn off old Search Console features<\/a> and I expect that the robots.txt checker will be retired soon. Googlers have recently been referring recently to it being out of step with how their actual crawlers work &#8211; and we can see differences in our own testing:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-7545\" src=\"https:\/\/brainlabdev.wpenginepowered.com\/wp-content\/uploads\/2020\/11\/incorrect-googlebot-in-robotstxt_v2.png\" alt=\"incorrect googlebot in robotstxt\" width=\"960\" height=\"392\" srcset=\"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/incorrect-googlebot-in-robotstxt_v2.png 960w, https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/incorrect-googlebot-in-robotstxt_v2-300x123.png 300w, https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/incorrect-googlebot-in-robotstxt_v2-768x314.png 768w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/><\/p>\n<p>These cases seem to be handled correctly by the open source parser &#8211; here\u2019s my <a href=\"https:\/\/www.realrobotstxt.com\" target=\"_blank\" rel=\"noopener noreferrer\">web-based tool<\/a> on the exact same scenario:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-7546\" src=\"https:\/\/brainlabdev.wpenginepowered.com\/wp-content\/uploads\/2020\/11\/robotstxt-parser-example.png\" alt=\"robotstxt parser example\" width=\"1160\" height=\"372\" srcset=\"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/robotstxt-parser-example.png 1160w, https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/robotstxt-parser-example-300x96.png 300w, https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/robotstxt-parser-example-1024x328.png 1024w, https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/robotstxt-parser-example-768x246.png 768w\" sizes=\"auto, (max-width: 1160px) 100vw, 1160px\" \/><\/p>\n<p>This felt like all the more reason for me to release my web-based version, as the only official web-based tool we have is out of date and likely going away. Who knows whether Google will release an updated version based on their open source parser &#8211; but until they do, <a href=\"https:\/\/www.realrobotstxt.com\" target=\"_blank\" rel=\"noopener noreferrer\">my tool<\/a> might prove useful to some people.<\/p>\n<h3>I\u2019d like to see the documentation updated<\/h3>\n<p>Unfortunately, while I can make a <a href=\"https:\/\/github.com\/google\/robotstxt\/pulls\" target=\"_blank\" rel=\"noopener noreferrer\">pull request<\/a> against the open source code, I can\u2019t do the same with Google documentation. Despite implications out of Google that the old Search Console checker isn\u2019t in sync with real Googlebot, and hence shouldn\u2019t be trusted as the authoritative answer about how Google will parse a robots.txt file, references to it remain widespread in the documentation:<\/p>\n<ul>\n<li><a style=\"font-size: 1.05em;\" href=\"https:\/\/support.google.com\/webmasters\/answer\/6062608?hl=en\" target=\"_blank\" rel=\"noopener noreferrer\">Introduction to robots.txt<\/a><\/li>\n<li><a style=\"font-size: 1.05em;\" href=\"https:\/\/developers.google.com\/search\/mobile-sites\/mobile-seo\/common-mistakes\" target=\"_blank\" rel=\"noopener noreferrer\">Avoid common mistakes<\/a><\/li>\n<li><a style=\"font-size: 1.05em;\" href=\"https:\/\/support.google.com\/webmasters\/answer\/6062596?hl=en\" target=\"_blank\" rel=\"noopener noreferrer\">Create a robots.txt file<\/a><\/li>\n<li><a style=\"font-size: 1.05em;\" href=\"https:\/\/support.google.com\/webmasters\/answer\/6062598?hl=en\" target=\"_blank\" rel=\"noopener noreferrer\">Test your robots.txt with the robots.txt Tester<\/a><\/li>\n<li><a style=\"font-size: 1.05em;\" href=\"https:\/\/support.google.com\/webmasters\/answer\/6078399?hl=en\" target=\"_blank\" rel=\"noopener noreferrer\">Submit your updated robots.txt to Google<\/a><\/li>\n<li><a style=\"font-size: 1.05em;\" href=\"https:\/\/developers.google.com\/search\/docs\/guides\/debug\" target=\"_blank\" rel=\"noopener noreferrer\">Debugging your pages<\/a><\/li>\n<\/ul>\n<p>In addition, although it\u2019s natural that old blog posts might not be updated with new information, these are still prominently ranking for some related searches:<\/p>\n<ul>\n<li><a style=\"font-size: 1.05em;\" href=\"https:\/\/webmasters.googleblog.com\/2007\/08\/new-robotstxt-feature-and-rep-meta-tags.html\" target=\"_blank\" rel=\"noopener noreferrer\">New robots.txt feature and REP Meta Tags<\/a><\/li>\n<li><a style=\"font-size: 1.05em;\" href=\"https:\/\/webmasters.googleblog.com\/2014\/07\/testing-robotstxt-files-made-easier.html\" target=\"_blank\" rel=\"noopener noreferrer\">Testing robots.txt files made easier<\/a><\/li>\n<\/ul>\n<p>Who knows. Maybe they\u2019ll update the docs with links to <a href=\"https:\/\/www.realrobotstxt.com\" target=\"_blank\" rel=\"noopener noreferrer\">my tool<\/a> \ud83d\ude09<\/p>\n<h2>Let me know if it\u2019s useful to you<\/h2>\n<p>Anyway. I hope you find my tool useful &#8211; I enjoyed hacking around with a bit of C++ and Python to make it &#8211; it\u2019s good to have a \u201cmaker\u201d project on the go sometimes when your day job doesn\u2019t involve shipping code. If you spot any weirdness, have questions, or just find it useful, please drop me a note to let me know. You can find me <a href=\"https:\/\/twitter.com\/willcritchlow\" target=\"_blank\" rel=\"noopener noreferrer\">on Twitter<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The punchline: I\u2019ve been playing around with a toy project recently and have deployed it as a free web-based tool for checking how Google will parse your robots.txt files, given that their own online tool does not replicate actual Googlebot behaviour. Check it out at realrobotstxt.com. While preparing for my recent presentation at SearchLove London, [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":7436,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":true,"footnotes":""},"categories":[42],"tags":[],"class_list":["post-7543","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-seo"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.7 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Web-based robots.txt parser using Google&#039;s open-source Code - Brainlabs<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Web-based robots.txt parser using Google&#039;s open-source Code - Brainlabs\" \/>\n<meta property=\"og:description\" content=\"The punchline: I\u2019ve been playing around with a toy project recently and have deployed it as a free web-based tool for checking how Google will parse your robots.txt files, given that their own online tool does not replicate actual Googlebot behaviour. Check it out at realrobotstxt.com. While preparing for my recent presentation at SearchLove London, [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/\" \/>\n<meta property=\"og:site_name\" content=\"Brainlabs\" \/>\n<meta property=\"article:published_time\" content=\"2019-11-23T11:24:26+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-06-20T20:06:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/BL-Blog-Placeholder-V1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1136\" \/>\n\t<meta property=\"og:image:height\" content=\"445\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"claudiu\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Brainlabs\" \/>\n<meta name=\"twitter:site\" content=\"@Brainlabs\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"claudiu\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/\"},\"author\":{\"name\":\"claudiu\",\"@id\":\"https:\/\/www.brainlabsdigital.com\/#\/schema\/person\/d2633d055821dd28cb40492b806e23ff\"},\"headline\":\"Web-based robots.txt parser using Google&#8217;s open-source Code\",\"datePublished\":\"2019-11-23T11:24:26+00:00\",\"dateModified\":\"2025-06-20T20:06:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/\"},\"wordCount\":917,\"publisher\":{\"@id\":\"https:\/\/www.brainlabsdigital.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/BL-Blog-Placeholder-V1.png\",\"articleSection\":[\"SEO\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/\",\"url\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/\",\"name\":\"Web-based robots.txt parser using Google's open-source Code - Brainlabs\",\"isPartOf\":{\"@id\":\"https:\/\/www.brainlabsdigital.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/BL-Blog-Placeholder-V1.png\",\"datePublished\":\"2019-11-23T11:24:26+00:00\",\"dateModified\":\"2025-06-20T20:06:28+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#primaryimage\",\"url\":\"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/BL-Blog-Placeholder-V1.png\",\"contentUrl\":\"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/BL-Blog-Placeholder-V1.png\",\"width\":1136,\"height\":445},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.brainlabsdigital.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Web-based robots.txt parser using Google&#8217;s open-source Code\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.brainlabsdigital.com\/#website\",\"url\":\"https:\/\/www.brainlabsdigital.com\/\",\"name\":\"Brainlabs\",\"description\":\"High-Performance Media Agency\",\"publisher\":{\"@id\":\"https:\/\/www.brainlabsdigital.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.brainlabsdigital.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.brainlabsdigital.com\/#organization\",\"name\":\"Brainlabs\",\"url\":\"https:\/\/www.brainlabsdigital.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.brainlabsdigital.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2025\/04\/cropped-25-Brainlabs-Color-Logo-Icon-1-300x300.png\",\"contentUrl\":\"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2025\/04\/cropped-25-Brainlabs-Color-Logo-Icon-1-300x300.png\",\"width\":300,\"height\":300,\"caption\":\"Brainlabs\"},\"image\":{\"@id\":\"https:\/\/www.brainlabsdigital.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/Brainlabs\",\"https:\/\/www.linkedin.com\/company\/brainlabs-digital\/\",\"https:\/\/www.instagram.com\/brainlabs\/\",\"https:\/\/www.youtube.com\/@brainlabsmedia\/featured\",\"https:\/\/www.tiktok.com\/@brainlabsglobal?lang=en\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.brainlabsdigital.com\/#\/schema\/person\/d2633d055821dd28cb40492b806e23ff\",\"name\":\"claudiu\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.brainlabsdigital.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/4814cc0c790a1d2fe26b7690b22344dfde27b79d9ac47148e73322e9553a795b?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/4814cc0c790a1d2fe26b7690b22344dfde27b79d9ac47148e73322e9553a795b?s=96&d=mm&r=g\",\"caption\":\"claudiu\"},\"url\":\"https:\/\/www.brainlabsdigital.com\/author\/claudiu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Web-based robots.txt parser using Google's open-source Code - Brainlabs","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/","og_locale":"en_US","og_type":"article","og_title":"Web-based robots.txt parser using Google's open-source Code - Brainlabs","og_description":"The punchline: I\u2019ve been playing around with a toy project recently and have deployed it as a free web-based tool for checking how Google will parse your robots.txt files, given that their own online tool does not replicate actual Googlebot behaviour. Check it out at realrobotstxt.com. While preparing for my recent presentation at SearchLove London, [&hellip;]","og_url":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/","og_site_name":"Brainlabs","article_published_time":"2019-11-23T11:24:26+00:00","article_modified_time":"2025-06-20T20:06:28+00:00","og_image":[{"width":1136,"height":445,"url":"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/BL-Blog-Placeholder-V1.png","type":"image\/png"}],"author":"claudiu","twitter_card":"summary_large_image","twitter_creator":"@Brainlabs","twitter_site":"@Brainlabs","twitter_misc":{"Written by":"claudiu","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#article","isPartOf":{"@id":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/"},"author":{"name":"claudiu","@id":"https:\/\/www.brainlabsdigital.com\/#\/schema\/person\/d2633d055821dd28cb40492b806e23ff"},"headline":"Web-based robots.txt parser using Google&#8217;s open-source Code","datePublished":"2019-11-23T11:24:26+00:00","dateModified":"2025-06-20T20:06:28+00:00","mainEntityOfPage":{"@id":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/"},"wordCount":917,"publisher":{"@id":"https:\/\/www.brainlabsdigital.com\/#organization"},"image":{"@id":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#primaryimage"},"thumbnailUrl":"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/BL-Blog-Placeholder-V1.png","articleSection":["SEO"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/","url":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/","name":"Web-based robots.txt parser using Google's open-source Code - Brainlabs","isPartOf":{"@id":"https:\/\/www.brainlabsdigital.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#primaryimage"},"image":{"@id":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#primaryimage"},"thumbnailUrl":"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/BL-Blog-Placeholder-V1.png","datePublished":"2019-11-23T11:24:26+00:00","dateModified":"2025-06-20T20:06:28+00:00","breadcrumb":{"@id":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#primaryimage","url":"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/BL-Blog-Placeholder-V1.png","contentUrl":"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2020\/11\/BL-Blog-Placeholder-V1.png","width":1136,"height":445},{"@type":"BreadcrumbList","@id":"https:\/\/www.brainlabsdigital.com\/web-based-robots-txt-parser-using-googles-open-source-code\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.brainlabsdigital.com\/"},{"@type":"ListItem","position":2,"name":"Web-based robots.txt parser using Google&#8217;s open-source Code"}]},{"@type":"WebSite","@id":"https:\/\/www.brainlabsdigital.com\/#website","url":"https:\/\/www.brainlabsdigital.com\/","name":"Brainlabs","description":"High-Performance Media Agency","publisher":{"@id":"https:\/\/www.brainlabsdigital.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.brainlabsdigital.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.brainlabsdigital.com\/#organization","name":"Brainlabs","url":"https:\/\/www.brainlabsdigital.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.brainlabsdigital.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2025\/04\/cropped-25-Brainlabs-Color-Logo-Icon-1-300x300.png","contentUrl":"https:\/\/www.brainlabsdigital.com\/wp-content\/uploads\/2025\/04\/cropped-25-Brainlabs-Color-Logo-Icon-1-300x300.png","width":300,"height":300,"caption":"Brainlabs"},"image":{"@id":"https:\/\/www.brainlabsdigital.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/Brainlabs","https:\/\/www.linkedin.com\/company\/brainlabs-digital\/","https:\/\/www.instagram.com\/brainlabs\/","https:\/\/www.youtube.com\/@brainlabsmedia\/featured","https:\/\/www.tiktok.com\/@brainlabsglobal?lang=en"]},{"@type":"Person","@id":"https:\/\/www.brainlabsdigital.com\/#\/schema\/person\/d2633d055821dd28cb40492b806e23ff","name":"claudiu","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.brainlabsdigital.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/4814cc0c790a1d2fe26b7690b22344dfde27b79d9ac47148e73322e9553a795b?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/4814cc0c790a1d2fe26b7690b22344dfde27b79d9ac47148e73322e9553a795b?s=96&d=mm&r=g","caption":"claudiu"},"url":"https:\/\/www.brainlabsdigital.com\/author\/claudiu\/"}]}},"lang":"en","translations":{"en":7543},"pll_sync_post":[],"_links":{"self":[{"href":"https:\/\/www.brainlabsdigital.com\/wp-json\/wp\/v2\/posts\/7543","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.brainlabsdigital.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.brainlabsdigital.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.brainlabsdigital.com\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.brainlabsdigital.com\/wp-json\/wp\/v2\/comments?post=7543"}],"version-history":[{"count":0,"href":"https:\/\/www.brainlabsdigital.com\/wp-json\/wp\/v2\/posts\/7543\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.brainlabsdigital.com\/wp-json\/wp\/v2\/media\/7436"}],"wp:attachment":[{"href":"https:\/\/www.brainlabsdigital.com\/wp-json\/wp\/v2\/media?parent=7543"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.brainlabsdigital.com\/wp-json\/wp\/v2\/categories?post=7543"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.brainlabsdigital.com\/wp-json\/wp\/v2\/tags?post=7543"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}