Google publishes a draft to formalize the specification of the robot exclusion protocol


Google announced this morning that he had sent a call for comments to the Internet Engineering Working Group in order to formalize the specification of the protocol of exclusion robots, an informal standard 25 years old for the Internet. [19659002] L & # 39; s ad. Google wrote on his blog: "With the author of the protocol, webmasters, and other search engines, we documented the use of REP on the modern Web and submitted it to the IETF. The proposed REP project reflects more than 20 years of real-life experience in using robots.txt rules, used by both Googlebot and other major crawlers, as well as about half a billion websites that rely on REP. "

Nothing changes. . I asked Google's Gary Illyes, who was part of this announcement, if something changes and he replied, "No, nothing at all."

So why do that? As the robot exclusion protocol has never been a formal standard, there is no official or definitive guide to keeping it up to date or ensuring a specific syntax must be followed. All major search engines have adopted robots.txt as an analytics guideline, but this is not even an official standard. This will change.

Google has opened its robots.txt parser. Google then announced that he was opening the source of the part of his robots.txt file that parses the robots.txt file. "We opened from the C ++ library used by our production systems to parse and match the rules in robots.txt files," said Google. You can see this library on Github today if you want.

Why we care about it Nothing changes specifically today, but with this change to make it a formal standard, it is possible to change things. Do not forget that the Internet has been using it as a standard for 25 years without it becoming an official standard. It is therefore unclear what will change or may change in the future. But for now, if you build your own robot, you can use Google's robots.txt parser to help you.


About the Author

Barry Schwartz is the news editor of Land Search Engine and owns RustyBrick . , a web consulting company based in New York. He also directs Search Engine Roundtable a popular research blog dealing with topics related to SEM.



Source link

Leave a Reply