What is Robots.txt? and how to create it?

One of the most important tasks of search engine bots is to crawl and archive websites. If you want to make parts of your website inaccessible to search engine bots, the Robots.txt file is one of the methods you can use for this purpose. You need to be very careful when creating a file, because an incorrectly created file can cause important parts of your website to be inaccessible to search engines, or vice versa, with the wrong directory created, your website will not be taken into account by search engines and all directories that you do not want to be crawled can also be crawled. What kind of file directory should be prepared in terms of SEO and what should be considered? Let’s take a look together.

How to Create Robots.txt

The robots.txt file should be created in accordance with certain standards and added to the root directory of your site. You should pay attention to the following points when creating and preparing indexes;

This file;

be located in the root directory of your website
Preparation in accordance with UTF-8 character encoding
The URL of the file and the site URL are the same

That’s right:

Site URL: https://www.mobitek.com/
Robots.txt URL: https://www.mobitek.com/robots.txt

False:

Site URL: https://www.mobitek.com/
Robots.txt URL: https://www.mobitek.com/blog/robots.txt

Commands

Create a new text document and name it Robots.txt. There will be two different variables in the file directory. Their meanings are as follows;

User-agent: the part where the search bot will come up with its name
Allow: This command allows you to specify which pages on your site you want to be indexed.
Disallow: This command allows you to specify which pages on your site you don’t want indexed.

User-agent: *
Disallow: / temp /

The directory above belongs to a robots.txt file that allows all agents except / temp to access everything on the site.

User-agent: Googlebot
Disallow: / displays /
Disallow: / temp /
Disallow: /cgi-bin/

We see that another registry has been created that specifies more restrictive terms for Googlebot. When Googlebot starts reading this file of yours, it will see that all user agents (including Googlebot itself) are allowed all folders except / temp /. This point is where it learns what it needs to know for Googlebot, so it doesn’t read the file to the end and thinks that you don’t want it to index anything, including / temp / – including / images / and / cgi-bin /. As you can see, the structure of this file is simple, but it is still easy to fall into such a logic error, and such a situation should definitely be avoided.