robots.txt - that's the reason

If you are running a website, you have probably already discovered a file called robots.txt in your FTP program. You can find out what is behind this text file and why it is important in this practical tip.

robots.txt - Requirements for search engines

Each domain should have a robots.txt. It is an important part of SEO.

Search engines work with crawlers. These are small, independently working programs. They search the internet for content. Websites are read out and indexed.
Because crawlers work independently, they are also called search engine bots or robots.
Your website's robots.txt tells these crawlers which directories can and cannot be read.
To get this information, crawlers first look for a domain's robots.txt. For this reason, the robots.txt must be at the top level of the directory structure. It must not be moved to a directory - then the bots will not find these text files.
Put simply, robots.txt gives search engines crawling two pieces of information. The entry "User-agent:" specifies for which robot - this is addressed in robots.txt as a user-agent - the following instruction applies.
This is followed by the entry "allow:" or "disallow:". The directories and subdirectories that the bot is allowed to crawl and which directories he should leave out when indexing are then listed behind.
The entry "allow:" is less important. Anything that is not expressly excluded is indexed by the robot anyway.
Some CMS such as Drupal create robots.txt directly during installation. In WordPress you can create the robots.txt using a plugin.

If you receive the Google message "Unusually many requests", you can find out in our next practical tip what you can do.

robots.txt - that's the reason

You May Also Like...

Category

Hot Posts