Sitemap, Meta Robot tags, and Robot.txt file are so much important in terms of SEO.
Most people concentrate on content, site speed, design, Link building etc ( which are also important) but they finally forget to set up their technical SEO. When you want to rank well in Google and earn a pretty good amount of money, first you need to index your site pages and posts on Google properly.
Google index your pages and posts automatically, but they are not perfect. You have to give specific instructions to them manually.
Not only Google, In fact, all popular search engines like Bing, Yahoo, Ask works similarly.
You can see your posts and pages which are indexed in Google by searching like this in Google:
site:yourdomain.com ( replace your domain with your website domain name )
Here is how I can see my indexed pages:-
The best thing here is it is so easy to index in these search engine properly. You can give instructions to them by setting up Sitemap, Meta Robot tags, and Robots.txt file.
But, one thing you have to remember is, we can’t see changes of Indexing right after you set up all of them. It takes time to index properly following your instructions.
In this post, you can find the step-by-step procedure of how you can create and set up sitemap, meta robot tags and robots.txt file that too in the easiest way possible if you are using WordPress.
Create Sitemap, Meta Robot tags, Robots.txt file for WordPress (Easy Step-by-Step Method)
I Promise you that you can setup Sitemap, Meta Robot tags and Robots.txt file by just using 1 Plug-in. It is most popular plug-in and you may know that plug-in before and it is WordPress SEO by Yeast.
If you are not using it, my suggestion is to install right now. So that we can continue our set up process of sitemap, meta robot tags and robots.txt file.
before setting up, we have to know “what they are?” and “what they do?”.
What is Sitemap? & What Sitemap do?
A sitemap is an XML file which helps search engines like Google to know about your pages, posts, category etc.
Once Google gets to know about such information, they can crawl your site faster.
Now if you are not familiar with this word Crawling in blogging field.
This is what it means –
Google send Crawlers ( also popular as Google spider, Robots ). If that Google spider visited your site. It crawls on your site pages and posts. So that Google can know the pages existed on your site.
But, the problem is, It gives only some time for crawlers to crawl your site. This time depends on the authority of your domain. So you should give more information to those crawlers. So that they can crawl your site as fast as possible.
The sitemap is one of the ways you can give information to crawlers.
To give little more detail, we can say that, the sitemap is like a map for search engine crawlers to roam around on your website.
My most possible one-line answer:-
What is sitemap? –
A sitemap is an XML file which has a information of your site pages, posts etc.
What Sitemap do? –
It gives that information to search engine and helps to index your site properly.
How to create & Setup Sitemap?
I am standing by my word. I promised you that you can create all these 3 including sitemap by using one plug-in.
If you have not downloaded it, You can download it from here.
I use Yeast plug-in to create sitemap because of two reasons.
- Along with sitemap, it also offers many other features which are helpful for SEO purposes like social and maintaining On-page SEO.
- It creates the best sitemap comparing to many other sitemap generators.
After installing Yeast SEO plug-in,
You can create a sitemap from your WordPress dashboard by going to SEO > General.
Then click on features tab at the top of the page.
Then scroll down to XML sitemaps and make sure it is On.
and click on that save changes button at the bottom.
Now you can see your sitemap by using this URL after your domain name.
yourdomainname.com/sitemap.xml or yourdomainname.com/sitemp_index.xml.
Your sitemap will look like this
If you can’t see a sitemap similar to this. Then you may be using another plug-in to create a sitemap. Deactivate that plug-in or turn of the sitemap option in that plug-in. Even you can still proceed with your old plug-in. That’s up to you.
If you are using the Yeast sitemap,
You can customize your sitemap from your WordPress dashboard by going to SEO > Search Appearance.
Then select the type of content from the options.
- Content types
Then select yes or no for every type of content to be indexed in search engines.
The content you selected yes to show in search results will be in the sitemap.
In the previous versions of Yeast. You can find a separate section called sitemap to select which type of content to be in the sitemap.
Later they included it to Search Appearance Section.
So What you have selected here will be included in the sitemap. Which is a lot better than that old version according to SEO but bit confusing.
You have successfully created a sitemap. Now you should make Google know about this.
So that you have to submit your sitemap to Google.
This is how you can do.
Go to your Google Search Console ( also popular as Google webmasters ).
If you don’t have an account, you can create one using your Google account.
& You also should connect your Search Console to your site.
Click on Add A Property Button.
They enter your website name and click ADD.
For verification, I prefer to choose Alternate Method.
In the Alternate method. Select HTML tag option.
Don’t copy the full code. Just copy the content part.
I highlighted the code here. You get a different code for your site. Just copy that code.
Just don’t get excited, the code in the image is not my site code.
After copying that code.
Navigate to SEO > General. & select webmaster tools tab at the top and paste the code into the Google verification code option and click save changes.
Now click on the verify button in search console.
If it shows you successful page,
You have successfully connected to your search console.
Now you can submit your sitemap to search console. So that Google can know about your sitemap.
Navigate to search console home screen and click on your site.
Now navigate to crawl > Sitemaps.
Then select Submit and enter the sitemap URL of your site and click OK.
By now you have created and submitted sitemap successfully to Google.
It’s better to check your site and it’s errors in Search Console even once in 2 weeks.
now let’s go to next section meta robot tags.
What are Meta Robot Tags? What Meta Robot Tags do?
We already know that Google sends crawlers, these crawlers are also called as Robots.
That Robots can be controlled by these Meta Robot tags. These tags can be assigned to each and every page, post etc.
By default, all pages are set to none. That means we are not blocking any of those pages to index.
Meta Robot Tags are not really used for indexing. But they are used to block some pages to index. You may be thinking why you want to block pages from search engines.
These are some pages which are best to block from search engines.
- Pages with duplicate content like blog post pages. Google doesn’t like duplicate content. It’s better to stop those pages from Google. So that remaining pages can rank better.
You can check your duplicate content in siteliner.com
- Thank-you pages.
Some other pages You don’t want to index in Google.
My most possible one-line answer:-
What are Meta Robot Tags? –
Meta Robot Tags are tags that are assigned to each and every single page
What Meta Robot Tags do? –
They decide that page to index or not in the Search engines.
How to create & Setup Meta Robot Tags?
Setting up Meta Robot tags is so easy.
We can set it by using Yeast SEO plug-in as I said before.
Now just open the page, post or category you want to deindex from search engines.
Here I want to hide my blogs page from search engines as that page don’t have any useful content and also contains duplicate content.
Now scroll down to Yeast SEO section.
& select that Gear button on the left.
Then select Yes or no to show on search engines.
So what it will do?
It will delete that page on sitemap also. Which is helpful to hide pages like thank you pages even from the sitemap.
So, no one can see that page even in the sitemap.
Next option is to follow links or not from that page.
If you select no they search engines won’t follow that page. When you have more external links on any page then you can select No. So, that no search engines follow links on that page.
Yes:- Search engines follow links on that page.
No:-Search engines won’t follow links on that page.
Here mine is blog page and all my posts are on that page. I want search engines to follow my post links.
So, I will keep this Yes. You can do it according to your need.
Now comes the main setting. Changing Meta Robots tags.
Before changing them, take a small look at the tags it provides.
- No Image Index
- No archive
- No Snippet
None:- None means that you are not blocking anything on that page.
No Image Index:- If you selected this option, then search engines won’t index images on that page.
No archive:- It prevents search engines from showing cache copy of that page.
No snippet:- Prevents search engine from showing the snippet of that page.
You can select no tags which mean None. Or you can even all the 4 tags. That’s up to you.
Next, click the update button and save that page.
By that, you completed setting up Meta Robot tags. But, here I created only on one page. You can do it for all your pages you want to hide from search engines.
Now moving to our last part, creating robots.txt
What is a Robots.txt file? What Robots.txt file do?
Robot.txt file is a text file. Yes, it a normal plain text file located in the root of your domain.
When these crawlers visit your site, they visit the pages in the sitemap and your individual pages can be handled with meta robots.
So if you can take care of every page and post on your site already. Why this robots.txt file is important.
There are some best reasons to create a robots.txt file
Here are some of them:-
- It can block your admin pages
- It can block pages in directory format.
- It is the first place where search engine crawlers check.
- You can decide which bot you want to customize. ( separate instructions for different search engines )
- You can make your sitemap more reachable for the search engines by including a sitemap in the robots.txt file.
- You are giving extra information to search engines which is always the best way.
Actually, you can run a website by not using sitemap, meta robot tags and robot.txt file. And even such website can be in a successful way.
But, when you use these, your success rate will be more than the site which is not using them. Because you are giving more instructions to Google and indexing in Google perfectly.
If you are indexing better, then your duplicate content pages and high bounce rate pages can be hidden from Google and you can get low bounce rate which helps you to rank higher in Google.
That’s where the magic lies in creating them.
My most possible one-line answer:-
What is a Robots.txt file? –
It is a text file located in the root of the domain written in a language which is understood by search engine crawlers.
What Robots.txt file do? –
Search Engine crawlers first visit this text file when they started crawling your site. So, it’s a great place to give them instructions.
How to create a Robots.txt file?
Before to create a robots.txt file, you have to create a text file on your PC and then upload it to file manager in Cpanel.
now the process is so easy.
You can create a robots.txt file using your yeast Plug-in.
This is how you can do:
Navigate to SEO > Tools from your WordPress dashboard.
There you can find an option called File editor.
click on it and you will be directed this page where you can edit and upload the robots.txt file.
Now if you find any pre-installed text over there, you can delete it and start fresh.
To upload a file you have to know about the code which robots can understand.
By luck, this is so easy to understand.
The basic structure of robots.txt file:-
You can find these terms all over the text file and here what they mean
In the robots.txt file, you can give separate instructions for each search engine bot. The term mentioned in this section will be a search engine bot name to which the instructions you are giving to.
we have mentioned star mark ( * ) which means you are giving instructions to all search engines.
If you have mentioned a name like Google-bot, then the instructions you are giving are read by only Google. Remaining all the bots don’t take the instructions you give below.
If you want to give instructions for two or more bots, this is how you can do.
Disallow: Bla bla bla etc
This is how you can give instructions to two or more bots by using different sections.
If you want to give instructions to all search engines except one or more specifically. You can even do it.
This is how that can be achieved.
If you are mentioning a separate section for any search engine bot, then that bot will not consider the instructions given to all. because it has it’s own instructions given.
You can find user agent bot list here.
Disallow & Allow:
So you have selected a user agent or mentioning to all bots in one section.
Now you have to give instructions to them.
The instructions a search engine bot can understand is Disallow and Allow.
Disallow means you are saying to search engines to don’t allow that page, post, file, or directory to index.
Allow means you are saying to search engines to allow a page, post, file, or directory. But there is no need to mention for each and page and posts. You can use this allow instruction only when you want to index any post or page from a disallowed directory.
If you are confusing with instructions given and directories, don’t be worried, I will explain in-depth later in this post. Just understand the meaning of Disallow and allow instructions.
You can mention your sitemap URL in the robots.txt file. With that, you can remind bots that you are having a sitemap also which they can get more information.
For example, here how can I mention my sitemap in robots.txt file
So, now you can mention user-agent, and give instructions like Disallow and allow. Finally, you can remind bots about your sitemap.
Now the only thing you have to know is how to mention your instructions in Disallow and Allow.
You can know about that with these following steps:-
Navigate to your Cpanel of your site.
There you can find file manager.
After going into file manager you can find your files.
Then enter into public_html
There you can find all your file, folders etc.
There you can understand how to block directories.
If you block any folder, you are blocking everything in it.
If I want to block WordPress admin section because search engines don’t really want it.
I can block my wp-admin folder
Here’s how I can do that in robots.txt file
If I want to allow something in the blocked folder. I can do that by
If you want to block any specific folder or file in a folder, you can do it too
If you don’t want to block anything, then you can do this
But if you are using / after the disallow like this
then you are disallowing entire site. And search engine bots will not index any of your site pages and posts.
So be careful while editing, Or else you may create extra problems.
The best thing about the robots.txt file is, You can see every sites robots.txt file by using robots.txt after it.
You can see my robots.txt file like this
You can even find Google’s robots.txt file by
After seeing them, you can get an idea of how to set your robots.txt file.
After setting your robots.txt file, don’t immediately click on this save changes to the robots.txt file,
You have check whether your robots.txt file has any errors before uploading.
For that, you can use robots.txt checker in Google search console.
You can find it by Navigating to Google search console > Crawl >Robot.txt tester.
If you find any text already, delete it and paste the text you edited.
Now click on that test button at the bottom
If it says allowed and you have 0 errors and 0 warnings, It’s time to upload your robots.txt file.
You can upload your robots.txt file by clicking this button in WordPress.
That’s it, you have created and completed setting up your sitemap, meta robot tags, and robots.txt file.
Now search engines can find more information about how to index your website.