Sitemap, Meta Robot tags, and Robot.txt file are so much important in terms of SEO.

But,

Most people concentrate on content, site speed, design, Link building etc ( which are also important) but they finally forget to set up their technical SEO. When you want to rank well in Google and earn a pretty good amount of money, first you need to index your site pages and posts on Google properly.

Google index your pages and posts automatically, but they are not perfect. You have to give specific instructions to them manually.

Not only Google, In fact, all popular search engines like Bing, Yahoo, Ask works similarly.

You can see your posts and pages which are indexed in Google by searching like this in Google:

site:yourdomain.com ( replace your domain with your website domain name )

Here is how I can see my indexed pages:-

sitemap,meta robot tags, robots.txt

The best thing here is it is so easy to index in these search engine properly. You can give instructions to them by setting up Sitemap, Meta Robot tags, and Robots.txt file.

But, one thing you have to remember is, we can’t see changes of Indexing right after you set up all of them. It takes time to index properly following your instructions.

In this post, you can find the step-by-step procedure of how you can create and set up sitemap, meta robot tags and robots.txt file that too in the easiest way possible if you are using WordPress.

Create Sitemap, Meta Robot tags, Robots.txt file for WordPress (Easy Step-by-Step Method)

I Promise you that you can setup Sitemap, Meta Robot tags and Robots.txt file by just using 1 Plug-in. It is most popular plug-in and you may know that plug-in before and it is WordPress SEO by Yeast.

If you are not using it, my suggestion is to install right now. So that we can continue our set up process of sitemap, meta robot tags and robots.txt file.

But,

before setting up, we have to know “what they are?” and “what they do?”.

Firstly,

What is Sitemap? & What Sitemap do?

A sitemap is an XML file which helps search engines like Google to know about your pages, posts, category etc.

Once Google gets to know about such information, they can crawl your site faster.

Now if you are not familiar with this word Crawling in blogging field.

This is what it means –

Google send Crawlers ( also popular as Google spider, Robots ).  If that Google spider visited your site. It crawls on your site pages and posts. So that Google can know the pages existed on your site.

But, the problem is, It gives only some time for crawlers to crawl your site. This time depends on the authority of your domain. So you should give more information to those crawlers. So that they can crawl your site as fast as possible.

The sitemap is one of the ways you can give information to crawlers.

To give little more detail, we can say that, the sitemap is like a map for search engine crawlers to roam around on your website.

My most possible one-line answer:-

What is sitemap?

A sitemap is an XML file which has a information of your site pages, posts etc.

What Sitemap do? 

It gives that information to search engine and helps to index your site properly.

How to create & Setup Sitemap?

I am standing by my word. I promised you that you can create all these 3 including sitemap by using one plug-in.

If you have not downloaded it, You can download it from here.

I use Yeast plug-in to create sitemap because of two reasons.

  • Along with sitemap, it also offers many other features which are helpful for SEO purposes like social and maintaining On-page SEO.
  • It creates the best sitemap comparing to many other sitemap generators.

After installing Yeast SEO plug-in,

You can create a sitemap from your WordPress dashboard by going to SEO > General.

Then click on features tab at the top of the page.

sitemap,meta robot tags, robots.txt

Then scroll down to XML sitemaps and make sure it is On.

sitemap,meta robot tags, robots.txt

and click on that save changes button at the bottom.

Now you can see your sitemap by using this URL after your domain name.

yourdomainname.com/sitemap.xml or yourdomainname.com/sitemp_index.xml. 

Your sitemap will look like this

sitemap,meta robot tags, robots.txt

If you can’t see a sitemap similar to this. Then you may be using another plug-in to create a sitemap. Deactivate that plug-in or turn of the sitemap option in that plug-in. Even you can still proceed with your old plug-in. That’s up to you.

If you are using the Yeast sitemap,

You can customize your sitemap from your WordPress dashboard by going to SEO > Search Appearance.

Then select the type of content from the options.

  • Content types
  • Media
  • Taxonomies
  • Archives

Then select yes or no for every type of content to be indexed in search engines.
The content you selected yes to show in search results will be in the sitemap.

sitemap,meta robot tags, robots.txt

In the previous versions of Yeast. You can find a separate section called sitemap to select which type of content to be in the sitemap.

But,

Later they included it to Search Appearance Section.

So What you have selected here will be included in the sitemap. Which is a lot better than that old version according to SEO but bit confusing.

You have successfully created a sitemap. Now you should make Google know about this.

So that you have to submit your sitemap to Google.

This is how you can do.

Go to your Google Search Console ( also popular as Google webmasters ).

If you don’t have an account, you can create one using your Google account.

sitemap,meta robot tags, robots.txt

& You also should connect your Search Console to your site.

Click on Add A Property Button.

They enter your website name and click ADD.

For verification, I prefer to choose Alternate Method.

sitemap,meta robot tags, robots.txt

In the Alternate method. Select HTML tag option.

Don’t copy the full code. Just copy the content part.

I highlighted the code here. You get a different code for your site. Just copy that code.

sitemap,meta robot tags, robots.txt

Just don’t get excited, the code in the image is not my site code.

After copying that code.

Navigate to SEO > General. & select webmaster tools tab at the top and paste the code into the Google verification code option and click save changes.

Now click on the verify button in search console.

If it shows you successful page,

You have successfully connected to your search console.

Now you can submit your sitemap to search console. So that Google can know about your sitemap.

Navigate to search console home screen and click on your site.sitemap,meta robot tags, robots.txt

Now navigate to crawl  > Sitemaps.

sitemap,meta robot tags, robots.txt

Then select Submit and enter the sitemap URL of your site and click OK.

By now you have created and submitted sitemap successfully to Google.

It’s better to check your site and it’s errors in Search Console even once in 2 weeks.

now let’s go to next section meta robot tags.

What are Meta Robot Tags? What Meta Robot Tags do?

We already know that Google sends crawlers, these crawlers are also called as Robots.

That Robots can be controlled by these Meta Robot tags. These tags can be assigned to each and every page, post etc.

By default, all pages are set to none. That means we are not blocking any of those pages to index.

Meta Robot Tags are not really used for indexing. But they are used to block some pages to index. You may be thinking why you want to block pages from search engines.

These are some pages which are best to block from search engines.

  • Pages with duplicate content like blog post pages. Google doesn’t like duplicate content. It’s better to stop those pages from Google. So that remaining pages can rank better.
    You can check your duplicate content in siteliner.com
  • Thank-you pages.

Some other pages You don’t want to index in Google.

My most possible one-line answer:-

What are Meta Robot Tags? – 

Meta Robot Tags are tags that are assigned to each and every single page

What Meta Robot Tags do? –

They decide that page to index or not in the Search engines.

How to create & Setup Meta Robot Tags?

Setting up Meta Robot tags is so easy.

We can set it by using Yeast SEO plug-in as I said before.

Now just open the page, post or category you want to deindex from search engines.

Here I want to hide my blogs page from search engines as that page don’t have any useful content and also contains duplicate content.sitemap,meta robot tags, robots.txt

Now scroll down to Yeast SEO section.

sitemap,meta robot tags, robots.txt

& select that Gear button on the left.

sitemap,meta robot tags, robots.txt

Then select Yes or no to show on search engines. sitemap,meta robot tags, robots.txt

So what it will do?

It will delete that page on sitemap also. Which is helpful to hide pages like thank you pages even from the sitemap.
So, no one can see that page even in the sitemap.

Next option is to follow links or not from that page.

sitemap,meta robot tags, robots.txt

If you select no they search engines won’t follow that page. When you have more external links on any page then you can select No. So, that no search engines follow links on that page.

Yes:- Search engines follow links on that page.

No:-Search engines won’t follow links on that page.

Here mine is blog page and all my posts are on that page. I want search engines to follow my post links.

So, I will keep this Yes. You can do it according to your need.

Now comes the main setting. Changing Meta Robots tags.

Before changing them, take a small look at the tags it provides.

sitemap,meta robot tags, robots.txt

  • None
  • No Image Index
  • No archive
  • No Snippet

None:- None means that you are not blocking anything on that page.

No Image Index:- If you selected this option, then search engines won’t index images on that page.

No archive:- It prevents search engines from showing cache copy of that page.

No snippet:- Prevents search engine from showing the snippet of that page.

You can select no tags which mean None. Or you can even all the 4 tags. That’s up to you.

Next, click the update button and save that page.

By that, you completed setting up Meta Robot tags. But, here I created only on one page. You can do it for all your pages you want to hide from search engines.

Now moving to our last part, creating robots.txt

What is a Robots.txt file? What Robots.txt file do?

Robot.txt file is a text file. Yes, it a normal plain text file located in the root of your domain.

When these crawlers visit your site, they visit the pages in the sitemap and your individual pages can be handled with meta robots.

So if you can take care of every page and post on your site already. Why this robots.txt file is important.

There are some best reasons to create a robots.txt file
Here are some of them:-

  • It can block your admin pages
  • It can block pages in directory format.
  • It is the first place where search engine crawlers check.
  • You can decide which bot you want to customize. ( separate instructions for different search engines )
  • You can make your sitemap more reachable for the search engines by including a sitemap in the robots.txt file.
  • You are giving extra information to search engines which is always the best way.

Actually, you can run a website by not using sitemap, meta robot tags and robot.txt file. And even such website can be in a successful way.

But, when you use these, your success rate will be more than the site which is not using them. Because you are giving more instructions to Google and indexing in Google perfectly.

If you are indexing better, then your duplicate content pages and high bounce rate pages can be hidden from Google and you can get low bounce rate which helps you to rank higher in Google.

That’s where the magic lies in creating them.

My most possible one-line answer:-

What is a Robots.txt file? –

It is a text file located in the root of the domain written in a language which is understood by search engine crawlers.

What Robots.txt file do? –

Search Engine crawlers first visit this text file when they started crawling your site. So, it’s a great place to give them instructions.

How to create a Robots.txt file?

Before to create a robots.txt file, you have to create a text file on your PC and then upload it to file manager in Cpanel.

But,

now the process is so easy.

You can create a robots.txt file using your yeast Plug-in.

This is how you can do:

Navigate to SEO >  Tools from your WordPress dashboard.

There you can find an option called File editor.

sitemap,meta robot tags, robots.txt

click on it and you will be directed to this page where you can edit and upload the robots.txt file.

Now if you find any pre-installed text over there, you can delete it and start fresh.

To upload a file you have to know about the code which robots can understand.

By luck, this is so easy to understand.

The basic structure of robots.txt file:-

User-agent:
Disallow:
Allow:

Sitemap:

Understanding basics:-

You can find these terms all over the text file and here what they mean

User-agent:

In the robots.txt file, you can give separate instructions for each search engine bot. The term mentioned in this section will be a search engine bot name to which the instructions you are giving to.

For example:-

User-agent: *

we have mentioned star mark ( * ) which means you are giving instructions to all search engines.

If you have mentioned a name like Google-bot, then the instructions you are giving are read by only Google. Remaining all the bots don’t take the instructions you give below.

If you want to give instructions for two or more bots, this is how you can do.

User-agent: Googlebot
Disallow: 

User-agent: Bingbot
Disallow: /wp-admin/
Disallow: Bla bla bla etc

This is how you can give instructions to two or more bots by using different sections.

If you want to give instructions to all search engines except one or more specifically. You can even do it.

This is how that can be achieved.

User-agent: *
Disallow:

User-agent: Googlebot
Disallow: /wp-admin/

If you are mentioning a separate section for any search engine bot, then that bot will not consider the instructions given to all. because it has it’s own instructions given.

You can find user agent bot list here.

Disallow & Allow:

So you have selected a user agent or mentioning to all bots in one section.

Now you have to give instructions to them.
The instructions for a search engine bot can understand is Disallow and Allow.

Disallow means you are saying to search engines to don’t allow that page, post, file, or directory to index.

Ex:-

User-agent: *
Disallow: /wp-admin/

Allow means you are saying to search engines to allow a page, post, file, or directory. But there is no need to mention for each and page and posts. You can use this allows instruction only when you want to index any post or page from a disallowed directory.

Ex:-

User-agent: *
Disallow: /wp-content/
Allow: /wp-content/uploads

If you are confusing with instructions given and directories, don’t be worried, I will explain in-depth later in this post. Just understand the meaning of Disallow and allow instructions.

Sitemap: 

You can mention your sitemap URL in the robots.txt file. With that, you can remind bots that you are having a sitemap also which they can get more information.

For example, here how can I mention my sitemap in robots.txt file

sitemap: http://studentcompanion.in/sitemap_index.xml

So, now you can mention user-agent, and give instructions like Disallow and allow. Finally, you can remind bots about your sitemap.

Now the only thing you have to know is how to mention your instructions in Disallow and Allow.

You can know about that with these following steps:-

Navigate to your Cpanel of your site.
There you can find file manager.

sitemap,meta robot tags, robots.txt
After going into file manager you can find your files.
Then enter into public_html

sitemap,meta robot tags, robots.txt

There you can find all your file, folders etc.

There you can understand how to block directories.

If you block any folder, you are blocking everything in it.

If I want to block WordPress admin section because search engines don’t really want it.

I can block my wp-admin folder

sitemap,meta robot tags, robots.txt

Here’s how I can do that in robots.txt file

User-agent: *
Disallow: /wp-admin/

If I want to allow something in the blocked folder. I can do that by

User-agent: *
Disallow: /wp-content/
Allow: /wp-content/uploads/

If you want to block any specific folder or file in a folder, you can do it too

User-agent: *
Disallow: /wp-content/uploads/

If you don’t want to block anything, then you can do this

User-agent: *
Disallow:

But if you are using / after the disallow like this

User-agent: *
Disallow:/

then you are disallowing entire site. And search engine bots will not index any of your site pages and posts.
So be careful while editing, Or else you may create extra problems.

The best thing about the robots.txt file is, You can see every sites robots.txt file by using robots.txt after it.

Example:- domainname.com/robots.txt

You can see my robots.txt file like this

studentcompanion.in/robots.txt

You can even find Google’s robots.txt file by

google.com/robots.txt

After seeing them, you can get an idea of how to set your robots.txt file.

After setting your robots.txt file, don’t immediately click on this save changes to the robots.txt file,

sitemap,meta robot tags, robots.txt

You have check whether your robots.txt file has any errors before uploading.

For that, you can use robots.txt checker in Google search console.

You can find it by Navigating to Google search console > Crawl >Robot.txt tester.

If you find any text already, delete it and paste the text you edited.

Now click on that test button at the bottomsitemap,meta robot tags, robots.txt

If it says allowed and you have 0 errors and 0 warnings, It’s time to upload your robots.txt file.

sitemap,meta robot tags, robots.txt

You can upload your robots.txt file by clicking this button in WordPress.

sitemap,meta robot tags, robots.txt

That’s it, you have created and completed setting up your sitemap, meta robot tags, and robots.txt file.

Now search engines can find more information about how to index your website.

Important Tip for Better Indexing

Right, After you published your content, It may be a blog post or a page which you want to get Index in Google, Use Fetch as Google option in Google Search Console to index your page faster.

This is how you can do it.

Navigate to Google Search Console & select Fetch as Google in the Crawl section.sitemap,meta robot tags, robots.txt

Enter the URL of your page or post, then hit that Fetch & Render button.sitemap,meta robot tags, robots.txt

It takes few minutes to complete this action.

Once the status got completed, click on that Request Indexing button.

sitemap,meta robot tags, robots.txt

It shows a pop-up. Complete the captcha and select one option and click on Go.

sitemap,meta robot tags, robots.txt

Click on the Mobile Option, & repeat the entire process again for the mobile version.

sitemap,meta robot tags, robots.txt

It helps Google to know that you have a content to Indexed in Google.

Using Internal links from other pages on your site also helps your site to crawl faster and get Indexed in Google.

Hope this helped you,
So, what extra techniques you use to get your pages Index faster.

Any Queries?
Ank in the comment section.

Ravi Teja KNTS
I am Admin of Student Companion & I am too much passionate about Film Making & Blogging. I started Student Companion to help others to start Blogging & make living.
Ravi Teja KNTS on EmailRavi Teja KNTS on FacebookRavi Teja KNTS on GoogleRavi Teja KNTS on LinkedinRavi Teja KNTS on PinterestRavi Teja KNTS on Twitter

What's Your Opinion

Close Menu