All About Sitemap.xml file in SEO
In my previous article Onpage SEO, I have mentioned sitemap.xml and sitemap.html files.
In this article, I am going to cover those topics a little bit more in-depth.
What is sitemap.xml?
A sitemap.xml is a text file written in xml ( extensible markup language ) markup language that contains all the URLs in your website and their details.
Extra information like, when the URL last modified, change the frequency of the URL, the priority of that URL etc.
The purpose of sitemap.xml file is not for ranking your site, but to increase the chances of crawling and indexing your web pages.
It is essential for websites with bad internal linking strategy.
If your website contains web pages with no internal link, or you add a new page or removed an old one, you need to include all these information within the sitemap.
Including a sitemap doesn’t guarantee your pages to be crawled and indexed, but it increases the chances.
Types of Site maps
There are different kinds of sitemap recognized by Google. They are HTML, video, image, news, mobile etc.
A typical sitemap file looks like this below:
<?xml version=”1.0” encoding=”UTF-8”>
<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:xhtml=”http:www.w3.org/1999/xhtml”>
<xhtml:link rel=”alternate” hreflang=”en” href=”https://www.example.com”/>
<xhtml:link rel=”alternate” hreflang=”fr” href=”https://www.example.com/fr”/>
Tags used in this sitemap:
<urlset> : sitemap opens with this tag. Currently protocol standard.
<url> : Parent tag for each URL entry. The remaining tags are placed inside this tag.
<loc> : contains locator of the page. Absolute URL, not more than 2048 characters.
<lastmod> : the date of last modification of the URL. Include date in yyyy-mm-dd format.
<changefreq> : it tells search engine crawler how frequently the page changes. The options are: hourly, daily, weekly, monthly, yearly, always and never. Choose the option carefully and try not to fool google crawler, otherwise, it may ignore to crawl.
<priority> : priority of the URL relative to other URLs in the website. Its value ranges from 0.0 to 1.0. It tells search engine crawlers which pages are most important to you.
It will not affect in the ranking of that page, but while crawling your pages crawler will give importance to it than other urls. So, it is better to provide more importance to pages you change frequently.
What is the sitemap Index file?
A sitemap file can consist of only 50,000 urls or 50 Mb size. If you need more than that, you need to create separate sitemaps.
A sitemap index file contains URLs of sitemaps on that site.
All the sitemaps should be listed in this index file. You can use gzip to compress the sitemap, but after decompressing it should not be more than 50 Mb.
The tags used in sitemap index files are:
<sitemapindex> : standard protocol for sitemap index file.
<sitemap> : information about each sitemap are put inside this tag.
<loc> : Actual location of the sitemap file in your hosting.
<lastmod> : Information about when was the sitemap file changed last.
A sitemap index file contains maximum 50000 sitemap files.
Where to put sitemap.xml file?
Ideally, the location of the sitemap should be at the root directory. So that all the URLs in the website can be listed within the sitemap.
The location of the sitemap determines the set of URLs to be enlisted.
If the location of the file such that:
Then the URLs can be listed in the sitemap should have the starting
Not like this, http://abcd.com/images/
Sitemap for subdomain:
Subdomains are used to create separate content from the main domain.
Such as, news.google.com, map.google.com, play.google.com etc.
Google treats them as a separate website.
So it is better to create a separate sitemap.xml file for each subdomain.
Such as, http://news.abcd.com/sitemap.xml
Remember all the URLs within a sitemap should be from the same host as the sitemap and use the same protocol.
If your site contains lots of images that you want to rank in google images, you better create an image sitemap and submit to Google.
Below is an example of an image sitemap.
<?xml version=”1.0” encoding=”UTF-8”>
<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:image=”http://www.google.com/schemas/sitemap-image/1.1”>
Tags used in image sitemap:
<image:image> : encloses each url. Upto 1000 image URLs are allowed.
<image:loc> : contains the url of image.
<image:caption> : optional tag, caption of the image.
<image:geo_location> : specify geo location of the image.
<image:title> : Title of the image.
<image:license> : URL pointing to the license of the image.
Sitemap for videos:
There are lots of tags used in video sitemap. You can tell search engine a lot more information using these tags.
<video:player_loc> – The URL pointing to the player for the video. If your video is embedded on your page, like from YouTube or Vimeo, you can use this tag instead of
<video:content_loc>. You can normally find this URL in the video’s embed code.
<video:duration> – The video’s length in minutes, between 0 and 28800 (8 hours). This isn’t technically required, but Google recommends it.
<video:expiration_date> – Only include this information if your video will not be available after a certain date. If you do use it, put dates in YYYY-MM-DD format, and times in Thh:mm:ss:TZD format.
<video:rating> – The video’s rating. Only values between 0.0 and 5.0 are valid.
<video:view_count> – The number of times the video has been watched.
<video:publication_date> – The date the video was first published, not the date you put it on your site.
<video:family_friendly> – If No, your video will only appear in search results when the user disables SafeSearch. Otherwise, make this Yes.
<video:tag> – A very short description of key concepts related to your video. Create a separate <video:tag> element for each tag you use, up to 32 tags.
<video:category – The broad subject your video covers, such as SEO, Digital Marketing or Advertising.
<video:restriction relationship=allow/deny> – A list of countries where the video cannot play, or a list of the only countries in which users can access the video, dependent on whether you set relationship as allow or deny. The list is space-delimited and uses the ISO 3166 country codes. If you don’t use this tag, it will be assumed that your video is available globally.
<video:gallery_loc> – The URL where you can find the collection in which your video appears if there is one. Each video can have only one gallery_loc tag. If your gallery has a title you can add the title attribute.
<video:price currency=” ”> – The price to download the video. The currency= attribute is required and uses the ISO 4217 currency code. Add the optional type= attribute to specify if the download is to own or rent, and resolution= to specify if the video is in HD or SD. You can use this multiple times for each currency you accept.
<video:requires_subscription> – Allowed values are yes and no to indicate whether or not a subscription is required to watch the video.
<video:uploader> – If your video is embedded from another video site, put the name of the host here. This URL must be the same domain as the <loc> tag.
<video:platform_relationship=allow/deny> – The platforms, web, mobile, and tv, where the video can or cannot be accessed. The relationship= attribute defines whether the list is inclusive or exclusive. You can have only one platform tag per video.
<video:live> – Whether or not the video is a live stream. Only yes or no are valid.
There are lots of online tools, plugins available which can help you generate sitemap files without writing codes.
Some of them are given below:
Sitemap Plugins For CMSs:
WordPress: Yoast SEO, All in One SEO
Joomla: Sitemap Generator
Drupal: Project Sitemap
OS Commerce: Sitemap Generator
Sitemap validation tools online:
After creating the sitemap, you need to validate them before uploading and submitting to Google.
Some of the sitemap validators are: