Experience Sitecore ! | All posts tagged 'sitemap'

Experience Sitecore !

More than 200 articles about the best DXP by Martin Miles

Sitecore Edge considerations for sitemap

A quick one today. We recently came across interesting thoughts and concerns about using Sitecore Edge. As you might know (for example from my previous post), there are no more CD servers when publishing to Sitecore Edge - think of that as just a GraphQL endpoint serving out json.

So, how do we implement a sitemap.xml in such a case? Brainstorming brought several approaches to consider:

Approach one

  • Create a custom sitemap NextJS route
  • Use GraphQL to query Edge using the search query. Here we would have to iterate through items in increments of 10
  • Cache the result on Vercel side using SSG

Approach two

  • Create a service from CM side that will return all published items/urls
  • This service will only be accessible by Azure function which will generate a sitemap file and store it in CDN
  • Front-end would then in this case access this file and render the content of it (or similar)

Approach three

  • Generate all the sitemaps (if more than a single sitemap) on CM, then store them all in single text fields
  • Returned them via edge, using GraphQL the font-end head which handles sitemap.xml

Then I realized, there is SXA Headless boasts SEO features OOB, including sitemap.xml. Let's take a look at what they do in order to generate sitemaps.

With 10.3 of SXA, the team has revised the Sitemap feature providing much more flexibility to cover as many use cases as only possible. Looking at /Sitecore/Content/Tenant/Site/Settings/Sitemap item you'll find lots of settings for fine-tuning your sitemaps depending on your particular needs. CM crawls websites and generates sitemaps. Then they get published to Sitecore Edge as a blob and then it gets proxied by a Rendering Host via GraphQL. When search engines request sitemaps of a particular website, Rendering Host gives them exactly what has been asked. That is actually similar to the above approach three with all the invalidation and updates of sitemaps provided also OOB.

This gives out a good amount of options, depending on your particular scenario.

Creating XML Sitemap for the Helix solution

I am working on a solution that already has HTML sitemap as a part of Navigation feature. Now I got a request to add also a basic XML sitemap with common set requirements. Habitat ships with an interface template _Navigable, so let's extend this template by adding a checkbox field called

ShowInSitemap, stating whether a particular page will be shown in that sitemap:


In order to start, we need to create a handler. Having handlers in web.config is not the desired way of doing things, it will require also doing configuration transform for the deployments, so let's do things in a Sitecore way (Feature.Navigation.config file):

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
    <sitecore>
        <pipelines>
            <httpRequestBegin>
                <processor type="Platform.Feature.Navigation.Pipelines.SitemapHandler, Platform.Feature.Navigation"
                           patch:before="processor[@type='Sitecore.Pipelines.HttpRequest.CustomHandlers, Sitecore.Kernel']">
                </processor>
            </httpRequestBegin>
            <preprocessRequest>
                <processor type="Sitecore.Pipelines.PreprocessRequest.FilterUrlExtensions, Sitecore.Kernel">
                    <param desc="Allowed extensions">aspx, ashx, asmx, xml</param>
                </processor>
            </preprocessRequest>
        </pipelines>
    </sitecore>
</configuration>

We rely on httpRequestBegin pipeline and incline our new SitemapHandler from Navigation feature right before CustomHandlers processor.

SitemapHandler is an ordinary pipeline processor for httpRequestBegin pipeline, so is inherited from HttpRequestProcessor:

    public class SitemapHandler : HttpRequestProcessor
    {
        const string sitemapHandler = "sitemap.xml";

        private readonly INavigationRepository _navigationRepository;

        public SitemapHandler()
        {
            _navigationRepository = new NavigationRepository(RootItem);
        }

        public override void Process(HttpRequestArgs args)
        {
            if (Context.Site == null 
                || args == null
                || string.IsNullOrEmpty(Context.Site.RootPath.Trim()) 
                || Context.Page.FilePath.Length > 0 
                || !args.Url.FilePath.Contains(sitemapHandler))
            {
                return;
            }

            Response.ClearHeaders();
            Response.ClearContent();
            Response.ContentType = "text/xml";

            try
            {
                var navigationItems = _navigationRepository.GetSitemapItems(RootItem);
                string xml = new XmlSitemapService().BuildSitemapXML(flatItems);

                Response.Write(xml);
            }
            finally
            {
                Response.Flush();
                Response.End();
            }
        }

        private Item RootItem => Context.Site.GetRootItem();

        private HttpResponse Response => HttpContext.Current.Response;
    }

And XmlSitemapService code below:

    public class XmlSitemapService
    {
        public string CreateSitemapXml(IEnumerable<NavigationItem> items)
        {
            var doc = new XmlDocument();

            var declarationNode = doc.CreateXmlDeclaration("1.0", "UTF-8", null);
            doc.AppendChild(declarationNode);

            var urlsetNode = doc.CreateElement("urlset");

            var xmlnsAttr = doc.CreateAttribute("xmlns");
            xmlnsAttr.Value = "http://www.sitemaps.org/schemas/sitemap/0.9";
            urlsetNode.Attributes.Append(xmlnsAttr);
            doc.AppendChild(urlsetNode);

            foreach (NavigationItem itm in items)
            {
                doc = CreateSitemapRecord(doc, itm);
            }
            return doc.OuterXml;
        }

        private XmlDocument CreateSitemapRecord(XmlDocument doc, NavigationItem item)
        {
            string link = item.Url;

            string lastModified = HttpUtility
             .HtmlEncode(item.Item.Statistics.Updated.ToString("yyyy-MM-ddTHH:mm:sszzz"));

            XmlNode urlsetNode = doc.LastChild;

            XmlNode url = doc.CreateElement("url");
            urlsetNode.AppendChild(url);

            XmlNode loc = doc.CreateElement("loc");
            url.AppendChild(loc);
            loc.AppendChild(doc.CreateTextNode(link));

            XmlNode lastmod = doc.CreateElement("lastmod");
            url.AppendChild(lastmod);
            lastmod.AppendChild(doc.CreateTextNode(lastModified));

            return doc;
        }
    }
Also, NavigationItem is a custom POCO:
 
public class NavigationItem
{
    public Item Item { get; set; }
    public string Title { get; set; }
    public string Url { get; set; }
    public bool IsActive { get; set; }
    public int Level { get; set; }
    public NavigationItems Children { get; set; }
    public string Target { get; set; }
    public bool ShowChildren { get; set; }
}


Few things to mention.
1. Since you are using LinkManager in order to generate the links, you need to make sure you have full URL path as required by protocol, not the site-root-relative path. So you'll need to pass custom options in that case:

2. Once deployed to production, you may face an unpleasant behavior of HTTPS links generated along with 443 port number (such as . That is thanks to LinkManager not being wise enough to predict such a case. However there is a setting that make LinkManager works as expected. Not obvious
var options = LinkManager.GetDefaultUrlOptions();
options.AlwaysIncludeServerUrl = true;
options.SiteResolving = true;
LinkManager.GetItemUrl(item, options);

or better option in Heliix to rely on Sitecore.Foundation.SitecoreExtensions:

item.Url(options) from


//TODO: Update the code with the recent



That's it!