A quick one today. We recently came across interesting thoughts and concerns about using Sitecore Edge. As you might know (for example from my previous post), there are no more CD servers when publishing to Sitecore Edge - think of that as just a GraphQL endpoint serving out json.
So, how do we implement a sitemap.xml in such a case? Brainstorming brought several approaches to consider:
Approach one
- Create a custom sitemap NextJS route
- Use GraphQL to query Edge using the search query. Here we would have to iterate through items in increments of 10
- Cache the result on Vercel side using SSG
Approach two
- Create a service from CM side that will return all published items/urls
- This service will only be accessible by Azure function which will generate a sitemap file and store it in CDN
- Front-end would then in this case access this file and render the content of it (or similar)
Approach three
- Generate all the sitemaps (if more than a single sitemap) on CM, then store them all in single text fields
- Returned them via edge, using GraphQL the font-end head which handles sitemap.xml
Then I realized, there is SXA Headless boasts SEO features OOB, including sitemap.xml. Let's take a look at what they do in order to generate sitemaps.
With 10.3 of SXA, the team has revised the Sitemap feature providing much more flexibility to cover as many use cases as only possible. Looking at /Sitecore/Content/Tenant/Site/Settings/Sitemap
item you'll find lots of settings for fine-tuning your sitemaps depending on your particular needs. CM crawls websites and generates sitemaps. Then they get published to Sitecore Edge as a blob and then it gets proxied by a Rendering Host via GraphQL. When search engines request sitemaps of a particular website, Rendering Host gives them exactly what has been asked. That is actually similar to the above approach three with all the invalidation and updates of sitemaps provided also OOB.
This gives out a good amount of options, depending on your particular scenario.