Sitecore XM Cloud represents a
monumental shift in how we build and deliver digital experiences. The promise of a fully managed,
composable DXP is incredibly compelling, and for good reason. However, after seeing numerous projects in
the wild and talking with fellow developers and architects, I've noticed a pattern of common,
time-consuming, and often costly mistakes. These aren't the simple "RTFM" errors; they are the insidious
problems that stem from underestimating the fundamental paradigm shift from traditional Sitecore XP to a
headless, SaaS world.
I've spent the last couple of years
deep in the trenches of XM Cloud, and I've seen what works and what causes projects to grind to a halt.
In this post, I’m going to walk you through the top 10 mistakes I see organizations making time and time
again. My goal is to save you the headaches, the late nights, and the budget overruns. Let's take a look
at them.
1. Migrating Information
Architecture Without Restructuring First
This is, without a doubt, the single
biggest mistake I see teams make, and it poisons the well for the entire project. It seems logical to
start by moving your content, but on XM Cloud, that's a recipe for disaster. Failing to address
Information Architecture first is the source of all the further downstream
chaos.
In the world of Sitecore XP, we had
a lot of flexibility. You could have mixed templates in content folders, no clear separation between
page structure and data items, and site definitions buried deep in the content tree. XM Cloud, on the
other hand, is ruthless in its demand for a clean, predictable structure. It expects a strict hierarchy:
Tenant → Site →
Content.
Why It Becomes a Nightmare
When you try to push a legacy XP content tree into XM Cloud, the entire system starts to break in subtle and
frustrating ways:
- Serialization Fails:
The
dotnet sitecore ser push command will throw errors about non-unique
paths or other inconsistencies that were perfectly valid in your old instance.
- Component Bindings Break:
Pages might render, but components won't bind to their
datasources correctly because the expected folder structures don't
exist.
- Pages Editor is
Unusable: Content authors will
complain about missing fields, broken
placeholders, and a generally unusable editing experience.
I cannot stress this enough: before you migrate a single piece
of content, you must first migrate your
IA. This means creating a clean, XM Cloud-native IA in your new project, and then mapping your old
content to that new structure. It feels like extra work upfront, but it will save you weeks of
painful debugging down the line.
2. Ignoring Experience Edge Rate Limits and Caching Architecture
Once you get past the initial IA migration, the next major pitfall I see is a fundamental
misunderstanding of Experience Edge. We get so excited about the idea of a globally replicated,
high-performance content delivery network that we forget it operates under specific rules and
limitations. As I covered in one of my own posts, you absolutely must know these limitations before
you start building.
Experience Edge is not a magic black box. It has hard limits that can and
will break your application in production if you don't design for them from day one. Here are the
critical ones:
|
Limit Type
|
Constraint
|
Impact of Ignoring
|
|
API Rate Limit
|
80 requests/second
|
HTTP 429 errors, service
unavailability
|
|
GraphQL Query Results
|
1,000 items per query
|
Incomplete data requires
pagination
|
|
GraphQL Query Complexity
|
~250 (undocumented)
|
Queries fail with complexity
errors
|
|
Default Cache TTL
|
4 hours
|
Stale content for up to 4 hours
|
The Architectural Imperative: Cache Everything
The 80 requests/second limit seems generous, but it's a cap on
uncached requests. A
single user visiting a server-side rendered (SSR) page with
multiple components could generate a dozen or more GraphQL queries. In a high-traffic scenario, you
will hit that limit almost instantly. This is why Sitecore's guidance, and my own experience,
dictate that you must architect your solution with a
cache-first mindset.
This means leveraging Static Site Generation (SSG) and Incremental Static Regeneration (ISR)
wherever possible.
Trying to retrofit a caching strategy onto a chatty, SSR-heavy application
after it's already built is a nightmare. You must plan your component rendering strategies from the
beginning. Ask yourself for every component: - "Can this be statically rendered? Does it need to be
server-side rendered? Can we use client-side fetching for dynamic elements?". Ignoring these questions
is a direct path to production performance issues and emergency redesigns.
3. Underestimating Non-SXA to Headless SXA Migration Complexity
This is a big one, and it often comes as a nasty surprise to teams migrating from older, non-SXA Sitecore
XP instances. The assumption is that you can just lift your existing components and content
structure and somehow make them work in a headless fashion. The reality is that
XM Cloud expects Headless SXA as its baseline. If
your legacy solution isn't built on SXA, you are not performing a simple migration; you are
performing a complete architectural rebuild.
I've seen teams budget for a straightforward content migration only to discover that their entire
presentation layer is fundamentally incompatible. Migrating from a non-SXA site can add an
extra month or two to a project easily, depending on the complexity.
Why It's a Rebuild, Not a Migration
Headless SXA enforces a strict, convention-based approach to site structure
and presentation that simply doesn't exist in non-SXA builds. Here’s what you’re actually signing up
for:
- Rebuilding
Structure: You must manually recreate
your site architecture using SXA
conventions (Tenant → Site → Page Branches → Data).
- Rebuilding
Layouts: All of your existing layouts
must be rebuilt as
Page
Designs.
- Rebuilding
Shared Components: Headers, footers,
and other shared elements must be recreated
as Partial
Designs.
- Converting All
Renderings: Every single one of your
renderings must be converted into a
Headless SXA-compatible JSS component.
- Remapping
Placeholders: Your old custom
placeholders won't work as-is. The placeholder
mapping in XM Cloud is far more rigid and requires a complete
overhaul.
Failing to account for this massive effort is one of the fastest ways to
blow your budget and timeline. If you are coming from a non-SXA background, you must treat the
project as a replatforming exercise, not a simple upgrade.
4.
Assuming MVC Backend Customizations Can Be Migrated Directly
For
years, the power of Sitecore development lay in its extensible backend. We built custom
renderField pipelines,
hooked into countless processors, and used dependency injection
to create powerful, server-side logic. In XM Cloud, that world is gone. I’ve seen teams spend weeks
trying to figure out how to migrate their complex backend code, only to realize that it’s a futile
effort. If your old
solution contains backend customization, you must find a way to implement it in the head
application.
This is a fundamental paradigm shift that many seasoned Sitecore developers struggle with. XM Cloud is a
headless-first platform, which means your .NET code is running in a black box, completely decoupled
from the rendering host. There is no way to directly influence the rendered output from the
backend.
The Headless Mindset Shift
Here’s what this means in practice:
renderField Pipelines are Obsolete:
Any logic that modifies field rendering at request time must
be moved to your Next.js components.
- Controller Logic Must Be Refactored:
Your MVC controller actions that process data or modify
component output must be re-implemented as stateless services or APIs that your Next.js
application can call.
- Dependency Injection is Different:
Your custom services and logic can't be injected into
rendering pipelines anymore. A great alternative I've seen work well is using
PageProps factory
plugins in the Next.js application to fetch and
inject data into your
components.
Trying to shoehorn your old MVC patterns into XM Cloud will lead to nothing
but frustration. You have to embrace the headless mindset and move your logic to where it now
belongs: the head application.
5.
Not Understanding Next.js Routing and Configuration Fragility
One
of the most common cries for help I see on community channels is, - "I just set up my project, and
every page is a 404!". This is almost always the result of underestimating the sheer fragility of the
Next.js routing and configuration setup in an XM Cloud project. There is a big number of moving parts that all have to work
correctly to get pages to display.
This
isn't a single point of failure; it's a dozen small, interconnected dependencies that can break the
entire application. It’s a classic “needle in a haystack” problem that can burn hours, if not days,
of a developer's time.
The
House of Cards: Common Failure Points
If
you're hitting routing issues, here’s a checklist of the most likely culprits I've
seen:
- The
[[...path]].tsx file:
This is the heart of Sitecore's catch-all routing. If this
file is accidentally renamed, if a merge conflict corrupts it, or if the special bracket
characters are wrong, all your Sitecore-managed routes will
fail.
- Site Name
Mismatches: The
siteName in
your .env
files must perfectly match the site name configured in
Sitecore under
/sitecore/content/MyTenant/MySite/Settings/Site
Grouping/MySite. A tiny typo will break
everything.
- Environment Variable Loading:
Remember that Next.js loads environment files in a specific
order (
scjssconfig.json
→ .env → .env.local). A misconfigured variable in one file can silently override the correct value in another.
- Layout Service Failures: The
[[...path]].tsx
file relies on a successful response from the Layout Service.
If your API key is wrong, your JSS Editing Secret doesn't match, or the rendering host
items at /sitecore/system/Settings/Services/Rendering
Hosts are misconfigured, the Layout Service will
fail, and your app will render a 404.
- Special Folder Names:
Next.js has special folder names like
pages and app. If a git
merge accidentally creates an empty folder with one
of these names in your rendering host's file system, it can confuse the Next.js router
into thinking no routes exist.
Troubleshooting this requires a methodical approach. You have to check
every single one of these connection points, from your local environment files all the way to your
CM instance configuration. There are no shortcuts.
6. Overlooking Serialization Duplicate Item Issues
Serialization is the backbone of modern Sitecore development, but it has
its own set of quirks that can bring a project to a standstill. One of the most frustrating issues
I’ve seen is the "Non-unique paths cannot be serialized" error. This problem, as detailed by Brad
Fettes, is both familiar and maddeningly tedious to resolve.
This error typically occurs when you have two items with the same name under the same parent item. While
Sitecore’s database is perfectly happy with this (since it uses GUIDs for identity), the file system
is not. When the serialization process tries to create two YAML files with the exact same name in
the same directory, it fails. The most common culprit? Duplicate
__Standard Values
items.
How the Trap is Set
This usually happens when developers are working in parallel on different environments. Here’s a typical
scenario:
- Developer A creates a new template and its standard values on
their local machine and serializes them.
- Before Developer A commits and pushes their changes, Developer
B creates the same template and standard values on a shared development
environment.
- Developer A pushes their code. Now, the serialized YAML for the
standard values exists in the repository.
- When you try to pull these changes and push them to the shared
environment, the serialization engine sees a conflict: the GUID from the YAML file is
different from the GUID of the item that already exists in the Sitecore database, even
though the path is identical. The result:
Non-unique paths cannot be
serialized.
The Painful Manual Fix
The error message suggests renaming the item, but that’s not an option for
__Standard Values. The
only way to fix this is a manual, repetitive process:
- Copy the second GUID from the error message.
- Paste it into the search bar in the Content Editor of the target environment to find the offending item.
- Manually delete the duplicate item.
- Re-run the dotnet sitecore ser push command.
- Repeat the process if you get another error for a different item.
This
is a huge time sink, and it highlights the fragility of a file-based serialization strategy when
multiple developers are making content changes. It’s a stark reminder that in the XM Cloud world, a
disciplined Git workflow and clear communication about who is creating which items are more critical
than ever.
7.
Trusting Undocumented Platform Behavior
One of the hardest lessons to learn when moving to a SaaS platform like XM Cloud is that you are no
longer in control of the underlying infrastructure. Things can and will change without notice. A
painful example of this surfaced in mid-2024, when deployments across numerous projects began
failing due to mysterious compilation errors: an unannounced, breaking change in how the XM Cloud build process
consumed the xmcloud.build.json
file.
For months, the build process had seemingly ignored the
buildTargets property, defaulting to using the solution
(.sln) file. Then, one day, Sitecore deployed a change that strictly enforced
this property. Teams that had (logically) pointed this to their specific project
(.csproj) file
suddenly found their deployments failing because the new build process couldn't restore NuGet packages correctly. The error messages about missing assemblies were
completely misleading, sending developers on a wild goose chase for a problem that wasn't in their code.
The Sobering Reality of SaaS
This incident is a perfect illustration of a new class of problems we face with XM
Cloud:
- Undocumented Breaking Changes:
The platform is constantly evolving, and not every change is going
to be announced in advance. What worked yesterday might be broken today for reasons entirely
outside your control.
- Misleading Error Messages:
The errors you see are often symptoms of a deeper platform
issue, not a problem with your own code, which makes troubleshooting incredibly
difficult.
- Reliance on Support: The only way this issue was diagnosed was through a Sitecore
support ticket. You must be prepared to engage with support and provide detailed logs to
resolve these kinds of problems.
My
advice: when a previously working process suddenly breaks for no apparent reason, and your local
builds are fine, your first suspect should be an unannounced platform change. Document your
troubleshooting steps meticulously and open a support ticket sooner rather than later. Don't waste
days debugging your own code when the platform itself may have shifted under your
feet.
8.
Neglecting Local Development Environment Complexity
One
of the great promises of XM Cloud was a simplified local development setup. While it has improved in
many ways, I’ve found that many teams underestimate the complexity and fragility of the Docker-based
environment. It is far from a "plug and play" experience and is a significant source of friction,
especially for developers new to the project or to Docker itself.
The
official Sitecore documentation provides a good starting point for troubleshooting, but it also
reveals just how many things can go wrong.
These aren't edge cases; they are common, everyday problems that can cost your team hours of
productivity.
The
Daily Hurdles of Local Development
Here
are some of the recurring issues that turn local setup into a constant battle:
- The
down.ps1 Ritual: If you shut down your machine
without running the
down.ps1
script, your containers are left in an exited state. The next
time you run
up.ps1, it
will fail. You have to remember to run the cleanup script
every single time. It’s a small thing, but it’s a constant
papercut.
- Corporate
Network and Firewall Policies: This is
a huge one. I’ve seen countless hours lost because a
company's security policies block communication between containers or prevent access to
the Docker DNS. The solutions often involve switching to
hyperv
isolation or getting firewall rules changed, which can be a
bureaucratic nightmare.
- DNS
Configuration: Sometimes the CM
container will fail to authorize on startup
due to invalid DNS settings. The fix is to manually set the DNS in your Docker Desktop
settings to a public one like
8.8.8.8, but
this is not documented in the standard setup guides and
is something you only discover after hours of frustration.
- The "Unhealthy"
Container: Seeing your cm container
in an "unhealthy" state is a rite of passage for XM
Cloud developers. Debugging it requires you to manually check the Docker logs,
exec into the container, and curl the
health check endpoint to figure out what went wrong. It’s
a tedious and opaque process.
Don't treat the local development environment as a given. You need to budget time for setup,
troubleshooting, and creating internal documentation for your specific network environment. A smooth
local setup is not a luxury; it's a prerequisite for a productive team.
The good news is that since the introduction of Metadata Editing Mode in 2025, developers can progress with an FE-first approach, which allows scaffolding a front-end app and developing it against the remote XM Cloud environment, avoiding all the hassles of dealing with local Windows-based containers.
9. Missing SXA Feature Parity Gaps
For teams that have been using SXA on-premise for years, there’s a natural assumption that the features
they rely on will be available in XM Cloud. This assumption is often wrong and can lead to
significant unplanned development work. I’ve seen this myself on projects where clients were
extensive users of features like Snippets, Scriban, Content Tokens, or Overlays, only to discover
during migration that these features simply don’t exist
in the current version of Headless SXA.
This leaves development teams with a difficult choice: either abandon the functionality or rebuild it
from scratch in a headless world.
Rebuilding What Was Once Out-of-the-Box
This isn’t a trivial task. Take Content Tokens, for example. To replicate this functionality, you can’t
just write a simple helper function. A proper implementation requires:
- A Custom GraphQL Query:
You need to create a GraphQL query to fetch the token data,
similar to how the Dictionary Service works.
- A Custom
Service: You need a service that
fetches this data and, crucially,
caches it in memory to avoid hammering the Experience Edge API on every page load,
especially during a static build.
- Integration with
PageProps: This service must be
integrated into the PageProps factory plugin to make the token data available to all your components.
This is a significant amount of work to replicate a feature that was once a standard part of SXA. It’s a
classic example of the “rebuild, don’t migrate” reality of XM Cloud. You must perform a thorough
audit of all SXA features your existing solution uses and verify their availability in XM Cloud
before committing to a migration plan. Don’t assume feature
parity.
10. Failing to Plan for Workflow and Multilingual Migration Complexity
Finally, I want to touch on two areas that are often treated as
afterthoughts but are fraught with hidden complexity: workflows and multilingual content. These are
the silent killers of a migration project. The issues don’t show up as significant, loud deployment
errors; they manifest as confusing problems for content authors long after the initial migration is
supposedly “done.”
Because you no longer have direct database access, you can’t just run a SQL
script to fix things when they go wrong. You are entirely at the mercy of your serialization and
import tools.
The Silent Failures
Here’s what often goes wrong:
- Workflows Break
Silently: During content import, items
can be created without their
original workflow state, or XM Cloud might reject an invalid workflow ID from your old
instance. The result is that editors can’t publish migrated content, and nobody knows
why until they try. Auto-trigger actions also frequently fail due to missing commands.
The only solution is to manually map and reassign workflows to thousands of items after
the import, a tedious and error-prone task.
- Multilingual
Content Gets Corrupted: Migrating
multilingual sites is exponentially more complex.
I’ve seen shared fields get overwritten by an import from a different language, layout
variations between languages get lost, and media items have inconsistent versioning.
These are incredibly difficult problems to untangle without direct database
access.
These issues highlight the need for meticulous planning and, most importantly,
rigorous verification. A migration isn’t successful when the import script finishes. It’s successful when you have clean
verification reports, and your content authors have confirmed that they can edit, publish, and see
their content correctly in all languages.
Conclusion
The move to Sitecore XM Cloud is an exciting and necessary evolution for the platform. However, it is
not a simple upgrade. It is a fundamental paradigm shift that requires a new way of thinking about
architecture, development, and deployment. The biggest mistakes I’ve seen are not the result of bad
code, but of bad assumptions - assuming that what worked in the on-premise world will work in a
composable, SaaS world.
My advice is to approach your first XM Cloud project with a healthy dose of humility. Assume nothing.
Question everything. Budget time for learning, for troubleshooting, and for the inevitable “unknown
unknowns.” The platform is powerful, but it demands respect for its complexity.
I hope this list helps you avoid some of the pitfalls I’ve seen. If you’re struggling with your XM
Cloud implementation or planning a migration, I’m always happy to share my experiences. Feel free to
drop me a message!