Experience Sitecore ! | All posts tagged 'Container'

Experience Sitecore !

More than 200 articles about the best DXP by Martin Miles

LTSC2022 images for Sitecore containers released: what does it mean to me?

Exciting news! Sitecore kept the original promise and released the new ltsc2022 container images for all the topologies of both the 10.3 and 10.2 versions of their platform.

The biggest benefits of new images are improved image sizes – almost 50% smaller than ltsc2019, and support for running Process Isolation on Windows 11.

Check it yourself:

So, what does that mean for developers and DevOps?

First and most, running Sitecore 10.3 on Windows Server 2022 is now officially supported. You may consider upgrading your existing solutions to benefit from Server 2022 runtime.

Developers working on Windows 11 now also got so much wanted support, containers built from the new images can run in Process isolation mode without a hypervisor. That brings your cluster performance to nearly bare metal metrics.


Let's try it in action!

I decided to give it a try and test if that would work and how effectively. I recently purchased a new Microsoft Surface  8 Pro laptop which had Windows 11 pre-installed and therefore useless for my professional purposes, so it seems to be excellent test equipment.

After initial preparation and installing all the prerequisites, I was ready to go. Choosing the codebase I decided to go with the popular Sitecore Containers Template for JSS Next.js apps and Sitecore 10.3 XM1 topology, as the most proven and well-preconfigured starter kit.

Since I initialized my codebase with -Topology XM1 parameter, all the required container configurations are located under /MyProject/run/sitecore-xm1 folder. We are looking for .env file which stores all the necessary parameters.

The main change to do here is setting these two environmental settings to benefit from ltsc2022 images:

SITECORE_VERSION=10.3-ltsc2022
EXTERNAL_IMAGE_TAG_SUFFIX=ltsc2022

The other important change in .env file would be changing to ISOLATION=process. Also, please note that TRAEFIK_ISOLATION=hyperv stays unchanged due to a lack of ltsc2022 support for Traefik, so sadly you still need to have Hyper-V installed on this machine. The difference is that it serves only Traefik, the rest of Sitecore resources will work in the Process mode.

I also did a few optional improvements upgrading important components to their recent versions:

MANAGEMENT_SERVICES_IMAGE=scr.sitecore.com/sxp/modules/sitecore-management-services-xm1-assets:5.1.25-1809
HEADLESS_SERVICES_IMAGE=scr.sitecore.com/sxp/modules/sitecore-headless-services-xm1-assets:21.0.583-1809

Also, changed node to reflect the recent LTS version:

NODEJS_VERSION=18.14.1

Please note, that sitecore-docker-tools-assets did not get any changes from the previous version of Sitecore (10.2), so I left it untouched.

Last thing – to make sure I indeed build and run in the Process isolation mode, I set ISOLATION=process changing this value from default. The rest of .env file was correctly generated for me by Init.ps1 script.

All changes complete, let’s hit .\up.ps1 in PowerShell terminal with administrative mode and wait until it downloads and builds images:


Advanced Part: building Traefik with ltsc2022

Now, let's get rid of the only left 1809-based container, which is Traefik. Luckily, its Dockerfile is available, so I can rewrite it to consume ltsc2022 images. In addition, I took the latest (by the time) version of it which is 2.9.8, while the officially supported is 2.2.0, so it would make sense to parametrize the version as well, taking its settings from .env settings.

I created a new docker\build\traefik folder and ended up with the following Dockerfile within there:

ARG IMAGE_OS
FROM mcr.microsoft.com/windows/servercore:${IMAGE_OS}

ARG VERSION
SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]

RUN Invoke-WebRequest \
        -Uri "https://github.com/traefik/traefik/releases/download/$env:VERSION/traefik_${env:VERSION}_windows_amd64.zip" \
        -OutFile "/traefik.zip"; \
    Expand-Archive -Path "/traefik.zip" -DestinationPath "/" -Force; \
    Remove-Item "/traefik.zip" -Force

EXPOSE 80
ENTRYPOINT [ "/traefik" ]

# Metadata
LABEL org.opencontainers.image.vendor="Traefik Labs" \
    org.opencontainers.image.url="https://traefik.io" \
    org.opencontainers.image.source="https://github.com/traefik/traefik" \
    org.opencontainers.image.title="Traefik" \
    org.opencontainers.image.description="A modern reverse-proxy" \
    org.opencontainers.image.version=$env:VERSION \
    org.opencontainers.image.documentation="https://docs.traefik.io"

Because of that I also had to update the related docker-compose section of docker-compose.override.yml file:

  traefik:
    isolation: ${ISOLATION}
    image: ${REGISTRY}traefik:${TRAEFIK_VERSION}-servercore-${EXTERNAL_IMAGE_TAG_SUFFIX}
    build:
      context: ../../docker/build/traefik
      args:
        IMAGE_OS: ${EXTERNAL_IMAGE_TAG_SUFFIX}
        VERSION: ${TRAEFIK_VERSION}
    volumes:
      - ../../docker/traefik:C:/etc/traefik
    depends_on:
    - rendering

What I want to pay attention here - I am now using ${ISOLATION} as the rest of the containers are using instead of dedicated TRAEFIK_ISOLATION which can now be removed from .env.

Another thing is that I am passing fully parametrized image name:

image: ${REGISTRY}traefik:${TRAEFIK_VERSION}-servercore-${EXTERNAL_IMAGE_TAG_SUFFIX}

I intentionally do not prefix it with ${COMPOSE_PROJECT_NAME} so that this image becomes reusable between several solutions on the same machine, which saves some disk drive space.

Last step would be adding .env parameter TRAEFIK_VERSION=v2.9.8 and removing TRAEFIK_IMAGE parameter which is no longer needed. Good to go!


Outcomes and verdict

I tested all of the important features of the platform, including Experience Editor and it all works, and what is especially important – works impressively fast with the Process isolation mode. And since all the containers are built with ltsc2022 and run in Process isolation, one doesn't need Hyper-V at all!

As for me, I ended up having a nice and powerful laptop suitable for modern Sitecore headless operations.

Enjoy faster development!

Applying vulnerability fix to containerized environments

This advice was originally proposed by Peter Nazarov (Twitter, LinkedIn), who kindly asked me to give it a bigger spread.

The biggest question for the day is if the fix was already applied to all official Sitecore container images so that now we can just pull new Sitecore containers and rebuild to rebuild our own container images to apply the patch?

The KB article offers WDP and ZIP packages fixes but says noting about containers, just like containers are not supported by Sitecore:

Critical vulnerability applicable to all Sitecore versions related to XSS. 

This issue is related to a Cross Site Scripting (XSS) vulnerability which might allow authenticated Sitecore users to execute custom JS code within Sitecore Experience Platform (XP) and Sitecore Managed Cloud.

We encourage Sitecore customers and partners to familiarize themselves with the information below and apply the Solution to all affected Sitecore instances. We also recommend that customers maintain their environments in security-supported versions and apply all available security fixes without delay.


So below are some findings:


So to apply the fix for your Docker images you need to copy the patch files from the following Sitecore Docker assets images:

  • for XM1: scr.sitecore.com/sxp-pre/sitecore-xm1-assets:10.2.1.007064.169-10.0.17763.2366-1809
  • for XP0: scr.sitecore.com/sxp-pre/sitecore-xp0-assets:10.2.1.007064.169-10.0.17763.2366-1809
  • for XP1: scr.sitecore.com/sxp-pre/sitecore-xp1-assets:10.2.1.007064.169-10.0.17763.2366-1809

For example, if using XM1: scr.sitecore.com/sxp-pre/sitecore-xm1-assets:10.2.1.007064.169-10.0.17763.2366-1809 for XM 10.2.0.


Inside this Sitecore Docker assets image you find C:\platform\ directory which contains the directories for the corresponding Docker images that you need to patch:

  • \platform\cd
  • \platform\cm
  • \platform\id (it is empty and can be ignored)
You will need to copy the content of those directories to file system root C:\ of the corresponding container.
For your \docker\build\cm\Dockerfile  you would need a couple of new lines:
...
FROM scr.sitecore.com/sxp-pre/sitecore-xm1-assets:10.2.1.007064.169-10.0.17763.2366-1809 as kb1001489
...

FROM ${BASE_IMAGE}
...
WORKDIR C:\
COPY --from=kb1001489 /platform/cm/ ./


You would need to do similar changes to your \docker\build\cd\Dockerfile with only one difference that you copy the CD patch files instead of CM in the last line:
...
COPY --from=kb1001489 /platform/cd/ ./


Of course, you can introduce the .env variable for scr.sitecore.com/sxp-pre/sitecore-xm1-assets:10.2.1.007064.169-10.0.17763.2366-1809 and pass it to your docker files as an ARG.


Note: this patch changes the version of your Sitecore 10.2.0 instance to 10.2.1: Sitecore.NET 10.2.1 (rev. 007064 PRE) (see the screenshot below). Seeing this happening it feels that, sadly, Sitecore is unlikely to release 10.2.0 Docker that includes this patch - it would cause versioning issues:


The example above is good to learn how you apply a patch with Docker assets image-based patch to your containers.

However, this Cumulative fix for Sitecore XP 10.2 patch changes a lot of DLLs to the new version which are not exposed via NuGet feed, and changes your Sitecore version to a pre-release version (which does not exist). This would give you several challenges. Therefore, I would prefer in this specific case to apply just a standard .zip file-based fix as per Notes section on the page:

For Sitecore XP 10.1 and later, if it is not possible to apply the cumulative fix (pre-release update), the following patch can be applied alternatively: Sitecore.Support.500712.zip.

  • Changes multiple DLL versions (they are not available in NuGet feed)
  • Changes Sitecore version to a pre-release version (next version that is not released yet)


 

Sitecore.Support.500712.zip

  • Deploys new Sitecore.Support.500712.dll
  • Overwrites vulnerable \sitecore\shell\Applications\Content Manager\Execute.aspx page file so that it runs from Sitecore.Support.500712.dll which contains the fix.

Things beginners get incorrect about Kubernetes

On start playing with Kubernetes, one may face with one of the biggest delusions considering the K8S will work in the same way for both the development or testing environment. 

But It won't!

When it comes to containers in general and Kubernetes specifically, there is a big difference between occasional runs in a labs-alike conditions and in full production lifecycle. That is similar to a difference between just starting an app and long term running it full security and reliability enabled.

Not a Kubernetes exclusive problem, but is true for the entire variety of containers and microservices. Spin-up a container comes as relative simple task, while scaling containers as containerized microservices in the production turns to be more complicated.

Although Kubernetes has alternatives, it has quickly become a de-facto standard for orchestration. However there is a difference between launching K8S in a sandbox compared to a full production environment.



Delusion #1. Running containers with Kubernetes in the development or testing environment ensures that your operational needs will be satisfied.

The truth: the launch of Kubernetes in the development or testing environment allows cutting the corners, simplify things and not to bother with the operational load, which one faces when going live to Prod. Ops and safety considerations will become major areas of differences between K8S running in prod and in the development / testing environments. Failing a cluster in the labs conditions does not bring any losses.

For me it looks like a compromise between an agility and reliability: devs use containers to achieve flexibility while working with apps when developing and testing the code does its purpose. While the ops need to provide reliability, scaling, performance and safety provided by a sustainable, industry-proven platform. They are looking for a deployment automation for the clusters to ensure the repeatability and consistency. It also helps when restoring the system.

Versioning is also critical for operations. As far as possible, you need enabling versioning everywhere, including services deployment configuration, policies and infrastructure (applying the infrastructure-as-a-code approach). That results in environments becoming repeatable. As a good practice, avoid "latest" image versions, in order to avoid configuration drift effect.


Delusion #2. Both reliability and security got provided with Kubernetes

In reality: when using Kubernetes at non-production environments only, most unlikely reliability and security got provided, at least initially. Do not get discouraged, you will be there: it's a matter of designing an architecture before switching to the Prod.

Obviously, performance, scaling, availability and safety requirements are much higher in prod environments. This It is important to plan these requirements for the deployment of K8S into architecture, as well as build scaling and security plans into Helm-charts, etc.

But how could running a cluster in dev/testing environments lead to a false confidence?

This is common for non-production environments having all network connections open. It is acceptable that any service can refer to any other service: open connections are the defaults for Kubernetes. However such an approach is an evil practice for production environments and can lead to downtime. It also exposes larger areas for potential attack and increases threats to business.

When it comes to containers / microservices, one needs spending bigger effort for creating a highly available and reliable system. Orchestration itself helps a lot but isn't a "silver bullet", same applies to security. We will have to work hard to protect Kubernetes and reduce the surface of the attack. It is very important using RBAC with minimal privileges and enforce network policies, leaving only those channels services indeed use.

Also vulnerabilities of container images can rapidly turn ops into a critical state, while on development / testing environments this danger may absent at all. Pay attention to the base images used for building your containers: as far as possible, use trusted official images, or build your own. The last thing you want happening for your Kubernetes cluster is helping someone mining crypto coins.

It is recommended to refer to the security of containers as a ten-level system covering the container stack (host and registries), as well as questions related to the life cycle of containers (for example, API management). 


Delusion #3. Orchestration makes scaling a formality

Although Kubernetes considered being a completely necessary tool for scaling containers, it will be delusted to think that orchestration immediately sorts out scaling needs for the production environment. The volume of data at live environments is times more, please also keep in mind that monitoring may also need scaling. With increasing volumes, everything changes. 

It is impossible to ensure all K8S components implementing the interfaces correctly until you spin-up the prod: determining Kubernetes "working normally", and the API server and other controlled components get scaled according to your needs.

As I say, the development and testing environments go much easier. In local environments it is easy skipping basics like defining the right resources and restrictions for requests. Avoiding that can collapse you prod once later. 

Scaling the cluster both directions is a good example when the task goes easy locally, being clearly complicated at production: scaling prod clusters is more difficult than clusters for development/testing.

While Kubernetes makes it relatively simple scaling horizontally, DevOps still need keeping in mind some nuances, especially when it comes to maintaining services live when scaling an infrastructure. It is crucial to ensure that the main services, as well as a system monitoring and security alerts, were distributed across the cluster nodes and do work with stateful volumes so that data not being lost on scaling down.

Again, it all comes to proper planning and resources available. You need not just understand your needs for scaling when planning but most importantly - test them. Your production environment must be capable for handling much higher loads.


Delusion #4. Kubernetes works everywhere equally that same

In reality: differences in work in another environment may vary similar to those differences between running Kubernetes on the developer's laptop and prod server. The reality is that there may be serious differences depending on the vendor .Many believe that if the K8S works locally, it will work in any operational environment. 

Local environments commonly miss important components required by prod environments: monitoring, logging, certificate management and credentials. You need to keep that in mind, as that is another problem raised from a difference between prod  and development/testing environments.

However, that isn't Kubernetes exclusively, but applies to containers/microservices in general, especially in multicloud and hybrid cloud setups. Those Kubernetes implementations are more complicated than it seems initially, as many of the mandatory services are proprietary, like load balancing and firewalls. A container that works well locally may work unprotected (may not start at all) in the cloud with another setup of tools. Therefore, SERVICE MESH technologies like Istio attract so much attention. They guarantee the availability wherever your container works, so you do not need to think about infrastructure - which is the main reason for using containers.

I hope you can reach safer and more reliable production environments with Kubernetes keeping the above in mind!

Everything you wanted to ask about "Items-as-Resources" coming with new Sitecore 10.1

Sitecore 10.1 brings new Items-as-Resources option, which raises plenty of questions.
  • What it that used for?
  • Why did we get it at all?
  • Any concerns of using that?
Please find the answers below:


1. Before 10.1 you’ve been given the initial set of OOB items upon the installation in the databases. That includes default templates, layouts, workflows and the rest of scaffolding items.

2. Now with 10.1 all these are supplied as the resources files outside of database. That's correct: all these items are no longer residing in the database. Yes, you still have them in your content tree as normal.

3. Does databases come empty? Not actually - there are just two entries for the default site (page) you normally first see after successful Sitecore instance installation, at the root of URL. It was decided not to put these into resources, as most customers delete that default home page anyway.

4. Are these resource read-only? Yes, Sitecore cannot write back into those resource files. Treat it as if they're written on CD but with an immediate access.

5. So does that mean I cannot modify default OOB items in Sitecore anymore? No, you actually can edit those as normal after "Unprotecting item" from a Content Editor ribbon. What happens in that case is Sitecore will take the delta between initial value stored in resource file and your changes and will store that delta having only changes you’ve done in the database. On item "consumption" the current state of item gets calculated from a resource file and that delta.

6. But you cannot delete these items. Sitecore prompts that it origins from the resource file therefore cannot be deleted. Still good, as leaves less potential for silly errors, anyway..

7. So where are these resource files located? They are based (quite predictably) within App_Data folder - App_Data\items\<DATABASE_NAME>\items.<DATABASE_NAME>.dat (by default).

8. What format are these resource files? Protobuff (Protocol Buffers) from Google. That is a surprisingly old format which is proven for a decade, at least.

9. How can I create my own resources?
Officially - you cannot. Well, it is technically possible but requires very deep dive into Protocol Buffers, raw database storage and investigating new data provider in Sitecore. But, Sitecore will likely start providing authors of popular modules with the toolset to create such a resources with an ease. So, let's say for SXA you will no longer need installing SPE + SXA packages yourself, instead you'll simply drop the resource files provided by SXA team underneath items folder, not even need to publish that afterwards.

10. Why no need publishing? That's because you copy the resource file for web database as well - items are alredy on the web database. Of course, all the items created by you will still need to get published.

11. But why at all Sitecore introduced that?
The main reason is to simplify the platform version upgrade process. The way update is done has changed.
You may have notice on the Sitecore download page, "Upgrade options" section have changed: instead of Sitecore Update Packages you now have Sitecore UpdateApp Tool that operates against each specific version you'd want to upgrade from. This tool will remove the default items for each particular legacy version and replace it with the resource files, Of course it also updates the schema with the changes which was already available.

12. The bigger reason for this change was "think containers - think ahead" approach. With such a change it becomes easier to upgrade version of Sitecore when running in containers: everything from the database since now is entirely user's custom data, and can be entirely copied to a never database, while version-specific-and-system-related items get updated by just a resource file substitute.

13. Also you may heard that Fast Query has been deprecated. That is exact reason why - if something isn't in the database, Sitecore cannot efficiently build the graph of the relationship for fast query


14. What is that data provider mentioned above?
That is a new one called CompositeDataProvider that inherited by DefaultDataProvider. The name composite assumes that one cares of merging items for Sitecore tree from both DB and the resources. In the configuration you specify it for an individual database under <database> section, you can also change the location for such resources by patching <filePath> node of <protobutItems> and overriding the location.

15. For the end-consumer of DataProvider (high level of stack) nothing changes as they still use DefaultDataProvider from their code. The changes occur at intermediate level and those happen to be internal for Sitecore.

16. That actually opens up a much wider potential for creating some intemediate-level providers to things other than ProtoBuf and SQL Databases: CRMs, DAMs, some other headless CMSs maybe. In any case this is very important and greatly welcomed step ahead for the platform!

Update: there is another great blog post from my MVP-colleague Jeremy Davis on that same topic, where he also tried drilling into these resurce files with ProtoBuff.Net library.