Experience Sitecore! | All posts tagged 'Security'

Experience Sitecore!

Martin Miles on Sitecore

Things beginners get incorrect about Kubernetes

On start playing with Kubernetes, one may face with one of the biggest delusions considering the K8S will work in the same way for both the development or testing environment. 

But It won't!

When it comes to containers in general and Kubernetes specifically, there is a big difference between occasional runs in a labs-alike conditions and in full production lifecycle. That is similar to a difference between just starting an app and long term running it full security and reliability enabled.

Not a Kubernetes exclusive problem, but is true for the entire variety of containers and microservices. Spin-up a container comes as relative simple task, while scaling containers as containerized microservices in the production turns to be more complicated.

Although Kubernetes has alternatives, it has quickly become a de-facto standard for orchestration. However there is a difference between launching K8S in a sandbox compared to a full production environment.



Delusion #1. Running containers with Kubernetes in the development or testing environment ensures that your operational needs will be satisfied.

The truth: the launch of Kubernetes in the development or testing environment allows cutting the corners, simplify things and not to bother with the operational load, which one faces when going live to Prod. Ops and safety considerations will become major areas of differences between K8S running in prod and in the development / testing environments. Failing a cluster in the labs conditions does not bring any losses.

For me it looks like a compromise between an agility and reliability: devs use containers to achieve flexibility while working with apps when developing and testing the code does its purpose. While the ops need to provide reliability, scaling, performance and safety provided by a sustainable, industry-proven platform. They are looking for a deployment automation for the clusters to ensure the repeatability and consistency. It also helps when restoring the system.

Versioning is also critical for operations. As far as possible, you need enabling versioning everywhere, including services deployment configuration, policies and infrastructure (applying the infrastructure-as-a-code approach). That results in environments becoming repeatable. As a good practice, avoid "latest" image versions, in order to avoid configuration drift effect.


Delusion #2. Both reliability and security got provided with Kubernetes

In reality: when using Kubernetes at non-production environments only, most unlikely reliability and security got provided, at least initially. Do not get discouraged, you will be there: it's a matter of designing an architecture before switching to the Prod.

Obviously, performance, scaling, availability and safety requirements are much higher in prod environments. This It is important to plan these requirements for the deployment of K8S into architecture, as well as build scaling and security plans into Helm-charts, etc.

But how could running a cluster in dev/testing environments lead to a false confidence?

This is common for non-production environments having all network connections open. It is acceptable that any service can refer to any other service: open connections are the defaults for Kubernetes. However such an approach is an evil practice for production environments and can lead to downtime. It also exposes larger areas for potential attack and increases threats to business.

When it comes to containers / microservices, one needs spending bigger effort for creating a highly available and reliable system. Orchestration itself helps a lot but isn't a "silver bullet", same applies to security. We will have to work hard to protect Kubernetes and reduce the surface of the attack. It is very important using RBAC with minimal privileges and enforce network policies, leaving only those channels services indeed use.

Also vulnerabilities of container images can rapidly turn ops into a critical state, while on development / testing environments this danger may absent at all. Pay attention to the base images used for building your containers: as far as possible, use trusted official images, or build your own. The last thing you want happening for your Kubernetes cluster is helping someone mining crypto coins.

It is recommended to refer to the security of containers as a ten-level system covering the container stack (host and registries), as well as questions related to the life cycle of containers (for example, API management). 


Delusion #3. Orchestration makes scaling a formality

Although Kubernetes considered being a completely necessary tool for scaling containers, it will be delusted to think that orchestration immediately sorts out scaling needs for the production environment. The volume of data at live environments is times more, please also keep in mind that monitoring may also need scaling. With increasing volumes, everything changes. 

It is impossible to ensure all K8S components implementing the interfaces correctly until you spin-up the prod: determining Kubernetes "working normally", and the API server and other controlled components get scaled according to your needs.

As I say, the development and testing environments go much easier. In local environments it is easy skipping basics like defining the right resources and restrictions for requests. Avoiding that can collapse you prod once later. 

Scaling the cluster both directions is a good example when the task goes easy locally, being clearly complicated at production: scaling prod clusters is more difficult than clusters for development/testing.

While Kubernetes makes it relatively simple scaling horizontally, DevOps still need keeping in mind some nuances, especially when it comes to maintaining services live when scaling an infrastructure. It is crucial to ensure that the main services, as well as a system monitoring and security alerts, were distributed across the cluster nodes and do work with stateful volumes so that data not being lost on scaling down.

Again, it all comes to proper planning and resources available. You need not just understand your needs for scaling when planning but most importantly - test them. Your production environment must be capable for handling much higher loads.


Delusion #4. Kubernetes works everywhere equally that same

In reality: differences in work in another environment may vary similar to those differences between running Kubernetes on the developer's laptop and prod server. The reality is that there may be serious differences depending on the vendor .Many believe that if the K8S works locally, it will work in any operational environment. 

Local environments commonly miss important components required by prod environments: monitoring, logging, certificate management and credentials. You need to keep that in mind, as that is another problem raised from a difference between prod  and development/testing environments.

However, that isn't Kubernetes exclusively, but applies to containers/microservices in general, especially in multicloud and hybrid cloud setups. Those Kubernetes implementations are more complicated than it seems initially, as many of the mandatory services are proprietary, like load balancing and firewalls. A container that works well locally may work unprotected (may not start at all) in the cloud with another setup of tools. Therefore, SERVICE MESH technologies like Istio attract so much attention. They guarantee the availability wherever your container works, so you do not need to think about infrastructure - which is the main reason for using containers.

I hope you can reach safer and more reliable production environments with Kubernetes keeping the above in mind!

Which certificates (and where to) got installed with Sitecore 9.1?

Upon the new clean installation, Sitecore 9.1 puts the following certificates (as per below example of habitat project hostnames):

1. Current User\Personal - nothing



2. Current User\Intermediate Certification Authorities - SIF

  • Sitecore Install Framework / Sitecore Install Framework



3. Local Computer\Personal - xConnect and Identity Server

  • habitat_xconnect.dev.local / DO_NOT_TRUST_SitecoreRootCert
  • habitat_IdentityServer.dev.local / DO_NOT_TRUST_SitecoreRootCert


4. Local Computer\Intermediate Certification Authorities - Sitecore and SIF

  • DO_NOT_TRUST_SitecoreRootCert / DO_NOT_TRUST_SitecoreRootCert
  • Sitecore Install Framework / Sitecore Install Framework


Hope this helps!

Field level deny permissions in Helix based on Habitat and how that affects your workflows?

If you decide to use Habitat as a bootstrap platform for your Helix solution, while setting up workflows for your solution, you may come across a situation described below. By this blog post I will try to explain what happens, why is it so, and how to make things work.

Symptoms: you are about to set up workflows for the solution and have created a role for the content editors. Then you give read / write permissions for that role to the site content (likely to be /Home and /Global nodes under your site definition item, recursively). When logging as a user having Content Editor role mention above, you are able to Lock and Edit and later to Check-In an item, but the fields for that item are disabled. Weird. But doing the same on other items outside your website works well (for instance - Home item coming with Sitecore initial installation). Why is it so?


There are few of StackOverflow questions trying to sort this out: one and two. I have left few comments there helping other to solve the situation.


Explanation: Habitat uses an "intersection" of feature-or-foundation-level permissions (also knows as Functional roles) with project-level permissions (also knows as organisational rights). Most of the Habitat modules have such a functional role coming as a part of the module, is in the following format: modules\Feature XXX Admin or modules\Foundation XXX Admin.

What habitat does - it denies write access for the inheritance for all the fields by default and then explicitly allows writing permission for that particular Functional Role within a module. That is briefly explained in the official Helix documentation but two images below would be more descriptive:



Solution: two potential ways of sorting this out. The first option is when you decide to keep Functional roles as a part of your solution. In that case, you need to make sure your Content Editor roles also inherits from these Functional roles (or from an umbrella role inheriting a combination of Functional roles).

Another way will be if you decide to drop these Functional roles. In that case, you'll need to remove them from serialization config and source control, and also perform the following for each field affected:

1. Navigate to that field in Sitecore, for example: /sitecore/templates/Feature/Navigation/_Navigable/Navigation/ShowInNavigation

2. Click Security tab, then Assign. You'll see at least two roles available - Everyone and a Functional role for that module.

3. Selecting Everyone, remove Inheritance denial for both Item and Descendants by clicking both red crosses, then save (OK).

4. Repeat that for each field of each template for each of the Feature / Foundation layers.

Then users from Content Editors role will be able to edit all the fields.

Hope this helps!

Creating a simple workflow in Helix

This may be a not as comprehensive guidance, as it should be, however, I am using this blog post mostly for leaving notes in a cheatsheet manner for later. So, there are several steps to make things happen.

1. Let's create a security domain for our website - that should typically be in a site config on a project layer (Website1.Website.config for my example):
<?xml version="1.0"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/">
    <sitecore>
        <domainManager defaultProvider="file">
            <patch:attribute name="defaultProvider">config</patch:attribute>
            <domains>
                <domain id="website1" type="Sitecore.Security.Domains.Domain, Sitecore.Kernel">
                    <param desc="name">$(id)</param>
                    <ensureAnonymousUser>false</ensureAnonymousUser>
                </domain>
            </domains>
        </domainManager>
    </sitecore>
</configuration>


2. Create 3 roles: website1\Editor and website1\Approver on the same domain, as well as website1\everyone that is a member of sitecore\Sitecore Client Authoring and is shared between first two. Make website1\Editor and website1\Approver members of website1\everyone and website1\everyone in turn member of sitecore\Sitecore Client Authoring.Also make website1\Approver member of sitecore\Sitecore Client Publishing


3. Create user accounts. I am creating them as part of website1 domain but they may be part of sitecore domain if need them instance-wide.


4. Assign users to their appropriate roles - editors or approvers. Once assigned - the user is able to log in and load content editor, however not able to insert new item for our website or change presentation details. User also is able to open a page in experience editor but again cannot edit this item because do not have write access to it.


Image above shows how LaunchPad looks when users log into Sitecore. However due to not having permissions they will see the following message in a Content Editor:

5. Create your workflow. I won't be original calling new workflow as Website1 Workflow. The easiest for a quick start would be to clone Sample Workflow and adjust states and other refs to point within corresponding items within that newly created workflow.


6. Then assign page relevant templates into Website1 Workflow. (Standard values -> Workflow section. Set default workflow into Website1 workflow). This ensures all the new items of this template will have Workflow field set into Website1 Workflow and State field will have state preselected as per workflow's Initial state field (workflow definition item)



7. Now give permissions to the role:
Open Access Viewer, click Account from the left top corner and select website1\Editor. Then having it selected give permissions to for everything under
/sitecore/content/Website1/Home,/sitecore/content/Website1/Global (but explicitly deny editing and deleting the top node itself), and do not forget media library for that project

Once complete - users will be able to Lock and Edit items and later submit for approval. So far so good.

8. Next step is to ensure that editors will not be able to approve items. It can be done by denying permission on Awaiting Approval state for editor.


That will result in the following permissions set for Editors in Access Viewer:


While Approvers' Acess Viewer shows Awaiting Approval state available:



9. One more thing to mention - if you got your Helix solution created from Habitat - you may come into a situation when certain fields are not editable. That happens due to write permission of fields for Feature-level templates are set to deny. I have written an explanation and the solution in a separate blog post.


10. Add language permissions to a shared role website1\everyone:



11. Last, but not the least - serialization. What you will serialize? Standard values for all templates that now became part of the workflow. Workflow itself (as a part of your Website1 project serialization configuration), Roles, possibly Users (however remember that there is no way to serialize a user with password - the only option is deserializing a user by Unicorn with setting a default password, also apart having the same default password these users will have Created field updated, which in turn will trigger source control changes). Also need to serialize Languages (foundation layer) with updated permissions - .\src\Foundation\Serialization\serialization\Foundation.Serialization.Languages\Languages.yml should be serialized.


12. Testing workflows in its basic falls into three steps routine:

- Firstly, you need to log in as an Editor and create a page and provide the rest of required content. Once done - submit that for approval. An important check is opening workbox to ensure that editors can only see Draft mode but not Awaiting approval


- Secondly, re-login as Approver and open Workbox. Now you should see Awaiting Approval section and will be able to Approve using it.


- Finally, login as an admin, switch to web database and make sure all the content has been published. That includes related and child items.