Experience Sitecore! | All posts tagged 'Container'

Experience Sitecore!

Martin Miles on Sitecore

Things beginners get incorrect about Kubernetes

On start playing with Kubernetes, one may face with one of the biggest delusions considering the K8S will work in the same way for both the development or testing environment. 

But It won't!

When it comes to containers in general and Kubernetes specifically, there is a big difference between occasional runs in a labs-alike conditions and in full production lifecycle. That is similar to a difference between just starting an app and long term running it full security and reliability enabled.

Not a Kubernetes exclusive problem, but is true for the entire variety of containers and microservices. Spin-up a container comes as relative simple task, while scaling containers as containerized microservices in the production turns to be more complicated.

Although Kubernetes has alternatives, it has quickly become a de-facto standard for orchestration. However there is a difference between launching K8S in a sandbox compared to a full production environment.



Delusion #1. Running containers with Kubernetes in the development or testing environment ensures that your operational needs will be satisfied.

The truth: the launch of Kubernetes in the development or testing environment allows cutting the corners, simplify things and not to bother with the operational load, which one faces when going live to Prod. Ops and safety considerations will become major areas of differences between K8S running in prod and in the development / testing environments. Failing a cluster in the labs conditions does not bring any losses.

For me it looks like a compromise between an agility and reliability: devs use containers to achieve flexibility while working with apps when developing and testing the code does its purpose. While the ops need to provide reliability, scaling, performance and safety provided by a sustainable, industry-proven platform. They are looking for a deployment automation for the clusters to ensure the repeatability and consistency. It also helps when restoring the system.

Versioning is also critical for operations. As far as possible, you need enabling versioning everywhere, including services deployment configuration, policies and infrastructure (applying the infrastructure-as-a-code approach). That results in environments becoming repeatable. As a good practice, avoid "latest" image versions, in order to avoid configuration drift effect.


Delusion #2. Both reliability and security got provided with Kubernetes

In reality: when using Kubernetes at non-production environments only, most unlikely reliability and security got provided, at least initially. Do not get discouraged, you will be there: it's a matter of designing an architecture before switching to the Prod.

Obviously, performance, scaling, availability and safety requirements are much higher in prod environments. This It is important to plan these requirements for the deployment of K8S into architecture, as well as build scaling and security plans into Helm-charts, etc.

But how could running a cluster in dev/testing environments lead to a false confidence?

This is common for non-production environments having all network connections open. It is acceptable that any service can refer to any other service: open connections are the defaults for Kubernetes. However such an approach is an evil practice for production environments and can lead to downtime. It also exposes larger areas for potential attack and increases threats to business.

When it comes to containers / microservices, one needs spending bigger effort for creating a highly available and reliable system. Orchestration itself helps a lot but isn't a "silver bullet", same applies to security. We will have to work hard to protect Kubernetes and reduce the surface of the attack. It is very important using RBAC with minimal privileges and enforce network policies, leaving only those channels services indeed use.

Also vulnerabilities of container images can rapidly turn ops into a critical state, while on development / testing environments this danger may absent at all. Pay attention to the base images used for building your containers: as far as possible, use trusted official images, or build your own. The last thing you want happening for your Kubernetes cluster is helping someone mining crypto coins.

It is recommended to refer to the security of containers as a ten-level system covering the container stack (host and registries), as well as questions related to the life cycle of containers (for example, API management). 


Delusion #3. Orchestration makes scaling a formality

Although Kubernetes considered being a completely necessary tool for scaling containers, it will be delusted to think that orchestration immediately sorts out scaling needs for the production environment. The volume of data at live environments is times more, please also keep in mind that monitoring may also need scaling. With increasing volumes, everything changes. 

It is impossible to ensure all K8S components implementing the interfaces correctly until you spin-up the prod: determining Kubernetes "working normally", and the API server and other controlled components get scaled according to your needs.

As I say, the development and testing environments go much easier. In local environments it is easy skipping basics like defining the right resources and restrictions for requests. Avoiding that can collapse you prod once later. 

Scaling the cluster both directions is a good example when the task goes easy locally, being clearly complicated at production: scaling prod clusters is more difficult than clusters for development/testing.

While Kubernetes makes it relatively simple scaling horizontally, DevOps still need keeping in mind some nuances, especially when it comes to maintaining services live when scaling an infrastructure. It is crucial to ensure that the main services, as well as a system monitoring and security alerts, were distributed across the cluster nodes and do work with stateful volumes so that data not being lost on scaling down.

Again, it all comes to proper planning and resources available. You need not just understand your needs for scaling when planning but most importantly - test them. Your production environment must be capable for handling much higher loads.


Delusion #4. Kubernetes works everywhere equally that same

In reality: differences in work in another environment may vary similar to those differences between running Kubernetes on the developer's laptop and prod server. The reality is that there may be serious differences depending on the vendor .Many believe that if the K8S works locally, it will work in any operational environment. 

Local environments commonly miss important components required by prod environments: monitoring, logging, certificate management and credentials. You need to keep that in mind, as that is another problem raised from a difference between prod  and development/testing environments.

However, that isn't Kubernetes exclusively, but applies to containers/microservices in general, especially in multicloud and hybrid cloud setups. Those Kubernetes implementations are more complicated than it seems initially, as many of the mandatory services are proprietary, like load balancing and firewalls. A container that works well locally may work unprotected (may not start at all) in the cloud with another setup of tools. Therefore, SERVICE MESH technologies like Istio attract so much attention. They guarantee the availability wherever your container works, so you do not need to think about infrastructure - which is the main reason for using containers.

I hope you can reach safer and more reliable production environments with Kubernetes keeping the above in mind!

Everything you wanted to ask about "Items-as-Resources" coming with new Sitecore 10.1

Sitecore 10.1 brings new Items-as-Resources option, which raises plenty of questions.
  • What it that used for?
  • Why did we get it at all?
  • Any concerns of using that?
Please find the answers below:


1. Before 10.1 you’ve been given the initial set of OOB items upon the installation in the databases. That includes default templates, layouts, workflows and the rest of scaffolding items.

2. Now with 10.1 all these are supplied as the resources files outside of database. That's correct: all these items are no longer residing in the database. Yes, you still have them in your content tree as normal.

3. Does databases come empty? Not actually - there are just two entries for the default site (page) you normally first see after successful Sitecore instance installation, at the root of URL. It was decided not to put these into resources, as most customers delete that default home page anyway.

4. Are these resource read-only? Yes, Sitecore cannot write back into those resource files. Treat it as if they're written on CD but with an immediate access.

5. So does that mean I cannot modify default OOB items in Sitecore anymore? No, you actually can edit those as normal after "Unprotecting item" from a Content Editor ribbon. What happens in that case is Sitecore will take the delta between initial value stored in resource file and your changes and will store that delta having only changes you’ve done in the database. On item "consumption" the current state of item gets calculated from a resource file and that delta.

6. But you cannot delete these items. Sitecore prompts that it origins from the resource file therefore cannot be deleted. Still good, as leaves less potential for silly errors, anyway..

7. So where are these resource files located? They are based (quite predictably) within App_Data folder - App_Data\items\<DATABASE_NAME>\items.<DATABASE_NAME>.dat (by default).

8. What format are these resource files? Protobuff (Protocol Buffers) from Google. That is a surprisingly old format which is proven for a decade, at least.

9. How can I create my own resources?
Officially - you cannot. Well, it is technically possible but requires very deep dive into Protocol Buffers, raw database storage and investigating new data provider in Sitecore. But, Sitecore will likely start providing authors of popular modules with the toolset to create such a resources with an ease. So, let's say for SXA you will no longer need installing SPE + SXA packages yourself, instead you'll simply drop the resource files provided by SXA team underneath items folder, not even need to publish that afterwards.

10. Why no need publishing? That's because you copy the resource file for web database as well - items are alredy on the web database. Of course, all the items created by you will still need to get published.

11. But why at all Sitecore introduced that?
The main reason is to simplify the platform version upgrade process. The way update is done has changed.
You may have notice on the Sitecore download page, "Upgrade options" section have changed: instead of Sitecore Update Packages you now have Sitecore UpdateApp Tool that operates against each specific version you'd want to upgrade from. This tool will remove the default items for each particular legacy version and replace it with the resource files, Of course it also updates the schema with the changes which was already available.

12. The bigger reason for this change was "think containers - think ahead" approach. With such a change it becomes easier to upgrade version of Sitecore when running in containers: everything from the database since now is entirely user's custom data, and can be entirely copied to a never database, while version-specific-and-system-related items get updated by just a resource file substitute.

13. Also you may heard that Fast Query has been deprecated. That is exact reason why - if something isn't in the database, Sitecore cannot efficiently build the graph of the relationship for fast query


14. What is that data provider mentioned above?
That is a new one called CompositeDataProvider that inherited by DefaultDataProvider. The name composite assumes that one cares of merging items for Sitecore tree from both DB and the resources. In the configuration you specify it for an individual database under <database> section, you can also change the location for such resources by patching <filePath> node of <protobutItems> and overriding the location.

15. For the end-consumer of DataProvider (high level of stack) nothing changes as they still use DefaultDataProvider from their code. The changes occur at intermediate level and those happen to be internal for Sitecore.

16. That actually opens up a much wider potential for creating some intemediate-level providers to things other than ProtoBuf and SQL Databases: CRMs, DAMs, some other headless CMSs maybe. In any case this is very important and greatly welcomed step ahead for the platform!

Update: there is another great blog post from my MVP-colleague Jeremy Davis on that same topic, where he also tried drilling into these resurce files with ProtoBuff.Net library.