Experience Sitecore ! | All posts tagged 'Duplicates'

Experience Sitecore !

More than 200 articles about the best DXP by Martin Miles

Sitecore with SEO: overview and compare ways for managing duplicate content

In this blog post I decided to cover all ways of managing duplicate content in Sitecore and overview possible ways of dealing with that with emphasis on SEO. So, we have the following options to consider:

  1. Duplicates
  2. Clones
  3. Proxies
  4. Aliases
  5. IIS URL Rewrite module
  6. Sitecore Redirect Module
  7. External Reverse Proxy

1. Duplicates are commonly known and most straightforward way of creating duplicates (clones) of the items. The easiest way to perform that operation is to right click the item you'd want to copy, select Duplicate from context menu and specify new item's name.


This ends up with an entirely independent new item (and all its ancestors) located at the same level, including all field values, presentation details, permissions etc. Beware the locks and workflows - those also would be exact match of those original items have. After that, new item lives it own life and is no way synchronized to its original prototype (except Standard Values, for sure, as both new and duplicate items share the same template).

Also it's worth of mentioning Copy To - this brings similar behavior, but allows to create duplicates keeping same name but at other paths rather than original item. Copy To is available from the same context menu.


2. Clones are sort of similar to duplicates with the difference that no new item is created when using clones. To create a clone for a highlighted item from a Sitecore tree, select Configure tab, hit Clone and specify where you'd want to locate your clone.


Notice, that clones are displayed in content tree in a slightly light font color, I personally think that may create some future issues when business users may perform some actions on item without realizing that item is a clone. Why is that important to know? Let's view the way clones function on a lower level.

When you create a clone the item and the values are not physically copied. Instead, the inheritance similar to the one between Standard Values (that is sort of prototype item for a template) and real template item, is created (clone inherits not from s.v. but from original item). When you modify a filed value of original item, that would affect same field of cloned item. However the reverse process, when you modify a field value on cloned item, overrides that individual field value and it is no more tied to original item's field. Other fields of the same item will still keep the reference to their originals. Clones use the __Source field of the Advanced section from standard template to specify the cloned item:


Unlike duplicates, clones do not clone most of standard fields (those coming from standard template) like locks / workflows and statistics (created, updated, revision). But they do clone security settings, which, again, can be overridden for a clone item.

If you want to get rid of clone item - there are 2 ways to do that: just delete the clone (obvious) and unclone it. Uncloning turns cloned item into a normal item and copies field values from originals. Clones exist only in master database, when you perform publishing to CD servers - uncloning takes place there.

You also can do some crazy things like creating clones of clones - inheritances chain take place in this case; each field at each level can be overridden, for sure.

To get even more understanding on how clones work in Sitecore I recommend reading Cloning What Ifs article.


3. Proxies is another mechanism of creating and managing duplicate content in Sitecore. The are frequently used in cases similar when you have an item that you may want to be a child of multiple parent items. In order to use them you must ensure a config file setting called proxiesEnabled is true; then you create proxy items at /Sitecore/System/Proxies based on /System/Proxy template. However, proxies considered to be outdated in favor of Clones. Please do not use Proxies!


4. Aliases are the different beast. They are perfectly good for promos and campaigns as the normally specify a quick URL for campaign landing page. Aliases have out-of-box limitation that they are set only per root level and not multisite-friendly (however there is a link that explains how to implement that feature on your own).

Aliases are defined under /sitecore/system/Aliases based on the System/Alias template.

There is just one field in alias template that allows to select target item.

There are two more overheads when working with aliases - sometimes you may need to identify if an item is alias:

bool isAlias = Context.Database.Aliases.Exists(path);

Also you may need to set canonical URLs on them to improve SEO. Good way of doing that is:

public class AliasResolver : Sitecore.Pipelines.HttpRequest.AliasResolver
{
    public override void Process(HttpRequestArgs args)
    {
        base.Process(args);

        if (Context.Item != null)
        {
            args.Context.Items["CanonicalUrl"] = Context.Item.GetFullUrl(args.Context.Request.Url);
        }
    }
}

Also, do not forget to publish your aliases to content delivery databases, as they won't work until published.



5. IIS URL Rewrite module is probably most functional option, it is external to Sitecore, that means it happens before routing and before pipelines.

For the drawbacks of using IIS URL Rewrite I would mention that you'd need to have access to IIS Manager or web.config write permission on each of content delivery servers. I previously wrote a blog post IIS URL Rewrite module - few SEO tricks that can demonstrate how powerful it is.

Also I would beware you of some specific Sitecore URLs and create appropriate extensions (ex. for WebResource.axd - take from real code).



6. Sitecore Redirect Module is another good choice as it does perfect server side 301 redirect for both URLs and items. It is almost as powerful as IIS URL Rewrite, but because it is configured in Sitecore - you do not need to have CD environments access at all - just create and publish redirect rules (as you normally do with generic content) - they will take effect immediately! Module is transparent to multi-site configuration, it can do redirects from one site's URL or item to another.

One more advantage of the module - availability of source code, so functionality can be extended to any bespoke requirement, also it becomes compatible with new Sitecore versions by just rebuilding it with appropriate Sitecore.Kernel.dll and replacing updated module DLL in webroot bin folder.

The only drawback, probably, is that in default state it performs only 301 redirects (however you may implement whatever you require). Please remember, that 301 requests are cached by browser -so you you are testing it intensively - you may need to purge browser cache from time to time.


7. External Reverse Proxy can be another option. It can do not only rewrites to external websites, but also rewrite some requests to alternative internal URL and pass that to IIS as "given" and further down to Sitecore. I met such scenarios several times on projects I took part. By the way, did you know that IIS can also serve as reverse proxy?

Performing rewrites and URL resolving logic outside of Sitecore can be both advantageous and disadvantageous. What traps does it bring?

Well, imagine you are new developer who start working on a new working copy of source code. When you run locally you may have different URL patterns compared to those on production environment. Business users usually deal with external production URLs and do not know internal structure, so that is how they form tasks and change requests. If you are not enough lucky to have comprehensive documentation or senior colleagues who can explain how is that configured - you may end up in multiple puzzling hours of attempting to find and match URLs from different environments.

Also, SEO much relies on sitemaps, so if you are using dynamic sitemaps - you need to implement that custom URL resolving logic that you have on reverse proxy. Also Sitemap Module from the Market would not work for you in that case.


I hope this article helped you to understand you options are with their pros and cons and to pick up a proper implementation depending on exact scenario.