Using AI to generate Image metadata in Episerver

Using Azure's Cognitive Services Computer Vision, we can analyse images and return extra information about the contents of the image. Here are some interesting things you can do with this information...

30 July 2018

6 minute read

In this blog post, I decided it was time to play with some Artificial Intelligence, using Azure's Cognitive Services Computer Vision I demonstrate how to analyse Images and generate some metadata.

How can AI improve your CMS Editors experience and the quality of your content?

Imagine some of the following scenarios;

  • Generating descriptive text for Images (eg alt text). We can use the Describe Images feature to generate a human readable description of the image.
  • Preventing uploading inappropriate content, we can use the Adult feature to identify images that are detected as Adult or Racy, helpful for "rogue" CMS Editors and User Generated Content.
  • Further to preventing inappropriate content, we can use the Recognize Text feature to read text embedded in an Image and validate it against a list of banned words.
  • Automating style selection. Imagine a hero Banner with a manually selected style or theme (text / background color). With the Color feature, Computer Vision can detect the foreground, background and accent (hex) colors of an image. We could use this to pre-select the best style or theme to complement the image.
  • Creating thumbnails and specifying focal points in images. You've likely seen tools that allow CMS Editors to select a focal point for an image. With the Thumbnails and Area Of Interest feature, we can detect areas of interests and automatically set the focal point. 

These are just some of the possibilities this service inspired me to consider after reading the official documentation;

Humans vs Machines

I want to immediately to dispel any fear factor. After using these services, as cool as these features are, I can confidently say the Machines aren't taking over... yet

What I do see are a lot of opportunities to improve the CMS Editors experience, efficiency and enhance the overall quality of content.

Although the Machine's description of an image is often good. It's still far inferior to the creative / targeted description that a Human can provide. But a good description is better than no description, and a Machine can really help with large amounts of data (eg a Scheduled Job).

Imagine you've been audited asked to improve your website's Accessibility and SEO by ensuring every image has alt text provided. Then imagine you have an existing media library with 20,000 images... let me show you how AI can help.

Generating Image metadata in Episerver

The remainder of this blog covers the technical implementation of Azure's Computer Vision service. I focus fairly heavily on it's Description feature but touch on other features too.

  1. Azure configuring Computer Vision service
  2. Nuget Package
  3. Create class called ImageFile : ImageData (with metadata properties)
  4. Initialzation Module for OnSave event for ImageFile
  5. Analyze Image (stream vs url)
  6. Set metadata
  7. Display description alt text


[up to here]

 Episerver, ImageData is a base class that inherits from EPiServer.Core.MediaData and provides a Content Type to handle any type of image.

When a CMS editor uploads an image in Episerver DXC, the image is stored in Azure Blob storage and Episerver generates a thumbnail to help the CMS editor select images from the Media tab.

Adding Metadata fields

We needed to add metadata to images. To do this we created a new class that inherits from ImageData called ImageFile.

[ContentType(GUID = "0A89E464-56D4-449F-AEA8-2BF774AB8730")]
    [MediaDescriptor(ExtensionString = "jpg,jpeg,jpe,ico,gif,bmp,png")]
    public class ImageFile : ImageData, IContentMediaMetaData
    {
        /// <summary>
        /// Description of the Image
        /// </summary>
        public virtual string Description { get; set; }

        /// <summary>
        /// File extension of the Image file
        /// </summary>
        [Editable(false)]
        public virtual string FileExtension { get; set; }

        /// <summary>
        /// File size of the Image file
        /// </summary>
        [Editable(false)]
        public virtual int FileSize { get; set; }

        /// <summary>
        /// Width of the Image
        /// </summary>
        [Editable(false)]
        public virtual int Width { get; set; }

        /// <summary>
        /// Height of the Image
        /// </summary>
        [Editable(false)]
        public virtual int Height { get; set; }
    }

You might have noticed we also implemented an interface called IContentMediaMetaData. We wanted to add some metadata fields to all our media, not just images, but you could choose to omit this interface.

    public interface IContentMediaMetaData : IContentMedia
    {
        string FileExtension { get; set; }

        int FileSize { get; set; }
    }

Once this was done, we just had to upload an image in the CMS and our metadata fields were available.

Episerver Image Metadata

Generating metadata

We're always trying to reduce unnecessary CMS editor data entry. The only field we really need a human to manually populate is the Description field. 

So we added some helper methods to automatically populate the File Size, File Extension, Width and Height fields and added an InitializationModule with CreatingContent and SavingContent event handlers to set these field values.

    [InitializableModule]
    [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
    public class ContentMediaInitialization : IInitializableModule
    {
        public void Initialize(InitializationEngine context)
        {
            var eventRegistry =
            ServiceLocator.Current.GetInstance<IContentEvents>();

            eventRegistry.CreatingContent += OnCreatingContent;
            eventRegistry.SavingContent += OnSavingContent;

        }

        public void Preload(string[] parameters)
        {
        }

        private void OnSavingContent(object sender, ContentEventArgs e)
        {
            if (e.Content is IContentMediaMetaData)
            {
                MediaHelpers.SetFileMetaData(e.Content as IContentMediaMetaData);
            }
        }

        private static void OnCreatingContent(object sender, ContentEventArgs e)
        {
            if (e.Content is IContentMediaMetaData)
            {
                MediaHelpers.SetFileMetaData(e.Content as IContentMediaMetaData);
            }
        }

        public void Uninitialize(InitializationEngine context)
        {
            var eventRegistry =
                ServiceLocator.Current.GetInstance<IContentEvents>();

            eventRegistry.CreatingContent -= OnCreatingContent;
            eventRegistry.SavingContent -= OnSavingContent;

        }
    }

Now when we upload or even edit an image with the built-in Image Editor, our helper methods will automatically populate the metadata fields. With this automated, we decided to make these fields read only by adding the Attribute [Editable(false)] to the Properties in our ImageFile class.

Episerver Image Generated Metadata

Adding support for Vector based images

If you noticed the MediaDescriptor ExtensionString list on our ImageFile class, you maybe thinking you could just add svg to the list like this... We tried this too!

[MediaDescriptor(ExtensionString = "jpg,jpeg,jpe,ico,gif,bmp,png,svg")]

While this did allow SVGs to be handled by the ImageFile class, it did present some issues with both thumbnail generation and metadata generation.

  • Thumbnail generation
    Regardless of whether it's a Vector or Raster based image, Episerver will attempt to generate a thumbnail when you upload it. However, in the case of SVGs (vectors), the concept of a thumbnail is not relevant because they are already a scalable object. Episerver generates an empty thumbnail, but we really want it to use the original SVG.

    The following image shows JPG thumbnails compared to SVG thumbnails.
Epsierver Generated Thumbnails
  • Metadata generation
    Similarly, as they are infinitely scalable, Width and Height are not relevant for an SVG. 

To solve these issues,  we added another new Class called VectorImageFile to specifically support vector images, but don't worry, it inherits from the ImageFile class we created previously.

    [ContentType(DisplayName = "VectorImageFile", GUID = "c0b70fd0-0d5b-4c53-83fe-e32ea8faa2d5", Description = "")]
    [MediaDescriptor(ExtensionString = "svg")]
    public class VectorImageFile : ImageFile, IContentMediaMetaData
    {
        /// <summary>
        /// Gets the generated thumbnail for this media.
        /// </summary>
        public override Blob Thumbnail
        {
            get { return BinaryData; }
        }
    }

Take note of the Public Override Blob Thumbnail. As the name suggests, this overrides the base class and returns the BinaryData of the actual SVG rather than a generated thumbnail.

The only thing left to do is to prevent our helper from attempting to set the image Width and Height. This was as simple as adding a condition to our helper to check if the object type is VectorImageFile

if (media is VectorImageFile) return;

So how does all this look in the CMS?

Inside Episerver when we upload images in the Media tab of the Assets pane we now see the following...

JPGs in the Media tab

  • The Description metadata field is available for editing
  • The remaining Metadata fields have be automatically populated
Episerver JPG Metadata Thumbnails

SVGs in the Media tab

  • The Thumbnails are displayed based on the binary data of the actual SVG
  • The Description metadata field is available for editing
  • The Width and Height fields have not been automatically populated
  • The remaining Metadata fields have be automatically populated 
Episerver SVG Metadata Thumbails

Hiding or removing Width and Height fields from VectorImages

Now that the Width and Height fields are not longer being automatically populated, we wanted to hide them, so we were able to simply override them in the Class definition and add a [ScaffoldColumn(false)] or  [Ignore]  attribute to the Properties. 

        [ScaffoldColumn(false)]
        public override int Width { get; set; }

        [ScaffoldColumn(false)]
        public override int Height { get; set; }
Episerver SVG Metadata no dimensions

Wrapping it up

In this blog post we have covered how we have improved the CMS editors' experience when working with raster and vector images in Episerver, and how we extended the ImageData base class include and generate data for Metadata fields. We also added a VectorImage class to allow the CMS to display SVG thumbnails.

Need an agency with Episerver expertise?

Contact Us

Keep Reading

Want more? Here are some other blog posts you might be interested in.