Kubernetes logoI recently built a machine learning model, trained it, and explored the implications of deploying it using KubernetesMachine learning trains a program to recognize patterns in data so when new data is provided, it can make predictions based on what it’s learned. Kubernetes is a container orchestrator that automates the deploying and scaling of containers. Packaging a machine learning program and deploying it on Kubernetes has the potential to help our customers with their increasingly complex machine learning needs.

First, I determined what I wanted to accomplish with machine learning, the tools I needed, and how to put them together.  Teaching a machine how to recognize imagery is fascinating to me, so I focused on image recognition. I found the PlanesNet dataset on Kaggle — a dataset focused on recognizing planes in aerial imagery.  The dataset would be simple enough to implement but have good potential for further exploration.

To build a proof of concept I used TensorFlow, TFLearn, and Docker.  TensorFlow is an open-source machine learning library and TFLearn is a higher-level TensorFlow API which aids in writing less code to get started.  TFLearn is a Python library, so I wrote my proof of concept in Python.  Docker was used to package the program into a container to run on Kubernetes.

Machine Learning

planes data set

The PlanesNet dataset has 32,000 color images, each at 20px by 20px.  There are 8,000 images classified as “plane” and 24,000 images classified as “no-plane.” Reading in the dataset and preparing it for processing was straightforward using Python libraries like Pillow and NumPy.  Then I broke the dataset up into a training set and a testing set using the train_test_split function in the sklearn Python library.  The training set was used to build the model weights while the testing set validated how well the model was trained.

Neural Network

One of the most complex parts was how to design my simple neural network. A neural network is broken into layers starting with the input layer, then one or more hidden layers, ending with the output layer.

Creating the input layer, I defined its shape, preprocessing, and augmentation. The shape of the PlanesNet data is a 20px by 20px by 3 colors matrix. The preprocessing allowed me to tweak preparing the data for both training and testing while the augmentation allowed me to perform operations (like flipping or rotating images) on the data while training.

Because I was working with imagery, I chose to use two convolutional hidden layers before arriving at the output layer. Convolutional layers perform a set of calculations on each pixel and its surrounding pixels, attempting to understand features (i.e., lines, curves) of an image. After my hidden layers, I introduced a dropout percentage rate, which helps to reduce overfitting.

Next, I added a readout layer which brought my neural network down to the number of expected outputs.  Finally, I added a regression layer to specify a loss function, optimizer, and learning rate.  With this network, I could now train it over multiple iterations, called epochs, until I reached the desired level of prediction accuracy.

diagram of a simple neural network

Simple Neural Network

Using this model, I could now predict if an image was of an airplane. I loaded the image, called the predict function and viewed the output.  The output gave me a percentage of likelihood that the image was an airplane.

Exploring Scaling

Now that I had my simple neural network for training and predicting, I packaged it into a Docker container so it could be run on more than my single computer.  As I examined the details of deploying to Kubernetes, two things quickly became apparent.  First, my simple neural network would not train across multiple container instances.  Second, my prediction program would scale well if I wrapped it in a web API.

For my simple neural network, I had not implemented anything to split a dataset, train multiple containers, and bring the results back together.  Therefore running multiple instances of my container using Kubernetes would only provide the benefit of choosing the container with the highest accuracy model. (The dataset splitting process along with the input augmentation causes each container to have different accuracies.)  To have multiple containers which coordinate learning in my simple neural network would require further design.

diagram of the container model

My container’s prediction program executes via the command line but wrapping it in a web API endpoint it would make it easier to use.  Since each instance of the container has the trained model, Kubernetes could scale up or down the number of running instances of my container to meet the demand of the web API endpoint.  Kubernetes also provides a method for rolling out updates to my container if I further train my network model.

Conclusion

This was an excellent exercise in building a machine learning model, training it to predict an airplane from aerial imagery, and deploying it on Kubernetes. Additional applications could include expanding the dataset to include different angles of planes along with recognizing various specific types of planes. It would also be beneficial to re-design the neural network to benefit from Kubernetes scaling to the training needs. Using Kubernetes to deploy a prediction API-based on a trained model is both beneficial and practical today.

PowerApps logoI’m working on a project with a straightforward requirement that’s typically solved with out-of-the-box PowerApps platform features. The customer needs to capture employment application data as part of their online hiring process. In addition to entering standard employment data, applicants enter a typical month’s worth of budgeted expenses to paint a clear financial picture which serves as one factor in their suitability for employment.

The Requirements

The solution consists of an Application entity and a related Expense entity to track multiple expenses per Application. The Application entity needs to capture aggregate expense values based on expense categories. The Expense entity stores the institution or business receiving the payment, the Expense Type (lookup), the monthly payment in dollars, and the balance owed in dollars. Expense Types are associated with “Expense Categories.” An Expense Category is associated with one or more Expense Types. The Application entity tracks 15-20 aggregate expense values that need to be calculated from the individual expense entries.

For example, the expense types “Life Insurance” and “Auto Insurance” are both associated with the expense category “Insurance.” The Application entity has an aggregate field called “Total Insurance Expenses” to capture the sum of all monthly insurance-related expenses.

The basic entity model is shown below:

Diagram of entity model

To summarize all that detail (which is probably hard to follow in paragraph form), the essential requirements are:

  • Capture standard employment data for applicants
  • Capture monthly budgeted expenses associated with an expense type and category
  • Calculate 15-20 aggregate expenses based on expense category
  • The solution must accommodate the addition of new expense categories

The aggregate fields on the Application entity fall into one of four categories: 1) a monthly payment total by expense category, 2) an outstanding balance total by expense category, 3) a monthly payment total across all categories, and 4) a total outstanding balance across all categories.

The breakdown for each of the expense categories and their associated aggregations on the Application entity can be depicted as such:

Expense Category Category Payment Rollup Category Balance Rollup All Payments Rollup All Balances Rollup
Automobile Total Automobile Expenses Total Auto Balances Owed Total Monthly Expenses Total Outstanding Debt
Credit Card Total Credit Card Expenses Total Credit Card Balances Owed Total Monthly Expenses Total Outstanding Debt
Food/Clothing Total Food/Clothing Expenses Total Monthly Expenses
Housing Total Housing Expenses Total Housing Balances Owed Total Monthly Expenses Total Outstanding Debt
Insurance Total Insurance Expenses Total Monthly Expenses
Medical/Dental Total Medical Expenses Total Monthly Expenses
Other Debt Total Other Debt Expenses Total Other Debt Balances Owed Total Monthly Expenses Total Outstanding Debt
Utilities Total Utility Expenses Total Monthly Expenses

The table shows that some expense categories require a total of four aggregate calculations, whereas others only require two aggregate calculations. The calculations should occur when an Expense is created, the values for a monthly payment or balance owed change, an Expense is deactivated/activated (state change), or when an Expense is deleted. “Total Monthly Expenses” is calculated for all expense entries. Only four categories require the category balance and total outstanding balance calculations.

Platform Limitations

Maximum Number of Rollup Fields

Dynamics 365 only allows for a maximum of 10 rollup fields on any given entity — these are fields that take advantage of the “Rollup” field type customized in the solution, the values for which are automatically calculated by the platform on a predetermined interval, e.g., every 12 hours.

One option to overcome this limitation — only available in on-premises Dynamics 365 implementations — is to modify the maximum number of rollup fields per entity in the MSCRM_CONFIG database. There are rare circumstances wherein modifying table values in this database are beneficial. However, given the possibility of a disaster recovery situation in which Dynamics 365 needs to be reinstalled and/or recovered, any modifications made to the MSCRM_CONFIG database could be lost. Even if an organization has well-documented disaster recovery plans that account for these modifications, there’s always a chance the documented procedures will not be followed, or steps possibly skipped.

Another consideration is the potential to move to the cloud. If the customer intends to move their Dynamics 365 application to the cloud, they’ll want to ensure their solution remains on a supported path, and eliminate the need to re-engineer the solution if that day comes.

Rollup Calculation Filters

Rollup fields in Dynamics 365 are indeed a powerful feature, but they do come with limitations that prevent their use in complex circumstances. Rollup fields only support simple filters when defining the rollup field aggregation criteria.

To keep this in the context of our requirements, note above that the Expense Type and Expense Category are lookup values in our solution. If we need to calculate the sum of all credit card expenses entered by an applicant, this is not possible given our current design, because the Expense Type is a lookup value on the expense entry. You’ll notice that when I try to use the Expense Type field in the filter criteria for the rollup field, I’m only given the choices “Does not contain data” and “Contains data.” Not only can’t I use actual values of the Expense Type, but I can’t drill down to the related Expense Category to include it in my aggregation filter.

Screenshot of test rollup screen

Alternatives

The limitations above could be overcome by redesigning our solution, for example, by choosing to configure both the Expense Type and Expense Category fields as Option Sets instead of lookups, along with some sophisticated Business Rules that appropriately set the Expense Category based on the selected Expense Type. That’s one option worth considering, depending on the business requirements with which you’re dealing. We could also choose to develop a code-heavy solution by writing plugin code to do all these calculations, thus side-stepping the limitation on rollup fields and accommodating the entity model I’ve described.

The Solution

Ultimately, however, the customer wants a solution that allows them to update their expense tracking requirements without needing developers to get the job done. For example, the organization may decide they no longer want to track a certain expense category or may want to add a new one. Choosing to create entities to store the necessary lookup values will afford them that kind of flexibility. However, that still leaves us the challenge of calculating the aggregate expense values on the Application entity.

The final solution may still require the involvement of their IT department for some of the configuration steps but ideally will not require code modifications.

Lookup Configuration

The first step toward our solution is to add four additional fields to the Expense Category entity. These four fields represent the four aggregation categories described above: 1) Category Payment Rollup, 2) Category Balance Rollup, 3) Payments Rollup (for all categories), and 4) Balances Rollup (for all categories).

These fields will allow users to define how aggregate values are calculated for each Expense Category, i.e., by identifying the target fields on the Application entity for each aggregation.

Screenshot of Expense Category in Dynmaics 365

Custom Workflow Activity

The next step is to write a custom workflow activity to perform the aggregate calculations described above. Custom workflow activities present several benefits to the customer, primarily centered on the ease of configuration within the Dynamics 365 UI itself  (run asynchronously, run on-demand, and/or when specific events occur on the target record type, e.g., create, update, delete, or state change). Custom workflow activities can accept user-defined parameters configured in the workflow definition.

This means that — as you might have guessed — the custom workflow activity can be written in such a way to allow users to add new Expense Categories so that the aggregate calculations “just work” without requiring code modifications or changes to the workflow configuration in the solution.

Here’s the custom workflow activity class that runs the calculations followed by the workflow definition. As you can see below, I’ve included the Application and Expense Category fields as required input parameters for the workflow activity. (I’ll likely refactor this solution to accept the four fields as inputs, but for now, this gets the job done. Thanks to my good friend, Matt Noel, for that suggestion.) Further down, you’ll notice that for each aggregate field, we run a custom fetch query configured appropriately to perform the required calculation.

using Microsoft.Xrm.Sdk;
using Microsoft.Xrm.Sdk.Client;
using Microsoft.Xrm.Sdk.Query;
using Microsoft.Xrm.Sdk.Workflow;
using System;
using System.Activities;

namespace Project.Workflow.Activities
{
    public class CalculateExpenses : CodeActivity
    {
        #region Input Parameters

        [RequiredArgument]
        [Input("Application")]
        [ReferenceTarget("new_application")]
        public InArgument<EntityReference> Application { get; set; }

        [RequiredArgument]
        [Input("Expense Category")]
        [ReferenceTarget("new_expensecategory")]
        public InArgument<EntityReference> ExpenseCategory { get; set; }

        private readonly string _categoryPaymentRollupField = "new_categorypaymentrollup";
        private readonly string _allPaymentsRollupField = "new_paymentsrollup";
        private readonly string _categoryBalanceRollupField = "new_categorybalancerollup";
        private readonly string _allBalancesRollupField = "new_balancesrollup";

        #endregion

        protected override void Execute(CodeActivityContext executionContext)
        {
            var tracer = executionContext.GetExtension<ITracingService>();
            var context = executionContext.GetExtension<IWorkflowContext>();
            var serviceFactory = executionContext.GetExtension<IOrganizationServiceFactory>();
            var orgService = serviceFactory.CreateOrganizationService(null);
            var orgContext = new OrganizationServiceContext(orgService);

            string[] _rollupFields =
            {
                _categoryPaymentRollupField,
                _allPaymentsRollupField,
                _categoryBalanceRollupField,
                _allBalancesRollupField
            };

            var expenseCategory = GetEntity(orgService, ExpenseCategory.Get(executionContext), _rollupFields);
            var applicationRef = Application.Get(executionContext);
            var application = new Entity("new_application")
            {
                Id = applicationRef.Id
            };

            // Set the Category Payment Rollup
            if (expenseCategory.GetAttributeValue<string>(_categoryPaymentRollupField) != null)
            {
                var paymentRollup = expenseCategory.GetAttributeValue<string>(_categoryPaymentRollupField);
                application[paymentRollup] = new Money(GetExpenseAggregate(orgService, application.Id, expenseCategory.Id, false, true));
            }

            // Set the rollup for all Monthly Payments
            if (expenseCategory.GetAttributeValue<string>(_allPaymentsRollupField) != null)
            {
                var allPaymentsRollup = expenseCategory.GetAttributeValue<string>(_allPaymentsRollupField);
                application[allPaymentsRollup] = new Money(GetExpenseAggregate(orgService, application.Id, expenseCategory.Id, false, false));
            }

            // Set the rollup for Category Balances
            if (expenseCategory.GetAttributeValue<string>(_categoryBalanceRollupField) != null)
            {
                var categoryBalanceRollup = expenseCategory.GetAttributeValue<string>(_categoryBalanceRollupField);
                application[categoryBalanceRollup] = new Money(GetExpenseAggregate(orgService, application.Id, expenseCategory.Id, true, true));
            }

            // Set the rollup for all Category Balances 
            if (expenseCategory.GetAttributeValue<string>(_allBalancesRollupField) != null)
            {
                var allBalancesRollup = expenseCategory.GetAttributeValue<string>(_allBalancesRollupField);
                application[allBalancesRollup] = new Money(GetExpenseAggregate(orgService, application.Id, expenseCategory.Id, true, false));
            }

            // Execute the update on the Application rollup fields 
            try
            {
                //trace 
                orgService.Update(application);
            }
            catch (Exception e)
            {
                //trace
                throw new InvalidPluginExecutionException("Error updating Application: " + e.Message);
            }
        }

        private static Entity GetEntity(IOrganizationService service, EntityReference e, params String[] fields)
        {
            return service.Retrieve(e.LogicalName, e.Id, new ColumnSet(fields));
        }

        private decimal GetExpenseAggregate(IOrganizationService service, Guid applicationId, Guid expenseCategoryId, bool balanceRollup, bool includeCategory)
        {
            var sum = 0m;

            var aggregateField = balanceRollup ? "new_balanceowed" : "new_monthlypayment";

            var fetchXML = @"<fetch distinct='false' mapping='logical' aggregate='true' >" +
                  "<entity name='new_expense' >" +
                    "<attribute name='" + aggregateField + "' aggregate='sum' alias='sum' />" +
                    "<filter type='and' >" +
                      "<condition attribute='statecode' operator='eq' value='0' />" +
                      "<condition attribute='new_applicationid' operator='eq' value='" + applicationId + "' />" +
                    "</filter>";

            var endFetch = "</entity></fetch>";

            if (includeCategory)
            {
                var categoryFetch = @"<link-entity name='new_expensetype' from='new_expensetypeid' to='new_expensetypeid' link-type='inner' alias='ExpenseType' >" +
                      "<link-entity name='new_expensecategory' from='new_expensecategoryid' to='new_categoryid' link-type='inner' alias='ExpenseCategory' >" +
                        "<filter type='and' >" +
                          "<condition attribute='new_expensecategoryid' operator='eq' value='" + expenseCategoryId + "' />" +
                        "</filter>" +
                      "</link-entity>" +
                    "</link-entity>";

                fetchXML += categoryFetch + endFetch;
            }
            else
            {
                fetchXML += endFetch;
            }

            FetchExpression fetch = new FetchExpression(fetchXML);

            try
            {
                EntityCollection aggregate = service.RetrieveMultiple(fetch);

                foreach (var c in aggregate.Entities)
                {
                    if (((AliasedValue)c["sum"]).Value is Money)
                    {
                        sum = ((Money)((AliasedValue)c["sum"]).Value).Value;
                        //tracer.Trace("Sum of payments is: {0}", sum);
                    }
                }
            }
            catch (Exception e)
            {
                //tracer.Trace(e.Message);
                throw new InvalidPluginExecutionException("Error returning aggregate value for " + aggregateField + ": " + e.Message);
            }

            return sum;
        }
    }
}

Workflow definition:

Powerapps workflow screenshot

Configuration of custom inputs:

Configuring custom inputs

Testing

After building and registering my workflow assembly, I created expense entries for all expense types ensuring that all expense categories were represented. The following images depict the successful aggregation of payments and balances:

Screenshot of expense output

Screenshot of debt output

Conclusion

Custom workflow activities are a powerful tool that balances the need for a highly maintainable solution after deployment, with complex requirements that need a coded solution. The design gives end users the flexibility to adapt their data collection needs over time as their requirements change, and they can do so with little or no involvement from IT.

As I mentioned above, an alternative approach to this requirement could involve writing a plugin to perform the calculations. This approach would still require some entity-based configurations for flexibility but would suffer from limited end-user configuration needed if or when requirements change. I can also update the custom workflow activity to accept the four aggregate fields as optional arguments to the workflow. Doing so would enable users to run separate workflow processes for each expense type/category, giving them additional configuration control over these automated calculations.

Kubernetes logoIf you’ve worked in software development or IT for any amount of time, chances are you’ve at least heard about containers…and maybe even Kubernetes.

Maybe you’ve heard that Google manages to spin up two billion containers a week to support their various services, or that Netflix runs its streaming, recommendation, and content systems on a container orchestration platform called Titus.

This is all very exciting stuff, but I’m more excited to write and talk about these things now more than ever before, for one simple reason: We are finally at a point where these technologies can make our lives as developers and IT professionals easier!

And even better…you no longer have to be a Google (or one of the other giants) employee to have a practical opportunity to use them.

Containers

Before getting into orchestrators and what they actually offer, let’s briefly discuss the fundamental piece of technology that all of this is depends on – the container itself.

A container is a digital package of sorts, and it includes everything needed to run a piece of software.  By “everything,” I mean the application code, any required configuration settings, and the system tools that are normally brought to the table by a computer’s operating system. With those three pieces, you have a digital package that can run a software application in isolation across different computing platforms because the dependencies are included in that package.

And there is one more feature that makes containers really useful – the ability to snapshot the state of a container at any point. This snapshot is called a container “image.” Think of it in the same way you would normally think of a virtual machine image, except that many of the complexities of capturing the current state of a full-blown machine image (state of the OS, consistency of attached disks at the time of the snapshot, etc.) are not present in this snapshot.  Only the components needed to run the software are present, so one or a million instances can be spun-up directly from that image, and they should not interfere with each other.  These “instances” are the actual running containers.

So why is that important? Well, we’ve just alluded to one reason: Containers can run software across different operating systems (various Linux distributions, Windows, Mac OS, etc.).  You can build a package once and run it in many different places. It should seem pretty obvious at this point, but in this way, containers are a great mechanism for application packaging and deployment.

To build on this point, containers are also a great way to distribute your packages as a developer.  I can build my application on my development machine, create a container image that includes the application and everything it needs to run, and push that image to a remote location (typically called a container registry) where it can be downloaded and turned into one or more running instances.

I said that you can package everything your container needs to run successfully, but the last point to make is that the nature of the container package gives you a way to enforce a clear boundary for your application and a way to enforce runtime isolation. This feature is important when you’re running a mix of various applications and tools…and you want to make sure a rogue process built or run by someone else doesn’t interfere with the operation of your application.

Container Orchestrators

So containers came along and provided a bunch of great benefits for me as a developer.  However, what if I start building an application, and then I realize that I need a way to organize and run multiple instances of my container at runtime to address the expected demand?  Or better yet, if I’m building a system comprised of multiple microservices that all need their own container instances running?  Do I have to figure out a way to maintain the desired state of this system that’s really a dynamic collection of container instances?

This is where container orchestration comes in.  A container orchestrator is a tool to help manage how your container instances are created, scaled, managed at runtime, placed on underlying infrastructure, communicate with each other, etc.  The “underlying infrastructure” is a fleet of one or more servers that the orchestrator manages – the cluster.  Ultimately, the orchestrator helps manage the complexity of taking your container-based, in-development applications to a more robust platform.

Typically, interaction with an orchestrator occurs through a well-defined API, and the orchestrator takes up the tasks of creating, deploying, and networking your container instances – exactly as you’ve specified in your API calls across any container host (servers included in the cluster).

Using these fundamental components, orchestrators provide a unified compute layer on top of a fleet of machines that allows you to decouple your application from these machines. And the best orchestrators go one step further and allow you to specify how your application should be composed, thus taking the responsibility of running the application and maintaining the correct runtime configuration…even when unexpected events occur.

VIEW OUR AZURE CAPABILITIES
Since 2009, AIS has been working with Azure honing our capabilities and offerings. View the overview of our Azure-specific services and offerings.

Kubernetes

Kubernetes is a container orchestrator that delivers the capabilities mentioned above. (The name “Kubernetes” comes from the Greek term for “pilot” or “helmsman of a ship.”) Currently, it is the most popular container orchestrator in the industry.

Kubernetes was originally developed by Google, based in part on the lessons learned from developing their internal cluster management and scheduling system Borg.  In 2014, Google donated Kubernetes to the Cloud Native Computing Foundation (CNCF) which open-sourced the project to encourage community involvement in its development. The CNCF is a child entity of the Linux Foundation and operates as a “vendor-neutral” governance group. Kubernetes is now consistently in the top ten open source projects based on total contributors.

Many in the industry say that Kubernetes has “won” the mindshare battle for container orchestrators, but what gives Kubernetes such a compelling value proposition?  Well, beyond meeting the capabilities mentioned above regarding what an orchestrator “should” do, the following points also illustrate what makes Kubernetes stand out:

  • The largest ecosystem of self-driven contributors and users of any orchestrator technology facilitated by CNCF, GitHub, etc.
  • Extensive client application platform support, including Go, Python, Java, .NET, Ruby, and many others.
  • The ability to deploy clusters across on-premises or the cloud, including native, managed offerings across the major public cloud providers (AWS, GCP, Azure). In fact, you can use the SAME API with any deployment of Kubernetes!
  • Diverse workload support with extensive community examples – stateless and stateful, batch, analytics, etc.
  • Resiliency – Kubernetes is a loosely-coupled collection of components centered around deploying, maintaining and scaling workloads.
  • Self-healing – Kubernetes works as an engine for resolving state by converging the actual and the desired state of the system.

Kubernetes Architecture

A Kubernetes cluster will always include a “master” and one or more “workers”.  The master is a collection of processes that manage the cluster, and these processes are deployed on a master node or multiple master nodes for High Availability (HA).  Included in these processes are:

  • The API server (Kube-apiserver), a distributed key-store for the persistence of cluster management data (etcd)
  • The core control loops for monitoring existing state and management of desired state (Kube-controller-manager)
  • The core control loops that allow specific cloud platform integration (Cloud-controller-manager)
  • A scheduler component for the deployment of Kubernetes container groups, known as pods (Kube-scheduler)

Worker nodes are responsible for actually running the container instances within the cluster.  In comparison, worker nodes are simpler in that they receive instructions from the master and set out serving up containers.  On the worker node itself, there are three main components installed which make it a worker node in a Kubernetes cluster: an agent called kubelet that identifies the node and communicates with the master, a network proxy for interfacing with the cluster network stack (kube-proxy), and a plug-in interface that allows kubelet to use a variety of container runtimes, called the container runtime interface.

diagram of Kubernetes architecture

Image source

Managed Kubernetes and Azure Kubernetes Service

“Managed Kubernetes” is a deceptively broad term that describes a scenario where a public cloud provider (Microsoft, Amazon, Google, etc.) goes a step beyond simply hosting your Kubernetes clusters in virtual machines to take responsibility for deploying and managing your cluster for you.  Or more accurately, they will manage portions of your cluster for you.  I say “deceptively” broad for this reason – the portions that are “managed” varies by vendor.

The idea is that the cloud provider is:

  1. Experienced at managing infrastructure at scale and can leverage tools and processes most individuals or companies can’t.
  2. Experienced at managing Kubernetes specifically, and can leverage dedicated engineering and support teams.
  3. Can add additional value by providing supporting services on the cloud platform.

In this model, the provider does things like abstracting the need to operate the underlying virtual machines in a cluster, providing automation for actions like scaling a cluster, upgrading to new versions of Kubernetes, etc.

So the advantage for you, as a developer, is that you can focus more of your attention on building the software that will run on top of the cluster, instead of on managing your Kubernetes cluster, patching it, providing HA, etc. Additionally, the provider will often offer complementary services you can leverage like a private container registry service, tools for monitoring your containers in the cluster, etc.

Microsoft Azure offers the Azure Kubernetes Service (AKS), which is Azure’s managed Kubernetes offering. AKS allows full production-grade Kubernetes clusters to be provisioned through the Azure portal or automation scripts (ARM, PowerShell, CLI, or combination).  Key components of the cluster provisioned through the service include:

  • A fully-managed, highly-available Master. There’s no need to run a separate virtual machine(s) for the master component.  The service provides this for you.
  • Automated provisioning of worker nodes – deployed as Virtual Machines in a dedicated Azure resource group.
  • Automated cluster node upgrades (Kubernetes version).
  • Cluster scaling through auto-scale or automation scripts.
  • CNCF certification as a compliant managed Kubernetes service. This means it leverages the Cloud-controller-manager standard discussed above, and its implementation is endorsed by the CNCF.
  • Integration with supporting Azure services including Azure Virtual Networks, Azure Storage, Azure Role-Based Access Control (RBAC), and Azure Container Registry.
  • Integrated logging for apps, nodes, and controllers.

Conclusion

The world of containers continues to evolve, and orchestration is an important consideration when deploying your container-based applications to environments beyond “development.”  While not simple, Kubernetes is a very popular choice for container orchestration and has extremely strong community support.  The evolution of managed Kubernetes makes using this platform more realistic than ever for developers (or businesses) interested in focusing on “shipping” software.

I recently encountered an issue when trying to create an Exact Age column for a contact in Microsoft Dynamics CRM. There were several solutions available on the internet, but none of them was a good match for my specific situation. Some ideas I explored included:

  1. Creating a calculated field using the formula DiffInDays(DOB, Now()) / 365 or DiffInYears(DOB, Now()) – I used this at first, but if the calculated field is a decimal type, then you end up with a value like 23 years old which is not desirable. If the calculated field is a whole number type, then the value is always the rounded value. So, if the DOB is 2/1/1972 and the current date is 1/1/2019, the Age will be 47 when the contact is actually still 46 until 2/1/2019.
  2. Using JavaScript to calculate the Age – The problem with this approach is that if the record is not saved, then the data becomes stale. This one also does not work with a view (i.e., if you want to see a list of client ages). The JavaScript solution seems more geared towards the form of UI experience only.
  3. Using Workflows with Timeouts – This approach seemed a bit complicated and cumbersome to update values daily across so many records.

Determining Exact Age

Instead, I decided to plug some of the age scenarios into Microsoft Excel and simulate Dynamic CRM’s calculations to see if I could come up with any ideas.

Note: 365.25 is used to account for leap years. I originally used 365, but the data was incorrect. After reading about leap years, I decided to plug 365.25 in, and everything lined up.

Excel Formulas

Setting up the formulas above, I was able to calculate the values below. I found that subtracting the DATEDIF Rounded value from the DATEDIF Actual value produced a negative value when the month/day was after the current date (2/16/2019 at the time). This allowed me to introduce a factor of -1 when the Difference was less than or equal to 0.  Using this finding, I set up the solution in CRM.

Excel Calculations

The Solution

  1. Create the necessary fields.
    Field  Data Type  Field Type  Other  Formula 
    DOB  Date and Time  Simple  Behavior: User Local   
    Age Actual  Decimal Number  Calculated  Precision: 10  DiffInDays(new_dob, Now()) / 365.25 
    Age Rounded  Whole Number  Calculated    DiffInDays(new_dob, Now()) / 365.25 
    Age Difference  Decimal Number  Calculated  Precision: 10  new_ageactual – new_agerounded 
    Age  Whole Number  Calculated    See below 
  1. Create a business rule for DOB; setting it equal to birthdate when birthdate contains data. This way when birthdate is set, the DOB is set automatically. This arrangement is necessary for other calculated fields.
    Business Rules
  2. Set up the Age calculated field as follow:
    Age Calculated Field

Once these three steps have been completed, your new Age field should be ready to use. I created a view to verify the calculations. I happened to be writing this post very late on the night of 2/16/2019. I wrote the first part before 12:00 a.m., then I refreshed the view before taking the screenshot below. I was happy to see Age Test 3 record flip from 46 to 47 when I refreshed after 12:00 a.m.

Age Solution Results

Determining Exact Age at Some Date in the Future

The requirement that drove my research for this solution was the need to determine the exact age in the future. Our client needed to know the age of a traveler on the date of travel. Depending on the country being visited and the age of the traveler on the date of departure, different forms would need to be sent in order to prevent problems when the traveler arrived at his or her destination. The solution was very similar to the Age example above:

The Solution

  1. Here is an overview of the entity hierarchy:
    Age at Travel Entities
  2. Create the necessary fields.
    Entity  Field  Data Type  Field Type  Other  Formula 
    Trip  Start Date  Date and Time  Simple  Behavior: User Local   
    Contact  DOB  Date and Time  Simple  Behavior: User Local   
    Trip Contact  Age at Travel Actual  Decimal Number  Calculated  Precision: 10  DiffInDays(contact.dobnew_trip.start) / 365.25 
    Trip Contact  Age at Travel Rounded  Whole Number  Calculated  n/a  DiffInDays(contact.dobnew_trip.start) / 365.25 
    Trip Contact  Age at Travel Difference  Decimal Number  Calculated  Precision: 10  new_ageattravelactual – new_ageattravelrounded 
    Trip Contact  Age at Travel  Whole Number  Calculated  n/a  See below 
  1. Create a business rule for Contact DOB; setting it equal to birthdate when birthdate contains data. This way when birthdate is set, the DOB is set automatically. This arrangement is necessary for other calculated fields.
    Business Rules
  2. Set up the Trip Contact’s Age at Travel calculated field as follow:
    Age at Travel Calculated Field

Once these steps have been completed, your new Age at Travel field should be ready to use. I created a view to verify the calculations.

You’ll notice that in the red example, the trip starts on 8/14/2020. The contact was born on 9/29/2003 and is 16 on the date of travel but turns 17 a month or so later. In the green example, the trip is also on 8/14/2020. The contact was born 4/12/2008 and will turn 12 before the date of travel.

Age at Travel Solution Results

Conclusion

While there are several approaches to the Age issue in Dynamics CRM, this is a great alternative that requires no code and works in real time. I hope you find it useful!

kubernetes logoToday, let’s talk about network isolation and traffic policy within the context of Kubernetes.

Network Policy Specification

Kubernetes’ first-class notion of networking policy allows a customer to determine which pods are allowed to talk to other pods. While these policies are part of Kubernetes’ specification, tools like Calico and Cilium implement these network policies.

Here is a simple example of a network policy:

...
  ingress:
  - from:
    - podSelector:
        matchLabels:
          zone: trusted
  ...

In the above example, only pods with the label zone: trusted are allowed to make an incoming visit to the pod.

egress:
  - action: deny
    destination:
      notSelector: ns == 'gateway’

The above example deals with outgoing traffic. This network policy will ensure that traffic going out is blocked unless the destination is a node with the label ‘gateway’.

As you can see, network policies are important for isolating pods from each other in order to avoid leaking information between applications. However, if you are dealing with data that requires higher trust levels, you may want to consider isolating the applications at the cluster level. The following diagrams depict both logical (network policy based) and physical (isolated) clusters.

Diagram of a Prod Cluster Diagrams of Prod Team Clusters

Network Policy is NOT Traffic Routing…Enter Istio!

Network policies, however, do not allow us to control the flow of traffic on a granular level. For example, let’s assume that we have three versions of a “reviews” service (a service that returns user reviews for a given product). If we want the ability to route the traffic to any of these three versions dynamically, we will need to rely on something else. In this case, let’s use the traffic routing provided by Istio.

Istio is a tool that manages the traffic flow across services using two primary components:

  1. An Envoy proxy (more on Envoy later in the post) distributes traffic based on a set of rules.
  2. The Pilot manages and configures the traffic rules that let you specify how traffic should be routed.

Diagram of Istio Traffic Management

image source

Here is an example of Istio policy that directs all traffic to the V1 version of the “reviews” service:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
  - reviews
  http:
  - route:
    - destination:
        host: reviews
        subset: v1

Here is a Kiali Console view of all “live” traffic being sent to the V1 version of the “reviews” service:

Kiali console screenshot

Now here’s an example of Istio policy that directs all traffic to the V3 version of the “reviews” service:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
  - route:
    - destination:
        host: reviews
        subset: v3

And here is a Kiali Console view of all “live” traffic being sent to the V3 version of the “reviews” service:

Kiali console screenshot v3

Envoy Proxy

Envoy is a lightweight proxy with powerful routing constructs. In the example above, the Envoy proxy is placed as a “sidecar” to our services (product page and reviews) and allows it to handle outbound traffic. Envoy could dynamically route all outbound calls from a product page to the appropriate version of the “reviews” service.

We already know that Istio makes it simple for us to configure the traffic routing policies in one place (via the Pilot). But Istio also makes it simple to inject the Envoy proxy as a sidecar. The following Kubectl command labels the namespace for automatic sidecar injection:

#--> Enable Side Car Injection
kubectl label namespace bookinfo istio-injection=enabled

As you can see each pod has two containers ( service and the Envoy proxy):

# Get all pods 
kubectl get pods --namespace=bookinfo

I hope this blog post helps you think about traffic routing between Kubernetes pods using Istio and Envoy. In future blog posts, we’ll explore the other facets of a “service mesh” – a common substrate for managing a large number of services, with traffic routing being just one facet of a service mesh.

Driving value, lowering costs, and building your organization’s future with Microsoft’s next great business technology

Lately, I’ve been helping folks understand the Microsoft Power Platform (MPP) by sharing two simple diagrams.

The first one is below and is my stab (others have made theirs) at contextualizing the platform’s various components in relation to one another.

The Common Data Service (CDS) is the real magic, I tell people. No matter which app you are using, the data lives there in that one CDS across your entire environment. (And no, folks outside your organization don’t get to use it.) This means that data available to one of your apps can be re-used and re-purposed by your other apps, no wizardry or custom integration required. I promise, it just works. Think expansively about the power of this in your organization, and you’ll come up with some cockamamie/brilliant ideas about what you can do.

These are the types of data-driving-business-function that geeks like me always dreamed of.

A diagram of Microsoft Power Platform components

Then there’s PowerApps, in purple. Most folks think of this as a low-code/no-code app development tool. It is, but it’s more. Imagine that there are three flavors of PowerApps:

  1. Dynamics 365, which in the end is a set of really big PowerApps developed by Microsoft
  2. COTS apps developed by Microsoft partners (including AIS), available for organizations to license and use
  3. Custom apps you build yourself

Point Microsoft PowerBI at all of this, then mash it up with data from outside of your CDS that you get to via hundreds of out-of-the-box connectors, automate it all together with workflows in Flow…and you’ve got Power Platform in a nutshell.

When I’m presenting this to a group, I turn to my next slide pretty quickly at this point.

A rearranged look at Microsoft Power Platform

Here I’ve essentially re-arranged the pieces to make my broader point: When we think about the Power Platform, the emphasis needs to be on the Platform bit. When your organization invests in this technology, say via working with an implementation partner such as AIS or purchasing PowerApps P1/P2 licenses, you’re not just getting a product or a one-off app solution.

What you’re getting is a platform on which to build your modern business. You’re not just extending Office 365. Instead, you’re creating a future where your organization’s data and business processes are deeply integrated with, driving, and learning intelligently from one another.

The more you leverage the platform, the higher the ROI and the lower the marginal costs of those licenses become. A central goal of any implementing partner ought to be guiding organizations on the journey of migrating legacy systems onto the platform (i.e., retiring legacy licensing + O&M costs) and empowering workers to make the platform even more valuable.

We don’t invest in one-off apps anymore, i.e. a CRM in one corner of your network where you run your sales, something in another where you manage your delivery, clunky Human Resources Management off over there where you take care of your people, etc.. No, what we care about here is the platform where you integrate all of the above — not through monolithic one-size-fits-all ERP — but rather through elegant app experiences across all your users’ devices that tie back to that magical Common Data Service.

This is what I mean when I tell folks sky’s the limit, and thinking about your entire business is what’s called for here. It’s because Power Platform gives us the ability to learn and grow with our customers, constituents, vendors, employees, and other stakeholders like never before.

That’s what has everyone at Microsoft so excited. I am as well.

I want to learn from you. How do you make Power Platform understandable to those who haven’t thought about it too deeply? How does your organization make it valuable as a platform rather than just a product? I love to build beautiful things, so inspire me!

The business intelligence, automation, and enterprise application landscape is changing dramatically.

In the previous incarnation of enterprise technology, line-of-business owners were forced to choose between pre-baked commercial off the shelf (COTS) software, which was difficult to customize and often did not truly meet the business’s unique needs, or custom solutions that (though flexible and often tailor-made to the business needs of the moment) cost more and were far riskier to develop and deploy.

Furthermore, certain classes of applications do not have a COTS answer, nor do they justify the cost of custom software development. In the chasm between the two arose a generation of quasi-apps: the homegrown Excel spreadsheets, Access databases, Google docs, and all manner of other back-of-the-napkin “systems.” End users developed these quasi-apps to fill the gaps between the big software IT provided and what users actually needed to do their jobs.

We’ve all been there: The massive spreadsheet that tracked a decade’s worth of employee travel but was always one accidental click away from oblivion. Or the quirky asset management database living on your officemate’s desktop (and still named after an employee who left the company five years ago); the SharePoint site full of sensitive HR data, or the shared network drive that had long been “shared” a bit too liberally. A generation of do-it-yourself workers grew up living on the edge of catastrophe with their quasi-apps.

Thankfully, three trends have converged to shatter this paradigm in 2019, fundamentally changing the relationship between business users, technologists, and their technology.

Connectivity of Everything

The new generation of business applications is hyper-connected to one another. They allow for connections between business functions previously considered siloed, unrelated, or simply not feasible or practical. This includes travel plans set in motion by human resources decisions, medical procedures scheduled based on a combination of lab results and provider availability, employee recruiting driven by sales and contracts.

Citizens’ Uprising

Business users long settled for spreadsheets and SharePoint, but new “low-code/no-code” tools empower these “citizen creators” with the capability to build professional grade apps on their own. Airport baggage screeners can develop mobile apps that cut down on paperwork, trainers and facilitators put interactive tools in the hands of their students, and analysts and researchers are no longer dependent on developers to “pull data” and create stunning visualizations.

New Ways of Looking at the World (& Your Data)

This isn’t just about business intelligence (BI) and data visualization tools far outpacing anything else that was recently available. It’s not even just about business users’ ability to harness and extend those tools. This is about the ability of tools like Microsoft Power BI to splice together, beautifully visualize, and help users interpret data that their organizations already own — data to which you’ve connected using one of the hundreds of native connectors to third-party services, and data generated every second of every minute of every day from the connected devices that enable the organization’s work.

It’s an exciting time. I’ve explored these trends further, plus how Microsoft’s Power Platform has become the go-to platform for organizations mastering the new landscape in my whitepaper, Microsoft’s Power Platform and the Future of Business Applications. We’re way past CRM. I hope you’ll read it and share your thoughts with me!

Accurately identifying and authenticating users is an essential requirement for any modern application. As modern applications continue to migrate beyond the physical boundaries of the data center and into the cloud, balancing the ability to leverage trusted identity stores with the need for enhanced flexibility to support this migration can be tricky. Additionally, evolving requirements like allowing multiple partners, authenticating across devices, or supporting new identity sources push application teams to embrace modern authentication protocols.

Microsoft states that federated identity is the ability to “Delegate authentication to an external identity provider. This can simplify development, minimize the requirement for user administration, and improve the user experience of the application.”

As organizations expand their user base to allow authentication of multiple users/partners/collaborators in their systems, the need for federated identity is imperative.

The Benefits of Federated Authentication

Federated authentication allows organizations to reliably outsource their authentication mechanism. It helps them focus on actually providing their service instead of spending time and effort on authentication infrastructure. An organization/service that provides authentication to their sub-systems are called Identity Providers. They provide federated identity authentication to the service provider/relying party. By using a common identity provider, relying applications can easily access other applications and web sites using single sign on (SSO).

SSO provides quick accessibility for users to multiple web sites without needing to manage individual passwords. Relying party applications communicate with a service provider, which then communicates with the identity provider to get user claims (claims authentication).

For example, an application registered in Azure Active Directory (AAD) relies on it as the identity provider. Users accessing an application registered in AAD will be prompted for their credentials and upon authentication from AAD, the access tokens are sent to the application. The valid claims token authenticates the user and the application does any further authentication. So here the application doesn’t need to have additional mechanisms for authentication thanks to the federated authentication from AAD. The authentication process can be combined with multi-factor authentication as well.

Glossary

Abbreviation Description
STS Security Token Service
IdP Identity Provider
SP Service Provider
POC Proof of Concept
SAML Security Assertion Markup Language
RP Relying party (same as service provider) that calls the Identity Provider to get tokens
AAD Azure Active Directory
ADDS Active Directory Domain Services
ADFS Active Directory Federation Services
OWIN Open Web Interface for .NET
SSO Single sign on
MFA Multi factor authentication

OpenId Connect/OAuth 2.0 & SAML

SAML and OpenID/OAuth are the two main types of Identity Providers that modern applications implement and consume as a service to authenticate their users. They both provide a framework for implementing SSO/federated authentication. OpenID is an open standard for authentication and combines with OAuth for authorization. SAML is also open standard and provides both authentication and authorization.  OpenID is JSON; OAuth2 can be either JSON or SAML2 whereas SAML is XML based. OpenID/OAuth are best suited for consumer applications like mobile apps, while SAML is preferred for enterprise-wide SSO implementation.

Microsoft Azure Cloud Identity Providers

The Microsoft Azure cloud provides numerous authentication methods for cloud-hosted and “hybrid” on-premises applications. This includes options for either OpenID/OAuth or SAML authentication. Some of the identity solutions are Azure Active Directory (AAD), Azure B2C, Azure B2B, Azure Pass through authentication, Active Directory Federation Service (ADFS), migrate on-premises ADFS applications to Azure, Azure AD Connect with federation and SAML as IdP.

The following third-party identity providers implement the SAML 2.0 standard: Azure Active Directory (AAD), Okta, OneLogin, PingOne, and Shibboleth.

A Deep Dive Implementation

This blog post will walk through an example I recently worked on using federated authentication with the SAML protocol. I was able to dive deep into identity and authentication with an assigned proof of concept (POC) to create a claims-aware application within an ASP.NET Azure Web Application using the federated authentication and SAML protocol. I used OWIN middleware to connect to Identity Provider.

The scope of POC was not to develop an Identity Provider/STS (Security Token Service) but to develop a Service Provider/Relying Party (RP) which sends a SAML request and receives SAML tokens/assertions. The SAML tokens are used by the calling application to authorize the user into the application.

Given the scope, I used stub Identity Provider so that the authentication implementation could be plugged into a production application and communicate with other Enterprise SAML Identity Providers.

The Approach

For an application to be claims aware, it needs to obtain a claim token from an Identity Provider. The claim contained in the token is then used for additional authorization in the application. Claim tokens are issued by an Identity Provider after authenticating the user. The login page for the application (where the user signs in) can be a Service Provider (Relying Party) or just an ASP.NET UI application that communicates with the Service Provider via a separate implementation.

Figure 1: Overall architecture – Identity Provider Implementation

Figure 1: Overall architecture – Identity Provider Implementation

The Implementation

An ASP.NET MVC application was implemented as SAML Service provider with OWIN middleware to initiate the connection with the SAML Identity Provider.

First, the communication is initiated with a SAML request from service provider. The identity provider validates the SAML request, verifies and authenticates the user, and sends back the SAML tokens/assertions. The claims returned to service provider are then sent back to the client application. Finally, the client application can authorize the user after reviewing the claims returned from the SAML identity provider, based on roles or other more refined permissions.

SustainSys is an open-source solution and its SAML2 libraries add SAML2P support to ASP.NET web sites and serve as the SAML2 Service Provider (SP).  For the proof of concept effort, I used a stub SAML identity provider SustainSys Saml2 to test the SAML service provider. SustainSys also has sample implementations of a service provider from stub.

Implementation steps:

  • Start with an ASP.NET MVC application.
  • Add NuGet packages for OWIN middleware and SustainSys SAML2 libraries to the project (Figure 2).
  • Modify the Startup.cs (partial classes) to build the SAML request; set all authentication types such as cookies, default sign-in, and SAMLl2 (Listing 2).
  • In both methods CreateSaml2Options and CreateSPOptions SAML requests are built with both private and public certificates, federation SAML Identity Provider URL, etc.
  • The service provider establishes the connection to identity on start up and is ready to listen to client requests.
  • Cookie authentication is set, default authentication type is “Application,” and set the SAML authentication request by forming the SAML request.
  • When the SAML request options are set, instantiate Identity Provider with its URL and options. Set the Federation to true. Service Provider is instantiated with SAML request options with the SAML identity provider. Upon sign in by the user, OWIN middleware will issue a challenge to the Identity Provider and get the SAML response, claim/assertion back to the service provider.
  • OWIN Middleware issues a challenge to SAML Identity Provider with the callback method (ExternalLoginCallback(…)). Identity provider returns that callback method after authenticating the user (Listing 3).
  • AuthenticateSync will have claims returned from the Identity Provider and the user is authenticated at this point. The application can use the claims to authorize the user to the application.
  • No additional web configuration is needed for SAML Identity Provider communication, but the application config values can be persisted in web.config.

Figure 2: OWIN Middleware NuGet Packages

Figure 2: OWIN Middleware NuGet Packages

Listing 1:  Startup.cs (Partial)

using Microsoft.Owin;
using Owin;

[assembly: OwinStartup(typeof(Claims_MVC_SAML_OWIN_SustainSys.Startup))]

namespace Claims_MVC_SAML_OWIN_SustainSys
{
    public partial class Startup
    {
        public void Configuration(IAppBuilder app)
        {
            ConfigureAuth(app);
        }
    }
}

Listing 2: Startup.cs (Partial)

using Microsoft.Owin;
using Microsoft.Owin.Security;
using Microsoft.Owin.Security.Cookies;
using Owin;
using Sustainsys.Saml2;
using Sustainsys.Saml2.Configuration;
using Sustainsys.Saml2.Metadata;
using Sustainsys.Saml2.Owin;
using Sustainsys.Saml2.WebSso;
using System;
using System.Configuration;
using System.Globalization;
using System.IdentityModel.Metadata;
using System.Security.Cryptography.X509Certificates;
using System.Web.Hosting;

namespace Claims_MVC_SAML_OWIN_SustainSys
{
    public partial class Startup
    {
        public void ConfigureAuth(IAppBuilder app)
        {            
            // Enable Application Sign In Cookie
            var cookieOptions = new CookieAuthenticationOptions
                {
                    LoginPath = new PathString("/Account/Login"),
                AuthenticationType = "Application",
                AuthenticationMode = AuthenticationMode.Passive
            };

            app.UseCookieAuthentication(cookieOptions);

            app.SetDefaultSignInAsAuthenticationType(cookieOptions.AuthenticationType);

            app.UseSaml2Authentication(CreateSaml2Options());
        }

        private static Saml2AuthenticationOptions CreateSaml2Options()
        {
            string samlIdpUrl = ConfigurationManager.AppSettings["SAML_IDP_URL"];
            string x509FileNamePath = ConfigurationManager.AppSettings["x509_File_Path"];

            var spOptions = CreateSPOptions();
            var Saml2Options = new Saml2AuthenticationOptions(false)
            {
                SPOptions = spOptions
            };

            var idp = new IdentityProvider(new EntityId(samlIdpUrl + "Metadata"), spOptions)
            {
                AllowUnsolicitedAuthnResponse = true,
                Binding = Saml2BindingType.HttpRedirect,
                SingleSignOnServiceUrl = new Uri(samlIdpUrl)
            };

            idp.SigningKeys.AddConfiguredKey(
                new X509Certificate2(HostingEnvironment.MapPath(x509FileNamePath)));

            Saml2Options.IdentityProviders.Add(idp);
            new Federation(samlIdpUrl + "Federation", true, Saml2Options);

            return Saml2Options;
        }

        private static SPOptions CreateSPOptions()
        {
            string entityID = ConfigurationManager.AppSettings["Entity_ID"];
            string serviceProviderReturnUrl = ConfigurationManager.AppSettings["ServiceProvider_Return_URL"];
            string pfxFilePath = ConfigurationManager.AppSettings["Private_Key_File_Path"];
            string samlIdpOrgName = ConfigurationManager.AppSettings["SAML_IDP_Org_Name"];
            string samlIdpOrgDisplayName = ConfigurationManager.AppSettings["SAML_IDP_Org_Display_Name"];

            var swedish = CultureInfo.GetCultureInfo("sv-se");
            var organization = new Organization();
            organization.Names.Add(new LocalizedName(samlIdpOrgName, swedish));
            organization.DisplayNames.Add(new LocalizedName(samlIdpOrgDisplayName, swedish));
            organization.Urls.Add(new LocalizedUri(new Uri("http://www.Sustainsys.se"), swedish));

            var spOptions = new SPOptions
            {
                EntityId = new EntityId(entityID),
                ReturnUrl = new Uri(serviceProviderReturnUrl),
                Organization = organization
            };
        
            var attributeConsumingService = new AttributeConsumingService("Saml2")
            {
                IsDefault = true,
            };

            attributeConsumingService.RequestedAttributes.Add(
                new RequestedAttribute("urn:someName")
                {
                    FriendlyName = "Some Name",
                    IsRequired = true,
                    NameFormat = RequestedAttribute.AttributeNameFormatUri
                });

            attributeConsumingService.RequestedAttributes.Add(
                new RequestedAttribute("Minimal"));

            spOptions.AttributeConsumingServices.Add(attributeConsumingService);

            spOptions.ServiceCertificates.Add(new X509Certificate2(
                AppDomain.CurrentDomain.SetupInformation.ApplicationBase + pfxFilePath));

            return spOptions;
        }
    }
}

Listing 3: AccountController.cs

using Claims_MVC_SAML_OWIN_SustainSys.Models;
using Microsoft.Owin.Security;
using System.Security.Claims;
using System.Text;
using System.Web;
using System.Web.Mvc;

namespace Claims_MVC_SAML_OWIN_SustainSys.Controllers
{
    [Authorize]
    public class AccountController : Controller
    {
        public AccountController()
        {
        }

        [AllowAnonymous]
        public ActionResult Login(string returnUrl)
        {
            ViewBag.ReturnUrl = returnUrl;
            return View();
        }

        //
        // POST: /Account/ExternalLogin
        [HttpPost]
        [AllowAnonymous]
        [ValidateAntiForgeryToken]
        public ActionResult ExternalLogin(string provider, string returnUrl)
        {
            // Request a redirect to the external login provider
            return new ChallengeResult(provider, Url.Action("ExternalLoginCallback", "Account", new { ReturnUrl = returnUrl }));
        }

        // GET: /Account/ExternalLoginCallback
        [AllowAnonymous]
        public ActionResult ExternalLoginCallback(string returnUrl)
        {
            var loginInfo = AuthenticationManager.AuthenticateAsync("Application").Result;
            if (loginInfo == null)
            {
                return RedirectToAction("/Login");
            }

            //Loop through to get claims for logged in user
            StringBuilder sb = new StringBuilder();
            foreach (Claim cl in loginInfo.Identity.Claims)
            {
                sb.AppendLine("Issuer: " + cl.Issuer);
                sb.AppendLine("Subject: " + cl.Subject.Name);
                sb.AppendLine("Type: " + cl.Type);
                sb.AppendLine("Value: " + cl.Value);
                sb.AppendLine();
            }
            ViewBag.CurrentUserClaims = sb.ToString();
            
            //ASP.NET ClaimsPrincipal is empty as Identity returned from AuthenticateAsync should be cast to IPrincipal
            //var identity = (ClaimsPrincipal)Thread.CurrentPrincipal;
            //var claims = identity.Claims;
            //string nameClaimValue = User.Identity.Name;
            //IEnumerable&amp;amp;lt;Claim&amp;amp;gt; claimss = ClaimsPrincipal.Current.Claims;
          
            return View("Login", new ExternalLoginConfirmationViewModel { Email = loginInfo.Identity.Name });
        }

        // Used for XSRF protection when adding external logins
        private const string XsrfKey = "XsrfId";

        private IAuthenticationManager AuthenticationManager
        {
            get
            {
                return HttpContext.GetOwinContext().Authentication;
            }
        }
        internal class ChallengeResult : HttpUnauthorizedResult
        {
            public ChallengeResult(string provider, string redirectUri)
                : this(provider, redirectUri, null)
            {
            }

            public ChallengeResult(string provider, string redirectUri, string userId)
            {
                LoginProvider = provider;
                RedirectUri = redirectUri;
                UserId = userId;
            }

            public string LoginProvider { get; set; }
            public string RedirectUri { get; set; }
            public string UserId { get; set; }

            public override void ExecuteResult(ControllerContext context)
            {
                var properties = new AuthenticationProperties { RedirectUri = RedirectUri };
                if (UserId != null)
                {
                    properties.Dictionary[XsrfKey] = UserId;
                }
                context.HttpContext.GetOwinContext().Authentication.Challenge(properties, LoginProvider);
            }
        }
    }
}

Listing 4: Web.Config

<?xml version="1.0" encoding="utf-8"?>
<!--
  For more information on how to configure your ASP.NET application, please visit
  https://go.microsoft.com/fwlink/?LinkId=301880
  -->
<configuration>
  <appSettings>
    <add key="webpages:Version" value="3.0.0.0" />
    <add key="webpages:Enabled" value="false" />
    <add key="ClientValidationEnabled" value="true" />
    <add key="UnobtrusiveJavaScriptEnabled" value="true" />
    <add key="SAML_IDP_URL" value="http://localhost:52071/" />
    <add key="x509_File_Path" value="~/App_Data/stubidp.sustainsys.com.cer"/>
    <add key="Private_Key_File_Path" value="/App_Data/Sustainsys.Saml2.Tests.pfx"/>
    <add key="Entity_ID" value="http://localhost:57234/Saml2"/>
    <add key="ServiceProvider_Return_URL" value="http://localhost:57234/Account/ExternalLoginCallback"/>
    <add key="SAML_IDP_Org_Name" value="Sustainsys"/>
    <add key="SAML_IDP_Org_Display_Name" value="Sustainsys AB"/>
  </appSettings>

Claims returned from the identity provider to service provider:

Claims returned from the identity provider to service provider

Additional References

On December 15th, I had the pleasure of presenting a session of “Introduction to Deep Learning” at the recently held #globalAIBootcamp (an amazing event with 68 participating locations worldwide).

This blog post captures some of the key points from my presentation. Feel free to go directly to the slides located here.

Deep Learning with Python book coverBefore I begin, I would like to thank François Chollet for his excellent book “Deep Learning with Python.” In my almost four-year quest to better understand deep learning, I found his book to be one of the best written books on this topic.

(I will totally understand if, at this point, you abandon reading this blog post and start reading François Chollet’s book…in fact, I recommend you do just that 😊)

The Importance of Deep Learning for Developers

Deep learning is such an incredible tool for developers like us. Deep learning has already helped solve complex problems such as near-human level image classification, handwriting recognition, speech recognition and more. Much of this aforementioned functionality is now available to us in the form of an easily callable API (e.g. Cognitive Services). Furthermore, many of these APIs also allow us to train the underlying models using data from our own specific domains (e.g. Custom Speech API). Finally, with the advent of automated ML tools, there is now a WYSIWYG way to get started with deep learning (e.g. Lobe.AI).

If we have all these tools available, why would we, as developers, need to code deep learning models from scratch?

To answer this question, let us review a recent quote from Andrew Ng.

“AI (Artificial Intelligence) technology is now poised to transform every industry, just as electricity did 100 years ago. Between now and 2030, it will create an estimated $13 trillion of GDP growth. While it has already created tremendous value in leading technology companies such as Google, Baidu, Microsoft and Facebook, much of the additional waves of value creation will go beyond the software sector.”

In addition to the astounding number in itself, the important part of this quote is the bold text (emphasis is mine). Large software companies have already created tremendous value for themselves through speech recognition, object classification, language translation, etc., and now it is up to us as developers to take this technology to our customers beyond the software sector. Bringing deep learning solutions to these sectors may, in many cases (but not all), require developing our own custom deep learning models. Here are a few examples:

  1. Predicting patient health outcomes based on past health data
  2. Mapping a set of attributes about a product to predict manufacturing defects
  3. Mapping pictures of food items to prices and calorie count for a self-checkout kiosk
  4. Smart bots for a specific industry
  5. Summarization of a vast amount of text into an auto-generated short summary with a timeline

In order to develop these kinds of solutions, I believe that we need to have a working knowledge of deep learning and neural networks (also referred to as Artificial Neural Networks or ANNs).

What is Deep Learning?

But what is a neural network and what do I mean by “deep” in deep learning? A neural network is a layered structure that is “inspired” by the neurons in our brain.

Like the neurons in our brain, which are essentially a network of cells that get activated or inhibited (think 0 or 1) based on an incoming signal, a neural network is comprised of layers of cells that are activated or inhibited by an incoming signal. As we will see later, in the case of a neural network, the cells can store any value (instead of just binary values).

Despite the similarities, it is unclear whether neurons in our brain work in a similar way to the neurons in neural networks. Thus, I emphasized the word “inspired” earlier in this paragraph. Of course, that has not stopped the popular press from painting a magical picture around deep learning (now you know not to buy into this hype).

We still need to answer our question about the meaning of the word “deep” in deep learning. As it turns out, a typical neural network is multiple levels deep – hence the reference to “deep” in deep learning. It’s that simple!

Let us change gears and look briefly into what we mean by “learning.” We will use the “Hello World” equivalent for neural nets as the basis for our explanation. This is a neural network that can identify a handwritten digit (such as the one shown below).

a handwritten 9

It’s All About a Layered Network

Input Layer

Let’s think of a neural network as a giant data transformation engine that takes a handwritten image as input and outputs a number between 0-9. Under the cover, a handwritten image is really a matrix of, say, 28 X 28 cells, with each cell’s value ranging from 0-255. A value of 0 represents white and a value of 255 represents black. So altogether we have 28 X 28 (784) cells in the input layer.

Hidden Layer

Now let us, somewhat arbitrarily, add a layer underneath the input layer (because this layer is underneath the input layer, it is referred to as a “hidden” layer). Furthermore, let us assume that we have 512 cells in the hidden layer (compared to 784 cells in the input layer). Now imagine that each cell in the input layer is connected to each cell in the hidden layer – this is referred to as densely connected layers.

Each connection, in turn, has a multiplier (referred to as a weight) associated with it that drives the value of the cell in the next layer as shown. To help visualize this, please refer to the diagram below: The lines in red show all the incoming connections coming into the topmost cell of the hidden layer. The numbers decorating the incoming connections are the associated weights that drive the value of cells in the hidden layer.

a diagram of the neural network

For example, the topmost cell in the hidden layer gets a value of .827 – we got this value by adding the weight of each input cell and multiplying it by a weight (i.e. 1 * .612 + 1 * .215). Stated differently, we have calculated the value of the cell in the hidden layer using the weighted sum of all the incoming connections.

Weighted Sum, Bias and Activation Function

The weighted sum calculation is rather simplistic. We want this calculation to be a bit more sophisticated, so we can conduct more complex transformations as the data flows from one layer to next (after all, we are trying to teach our network the non-trivial problem of identifying handwritten digits).

In order to make the weighted sum calculation a bit more sophisticated, we can do a couple of things to help us control how data flows to the following layer:

  1. We can introduce a notion of threshold associated with the weighted sum – if the weighted sum is below a certain threshold, we will consider the next neuron not to be activated. An easy way to incorporate the notion of the threshold in our weighted sum calculation is to add an equivalent constant value to it (referred to as a bias).
  2. We introduce the notion of an activation function. Activation functions are mathematical functions like Sigmoid and ReLU that can transform the output of the neuron. For example, ReLU is simply a max operation that returns a value of 0 if the input value is a negative number; else it simply returns the input value. Here is the C# code snippet to implement ReLU:

public double calcReLU (double x){return Math.Max(0,x);}

If you are wondering why the weighted sum calculation is simplistic and why we needed to add things like a bias and activation function, it is helpful to note that, ultimately, ML boils down to searching for the right data transformation function (consistent with our earlier statement that at the highest level, ML is really about transforming input data into output data).

For example, in the diagram below, our ML program has “learnt” to predict a value of y, given x, using the training set. The training set is made of a collection of (x,y) coordinates depicted as blue dots. The “learnt” data transformation (depicted as the red line in the diagram) is a simple linear data transformation.

diagram of the model

If the data transformation that the ML routine was supposed to learn was more complex than the example above, we will need to widen our search space (also referred to a hypothesis space). This is when introducing the concepts like bias and activation function can help. In the diagram below, the ML program needs to learn a more complex data transformation logic.

You are probably wondering how ReLU can help here – after all, the ReLU / max operation seems like another linear transformation. This is not true: ReLU is a non-linear transformation function. In fact, you can chain multiple ReLU operations to mimic a non-linear graph, such as the one shown in the diagram above.

Recognizing  Patterns

Before we leave this section, let us ponder for a moment about the weighted sum calculation and activation function. Based on what we have discussed so far, you understand that the weighted sum and the activation function together help us calculate the value of the cell in the next layer. But how does a calculation such as this help the neural net recognize handwritten digits?

The only way a neural network can learn to recognize digits is by finding patterns in the image, i.e., the number “9” is a combination of a circle alongside a vertical line. If we had to write a program to detect a vertical line in the center of an image, what would we do? We have already talked about representing an image as a 28 X 28 matrix, where each cell of the matrix has a value of 0-255. One easy way to determine if this image has a vertical line in the center of the image is to create another 28 X 28 matrix and set high values for cells where we are looking for the vertical line and low values for cells everywhere else.

Now if we multiply the two matrices, assuming our image did indeed have a vertical line in the center of the image, we will end up with a matrix that will have high values for cells that make up the vertical line. Hopefully, you have begun to develop an intuition on how the neural network will learn to recognize handwritten digits. (Note – on its own, we are not providing it a cheat sheet of patterns for each digit.)

Output Layer

Finally, let’s add an output layer that has only 10 cells. Why just 10 cells? Remember that the purpose of a neural net is to predict the digit that corresponds to a handwritten digit. This means we have 10 possibilities (0-9). Each cell of our output layer will contain a probability score of what our neural network thinks is the handwritten digit (the sum of all the probabilities cannot exceed 1). Such an output layer is referred to as a SoftMax layer.

With the necessary layers in place, let us review the layers in our neural network model as shown in the table below. The input layer is a 28 X 28 matrix of pixels. The hidden layer is comprised of 512 cells and the output layer has 10 cells. The third column of the table depicts the total number of weight parameters that can be adjusted for the network to learn.

Table with layers and values

What is “Learning”?

Great! We now have a neural network in place! But why is it even reasonable to think that a layered structure can learn to recognize handwritten digits?

This is where learning comes in!

We will start out by initializing the weights to some random values. Then we will go about calculating the values of cells in the hidden layer, and subsequently, the values in the output layer – this process is referred to as a forward pass. Since we initialized the weights to random values, we cannot expect the output to be a meaningful value.

Of course, our predicted output is going to be wrong. But how wrong? Well… we can calculate the difference between our incorrect output and “correct” output. I suspect that you’ve heard that in order to conduct the learning, we need a set of training examples – in our case, images of handwritten digits and the “correct” answer (referred to as “labels”). Fortunately for us, folks behind the MNIST database have already created a training set that is comprised of images of 60,000 handwritten digits and the corresponding labels. Here is an image depicting sample images from the MNIST database.

a collection of handwritten digits

Note: When working on a real-world problem, you probably won’t find a readily available training set. As you can imagine, collecting the requisite sized training set is one of the key challenges you will face. More training data is almost always better when it comes to neural network learning. (Although, there are techniques like data augmentation that can allow you to get by using a smaller training set.)

Another real-world challenge is the required data wrangling to transform the data into a format that is conducive for feeding into a neural network. The process of using knowledge of your context to transform the data into a format that makes neural network learning work better is known as feature engineering. One of the strengths of neural networks is that it can conduct its own feature discovery by itself (unlike other forms of machine learning). We don’t have to tell the neural network that the digit 9 is made up of a circle and a vertical line.

That said, any data transformation you can undertake in order to ease the learning will lead to a more optimal solution (in terms of resource and training set requirements).

Since we know the correct answer to our “identify the handwritten digit” question, we can determine how wrong we were with the output. But rather than using the simple difference between the two, let us be a bit more mathematically sound and use the square of the differences between the incorrect and correct output values. This way we take into account negative values. This calculation (the squared difference) is referred to as a loss function.

A related concept is cost – it represents the sum of all the loss values added up over the entire training set – in this case, the mean of the squared differences. Using the cost function, we iterate repeatedly, adjusting the weights each time, in an attempt to reach a state with the minimal possible loss. This state of the neural network (collectively representing all the adjusted weights) when the loss has reached its minimum value is referred to as a trained neural net. Keep in mind that training will involve repeating the aforementioned steps for each image (60,000) in our training set, and then averaging the weights across these iterations.

Now we are ready to “show” our trained network images of handwritten digits that it has not seen before, and hopefully, it should be able to offer us a result with a certain level of accuracy. Typically, we use a test set (completely separate from the training set) to determine the accuracy of our trained network.

Adjusting the Weights

Let us get into the details on how weights get adjusted. Let us go with something simple. Assume one of the weights (w1) is set to 2.5. Increase one weight (to 3.5), make a forward pass, calculate the output and the corresponding loss (l1). Now decrease w1 (to 1.5), make a forward pass, calculate the output and the corresponding loss (l2). If l2 is less than l1, we know that we need to continue to decrease the weight and vice versa.

Unfortunately, our simple algorithm to change the weights is horribly inefficient – two forward passes just to determine the necessary weight adjustment for a single weight! Fortunately, we can use a concept that we learned in high school – “instantaneous rate of change.” If we can calculate the rate of change in the cost at the instant when weight w1 is 2.5, all we need to do is move weight w1 in the direction that lowers the loss by the amount that is proportional to the rate of change.

This concept, dear reader, is one of the key concepts in understanding neural networks. I deliberately did not get into the math behind the instantaneous rate of change, but if you are so inclined, please review this link.

Now we know how to change an individual weight. But what about our discussion of densely connected layers earlier? How do interconnected layers of neurons impact how we change individual weights?

We know that the value stored in a given cell is dependent on the weighted sum of all incoming connections from cells in the previous layer. This means that changing the value of a given cell requires the changing the weights on the incoming connections from the previously. So we need to walk backward, starting with the output layer, then to the hidden layer and ending up at the input layer, adjusting the weight at each step.

Another consideration when walking backward is to focus on weights that will give us the most bang for our buck, so to speak. Remember we know the instantaneous rate of change, which means we know what the impact a weight change will be on the overall reduction in the cost function, our stated goal. We also know which cells are most consequential in getting the neural network to predict the right answer.

For example, if we are trying to teach our network to learn about the digit ‘2’, we know the cell that contains the probability of the digit being 2 (out of the 10 cells in the output layer) needs to be our focus. Overall, this technique of walking backward and adjusting the weights recursively is known as backpropagation. As before, we won’t get into the math, but rest assured that another important concept from high school calculus provides us with the mathematical basis for calculating the overall instantaneous rate of change, based on the all the individual instantaneous rates of changes associated with each cell in the network. If you want to read more about the about the mathematical basis, please review this link.

Later, when we look at a code example, we will come across the use of a component called an optimizer that is responsible for implementing the backpropagation algorithm.

We now have some understanding of how the weights get adjusted. But up until now, we have talked about adjusting a handful of weights, when in fact, it is not uncommon to find millions of weights in a modern neural network. It is hard for any of us to visualize the instantaneous rate of change involved with millions of weights. One way to think about this is to imagine a very large matrix that stores weights and the instantaneous rate of change associated with these weights. Fortunately, with the power of parallelization that GPUs and highly efficient matrix multiplication functions possess, we can process these large number of calculations in parallel, in a reasonable amount of time.

Review of Deep Learning Concepts

Phew… that was a rather lengthy walkthrough of neural network essentials. Our hard work will pay off as we get into the code next. But before we get into the code, let us review the key terms we have discussed so far in this blog post.

Training Set: Collection of handwritten images and labels that our neural network learns from

Test Set: Collection of handwritten images and labels used for validating how well our neural network learnt

Loss: Difference between the predicted output and actual answer. Our network is trying to minimize this value as part of the learning

Cost: Sum of all the loss values added up over the entire training set – in our example, we used the mean of squared difference as the cost function

Optimizer: Optimizer implements a backpropagation algorithm for adjusting the weights

Densely Connected: A fully connected layer that transforms the input with the weights and outputs the results to the following layer

Activation Function: A function such as ReLU used for non-linear transformation

SoftMax: Calculates the probability for each output type. We used a 10-way SoftMax layer that calculated the probability of digit being 0-9

Forward Propagation: A forward pass that travels through the input layer & hidden layer until it produces an output

Backward Propagation: An algorithm that involves traveling backwards through the neural network to adjust the weights

Code Walkthrough

We will use the Keras library for our code sample. It turns out that the author of the book we referenced earlier (François Chollet) is also the author of this library. I really like this library because it sits on top of deep learning libraries like Tensor Flow and CNTK. Not only does it hide the complexity of the underlying framework, but it also provides a first-class notion of a neural network layer as a building block that all of us developers can appreciate. As you will see in the code example below, putting together a neural network becomes a simple matter of stacking these building blocks.

Keras building blocks

Let us review an elided version of a Python program that matches the neural network we discussed above. I don’t expect you to pick up every syntactic detail associated with the entire program. My goal is to map, at a high level, the concepts we talked about, to code. For a complete listing of code see this link. You can easily execute this code by provisioning an Azure Data Science Virtual Machine (DSVM) that comes preconfigured with an AI and data science development environment. You also have the option of provisioning the DSVM with GPU support.

// Import Keras
// Load the MNIST database training images and labels (fortunately it comes built into Keras)

// Create a neural network model and add two dense layers


// Define the optimizer, loss function and the metric

// It is not safe to feed the neural network large value (in our case integers
// ranging from 0-255. Let us transform it into a floating-point number between 0-1


// Start the training (i.e. fit our neural network into the training data)

// Now that our network is trained, let us evaluate it using the test data set


// Print the testing accuracy

// 97.86% Not bad for our very basic neural network

Not to denigrate what we have learned so far, but please keep in mind that the neural network architecture we have discussed above is far from the state of art in neural networks. In future blog posts, we will get into some of the cutting-edge advancements in neural networks. But don’t be disheartened, the neural net we have discussed so far is quite powerful with an accuracy of 97.86%.

Learn More About Neural Networks (But Don’t Anthropomorphize)

I don’t know about you, but it is pretty amazing to me that a few lines of code have given us the ability to recognize sloppily handwritten digits that our program has not seen before. Try writing such a program using the traditional programming techniques you know and love. Note however that our trained neural network does not have any understanding of what the digits mean (for example, our trained neural network will not be able to draw a digit, despite having been trained on 60,000 handwritten images)

I also don’t want to minimize the challenges involved in effectively using neural networks, starting with putting together a large training set. Additionally, the data wrangling work involved in transforming the input data into a format that is suitable for training the neural network is quite significant. Finally, developing an understanding of the various optimization techniques for improving the accuracy of our training can only come through experience (Fortunately, AutoML techniques are making the optimization much easier).

Hopefully, this blog post motivates you to learn more about neural networks as a programming construct. In future blog posts, we will look at techniques to improve the accuracy of neural networks, as well as review state of the art neural network architectures.

Azure Web Apps Background

I’ve been working with Azure Web Apps for a long time. Before the launch of Azure Web Apps for Containers (or even Azure Web App on Linux), these web apps ran on Windows Virtual Machines managed by Microsoft. This meant that any workload running behind IIS (i.e., ASP.Net) would run without hiccups — but that was not the case with workloads which preferred Linux over Windows (i.e., Drupal).

Furthermore, the Azure Web Apps that ran on Windows were not customizable. This meant that if your website required a custom tool to work properly, chances are it was not going to work on an Azure Web App, and you’d need to deploy a full-blown IaaS Virtual Machine. There was also a strict lockdown regarding tools and language runtime versions that you couldn’t change. So, if you wanted the latest bleeding-edge language runtime, you weren’t gonna get it.

Azure Web Apps for Containers: Drum Roll

Last year, Microsoft released the Azure Web Apps for Containers or Linux App Service plan offering to the public. This meant we could build a custom Docker image containing all the binaries and files, and then deploy it on the PaaS offering. After working with the product for some time, I was like..

The product was excellent, and it was clear that it had potential. Some of  the benefits:

  • Ability to use a custom Docker image to run the Web App
  • Zero headaches from managing Docker containers
  • The benefits of Azure Web App on Windows like Backups, Kudu, Deployment Slots, Autoscaling (Scale up & Scale out), etc.

Suddenly, running workloads that preferred Linux or required custom binaries became extremely easy.

The Architecture

Compared to Azure Web App on Windows, the architecture implemented in Azure Web App for Containers is different.

diagram of Azure web apps architecture

Each of the above Web Apps is strictly locked down with minimal possibility of modification. Furthermore, the backend storage was based on Network File Shares which means that even if you don’t want any storage (like in cases when your app simply reads data from the database and displays it back), the app would still perform slowly.

diagram of Azure web apps architecture

The major difference is that the Kudu/SCM site runs in a separate container from the actual web app. Both containers are connected to each other with a private network. In this case, each App Service Plan is deployed on a separate Virtual Machine and all the plumbing is managed by Microsoft. The benefits of this approach are:

  • Better isolation. If your Kudu is experiencing issues, it reduces the chance of taking down your actual website.
  • Ability to customize the actual web app container running the website.
  • Better resource utilization

Stay tuned for the next part in which I would be discussing the various options related to Storage which are available in Azure Web App for Containers and their trade-offs.

Happy holidays!