When Microsoft introduced pipelines as part of their Azure DevOps cloud service offering, we received the tools to add continuous integration (CI) and continuous delivery (CD) practices to our development processes. An Azure DevOps pipeline can be created in two ways: 1) The current generally available “classic” pipeline tooling, and 2) the new multi-stage YAML pipeline feature which is currently in preview.

Classic Pipelines

Classic pipelines achieve CI through Azure DevOps build pipelines. A build pipeline executes before a developer integrates code changes into a code base. The pipeline does things like execute a build task, run the unit tests and/or run static code analysis. It then either accepts or rejects the new changes based on the outcome of these tasks.

CD is achieved through Azure DevOps release pipelines. After the build pipeline has produced a build artifact, a release pipeline will publish the artifact to various environments for manual functional testing, user experience testing and quality assurance. When testers have thoroughly tested the deployed artifacts, the release pipeline can then push the artifacts to a production environment.

As powerful as these classic CI/CD pipelines are, they do have their drawbacks. Firstly, the tooling for creating build and release pipelines does not provide a unified experience. CI pipelines provide an intuitive GUI to create and visualize the integration steps…

Classic build pipeline editor

… and also allow you to define those very same steps in YAML:

YAML build pipeline editor

Release pipelines also provide a GUI to create and visualize the pipeline steps. The problem is that the interface is different from that of the build pipeline and does not allow YAML definitions:

Classic release pipeline editor

Multi-stage Pipelines

To resolve these discrepancies Microsoft introduced multi-stage pipelines. Currently in preview, these pipelines allow an engineer to define a build, release or a combined build and release pipeline as a single YAML document. Besides the obvious benefits gained through a unified development experience, there are many other good reasons to choose YAML over classic pipelines for both your builds and releases.

Since you commit YAML definitions directly to source control, you get the same benefits source control has been providing developers for decades. Here are the top 10 reasons (in no particular order) you should choose YAML for your next Azure DevOps pipeline:

1. History

Want to see what your pipeline looked like last month before you moved your connection strings to Azure KeyVault? No problem! Source control allows you to see every change ever make to your pipeline since the beginning of time.

2. Diff

Have you ever discovered an issue with your build but not known exactly when it started failing or why? Having the ability to compare the failing definition with the last known working definition can greatly reduce the recovery time.

3. Blame

Similarly, it can be useful to see who committed the bug that caused the failure and who approved the pull request. You can pull these team members into discussions on how best to fix the issue while ensuring that the original objectives are met.

4. Work Items

Having the ability to see what was changed is one thing but seeing why it was changed is another. By attaching a user story or task to each pipeline commit, you don’t need to remember the thought process that went into a particular change.

5. Rollback

If you discover that the pipeline change you committed last night caused a bad QA environment configuration, simply rollback to the last known working version. You’ll have your QA environment back up in minutes.

6. Everything As Code

Having your application, infrastructure and now build and release pipelines as code in the same source control repository gives you a complete snapshot of your system at any point in the past. By getting an older version of your repo, you can easily spin up an identical environment, execute the exact same pipelines and deploy the same code exactly as it was. This is an extremely powerful capability!

7. Reuse and Sharing

Sharing or duplicating a pipeline (or part thereof) is as simple as copy and paste. It’s just text so you can even email it to a colleague if desired.

8. Multiple Engineers

Modern CI/CD pipelines can be large and complex, and more than one engineer might modify the same YAML file, causing a conflict. Source control platforms solved this problem long ago and provide easy to use tools for merging conflicting changes. For better or worse, YAML definitions allow multiple engineers to work on the same file at the same time.

9. Peer Reviews

If application code peer reviews are important, so are pipeline peer reviews. The ability to submit a pull request before bringing in new changes allows team members to weigh in and provides an added level of assurance that the changes will perform as desired.

10. Branching

Have a crazy idea you want to try out? Create a new branch for it and trigger a pipeline execution from that branch. If your idea doesn’t pan out, simply delete the branch. No harm done.

Though still in preview, the introduction of fully text-based pipeline definitions that can be committed to source control provides benefits that cannot be achieved with classic GUI-based definitions, especially for larger organizations. Be sure to consider YAML for your next Azure DevOps pipeline implementation.

ACA Compliance Group needed help streamlining the communications landscape and its fast-growing workforce to collaborate more effectively. AIS recommended starting small with Microsoft Teams adoption and utilizing Microsoft Planner to gain advocates, realize quick wins, and gather insights to guide the larger rollout.

Starting Their Cloud Transformation Journey

The cloud brings many advantages to both companies and their employees, including unlimited access and seamless collaboration. However, to unleash the full power of cloud-based collaboration, a company must select the right collaboration technology that fits their business needs and ensures employees adopt the technology and changes in practices and processes. This ultimately benefits the business through increased productivity and satisfaction.

In early 2019, an international compliance firm with around 800 employees contacted AIS to help migrate multiple email accounts into a single Office 365 (O365) Exchange account. They invited AIS to continue their cloud journey and help them:

  • Understand their existing business processes and pain points across multiple time zones, countries, departments, and teams.
  • Provide their employees with a secure, reliable, and integrated solution to effective communication and collaboration.
  • Increase employee productivity by improving file and knowledge sharing and problem-solving.
  • Reduce cost from licensing fees for products duplicating features already available through the company’s enterprise O365 license.

Kicking Off a Customer Immersion Experience

First, AIS provided a Microsoft Customer Immersion Experience (CIE) demonstration, which served as the foundational step to introduce all O365 tools. After receiving stakeholder feedback, needs, and concerns, we collaboratively determined the best order for rolling out the O365 applications. The client selected to move forward with Microsoft Teams adoption as the first step to implementing collaboration software in the organization.

Pilots for Microsoft Teams Adoption

Next, we conducted a pilot with two departments to quickly bring benefits to the organization without a large cost investment and to gather insights that would inform the overall Teams adoption plan and strategy for the entire organization. We confirmed with pilot study employees that they saw and welcomed the benefits that Microsoft Teams provides, including:

  • Reduced internal emails.
  • Seamless communication and collaboration among (remote) teams/departments.
  • Increased productivity, efficiency, and transparency.
  • Centralized and accessible location for files, documents, and resources in Teams.

The pilot study also found that adopting Microsoft Teams in the organization would require a paradigm shift. Many employees were used to email communication, including sending attachments back and forth that was hard to track. In addition, while some departments had sophisticated collaboration tools, a common collaboration tool across the company did not exist. For web conferencing, for example, different departments preferred different tools, such as GoToMeeting and WebEx, and most of them incurred subscription fees. Employees had to install multiple tools on their computers to collaborate across departmental boundaries.

QUESTIONS ABOUT TEAMS ADOPTION PROCESS?

Embracing Benefits of Microsoft Teams with Organizational Change Management (OCM)

To help employees understand the benefits of Teams, embrace the new tool, and willingly navigate the associated changes. For the organization-wide deployment and Microsoft Teams adoption, we formed a project team with different roles, including: a Project Manager, Change Manager, UX researcher, Business Analyst, and Cloud Engineer. Organizational Change Management (OCM), User Experience (UX), and business analysis were as critical as technical aspects of the cloud implementation.

Building on each other’s expertise, the project team worked collaboratively and closely with technical and business leaders at the company to:

  • Guide communication efforts to drive awareness of the project and support it.
  • Identify levers that would drive or hinder adoption and plan ways to promote or mitigate.
  • Equip department leaders with champions and facilitate end-user Teams adoption best practices.
  • Guide end users on how to thrive using Teams through best practices and relevant business processes.
  • Provide data analytics and insights to support target adoption rates and customize training.
  • Use an agile approach to resolve both technical issues and people’s pain points, including using Teams for private chats, channel messages, and meetings.
  • Develop a governance plan that addressed technical and business evolution, accounting for the employee experience.

Cutting Costs & Boosting Collaboration

At the end of the 16-week engagement, AIS helped the client achieve its goals of enhanced collaboration, cost savings, and 90% Teams use with positive employee feedback. The company was well-positioned to achieve 100% by the agreed-upon target date.

Our OCM approach significantly contributed to our project success, which is grounded in the Prosci ADKAR® framework, a leading framework for change management based on 20 years of research. As Prosci described on their website, “ADKAR is an acronym that represents the five tangible and concrete outcomes that people need to achieve for lasting change”:

  • Awareness of the need for change
  • Desire to support the change
  • Knowledge of how to change
  • Ability to demonstrate skills and behaviors
  • Reinforcement to make the change stick

The OCM designed was to provide busy executives, leaders, and end-users with key support and insights for action to achieve each outcome necessary for Teams adoption efficiently and effectively.

If you would like to participate in a CIE demonstration or learn more about adopting cloud-based collaboration tools and practices in your company, we are here to help!

READ MORE ABOUT OUR SUCCESS WITH
ACA COMPLIANCE GROUP

The Internet of Things also called the Internet of Everything or the Industrial Internet, is a technology paradigm envisioned as a global network of machines and devices capable of interacting with each other. It is a network of Internet-connected devices that communicate embedded sensor data to the cloud for centralized processing. Here we build an end to end solution from device to cloud (see reference architecture diagram below) or an end to end IoT implementation of the application while covering all its aspects like alerting an Operator, shutting down a system, and more.

About the Microsoft Professional Program (MPP)

This program will teach you the device programming, data analytics, machine learning, and solution design skills for a successful career in IoT. It is a collection of courses that teach skills in various core technology tracks that helps you to excel in the industry’s latest trends. These courses are created and taught by experts and feature hands-on labs and engaging communities.

Azure IoT reference architecture diagram

Benefits

A T Kearney: The potential for the Internet of Things (IoT) is enormous. It is projected that the world will reach 26 billion connected devices by 2020, with incremental revenue potential of $300 billion in services.

McKinsey & Co: The Internet of Things (IoT) offers a potential economic impact of $4 trillion to $11 trillion a year by 2025.

For a partner having these certified people can serve their customer whereas the developer can work on these projects and explore this new area.

MPP vs Microsoft Certification

The professional programs help us in gaining technical job-ready skills and get real-world experience through online courses, hands-on labs, and expert instruction within a specific time period. It is a good starting point to get your hands dirty with the technologies by learning via practical work, rather than classic certification style-based learning of reading a book. In MPP you be asked questions during the modules and you must complete all labs ready for the module exam where you will have to setup a solution from scratch and if your solution is correct only then will your answers be correct.

This program consists of eight different courses.

  • Getting Started with IoT
    This is a basic generic IoT course that provides an idea about IoT eco system or broad Perspective about IoT, it covers the concepts and patterns of an IoT solution and can be used to support business needs in industries like Manufacturing, Smart City/Building, Energy, Healthcare, Retail, and Transportation components of an IoT Architecture
  • Program Embedded Device Hardware
    Here you will learn the basics for programming resource-constrained devices. In addition to that you will get some programming best practices that can be applied when working with embedded devices and you will get practice developing code that interacts with hardware, SDKS, devices that connect to various kinds of sensors.
  • Implement IoT Device Communication
    Explains the Cloud Gateway or Azure IoT hub which helps in, Connecting and Manage IoT devices also helps in configuring them for secure cloud communication. Azure IoT hub helps in implementing secure 2-way communication between devices and the cloud, provision simulated devices using client tools such as Azure CLI and preforming management tasks while examining aspects of device security, Device Provisioning Service and how to provision devices at scale.
  • Analyze and Store IoT Data
    Analyzing data, how to store it and able to configure the latest tools and implement data analytics and storage requirements for IoT solutions. Explains the concepts related to cold storage and set up Azure Data Lake for cold storage. Analysis and concepts for warm storage. Using Azure Cosmos DB as an endpoint to receive data from Azure Stream Analytics jobs and analytic capabilities of the Azure edge runtime. Set up stream analytics, to run on a simulated edge device, stream analytics querying, routing and analysis capabilities.
  • Visualize and Interpret IoT Data
    In this course we can explore Time Series Insights. Real-Time Streaming, Predictive models data visualization tools, how to build visualizations with Time Series Insights, and how to create secure Power BI Service Dashboards for businesses characteristics of time series data – how it can be used for analysis and prediction. How IoT telemetry data is typically generated as time series data and techniques for managing and analyzing it with Azure Time Series Insights. Store, analyse and instantly query massive amounts of time series data. General introduction to using Power BI, with specific emphasis on how Power BI can load, transform and visualize IoT data sets.
  • Implement Predictive Analytics using IoT Data
    Predictive Analytics for IoT Solutions through a series of machine learning implementations that are common for IoT scenarios, such as predictive maintenance.
    Helps in describing machine learning scenarios and algorithms commonly pertinent to IoT, how to use the IoT solution Accelerator for Predictive Maintenance, preparing data for machine learning operations and analysis, apply feature engineering within the analysis process, choosing the appropriate machine learning algorithms for given business scenarios. Identify target variables based on the type of machine learning algorithm, evaluate the effectiveness of regression models.
  • Evaluate and Design an IoT Solution
    Learn to develop business planning documents and the solution architecture for their IoT implementations. To build massively scalable Internet of Things solutions in an enterprise environment, it is essential to have tools and services that can securely manage thousands to millions of devices while at the same time providing the back-end resources required to produce useful data insights and support for the business. Azure IoT services provide the scalability, reliability and security, as well as a host of functions to support IoT solutions in any of the vertical marketplaces (industrial/manufacturing, smart city/building, energy, agriculture, retail, etc.). In IoT Architecture Design and Business Planning, you will be presented with instruction that will document approaches to design, propose, deploy, and operate an IoT Architecture.
  • Final Project
    In this Project you should feel confident to start an IoT architect career and ability to design and implement a full IoT solution. In this project, you will get evaluated on the knowledge and skills that you acquired by completing the other IoT courses. Instead of learning new skills, we get assessed what we know with the emphasis placed on the hands-on activities. To leverage a real-world project scenario that enables us to verify that you have the skills required to implement an Azure IoT solution. The challenge activities section includes a series of tasks where you must design and implement an Azure IoT solution that encompasses many of the technologies. You will need to apply many different aspects of the training within your solution in order to be successful.

Real-World Scenario

Consider simulating a weather station located at some remote location. The station will send telemetry data to the cloud, where the data will be stored for long-term analysis and also monitored in real-time to ensure the wind speed does not exceed safe limits also unsafe wind speeds be detected, the solution will initiate an action that in the real-world would send alert notifications and ensure the wind farm turbines apply rotor brakes to ensure the turbines do not over-rev.
The functional requirements, constraints and the Proof of Value must satisfy.

  • Every turbine in the simulated farm will leverage several sensors to provide the telemetry information relating to turbine performance and will connect directly (and securely) to a network that provides access to the internet.
  • Demonstrate the use of Time Series Insights to view the wind turbine telemetry data.
  • Route telemetry to storage appropriate for high-speed access for Business Intelligence.
  • Create a dashboard in Power BI desktop that displays telemetry as lines charts and gauges.

Time Series Insight Graph

Graph Demo 1

Graph Demo 2

Graph Demo 3

Azure Arc is one of the significant announcements coming out of #msignite this week. As depicted in the picture below, Azure Arc is a single control plane across multiple clouds, premises, and the edge.

Azure Arc

Source: https://azure.microsoft.com/en-us/services/azure-arc/

But we’ve seen single control planes before, no?

That is correct. The following snapshot (from 2013) shows App Controller securely connected to both on-premise and Microsoft Azure resources.

Azure App Controller in 2013

Source: https://blogs.technet.microsoft.com/yungchou/2013/02/18/system-center-2012-r2-explained-app-controller-as-a-single-pane-of-glass-for-cloud-management-a-primer/

So, what is different with Azure Arc?

Azure Arc is not just a “single-pane” of control for cloud and on-premises. Azure Arc takes Azure’s all-important control plane – namely, the Azure Resource Manager (ARM) – and extends it *outside* of Azure. In order to understand the implication of the last statement, it will help to go over a few ARM terms.

Let us start with the diagram below. ARM (shown in green) is the service used to provision resources in Azure (via the portal, Azure CLI, Terraform, etc.). A resource can be anything you provision inside an Azure subscription. For example, SQL Database, Web App, Storage Account, Redis Cache, and Virtual Machine. Resources always belong to a Resource Group. Each type of resource (VM, Web App) is provisioned and managed by a Resource Provider (RP). There are close to two hundred RPs within the Azure platform today (and growing with the release of each new service).

ARM

Source: http://rickrainey.com/2016/01/19/an-introduction-to-the-azure-resource-manager-arm/

Now that we understand the key terms associated with ARM, let us return to Azure Arc. Azure Arc takes the notion of the RP and extends it to resources *outside* of Azure. Azure Arc introduces a new RP called “Hybrid Compute”. See the details for the RP HybridCompute in the screenshot below. As you can imagine, the HybridCompute RP is responsible for managing the resources *outside* of Azure. HybridCompute RP manages the external resources by connecting to the Azure Arc agent, deployed to the external VM. The current preview is limited to Windows or Linux VM. In the future, the Azure Arc team plans to support containers as well.

RP Hybrid Compute Screenshot

Note: You will need to first to register the provider using the command az register -n Microsoft.HybridCompute

Once we deploy the Azure Arc agent [1] to a VM running in Google Cloud, it shows inside Azure Portal within the resource group “az_arc_rg” (see screenshot below). Azure Arc agent requires connectivity to Azure Arc service endpoints for this setup to work. All connections are outbound from the agent to Azure and are secured with SSL. All traffic can be routed via an HTTPS proxy.

deploy the Azure Arc agent [1] to a VM running in Google cloud

Since the Google Cloud hosted VM (gcp-vm-001) is an ARM resource, it is an object inside Azure AD. Furthermore, there can be a managed identity associated with Google VM.

Benefits of Extending ARM to Resources Outside Azure:

  • Ability to manage external VMs as ARM resources via using Azure Portal / CLI, as well as, the ability to add tags, as shown below.Ability to manage external VMs as ARM resources using Azure Portal
  • Ability to centrally manage access and security policies for external resources with Role-Based Access Control.manage access and security policies for external resources with Role-Based Access Control
    Microsoft Hybrid Compute Permissions
  • Ability to enforce compliance and simplify audit reporting.Ability to enforce compliance and simplify audit reporting

[1] Azure Arc Agent is installed by running the following script on the remote VM. This script is generated from the Azure portal:

# Download the package:

Invoke-WebRequest -Uri https://aka.ms/AzureConnectedMachineAgent -OutFile AzureConnectedMachineAgent.msi

# Install the package:

msiexec /i AzureConnectedMachineAgent.msi /l*v installationlog.txt /qn | Out-String

# Run connect command:

"$env:ProgramFiles\AzureConnectedMachineAgent\azcmagent.exe" connect --resource-group "az_arc_rg" --tenant-id "" --location "westus2" --subscription-id ""
This blog is a follow on about Azure Cognitive Services, Microsoft’s offering for enabling artificial intelligence (AI) applications in daily life. The offering is a collection of AI services with capabilities around speech, vision, search, language, and decision.

In Azure Cognitive Services Personalizer: Part One, we discussed the core concepts and architecture of Azure Personalizer Service, Feature Engineering, its relevance, and its importance.

In this blog, Part Two, we will go over a couple of use cases in which Azure Personalizer Service is implemented. We will look at features used, reward calculation, and their test run result. Stay tuned for Part Three, where we will list out recommendations and capacities for implementing solutions using Azure Personalizer Service.

Use Cases and Results

Two use cases implemented using Personalizer involves the ranking of content for each user of a business application.

Use Case 1: Dropdown Options

Different users of an application with manager privileges would see a list of reports that they can run. Before Personalizer was implemented, the list of dozens of reports was displayed in alphabetical order, requiring most of the managers to scroll through the lengthy list to find the report they needed. This created a poor user experience for daily users of the reporting system, making for a good use case for Personalizer. The tooling learned from the user behavior and began to rank frequently run reports on the top of the dropdown list. Frequently run reports would be different for different users, and would change over time for each manager as they get assigned to different projects. This is exactly the situation where Personalizer’s reward score-based learning models come into play.

Context Features

In our use case of dropdown options, the context features JSON is as below with sample data

{
    "contextFeatures": [
        { 
            "user": {
                "id":"user-2"
            }
        },
        {
            "scenario": {
                "type": "Report",
                "name": "SummaryReport",
                "day": "weekend",
                "timezone": "est"
            }
        },
        {
            "device": {
                "mobile":false,
                "Windows":true,
                "screensize": [1680,1050]
            }
        }
    ]
}

Actions (Items) Features

Actions were defined as the following JSON object (with sample data) for this use case

{
    "actions": [
    {
        "id": "Project-1",
        "features": [
          {
              "clientName": "Client-1",
              "projectManagerName": "Manager-2"
          },
          {

                "userLastLoggedDaysAgo": 5
          },
          {
              "billable": true,
              "common": false
          }
        ]
    },
    {
         "id": "Project-2",
         "features": [
          {
              "clientName": "Client-2",
              "projectManagerName": "Manager-1"
          },
          {

              "userLastLoggedDaysAgo": 3
           },
           {
              "billable": true,
              "common": true
           }
        ]
    }
  ]
}

Reward Score Calculation

Reward score was calculated based on the actual report selected (from the dropdown list) by the user from the ranked list of reports displayed with the following calculation:

  • If the user selected the 1st report from the ranked list, then reward score of 1
  • If the user selected the 2nd report from the ranked list, then reward score of 0.5
  • If the user selected the 3rd report from the ranked list, then reward score of 0
  • If the user selected the 4th report from the ranked list, then reward score of – 0.5
  • If the user selected the 5th report or above from the ranked list, then reward score of -1

Results

View of the alphabetically ordered report names in the dropdown before personalization:

alphabetically ordered report names in the dropdown before personalization

View of the Personalizer ranked report names in the dropdown for the given user:

Azure Personalizer ranked report names based on frequency

Use Case 2: Projects in Timesheet

Every employee in the company logs a daily timesheet listing all of the projects the user is assigned to. It also lists other projects, such as overhead. Depending upon the employee project allocations, his or her timesheet table could have few to a couple of dozen active projects listed. Even though the employee is assigned to several projects, particularly at lead and manager levels, they don’t log time in more than 2 to 3 projects for a few weeks to months.

Before personalization, the projects in the timesheet table were listed in alphabetical order, again resulting in a poor user experience. Even more troublesome, frequent user errors caused the accidental logging of time in the incorrect row. Personalizer was a good fit for this use case as well, allowing the system to rank projects in the timesheet table based on time logging patterns for each user.

Context Features

For the Timesheet use case, context features JSON object is defined as below (with sample data):

{
    "contextFeatures": [
        { 
            "user": {
                "loginid":"user-1",
                "managerid":"manager-1"
		  
            }
        },
        {
            "scenario": {
                "type": "Timesheet",
                "day": "weekday",
                "timezone": "ist"
            }
        },
        {
            "device": {
                "mobile":true,
                "Windows":true,
                "screensize": [1680,1050]
            }
        }
     ]
}

Actions (Items) Features

For the timesheet use case, the Actions JSON object structure (with sample data) is as under:

{
    "actions": [
    {
        "id": "Project-1",
        "features": [
          {
              "clientName": "Client-1",
              "userAssignedForWeeks": "4-8"
          },
          {

              "TimeLoggedOnProjectDaysAgo": 3
          },
          {
              "billable": true,
              "common": false
          }
        ]
    },
    {
         "id": "Project-2",
         "features": [
          {
              "clientName": "Client-2",
              "userAssignedForWeeks": "8-16"
          },
          {

              " TimeLoggedOnProjectDaysAgo": 2
           },
           {
              "billable": true,
              "common": true
           }
        ]
    }
  ]
}

Reward Score Calculation

The reward score for this use case was calculated based on the proximity between the ranking of projects in timesheet returned by the Personalizer and the actual projects that the user would log time as follows:

  • Time logged in the 1st row of the ranked timesheet table, then reward score of 1
  • Time logged in the 2nd row of the ranked timesheet table, then reward score of 0.6
  • Time logged in the 3rd row of the ranked timesheet table, then reward score of 0.4
  • Time logged in the 4th row of the ranked timesheet table, then reward score of 0.2
  • Time logged in the 5th row of the ranked timesheet table, then reward score of 0
  • Time logged in the 6th row of the ranked timesheet table, then reward score of -0.5
  • Time logged in the 7th row or above of the ranked timesheet table, then reward score of -1

The above approach to reward score calculation considers that most of the time users would not need to fill out their timesheet for more than 5 projects at a given time. Hence, when a user logs time against multiple projects, the score can be added up and then capped between 1 to -1 while calling Personalizer Rewards API.

Results

View of the timesheet table having project names alphabetically ordered before personalization:

project names alphabetically ordered before Azure personalization

View of the timesheet table where project names are ordered based on ranking returned by Personalization Service:

timesheet table ordered by Azure Personalization Service

Testing

In order to verify the results of implementing the Personalizer in our selected use cases, unit tests were effective. This method was helpful in two important aspects:

  1. Injecting the large number of user interactions (learning loops)
  2. In simulating the user behavior towards a specific pattern

This provided an easy way to verify how Personalizer reflects the current and changing trends injected via Unit Tests in the user behavior by using reward scores and exploration capability. This also enabled us to test different configuration settings provided by Personalizer Service.

Test Run 1

This 1st test run simulated different user choices with different explorations settings. The test results show the number of learning loops that started reflecting the user preference from intermittent to a consistent point.

Unit Test Scenario
Learning Loops, Results and Exploration Setting
User selection of Project-A Personalizer Service started ranking Project-A at the top intermittently after 10 – 20 learning loops and ranked it consistently at the top after 100 learning loops with exploration set to 0%
User selection of Project-B Personalizer Service started reflecting the change in user preference (from Project-A to Project-B) by ranking Project-B at the top intermittently after 100 learning loops and ranked it consistently at the top after 1200 learning loops with exploration set to 0%
User selection of Project-C

 

Personalizer Service started reflecting the change in user preference (from Project-B to Project-C) by ranking Project-C at the top intermittently after 10 – 20 learning loops and ranked it almost consistently at the top after 150 learning loops with exploration set to 50%

 

Personalizer adjusted with the new user preference quicker when exploration was utilized.

 

User selection of Project-D

 

Personalizer Service started reflecting the change in user preference (from Project-C to Project-D) by ranking Project-D at the top intermittently after 10 – 20 learning loops and ranked it almost consistently at the top after 120 learning loops with exploration set to 50%

 

Test Run 2

In this 2nd test run, the impact of having and removing sparse features (little effective features) is observed.

Unit Test Scenario
Learning Loops, Results and Exploration Setting
User selection of Project-E Personalizer Service started reflecting the change in user preference (from Project-D to Project-E) by ranking Project-E at the top intermittently after 10 – 20 learning loops and ranked it almost consistently at the top after 150 learning loops with exploration set to 20%
User selection of Project-F Personalizer Service started reflecting the change in user preference (from Project-E to Project-F) by ranking Project-F at the top intermittently after 10 – 20 learning loops and ranked it almost consistently at the top after 250 learning loops with exploration set to 20%
User selection of Project-G Two less effective features (sparse features) of type datetime were removed. Personalizer Service started reflecting the change in user preference (from Project-F to Project-G) by ranking Project-G at the top intermittently after 5 – 10 learning loops and ranked it almost consistently at the top after only 20 learning loops with exploration set to 20%

 

User selection of Project-H

 

Two datetime sparse features were added back. Personalizer Service started reflecting the change in user preference (from Project-G to Project-H) by ranking Project-H at the top intermittently after 10 – 20 learning loops and ranked it almost consistently at the top after 500 learning loops with exploration set to 20%

 

Thanks for reading! In the next part of this blog post, we will look at the best practices and recommendations for implementing Personalizer solutions. We will also touch upon the capacities and limits of the Personalizer service at present.

Late last Friday, the news of the Joint Enterprise Defense Infrastructure (JEDI) contract award to Microsoft Azure sent seismic waves through the software industry, government, and commercial IT circles alike.

Even as the dust settles on this contract award, including the inevitable requests for reconsideration and protest, DoD’s objectives from the solicitation are apparent.

DOD’s JEDI Objectives

Public Cloud is the Future DoD IT Backbone

A quick look at the JEDI statement of objectives illustrates the government’s comprehensive enterprise expectations with this procurement:

  • Fix fragmented, largely on-premises computing and storage solutions – This fragmentation is making it impossible to make data-driven decisions at “mission-speed”, negatively impacting outcomes. Not to mention that the rise in the level of cyber-attacks requires a comprehensive, repeatable, verifiable, and measurable security posture.
  • Commercial parity with cloud offerings for all classification levels – A cordoned off dedicated government cloud that lags in features is no longer acceptable. Furthermore, it is acceptable for the unclassified data center locations to not be dedicated to a cloud exclusive to the government.
  • Globally accessible and highly available, resilient infrastructure – The need for infrastructure that is reliable, durable, and can continue to operate despite catastrophic failure of pieces of infrastructure is crucial. The infrastructure must be capable of supporting geographically dispersed users at all classification levels, including in closed-loop networks.
  • Centralized management and distributed control – Apply security policies; monitor security compliance and service usage across the network; and accredit standardized service configurations.
  • Fortified Security that enables enhanced cyber defenses from the root level – These cyber defenses are enabled through the application layer and down to the data layer with improved capabilities including continuous monitoring, auditing, and automated threat identification.
  • Edge computing and storage capabilities – These capabilities must be able to function totally disconnected, including provisioning IaaS and PaaS services and running containerized applications, data analytics, and processing data locally. These capabilities must also provide for automated bidirectional synchronization of data storage with the cloud environment when a connection is re-established.
  • Advanced data analytics – An environment that securely enables timely, data-driven decision making and supports advanced data analytics capabilities such as machine learning and artificial intelligence.

Key Considerations: Agility and Faster Time to Market

From its inception, with the Sep 2017 memo announcing the formation of Cloud Executive Steering Group that culminated with the release of RFP in July 2018, DoD has been clear – They wanted a single cloud contract. They deemed a multi-cloud approach to be too slow and costly. Pentagon’s Chief Management officer defended a single cloud approach by suggesting that multi-cloud contract “could prevent DoD from rapidly delivering new capabilities and improved effectiveness to the warfighter that enterprise-level cloud computing can enable”, resulting in “additional costs and technical complexity on the Department in adopting enterprise-scale cloud technologies under a multiple-award contract. Requiring multiple vendors to provide cloud capabilities to the global tactical edge would require investment from each vendor to scale up their capabilities, adding expense without commensurate increase in capabilities”

A Single, Unified Cloud Platform Was Required

The JEDI solicitation expected a unified cloud platform that supports a broad set of workloads, with detailed requirements for scale and long-term price projections.

  1. Unclassified webserver with a peak load of 400,000 requests per minute
  2. High volume ERP system – ~30,000 active users
  3. IoT + Tactical Edge – A set of sensors that captures 12 GB of High Definition Audio and Video data per hour
  4. Large data set analysis – 200 GB of storage per day, 4.5 TB of online result data, 4.5 TB of nearline result data, and 72 TB of offline result data
  5. Small form-factor data center – 100 PB of storage with 2000 cores that is deliverable within 30 days of request and be able to fit inside a U.S. military cargo aircraft

Massive Validation for the Azure Platform

The fact that the Azure platform is the “last cloud standing” at the end of the long and arduous selection process is massive validation from our perspective.

As other bidders have discovered, much to their chagrin, the capabilities described above are not developed overnight. It’s a testament to Microsoft’s sustained commitment to meeting the wide-ranging requirements of the JEDI solicitation.

Lately, almost every major cloud provider has invested in bringing the latest innovations in compute (GPUs, FPGAs, ASICs), storage (very high IOPS, HPC), and network (VMs with 25 Gbps bandwidth) to their respective platforms. In the end, what I believe differentiates Azure is a long-standing focus on understanding and investing in enterprise IT needs. Here are a few examples:

  • Investments in Azure Stack started 2010 with the announcement of Azure Appliance. It took over seven years of learnings to finally run Azure completely in an isolated mode. Since then, the investments in Data Box Edge, Azure Sphere and commitment to hybrid solutions have been a key differentiator for Azure.
  • With 54 Azure regions worldwide that ( available in 140 countries) including dedicated Azure government regions – US DoD Central, US DoD East, US Gov Arizona, US Gov Iowa, US Gov Texas, US Gov Virginia, US Sec East, US Sec West – Azure team has accorded the highest priority on establishing a global footprint. Additionally, having a common team that builds, manages, and secures Azure’s cloud infrastructure has meant that even the public Azure services have DoD CC SRG IL 2, FedRAMP moderate and high designations.
  • Whether it is embracing Linux or Docker, providing the highest number of contributions to GitHub projects, or open-sourcing the majority of  Azure SDKs and services, Microsoft has demonstrated a leading commitment to open source solutions.
  • Decades of investment in Microsoft Research, including the core Microsoft Research Labs and Microsoft Research AI, has meant that they have the most well-rounded story for advanced data analytics and AI.
  • Documentation and ease of use have been accorded the highest engineering priorities. Case in point, rebuilding Azure docs entirely on Github. This has allowed an open feedback mechanism powered by Github issues.
Microsoft HQ in RemondAfter much anticipation, the US Department of Defense (DoD) has awarded the $10 billion Joint Enterprise Defense Infrastructure (JEDI) contract for cloud computing services to Microsoft over Amazon. This effort is crucial to the Pentagon’s efforts to modernize core technology and improve networking capabilities, and the decision on which cloud provider was the best fit was not something taken lightly.

Current military operations run on software systems and hardware from the 80s and 90s, and the DoD has been dedicated to moving forward with connecting systems, streamlining operations, and enabling emerging technologies through cloud adoption.

Microsoft has always been heavy to invest back into their products, the leading reason we went all-in on our partnership, strengthening our capabilities and participating in numerous Microsoft programs since the inception of the partner program in 1994.

In our experience, one of the many differentiators for Microsoft Azure is its global networking capabilities. Azure’s global footprint is so vast that it includes 100K+ miles of fiber and subsea cables, and 130 edge locations connecting over 50 regions worldwide. That’s more regions across the world than AWS and Google combined. As networking is a vital capability for the DoD, they’re investing heavily in connecting their bases and improving networking speeds, data sharing, and operational efficiencies, all without sparing security and compliance.

Pioneering Cloud Adoption in the DoD

We are fortunate enough to have been on the front lines of Azure from the very beginning. AIS has been working with Azure since it was started in pre-release under the code name Red Dog in 2008. We have been a leading partner in helping organizations adopt Azure since it officially came to market in 2010, with the privilege of experience in numerous large, complex projects across highly-regulated commercial and federal enterprises ever since.

When Azure Government came along for pre-release in the summer of 2014, AIS was among the few partners invited to participate and led all partners with the most client consumption. As the first partner to successfully support Azure Gov IL5 DISA Cloud Access Point (CAP) Connectivity and ATO for the DoD, we’ve taken our experience and developed a reliable framework to help federal clients connect to the DISA CAP and expedite the Authority to Operate (ATO) process.

We have led important early adoption projects to show the path forward with Azure Government in DoD, including the US Air Force, US Army EITaaS, Army Futures Command, and Office Under Secretary of Defense for Policy. Our experiences have allowed us to show proven success moving DoD customers IMPACT Level 2, 4, 5, and (soon) 6 workloads to the cloud quickly and thoroughly with AIS’ DoD Cloud Adoption Framework.

To enable faster cloud adoption and native cloud development, AIS pioneered DevSecOps and built Azure Blueprints to help automate achieving federal regulation compliance and ATO. We were also the first to achieve the Trusted Internet Connections (TIC) and DoD Cyber Security Service Provider (CSSP), among others.

AIS continues to spearhead the development of processes, best practices, and standards across cloud adoption, modernization, and data & AI. It’s an exceptionally exciting time to be a Microsoft partner, and we are fortunate enough to be at the tip of the spear alongside the Microsoft product teams and enterprises leading the charge in cloud transformation.

Join Our Growing Team

We will continue to train, mentor, and support passionate cloud-hungry developers and engineers to help us face this massive opportunity and further the mission of the DoD.

WORK WITH THE BRIGHTEST LEADERS IN SOFTWARE DEVELOPMENT

I recently had the privilege and opportunity to attend this year’s DEF CON conference, one of the world’s largest and most notable hacker conventions, held annually in Las Vegas. Deciding what talks and sessions to attend can be a logistics nightmare for a conference that has anywhere between 20,000 – 30,000 people in attendance, but I pinpointed the ones that I felt would be beneficial for myself and AIS.

During the conference, Tanya Janca, a cloud advocate for Microsoft, and Teri Radichel from 2nd Sight Lab did a presentation on “DIY Azure Security Assessment” that dove into how to verify the security of your Azure environments. More specifically they went into detail on using Azure Security Center, and setting scope, policies, and threat protection. With this post, I want to share what I took away from the talk I found most helpful.

Security in Azure

Security is a huge part of deploying any implementation in Azure and ensuring fail-safes are in place to stop attacks before they occur. I will break down the topics I took away that can help you better understand and perform your own security assessment in Azure along with looking for vulnerabilities and gaps.

The first step in securing your Azure environment is to find the scope at which you are trying to assess and protect. This could also include things external to Azure, such as hybrid solutions with on-premises. These items include the following:

  • Data Protection
  • Application Security
  • Network Security
  • Access Controls
  • Cloud Security Controls
  • Cloud Provider Security
  • Governance
  • Architecture

Second, is using the tools and features within Azure in order to accomplish this objective. Tanya and Teri started out by listing a few key features that every Azure implementation should use. This includes:

  • Turning on Multi-Factor Authentication (MFA)
  • Identity and Access Management (IAM)
    • Roles in Azure AD
    • Policies for access
    • Service accounts
      • Least privilege
    • Account Structure and Governance
      • Management Groups
      • Subscriptions
      • Resource Groups

A key item I took away from this section was allowing access at the “least privileged” level using service accounts, meaning only the required permissions should be granted when needed using accounts that are not for administrative use. Along with tightening access, it’s also important to understand at what level to manage this governance. Granting access at a management group level will cast a wider and more manageable net. A more defined level, such as a subscription level, could help with segregation of duties but this is heavily based on the current landscape of your groups and subscription model.

The Center for Internet Security (CIS)

So maybe now you have an understanding of what scope you want to assess the security of your Azure environment at, but do not know where to start. This is where The Center for Internet Security (CIS) can come into play. CIS is crowd-sourced security for best practices and threat prevention which includes members such as corporations, governments, and academic institutions. It was initially intended for on-premises use. However, as the cloud has grown so has the need for increased security. CIS can help you decide what best practices you should follow based on known threat vectors; these include 20 critical controls broken down into the following 3 sections:

Basic Center for Internet Security Controls

Examples of these CIS control practices could be:

  • Inventory and Control of Hardware Assets by utilizing a software inventory tool
  • Controlled Use of Administrative Privileges by setting up alerts and logs

An additional feature is the CIS Benchmark which has recommendations for best practices in various platforms and services, such as Microsoft SQL or IIS. Plus it’s free! Another cool feature that CIS offers is within the Azure Marketplace. They have pre-defined system images that are already hardened for these best practices.

CIS Offers in Azure Marketplace

The figure below shows an example benchmark for control practice that gives you the recommendation to “Restrict access to Azure AD administration portal.” This will then output audits that show what steps need to be taken to be within the scope of that best practice.

Control Practice to Restrict access to Azure AD administration portal

Azure Security Center (ASC)

In this next section, I detail the features of Azure Security Center (ASC) that I took away from this presentation and how to get started using ASC. The figure below is of the dashboard. As you can see, there are a lot of options inside the ASC dashboard, including sections such as Policy & Compliance and Resource Security Hygiene. The settings inside of those can dive deeper into resources all the way down to the VM or application level.

Azure Security Center Dashboard

Making sure you have ASC turned on should be your first step when implementing the features within it. The visuals you get in ASC are very helpful, including things like subscription coverage and your security score. Policy management is also a feature with ASC to use pre-defined and custom rules to keep your environment within the desired compliance levels.

Cloud Networking

Your network design in Azure plays a crucial role in securing against incoming attacks, including more than just closing ports. When you build a network with security in mind you not only limit your attack surface but also make spotting vulnerabilities easier; all while making it harder for attackers to infiltrate your systems. Using Network Security Groups (NSGs) and routes can also help by allowing only the required ports. You can also utilize Network Watcher to test these effective security rules. Other best practices include not making RDP, SSH, and SQL accessible from the internet. At a higher-level, below are some more networking features and options to secure Azure including:

  • Azure Firewall
    • Protecting storage accounts
    • Using logging
    • Monitored
  • VPN/Express Route
    • Encryption between on-premises and Azure
  • Bastion Host
    • Access to host using jump box feature
    • Heavy logging
  • Advanced Threat Protection
    • Alerts of threats in low, medium and high severity
    • Unusually activities such as large amounts of storage files copied
  • Just in Time (JIT)
    • Access host only when needed in a configured time frame.
    • Select IP Ranges and ports
  • Azure WAF (Web Application Firewall)
    • Layer 7 firewall for applications
    • Utilize logging and monitoring

An additional design factor to consider is the layout of your network architecture. Keeping all your resources divided into tiers can be a great security practice to minimize risk to each component. An example would be utilizing a three-tier design. This design divides a web application into three-tier (VNets). In the figure below you can see a separate web tier, app tier, and data tier. This is much more secure because the front-end web tier can still access the app tier but cannot directly talk to the data tier which helps to minimize risk to your data.

Three Tier Network Architecture: web tier, app tier, and data tier

Logging and Monitoring

Getting the best data and analytics to properly monitor and log your data is an important part of assessing your Azure environment. For those in security roles, liability is an important factor in the ‘chain of custody’. When handling security incidents, extensive logging is required to ensure you understand the full scope of the incident. This includes having logging and monitoring turned on for the following recommended items:

  • IDS/IPS
  • DLP
  • DSN
  • Firewall/WAF
  • Load Balancers
  • CDN

The next possible way to gather even more analytics is the use of a SEIM (Security Information and Event Management) like Azure Sentinel. This just adds another layer of protection to collect, detect, investigate, and respond to threats from on-premises to multi-cloud vendors. An important note of this is to make sure you tune your SEIM, so you are detecting the threats accurately and not diluting the alerts with false positives.

Advanced Data Security

The final point I want to dive into is Advanced Data Security. The protection of data in any organization should be at the top of their list of priorities. Beginning by classifying your data is an important first step to know the sensitivity of your data. This is where Data Discovery & Classification can help in labeling the sensitivity of your data. Next is utilizing the vulnerability assessment scanning which helps assess the risk level of your databases and minimize leaks. Overall, these cloud-native tools are just another great way to help secure your Azure environment.

Conclusion

In closing, Azure has a plethora of tools at your disposal within the Azure Security Center to do your own security assessment and protect yourself, your company, and your clients from future attacks. The ASC can become your hub to define and maintain a compliant security posture for your enterprise. Tanya and Teri go into great detail the steps to take and even supply a checklist you can follow yourself to assess an Azure environment.

Checklist

  1. Set scope & only test what’s in scope
  2. Verify account structure, identity, and access control
  3. Set Azure policies
  4. Turn on Azure Security Center for all subs
  5. Use cloud-native security features – threat protection and adaptive controls, file integrity monitoring, JIT, etc.
  6. Follow networking best practices, NSGs, routes, access to compute and storage, network watcher, Azure Firewall, Express Route and Bastion host
  7. Always be on top of alerts and logs for Azure WAF and Sentinel
  8. VA everything, especially SQL databases
  9. Encryption, for your disk and data (in transit and rest)
  10. Monitor all that can be monitored
  11. Follow the Azure Security Center recommendations
  12. Then call a Penetration Tester

I hope you found this post to be helpful and make you, your company, and your clients’ experience on Azure more secure. For the full presentation, including a demo on Azure Security Center, check out this link. 

This case study was featured on Microsoft. Click here to view the case study.

When GEICO sought to migrate its sales mainframe application to the cloud, they had two choices. The first was to “rehost” their applications, which involved recompiling the code to run in a mainframe emulator hosted in a cloud instance. The second choice was to “rebuild” their infrastructure and replace the existing mainframe functionality with equivalent features build using cloud-native capabilities.

The rehost approach is arguably the easier of the two alternatives. Still, it comes with a downside – the inability to leverage native cloud capabilities and benefits like continuous integration and deployment that flow from it.

Even though a rebuild offers the benefits of native cloud capabilities, it is riskier as it adds cost and complexity. GEICO was clear that it not only wanted to move away from the mainframe codebase but also significantly increase the agility through frequent releases. At the same time, risks of the “rebuild” approach involving a million lines of COBOL code and 16 subsystems were staring them in their faces.

This was when GEICO turned to AIS. GEICO hired AIS for a fixed price and time engagement to “rebuild” the existing mainframe application. AIS extracted the business logic from the current system and reimplemented it from ground-up using modern cloud architecture and at the same time, baking in the principles of DevOps and CI/CD from the inception.

Together GEICO and AIS teams achieved the best of both worlds – a risk mitigated “rebuild” approach to mainframe modernization. Check out the full success story on Microsoft, GEICO finds that the cloud is the best policy after seamless modernization and migration.

DISCOVER THE RIGHT APPROACH FOR MODERNIZING YOUR APPLICATIONS
Download our free whitepaper to explore the various approaches to app modernization, where to start, how it's done, pros and cons, challenges, and key takeaways

Azure Cognitive Services is Microsoft’s offering for enabling artificial intelligence (AI) applications in daily life, a collection of AI services that currently offer capabilities around speech, vision, search, language, and decision. While these services are easy to integrate and consume in your business applications, they bring together powerful capabilities that apply to numerous use cases.

Azure Personalizer is one of the services in the suite of Azure Cognitive Services, a cloud-based API service that allows you to choose the best experience to show to your users by learning from their real-time behavior. Azure Personalizer is based on cutting-edge technology and research in the areas of Reinforcement Learning and uses a machine learning (ML) model that is different from traditional supervised and unsupervised learning models.

This blog is divided into three parts. In part one, we will discuss the core concepts and architecture of Azure Personalizer Service, Feature Engineering, and its relevance and importance. In part two, we will go over a couple of use cases in which Azure Personalizer Service is implemented. Finally, in part three, we will list out recommendations and capacities for implementing solutions using Personalizer.

Core Concepts & Architecture

At its core, Azure Personalizer takes a list of items (e.g. list of drop-down choices) and their context (e.g. Report Name, User Name, Time Zone) as input and returns the ranked list of items for the given context. While doing that, it also allows feedback submission regarding the relevance and efficiency of the ranking results returned by the service. The feedback (reward score) can be automatically calculated and submitted to the service based on the given personalization use case.

Azure Personalizer uses the feedback submitted for the continuous improvement of its ML model. It is highly essential to come up with well thought out features that represent the given items and their context most effectively as per the objective of personalization use case. Some of the use cases for Personalizer are content highlighting, ad placement, recommendations, content filtering, automatic prioritizing, UI usability improvements, intent clarification, BOT traits and tone, notification content & timing, contextual decision scenarios, rapidly changing contents (e.g. news, live events), etc.

There is a wide range of applications of Personalizer Service, in general, every use case where a ranking of options makes sense. Its application is not limited to a simple static list of items to be ranked, one is limited as much as the ability of feature engineering to define an item and its context, which can be effectively anything simple to quite complex. What makes Personalizer scope wide and effective are:

  • Definition of items (called as Actions) and their context with the features
  • No dependency on prior historically labeled data
  • Real-time optimization with consumption of feedback in the form of reward scores
  • Personalizer Service has this notion of exploitation (utilizing ML model recommendation) as well as exploration i.e. using an alternate approach (based on Epsilon Greedy algorithm) to determine the item ranking instead of ML model recommendation
  • Exploration ensures the Personalizer continues to deliver good results even in the changing user behavior and avoids model stagnation, drift, and ultimately lower performance

The following diagram shows the architectural flow and components of Personalizer Service and follows with a description of each of the labeled component.

Azure Services Personalizer

  1. The user interacts with the site/application, features related to the actions and context are sent to the Personalizer in a Rank call.
  2. Personalizer decides whether to exploit the current model or explore new choices. Explore setting defines the percentage of Rank calls to be used for exploring.
  3. Personalizer currently uses Vowpal Wabbit as the foundation for machine learning. This framework allows maximum throughput and lowest latency when calculating ranks and training the model with all events.
  4. Personalizer exploration currently uses an algorithm called epsilon greedy to discover new choices.
  5. Ranking results are returned to the user as well as sent to the EventHub for later correlation with reward scores and training of the model.
  6. The user chooses an action (item) from the ranking results, and the reward score is calculated and submitted to the service in a single or multiple calls using the Personalizer Rewards API. The total reward score is a value between -1 to 1.
  7. The ranking results and reward scores are sent to the EventHub asynchronously and correlated based on EventID. ML model is updated based on correlation results and the inference engine is updated with a new model.
  8. Training service updates the AI model based on the learning loops (cycle of ranking results and reward) and updates the engine.
  9. Personalizer provides an offline evaluation of the service based on the past data available from the ranking calls (learning loops). It helps determine the effectiveness of features defined for actions and context. This can be used to discover more optimized learning policies.

Learning policy determines the specific hyperparameters for the model training. These can be optimized offline (using offline evaluation) and then used online. These can be imported/exported for future reference, re-use, and audit.

Feature Engineering

Feature engineering is the process of producing such data items that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data, it is turning of raw input data into things that model can understand. Estimates show that 60 – 70% of the ML project time is spent in feature engineering.

Good quality features for context and actions is the foundation that determines how effective Personalization Service will perform predictions and drive the highest reward scores; due attention needs to be paid to this aspect of implementing Personalizer. In the field of data science, feature engineering is a complete subject on its own. Good features are characterized as:

  • Be related to the objective
  • Be known at prediction-time
  • Be numeric with meaningful magnitude
  • Have enough examples
  • Bring human insight into the problem

It is recommended to have enough features defined to drive personalization, and these be of diverse densities. High-density features help Personalizer extrapolate learning from one item to another. A feature is dense if many items are grouped into few buckets e.g. nationality of a person, and sparse if items spread across a large number of buckets e.g. book title. One of the objectives of the feature engineering is to make features denser, e.g. timestamp down to the second is very sparse, it could be made dense (effective) by classifying into “morning”, “midday”, “afternoon” etc.

Personalizer is flexible and adaptive to the unavailability of some features for some items (actions), or addition or removal of features over time. For Personalizer, features can be categorized and grouped into namespaces as long as they are valid JSON objects.

In the next part of this blog post, we will go over a couple of use cases for which Personalizer was implemented, looking at features, reward calculation, and their test run results. Stay tuned!