Open a ticket
Chat with us
BLOG Published on 2024/06/22 by Woshada Dassanayake in Tech-Tips

Improving Data Protection with Microsoft Purview



Introduction

Information protection can be enhanced with tools, fragmentation, and generative AI. Microsoft recently commissioned a global survey of 800 data security professionals, who indicated that a comprehensive information protection platform can address security tool fragmentation. This is crucial as organizations consider generative AI solutions, which may expose confidential data to unauthorized users if the data is not classified and protected. Despite 80 percent of organizations agreeing that an integrated data security platform is superior to multiple best-of-breed solutions, they currently use an average of 10 different security tools. This practice can create a false sense of security, visibility gaps, coverage issues, increased complexity, and require more time and resources to manage disparate solutions.


Data security challenges

In cybersecurity, the landscape is complex and multidimensional, demanding attention to numerous factors. It extends beyond fragmented security tools to a diverse array of applications and services, spanning across on-premises, Cloud, and multi-cloud environments, as well as various infrastructures and device platforms. Moreover, the regulatory environment is continuously evolving, with global mandates addressing privacy and compliance issues alongside the advent of innovative technologies such as large language models and generative AI tools. A notable example is the recent legislative scrutiny in the US regarding ChatGPT. The objective is to curb the surge of data breaches, which, on average, cost organizations $4.4 million in 2023, while protecting privacy and mitigating associated risks.

Organizations should prioritize adopting a comprehensive data security platform to address complex multidimensional requirements and dynamic regulatory changes effectively. This platform should feature integrated solutions determined by intelligence, unity, and integration. The key components required for achieving comprehensive data security include the ability to classify and protect sensitive files within the productivity tools, applications, and services used by organizations, whether they are Microsoft products or third-party applications. This protection should ideally be intelligent, unified, and native to the applications themselves, ensuring no gaps in coverage and eliminating the need for frequent integration testing as new features are released.

Also, seamless integration with other security and compliance solutions within the suite is essential, supporting a unified classification and labeling scheme. The platform should offer a centralized portal for monitoring sensitive files and user activities to assess risk consistently. The same encryption and protection methodologies should apply to external emails or Teams chats. Moreover, the platform must be adaptable to multiple device platforms and infrastructures and have the computational capacity to handle classification workloads at the scale of large enterprises. It should provide comprehensive 360-degree visibility into risk across the organization and serve as an integral component of a broader data security strategy.

To enhance protection throughout your organization, consider leveraging three data security solutions together.

By combining information protection and data loss prevention, you can effectively uncover and automatically classify data while preventing unauthorized usage. Pairing information protection with insider risk management enables a deeper understanding of user intent around sensitive data to identify critical data risks within your organization. Integrating insider risk management with data loss prevention allows adaptive protection, which assigns appropriate DLP policies based on user risk levels. Utilizing all three products enables a fortified data security stance, adopting a defense-in-depth approach.


What is Microsoft Purview Information Protection?

Microsoft Purview Information Protection is made up of three intelligent Cloud-enabled services.

Data classification service: The data classification service provides powerful capabilities that help customers identify their most sensitive information. Microsoft Purview offers a range of techniques, from simple regular expressions to advanced machine learning models.

Sensitivity label service: The sensitivity label service enables you to create a label taxonomy that reflects how sensitivity is defined in your organization. For example, data can be labeled as public, general, or confidential.

Rights management service: The rights management service allows admins to assign rights to labels. When these labels are manually applied to users or automatically triggered by the data classification service, access to sensitive data is enforced through encryption. This ensures that only authorized users can access the data.

These three capabilities can effectively address various needs within the data security market. They are fully supported by key solutions such as Exchange, SharePoint, Teams, Power BI, and Fabric. Integrated natively into Office, they are extended to contain endpoint DLP functionalities for Windows and Mac platforms. Moreover, the Purview Protection Platform is compatible with third-party SaaS applications. The Purview Information Protection Platform supports a range of solutions containing data security, compliance, privacy, and security with Defender for Cloud. Also, it provides SDK enabling third-party partners, such as Adobe, to integrate sensitivity labels and protections natively and expand coverage to new data types and applications.


Data classification

Significant investments have been made over the past 2-3 years to enhance these capabilities, making them best-in-class.

Sensitive info types: Microsoft Purview includes over 300 out-of-the-box sensitive information types, covering ID numbers, driver's licenses, social security numbers, and more.

Named entities: Named entities are used to identify various types of information, such as person names, addresses, medical terms and conditions, and drug names. This helps accurately detect regulatory compliance data.

Exact data match: This technique involves a lookup process to match content with unique customer data, such as product codes or healthcare records. It is scalable to large enterprise environments and supports up to 100 million rows with multiple lookup fields.

Optical Character Recognition: OCR supports over 150 languages and file sizes up to 50 megabytes. It can also extract text from images and PDFs, enabling existing DLP and auto-labeling policies to protect data that might leak from these sources.

Credential SITs: Credentials can detect digital authentication strings, passwords, and Azure connection strings as they are stored or shared in files, chats, and emails.

Context-based classification: This new category of classifiers identifies sensitive data based not on the file's content but on its properties.

Fingerprint SITs: Fingerprinting can detect exact or partial matches of sensitive intellectual property. It converts a standard form into a sensitive information type, which can then be used in the rules of your DLP policies.

Trainable classifiers: Trainable classifiers offer over 40 pre-trained machine learning models, ready to use for quickly identifying sensitive data across nine different business categories. Additionally, customers can create custom trainable classifiers for their proprietary intellectual property. 

The MIP and DLP analytics pages leverage the power of these classifications to present insightful information and recommendations. For example, it can display a recommendation to add advanced classifiers to a policy to improve its accuracy.


Sensitivity labels

SharePoint document libraries serve as the primary storage for your documents. Consider a scenario where you have a specific document library containing sensitive information. For example, it could be a collaboration site for a new product under development or data related to a confidential M&A transaction. If you wish to label all files on the site without relying on their content, you can leverage site default labels. Site default labels enable you to protect all documents within a library by designating the library itself as sensitive, eliminating the need for defining classification policies. Simply configure the appropriate content sensitivity label for your document libraries using the library settings in the site settings information panel. Afterward, all newly created or modified documents within that library will automatically receive the library's label. This ensures that documents are secure from the start, with policies associated with that label. Even if a document is downloaded from the library or the label includes an encryption policy, the protection remains with the document.

Context-based classification relies on information about the file rather than its content. This includes details such as file extension, size, creator, document properties, and specific words or phrases in the file name. It is used to classify and label files in specific groups or categories quickly. One example of context-based classification is when document properties are pre-written on a file by another labeling solution or a line-of-business application that generates sensitive information. In such cases, the existing document properties serve as the basis for classification and can be mapped to sensitivity labels.

To help customers get started, Microsoft offers several predefined policy templates. These templates combine classifiers to match compliance regulations accurately. For example, the GLBA template integrates machine learning models, named entities, and sensitive information types to detect sensitive information optimistically. Alongside the templates, policy simulation enables you to assess the impacts of your classification policies before enforcement. This feature offers the confidence to comprehend the policy's impact on your production environment and real data without affecting users. Policy simulation can repeat as you refine your policies.

Several important features are available, including auto-labeling of PDF files, support for protected PDFs in SharePoint Online, OneDrive, and Teams, user-defined permissions for secure collaboration, sensitivity labels to protect Teams shared channels, the configuration of DLP rules to display warnings in a popup dialogue before sending emails, Double Key Encryption (DKE) to protect sensitive files and emails in Microsoft 365 apps, tracking and revocation capabilities, and support for Microsoft Fabric.


Get started with Microsoft Purview

Microsoft Purview aims to address data security, governance, and compliance challenges while maintaining a commitment to data privacy. This approach provides customers with a comprehensive view of their data assets across their entire estate and allows them to manage controls from a single, unified interface. The goal is to extend the trusted Microsoft 365 protections—such as labeling, classification, and access controls—to structured, unstructured, and multi-cloud environments. This is enabled by unifying experiences across security, compliance, and governance. By leveraging the core Purview platform, a unified view of the data estate is offered, ensuring consistent data classification insights and protections. Also, a unified policy framework is delivered, aligning with the familiar Microsoft 365 environment.

Reference:

Microsoft Ignite Sessions


Woshada Dassanayake

Technical Lead in Cloud Infrastructure and Operations

Expert in Cloud platform operations, Cloud hosting and Network operations.

Newsletter

To keep up with the news and updates related to our products, make sure to subscribe to our newsletter!

Copyright © 2025 Terminalworks. All Rights Reserved