How to perform a proof of concept for automated discovery using Amazon Macie | AWS Security Blog

After reviewing the managed data identifiers provided by Macie and creating the custom data identifiers needed for your POC, it’s time to stage data sets that will help demonstrate the capabilities of these identifiers and better understand how Macie identifies sensitive data. If your testing requires that Macie identify additional sensitive data types that are offered as managed data identifiers but aren’t part of the recommended list, choose the Custom option and select the managed data identifiers that you need. Instead, you can create a few to help confirm that you can use custom data identifiers for sensitive data detection and that Macie can support your data discovery goals. Macie offers a default collection of recommend managed data identifiers to use for detecting general categories and types of sensitive data while optimizing data discovery results and reducing noise. Amazon Macie is a data security service that discovers sensitive data using machine learning and pattern matching, provides visibility into data security risks, and enables automated protection against those risks. Macie offers an automated data discovery feature that can continually discover sensitive data within your S3 buckets. These data discovery results can assist with analysis of records over time or to get a broader sense of what data Macie has scanned and which objects had sensitive data and which did not. Now that you understand how to use Macie to discover sensitive data, the next step in the POC is to enable automated discovery and use Macie to discover sensitive data across a larger collection of your existing S3 data. The core capabilities of Macie are focused on the security of your S3 buckets and helping to identify sensitive data including financial data, personal data, and credentials as well as sensitive data that’s unique to your organization, such as intellectual property. The heatmap provides point-in-time insights into the data that Macie has scanned, and in which buckets sensitive data has been identified or no sensitive data has been found. You can view the findings through the Findings tab in Macie or by choosing the sensitive data type when looking at the summary detections for a bucket. Because the goal is to use Macie to identify sensitive information in your S3 buckets, including examples of your own data that contains sensitive information can be helpful to test the capabilities of Macie. The POC steps demonstrate how you can use Macie to detect and alert you to sensitive data discovered in your AWS environment and help you determine the value of using Macie to enhance your current data protection strategies. As you identify the sensitive data discovery jobs to run as part of your production use of Macie, keep in mind that these jobs are immutable. Macie selects samples of the objects within S3 buckets and inspects them for the presence of sensitive data daily, providing insight into where sensitive data might reside in your overall Amazon S3 data estate. Additionally, confirm that you don’t have findings for objects that you staged that were not supposed to have sensitive data so that you can confirm how Macie handles these types of objects. In addition to reviewing the bucket level statistics that are generated by automated discovery, you can view the individual findings that were generated for each S3 object that was identified as having sensitive data. Now that you’ve reviewed the managed data identifiers, defined custom data identifiers, and staged sample data, it’s time to run a sensitive data discovery job. Macie comes with over 150 managed data identifiers that are designed to identify sensitive data in your S3 objects. With the preceding POC, you should now have a more complete understanding of how Macie identifies sensitive data and how you can use the information that Macie provides about that data identified. Amazon Macie helps customers identify, discover, monitor, and protect sensitive data stored in Amazon S3. Sensitive data discovery jobs provide a way to target a specific S3 bucket or group of buckets to do a deep analysis of the objects in those buckets and identify if sensitive data is present in the objects and if so, the type of data. This can help you understand how Macie handles data that you believe doesn’t contain sensitive information. Each Macie finding contains not only the details on the types of sensitive data identified, but also the location within the file where the sensitive data is located so that you can confirm the identified data is sensitive. After a POC with Macie, you can set the scope of how you will use Macie in production by deciding which buckets don’t need to be evaluated and so can be excluded, such as buckets used for AWS logs and buckets deemed not in scope for sensitive data identification. With the managed data identifiers that Macie offers, you should stage data files that you believe don’t contain information that aligns to the managed data identifiers. These records are an audit of every object that Macie attempted to scan, including objects that didn’t contain sensitive data. Macie covers a wide number of use cases with its managed data identifiers, but some use cases need custom data identifiers for data types that aren’t included in the managed data identifiers. We recommend that you stage data sets that contain sensitive data as well as data sets that do not to gain a full understanding of how Macie detects and reports on each of these situations. If your requirements for identifying sensitive data include detecting sensitive data that isn’t part of the current list of managed data identifiers, then you can create custom data identifiers for those data types. This information includes the sensitivity score of the bucket, a summary of the types of sensitive data found in the bucket, which objects within the bucket have been sampled, statistics related to the data that has been scanned and data that is still to be scanned, and other information about the bucket. If a bucket is blue, that means only that automated data discovery hasn’t identified sensitive data up to the point in time of the last scan, not that there is no sensitive data in the bucket. By using automated data discovery, you can focus your resources on deeper investigations of the security of buckets identified to have sensitive data. Objects that Macie found with sensitive data will be presented as Findings in the Macie console. Investigating sensitive data with findings has detailed guidance around locating sensitive data from Macie findings, retrieving the sensitive data, and the schema for sensitive data locations. In this post, we show you how to define and run a proof of concept (POC) to validate using Macie and automated discovery to enhance your current data protection strategies. Many managed data identifiers require keywords to be in proximity of the data for Macie to be able to detect findings. Each object where Macie found sensitive data will be listed as a single finding. This post outlined how you can use a POC to better understand how Amazon Macie can help meet your data discovery and classification needs. Keywords are an important component for Macie to be able to detect sensitive data. When preparing data to stage, keep in mind the keyword requirements for many of the Macie managed data identifiers. After automated discovery starts producing results, you will start seeing data in the Automated Discovery section of the Macie summary page in the console. Macie creates an analysis record for each S3 object that’s in scope for a data discovery job or an automated discovery scan. Examples of Macie managed data identifiers include credit card numbers, AWS secret access keys, and national identification numbers. If you want to use a customer managed AWS KMS key to encrypt the S3 data at rest, follow the instructions in Allowing Macie to use a customer managed AWS KMS key to give Macie access to decrypt the data in the bucket. Note that, in the automated data discovery phase, it will take 24–48 hours for Macie to perform the first scan after the feature is enabled. The summary includes metrics for the total number of buckets eligible for discovery, counts for the number of buckets where sensitive data was or was not found, and how many of these buckets are public. The heatmap view provides information on each organizational member account and insight about sensitive data within each bucket in the account. This feature is intended to help customers who have large amounts of S3 buckets and data better understand where sensitive data might be stored without having to scan all their data. This POC is intended to help you gain an understanding of what Macie is capable of and how you can use it to achieve your data discovery goals. Customers use Amazon S3 for a variety of use cases and store various types of data in S3 buckets, including sensitive data. However, it’s important that customers evaluate and test the capabilities of Macie to verify that they can meet their specific data identification and protection goals. Red indicates that some type of sensitive data has been found in the bucket, while blue indicates no sensitive data has been identified. You can also refine the managed data identifiers that are required for detecting sensitive data. After the POC is complete, evaluate the results to determine how much using Macie can strengthen your organization’s data protection program. Choose each of the findings that was produced and review the details to confirm what sensitive data was identified and if the sensitive data was discovered as you expected. Prior to beginning your POC, review the list of managed data identifiers and determine which ones you feel will be necessary to use for your data discovery requirements. For example, customers might need to identify sensitive data that’s specific to their company, such as an employee ID or project number. Each square represents a bucket in that account and the color of the square indicates whether sensitive data was discovered in that bucket. As part of your POC, it’s recommended that you investigate buckets that are reported to contain sensitive data. Continuously monitoring these buckets for the presence of sensitive data is a vital part of a data protection strategy. After the job completes, it’s time to review what Macie found in the data. Stage data files that don’t contain sensitive information. These repositories are often comprised of publicly available data sets or were created to help with testing machine learning models or sensitive data detection. If there were multiple types of sensitive data found in the object, each type of sensitive data and a count will be included in the details. When you’re staging your data, reference the keywords that are supported for the managed data identifiers you are using to help ensure that the data can be identified in your POC tests. This will help ensure that the remediation steps for identified sensitive data are directed to the correct parties. To avoid incurring additional charges, disable Macie while you evaluate the value of the additional data protection provided. Stage data that contains information that’s representative of data that you would want to detect using custom data identifiers. Over time, this heatmap might change as automated data discovery continues sampling the data in each bucket. Understanding the keywords that are used as part of sensitive data detection is important when it comes to building test data for a POC. If you created custom data identifiers, review findings for the objects that included the custom data that you detect to confirm that the data was detected. There are various repositories staged with information that could be used for sensitive data detection. Stage data files of your own data with sensitive information. A successful POC of Macie includes understanding what data Macie can detect. Staged data must be in file formats that Macie supports. It’s important to first understand the available managed data identifiers and which ones align with the use cases you want to address. Make sure that the recommended managed data identifiers are part of the custom list that you construct. For your investigation, validate if the data identified is sensitive based on your organization’s data classification policy. Defining detection criteria for custom data identifiers provides details for the types of data that require keywords. Similar to managed data identifiers, custom data identifiers have keyword requirements. This example shows the row and column where the sensitive data was found. Most customers use automated data discovery to get sample scans instead of adjusting the sampling depth for individual jobs. If the findings are true positives, make sure that the bucket has the right level of security configurations and permissions based on the data stored in the bucket. You can create custom data identifiers to help meet your data detection needs if necessary. There’s an exponential growth of digital data and organizations are grappling with not only managing it but also determining where their sensitive data exists. Additionally, identify which managed data identifiers, which are applicable to your POC, fall outside of the default list of identifiers. There’s also a 30-day free trial for automated data discovery, which is covered later in this post. Data discovery results are written to an S3 bucket that you own and where you control the data retention. To determine which managed data identifiers have keyword requirements, see Managed data identifiers by type. A well thought-out and implemented POC can provide valuable early insights and help you develop a more thorough understanding of what your data discovery and classification strategy should be. The data discovery scan results are stored as JSON Lines files. Select the recommended managed data identifiers. Choose the custom data identifiers that you want to be used in the job. Building custom data identifiers has a thorough explanation of how to define a custom data identifier. Data security is a broad concept that revolves around protecting digital information from unauthorized access, corruption, theft, and other forms of malicious activity throughout its lifecycle. There is no free trial for running targeted data discovery jobs. Amazon Web Services (AWS) customers of various sizes across different industries are pursuing initiatives to better classify and protect the data they store in Amazon Simple Storage Service (Amazon S3).

This Cyber News was published on aws.amazon.com. Publication date: Tue, 01 Oct 2024 20:43:05 +0000


Cyber News related to How to perform a proof of concept for automated discovery using Amazon Macie | AWS Security Blog

How to perform a proof of concept for automated discovery using Amazon Macie | AWS Security Blog - After reviewing the managed data identifiers provided by Macie and creating the custom data identifiers needed for your POC, it’s time to stage data sets that will help demonstrate the capabilities of these identifiers and better understand how ...
2 months ago Aws.amazon.com
9 Best DDoS Protection Service Providers for 2024 - eSecurity Planet content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More. One of the most powerful defenses an organization can employ against distributed ...
1 year ago Esecurityplanet.com
Rundown of Security News from AWS re:Invent 2023 - Amazon Web Services has been unveiling a steady stream of announcements during its AWS re:Invent 2023 event in Las Vegas this week. The focus over the four days, as expected, is on AI as AWS strives to show that its offerings can match - or surpass - ...
1 year ago Darkreading.com
CrowdStrike Demonstrates Cloud Security Leadership at AWS re:Invent - CrowdStrike is honored to be named Partner of the Year for several 2023 Geo and Global AWS Partner Awards at Amazon Web Services re:Invent 2023, where we are participating this year as a Diamond Sponsor. These accomplishments demonstrate our ...
1 year ago Crowdstrike.com
GCP to AWS migration: A Comprehensive Guide - Embarking on a GCP to AWS migration journey can be both exciting and challenging. Before we dive into the technical details, let's explore why businesses might consider migrating from GCP to AWS. While GCP offers a range of services, AWS boasts an ...
11 months ago Feeds.dzone.com
Customer compliance and security during the post-quantum cryptographic migration | AWS Security Blog - For example, using the s2n-tls client built with AWS-LC (which supports the quantum-resistant KEMs), you could try connecting to a Secrets Manager endpoint by using a post-quantum TLS policy (for example, PQ-TLS-1-2-2023-12-15) and observe the PQ ...
2 months ago Aws.amazon.com
A Handbook for Managing Containers on Amazon Web Services - Container management is a way to help you create, govern, and maintain your containers. There are tools and services available that can automate the creation, deployment, maintenance, scaling, and monitoring of application or system containers. In ...
1 year ago Trendmicro.com
Master the Art of Data Security - As we step further into the digital age, the importance of data security becomes increasingly apparent. As with all data storage services, it's crucial to ensure that the data stored on Amazon S3 is secure, particularly when it's 'at rest'-that is, ...
1 year ago Feeds.dzone.com
Shaping the Future of Finance: The Cisco and AWS Collaboration in EMEA - The collaboration between Cisco and Amazon Web Services in the Europe, Middle East, and Africa region-combining each company's market leading strengths-continues to deliver impressive outcomes for our customers, notably within the Financial Services ...
1 year ago Feedpress.me
ACM will no longer cross sign certificates with Starfield Class 2 starting August 2024 - AWS Certificate Manager is a managed service that you can use to provision, manage, and deploy public and private TLS certificates for use with Elastic Load Balancing, Amazon CloudFront, Amazon API Gateway, and other integrated AWS services. Starting ...
5 months ago Aws.amazon.com
Cisco Foundation Grantees prioritize Indigenous leadership to protect the Amazon Basin - This is the first of our three-part series on Cisco Foundation grantees working in the Amazon and South America region. This series will introduce you to eight Cisco Foundation Climate Impact & Regeneration grantees working to support preservation ...
10 months ago Feedpress.me
AWS CloudQuarry: Digging for Secrets in Public AMIs - Money, secrets and mass exploitation: This research unveils a quarry of sensitive data stored in public AMIs. As a best practice, AMI creators should not include credentials, including AWS account credentials, in published AMIs. We wanted to scan all ...
7 months ago Packetstormsecurity.com
CVE-2024-37293 - The AWS Deployment Framework (ADF) is a framework to manage and deploy resources across multiple AWS accounts and regions within an AWS Organization. ADF allows for staged, parallel, multi-account, cross-region deployments of applications or ...
6 months ago Tenable.com
SentinelLabs Details Discovery of FBot Tool for Compromising Cloud Services - SentinelLabs today published a report identifying a Python-based tool that cybercriminals are using to compromise cloud computing and software-as-a-service platforms. Alex Delamotte, senior threat researcher at SentinelLabs, said FBot is used to take ...
11 months ago Securityboulevard.com
7 Rules to Improve AWS Security and Reduce Unwanted Incidents - Security of your AWS infrastructure is ultimately up to you. As the largest cloud services provider, AWS invests heavily to ensure its cloud environment is secure. Much of AWS security is still left to the customer, especially with regard to managing ...
1 year ago Beyondtrust.com
The Dark Side of Digital Reading: E-Books as Corporate Surveillance Tools - Americans are reading digital books at a rate of three out of ten. In a market where the majority of readers are subject to both Big Publishing's greed and those of Big Tech, it is no surprise that these readers are subject to both the greed of Big ...
1 year ago Cysecurity.news
A Single Cloud Compromise Can Feed an Army of AI Sex Bots – Krebs on Security - “Once initial access was obtained, they exfiltrated cloud credentials and gained access to the cloud environment, where they attempted to access local LLM models hosted by cloud providers: in this instance, a local Claude (v2/v3) LLM model from ...
2 months ago Krebsonsecurity.com
AWS Root vs IAM User: What to Know & When to Use Them - In Amazon Web Services, there are two different privileged accounts. One is defined as Root User and the other is defined as an IAM User. In this blog, I will break down the differences of an AWS Root User versus an IAM account, when to use one ...
1 year ago Beyondtrust.com
Embracing Security as Code - Everything is smooth until it isn't because we traditionally tend to handle the security stuff at the end of the development lifecycle, which adds cost and time to fix those discovered security issues and causes delays. Over the years, software ...
11 months ago Feeds.dzone.com
Amazon sues REKK fraud gang that stole millions in illicit refunds - Amazon's Customer Protection and Enforcement team has taken legal action against an underground store refund scheme that has resulted in the theft of millions of dollars worth of products from Amazon's online platforms. This lawsuit targets 20 ...
1 year ago Bleepingcomputer.com
Amazon Prime Video Ads 5 February - Adverts will start appearing for UK users of Amazon Video Prime on 5 February 2024, unless extra fee is paid. Amazon has confirmed that adverts will begin appearing for UK customers of the Amazon Prime Video service in early 2024. In an email to UK ...
11 months ago Silicon.co.uk
How Does Automated API Testing Differ from Manual API Testing: Unveiling the Advantages - Delve into automated versus manual API testing for efficient software delivery. See how automation speeds validation while manual testing provides human insight, ensuring comprehensive coverage for robust development. In the domain of software ...
10 months ago Hackread.com
CVE-2021-40830 - The AWS IoT Device SDK v2 for Java, Python, C++ and Node.js appends a user supplied Certificate Authority (CA) to the root CAs instead of overriding it on Unix systems. TLS handshakes will thus succeed if the peer can be verified either from the ...
3 years ago
CVE-2021-40831 - The AWS IoT Device SDK v2 for Java, Python, C++ and Node.js appends a user supplied Certificate Authority (CA) to the root CAs instead of overriding it on macOS systems. Additionally, SNI validation is also not enabled when the CA has been ...
3 years ago
CVE-2021-40829 - Connections initialized by the AWS IoT Device SDK v2 for Java (versions prior to 1.4.2), Python (versions prior to 1.6.1), C++ (versions prior to 1.12.7) and Node.js (versions prior to 1.5.3) did not verify server certificate hostname during TLS ...
3 years ago

Latest Cyber News


Cyber Trends (last 7 days)


Trending Cyber News (last 7 days)