MC2: Safe Machine Learning Collaborative Analytics

Machine learning (ML) has gained prominence in recent years due to its ability to apply across dozens of industries and effectively solve complex problems. Until now, Research It shows that nearly 90% of AI/machine learning models are not in production or reach the market. The main challenge is that ML/AI models require massive amounts of high-quality, accurate, and timely data to be effective, but organizations have long been reluctant to share sensitive information due to security and privacy concerns.

Personal data is becoming more prevalent, which has led to increased privacy concerns. As a result, global data protection laws are becoming more stringent, and organizations face increasingly higher non-compliance risks. Alleviating these concerns and taking AI/Machine Learning to the next level requires a new approach to collaboration – secure collaborative learning.

Secure collaborative learning enables multiple parties to build mutually powerful ML models, without publicly sharing sensitive data with each other. With this technology, banks can use these powerful models to detect financial crimes and money laundering. Healthcare organizations can improve clinical insights from multiple patient data sets without revealing sensitive information, and mobile operators can predict fluctuations in call rates by collectively analyzing their traffic data.

After years of extensive research about this model in University of California Berkeley RISELabcontent creators Raluca Ada Boba And the Rishabh Poddar Develop MC2 is open source platform To meet this major challenge of multilateral cooperation.

MC2 (Multi-party Collaboration and Collaboration) enables rich analytics and machine learning to be performed on encrypted data, ensuring that data remains hidden even when it is processed. With a temporary “black box” method via secure enclaves, the data used remains confidential to the server running the job. This may sound paradoxical, but it is true: Multiple data owners can jointly run analytics or train ML models on their collective data, without actually revealing that data to anyone else. This alleviates concerns about offloading confidential workloads to untrusted third parties or cloud service providers. MC2 resolves the tension between expanding adoption of the cloud, the need to share data, and the growing concern about data privacy.

The rest of this article will detail the key technical aspects of this popular open source project to chart the path towards safe and collaborative ML and AI.

A software package that works to secure safe pockets

Secure enclaves enable the creation of a Trusted Execution Environment (TEE), an area where multiple parties can collaborate on confidential data, within an untrusted device. Previous approaches dump data at TEE and provide access to those who need it to collaborate, but this opens doors to hidden risks and third-party leakage that companies cannot afford in this regulatory climate.

With secure enclaves, each area has access to a restricted portion of the system’s memory and data or programs placed within the confined area are encrypted and isolated from the rest of the system. This creates an extra layer of security that protects against any intrusion, even from the system itself. Taking this even further, Secure Enclaves supports remote authentication, which enables users to cryptographically verify that the enclave is running with trusted, unmodified token.

MC2 runs common analytics and machine learning frameworks (Apache Spark, XGBoost, etc.) seamlessly within pockets safely and effectively, removing the complexities of writing region code from the end user. In addition, MC2 handles partitioning so that components that need to be calculated are automatically loaded directly onto sensitive data in the confined area.

Finally, MC2 fortifies pocket components using cryptographic techniques in two ways:

  1. MC2 contains built-in routines that check the integrity of jobs that require distributed execution.
  2. Since developers will still need to monitor and handle side channel leaks and attacks using secure enclaves, MC2 uses data-agnostic techniques in pocket code to ensure that side channel information is not leaked via memory access patterns.

MC2 provides both software and hardware data protection. Dual security reduces the risk of side channel attacks, which is a major pocket vulnerability.

MC2 in practice

At the beginning of the collaboration, each organization prepares the script that will manage the account. The text is the same for each organization and is agreed upon in advance.

While the encrypted data is being uploaded to the server, MC2 receives many local updates. The software trains a decision tree model on encoded data that is used to develop predictions. By aggregating local updates, MC2 produces a final algorithm based on the analysis of the encrypted data collected from each end.

Once the algorithm is finished, each organization downloads the results generated from the encrypted data set. This global model is what provides analytical insight. Even at this point, not every party will be able to see data from other organizations. They only have access to cohort analysis, which they can then apply to their dataset.

In practice, it may sound simple, because it is! MC2 makes multilateral collaboration on encrypted data possible for anyone.

The next wave of secret analytics

Yes, personal data is becoming more prevalent, privacy concerns are growing daily, and subsequent data protection laws are becoming more stringent. However, at the same time, organizations are realizing the huge benefits of being able to share their data with each other – banks can collaborate to detect financial crimes, health institutions can collaborate on medical studies, and so on.

Over $300 billion of the world’s most valuable data remains untapped due to the lack of a secure processing environment where ML cannot be applied, and Gartner expects By 2025, more than 50% of organizations will adopt privacy-enhanced computing to process sensitive data and perform multilateral analytics, underscoring the importance of secure access to encrypted data.

The secret computing space isn’t going to slow down any time soon, and now is the time for organizations to embrace this technology. With the amount of sensitive data increasing every day, the need for the MC2 platform has never been higher. Covert computing and analytics on encrypted data will soon become a must for all industries looking to collaborate on sensitive information.

Collection Created with Sketch.

Leave a Comment