Skip to content

Commit ce51c8e

Browse files
Update README and installation guide (#104)
1 parent f48974b commit ce51c8e

File tree

3 files changed

+138
-18
lines changed

3 files changed

+138
-18
lines changed

.travis.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ before_install:
2121
- wget -qO - https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
2222
- sudo apt-get -qq update
2323
# Install dependencies
24-
- sudo apt -y install clang-7 libssl-dev gdb libsgx-enclave-common libsgx-enclave-common-dev libprotobuf10 libsgx-dcap-ql libsgx-dcap-ql-dev az-dcap-client open-enclave libmbedtls-dev
24+
- sudo apt -y install clang-7 libssl-dev gdb libsgx-enclave-common libsgx-enclave-common-dev libprotobuf10 libsgx-dcap-ql libsgx-dcap-ql-dev az-dcap-client open-enclave=0.9.0 libmbedtls-dev
2525
- wget https://github.com/Kitware/CMake/releases/download/v3.15.6/cmake-3.15.6-Linux-x86_64.sh
2626
- sudo bash cmake-3.15.6-Linux-x86_64.sh --skip-license --prefix=/usr/local
2727
- export PATH=/usr/local:/usr/local/bin:$PATH

README.md

+129-15
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,140 @@
11
# Secure XGBoost
22

3-
## Introduction
4-
5-
Secure XGBoost is a library that enables **collaborative training and inference of [XGBoost](https://github.com/dmlc/xgboost) models on encrypted data**. In addition to offering the same efficiency, flexibility, and portability that vanilla XGBoost provides, Secure XGBoost enables privacy-preserving model training and inference by leveraging hardware enclaves and data-oblivious algorithms.
3+
[![Build Status](https://travis-ci.org/mc2-project/secure-xgboost.svg?branch=master)](https://travis-ci.org/mc2-project/secure-xgboost)
4+
[![Documentation Status](https://readthedocs.org/projects/secure-xgboost/badge/?version=latest)](https://secure-xgboost.readthedocs.io/en/latest/?badge=latest)
5+
![Contributions welcome](https://img.shields.io/badge/contributions-welcome-orange.svg)
6+
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
67

8+
Secure XGBoost is a library that leverages secure enclaves and data-oblivious algorithms to enable the **collaborative training of and inference using [XGBoost](https://github.com/dmlc/xgboost) models on encrypted data**.
79

8-
In a nutshell, data owners can use Secure XGBoost to train a model on a remote server _without_ revealing their data contents to the remote server. Furthermore, multiple data owners can use the library to _collaboratively_ train a model on their collective data, without revealing their individual data to each other.
10+
Data owners can use Secure XGBoost to train a model on a remote server, e.g., the cloud, _without_ revealing the underlying data to the remote server. Collaborating data owners can use the library to jointly train a model on their collective data without exposing their individual data to each other.
11+
![Alt Text](doc/images/workflow.gif)
912

1013
This project is currently under development as part of the broader [**MC<sup>2</sup>** effort](https://github.com/mc2-project/mc2) (i.e., **M**ultiparty **C**ollaboration and **C**oopetition) by the UC Berkeley [RISE Lab](https://rise.cs.berkeley.edu/).
1114

12-
**NOTE:** The Secure XGBoost library is a research prototype, and has not yet received independent code review. Please feel free to reach out to us if you would like to use Secure XGBoost for your applications. We also welcome contributions to the project.
15+
**NOTE:** The Secure XGBoost library is a research prototype, and has not yet received independent code review.
1316

14-
[![Build Status](https://travis-ci.org/mc2-project/secure-xgboost.svg?branch=master)](https://travis-ci.org/mc2-project/secure-xgboost)
15-
[![Documentation Status](https://readthedocs.org/projects/secure-xgboost/badge/?version=latest)](https://secure-xgboost.readthedocs.io/en/latest/?badge=latest)
16-
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
17+
## Table of Contents
18+
* [Background](#background)
19+
* [Installation](#installation)
20+
* [Usage](#usage)
21+
* [Documentation](#documentation)
22+
* [Getting Involved](#getting-involved)
1723

18-
## Documentation
24+
## Background
25+
### Secure Enclaves
26+
Secure enclaves are a recent advance in computer processor technology that enables the creation of a secure region of memory (called an enclave) on an otherwise untrusted machine. Any data or software placed within the enclave is isolated from the rest of the system. No other process on the same processor – not even privileged software such as the OS or the hypervisor – can access that memory. Examples of secure enclave technology include Intel SGX, ARM TrustZone, and AMD Memory Encryption.
27+
28+
Moreover, enclaves typically support a feature called remote attestation. This feature enables clients to cryptographically verify that an enclave in the cloud is running trusted, unmodified code.
29+
30+
Secure XGBoost builds upon the Open Enclave SDK – an open source SDK that provides a single unified abstraction across different enclave technologies. The use of Open Enclave enables our library to be compatible with many different enclave backends, such as Intel SGX and OP-TEE.
31+
32+
### Data-Oblivious Algorithms
33+
On top of enclaves, Secure XGBoost adds a second layer of security that additionally protects the data and computation against a large class of attacks on enclaves.
34+
35+
Researchers have shown that attackers may be able to learn sensitive information about the data within SGX enclaves by leveraging auxiliary sources of leakage (or “side-channels”), even though they can’t directly observe the data. Memory access patterns are an example of such a side-channel.
36+
37+
In Secure XGBoost, we design and implement data-oblivious algorithms for model training and inference. At a high level, our algorithms produce an identical sequence of memory accesses, regardless of the input data. As a result, the memory access patterns reveal no information about the underlying data to the attacker.
38+
39+
Unfortunately, the extra security comes at the cost of performance. If such attacks fall outside the users’ threat model, they can disable this extra protection.
40+
41+
## Installation
42+
1. Install the Open Enclave SDK (0.9.0) and the Intel SGX DCAP driver by following [these instructions](https://github.com/openenclave/openenclave/blob/master/docs/GettingStartedDocs/install_oe_sdk-Ubuntu_18.04.md). In Step 3 of the instructions, install Open Enclave version 0.9.0 by specifying the version:
43+
44+
```sh
45+
sudo apt -y install clang-7 libssl-dev gdb libsgx-enclave-common libsgx-enclave-common-dev libprotobuf10 libsgx-dcap-ql libsgx-dcap-ql-dev az-dcap-client open-enclave=0.9.0
46+
```
47+
48+
2. Configure the required environment variables.
49+
50+
```sh
51+
source /opt/openenclave/share/openenclave/openenclaverc
52+
```
53+
54+
3. Install CMake and other Secure XGBoost dependencies.
55+
56+
```sh
57+
wget https://github.com/Kitware/CMake/releases/download/v3.15.6/cmake-3.15.6-Linux-x86_64.sh
58+
sudo bash cmake-3.15.6-Linux-x86_64.sh --skip-license --prefix=/usr/local
1959
20-
To get started with the library, please refer to the [documentation](https://secure-xgboost.readthedocs.io/en/latest/).
60+
sudo apt-get install -y libmbedtls-dev python3-pip
61+
pip3 install numpy pandas sklearn numproto grpcio grpcio-tools
62+
```
63+
64+
4. Clone Secure XGBoost.
65+
66+
```sh
67+
git clone https://github.com/mc2-project/secure-xgboost.git
68+
```
69+
70+
5. Before building, you may choose to configure the [build parameters](https://secure-xgboost.readthedocs.io/en/latest/build.html#building-the-targets) in `CMakeLists.txt`, e.g., whether to perform training and inference obliviously. In particular, if running Secure XGBoost on a machine without enclave support, you'll have to set the `SIMULATE` parameter to `ON`.
71+
72+
6. Build Secure XGBoost and install the Python package.
73+
74+
```sh
75+
cd secure-xgboost
76+
mkdir build
77+
78+
cd build
79+
cmake ..
80+
make -j4
81+
82+
cd ../python-package
83+
sudo python3 setup.py install
84+
```
85+
86+
## Usage
87+
To use Secure XGBoost, replace the XGBoost import.
88+
89+
```python
90+
# import xgboost as xgb
91+
import securexgboost as xgb
92+
```
93+
94+
For ease of use, the Secure XGBoost API mirrors that of XGBoost as much as possible. While the below block demonstrates usage on a single machine, Secure XGBoost is meant for the client-server model of computation. More information can be found [here](https://secure-xgboost.readthedocs.io/en/latest/about.html#system-architecture).
95+
96+
**Note**: If running Secure XGBoost in simulation mode, pass in `verify=False` to the `attest()` function.
97+
98+
```python
99+
# Generate a key and use it to encrypt data
100+
KEY_FILE = "key.txt"
101+
xgb.generate_client_key(KEY_FILE)
102+
xgb.encrypt_file("demo/data/agaricus.txt.train", "demo/data/train.enc", KEY_FILE)
103+
xgb.encrypt_file("demo/data/agaricus.txt.test", "demo/data/test.enc", KEY_FILE)
104+
105+
# Initialize client and connect to enclave
106+
xgb.init_client(user_name="user1",
107+
sym_key_file="key.txt",
108+
priv_key_file="config/user1.pem",
109+
cert_file="config/user1.crt")
110+
xgb.init_server(enclave_image="build/enclave/xgboost_enclave.signed")
111+
112+
# Remote attestation to authenticate enclave
113+
# If running in simulation mode, pass in `verify=False` below
114+
xgb.attest(verify=True)
115+
116+
# Load the encrypted data and associate it with your user
117+
dtrain = xgb.DMatrix({"user1": "demo/data/train.enc"})
118+
dtest = xgb.DMatrix({"user1": "demo/data/test.enc"})
119+
120+
params = {
121+
"objective": "binary:logistic",
122+
"gamma": "0.1",
123+
"max_depth": "3"
124+
}
125+
126+
# Train a model
127+
num_rounds = 5
128+
booster = xgb.train(params, dtrain, num_rounds)
129+
130+
# Get encrypted predictions and decrypt them
131+
predictions, num_preds = booster.predict(dtest)
132+
```
133+
134+
## Documentation
135+
For additional tutorials and more details on build parameters and usage, please refer to the [documentation](https://secure-xgboost.readthedocs.io/en/latest/).
21136
22-
## Contact
23-
If you would like to know more about our project or have questions, please contact us at:
24-
* Rishabh Poddar ([email protected])
25-
* Chester Leung ([email protected])
26-
* Wenting Zheng ([email protected])
137+
## Getting Involved
138+
* [email protected]: For questions and general discussion
139+
* [GitHub Issues](https://github.com/mc2-project/secure-xgboost/issues): For bug reports and feature requests.
140+
* [Pull Requests](https://github.com/mc2-project/secure-xgboost/pulls): For code contributions.

doc/build.rst

+8-2
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,14 @@ to open an issue on `GitHub <https://github.com/mc2-project/secure-xgboost/issue
2828
Installing the Open Enclave SDK
2929
*******************************
3030

31-
1. Install the Open Enclave SDK (v0.8 or higher) and the Intel SGX DCAP driver.
32-
If building on an SGX-enabled machine, follow the instructions `here <https://github.com/openenclave/openenclave/blob/master/docs/GettingStartedDocs/install_oe_sdk-Ubuntu_18.04.md>`_.
31+
1. Install the Open Enclave SDK (v0.9.0) and the Intel SGX DCAP driver.
32+
If building on an SGX-enabled machine, follow the instructions `here <https://github.com/openenclave/openenclave/blob/master/docs/GettingStartedDocs/install_oe_sdk-Ubuntu_18.04.md>`_.
33+
34+
**Note**: In step 3 of the instructions, make sure that you install Open Enclave version 0.9.0 by specifying the version
35+
36+
.. code-block:: bash
37+
38+
sudo apt -y install clang-7 libssl-dev gdb libsgx-enclave-common libsgx-enclave-common-dev libprotobuf10 libsgx-dcap-ql libsgx-dcap-ql-dev az-dcap-client open-enclave=0.9.0
3339

3440
.. note:: You may also build the SDK in "simulation mode" on a machine without SGX support (e.g., for local development and testing). To build in simulation mode, follow the instructions `here <https://github.com/openenclave/openenclave/blob/master/docs/GettingStartedDocs/install_oe_sdk-Simulation.md>`_ instead. Notably, these instructions require that you skip the driver installation step.
3541

0 commit comments

Comments
 (0)