Skip to content

Commit da60a42

Browse files
GrahamMThomasGraham Thomasscbedd
authored
Initial Creation of azure-health-deidentification Dataplane SDK (Azure#36041)
* Initial commit of Health.Deidentification dataplane * Use MI instead of SAS * Regenerates with Plaintext * Adds rest of tests * First attempt patch * Patch Attempt #2 * Patch Attempt #3 * Creates base recordings * Fixes sanitizers; Test replay functioning * Creates all sync samples * Creates all async samples * Adds description in readme * Adds tsplocation * Checkpoint * Executes test recording migration * Adds pipeline yamls * Updates ci.yml triggers * Removes ArtifactName from ci.yaml * Fixes analysis failures * Fixes analysis failures 2 * Update sdk/healthdataaiservices/ci.yml Co-authored-by: Scott Beddall <[email protected]> * Updates test.yml * Uniquifier default to false for pipelines * Updates from feedback * Updates from feedback 2 --------- Co-authored-by: Graham Thomas <[email protected]> Co-authored-by: Scott Beddall <[email protected]>
1 parent cf6238a commit da60a42

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

64 files changed

+8935
-0
lines changed

.github/CODEOWNERS

+4
Original file line numberDiff line numberDiff line change
@@ -412,6 +412,10 @@
412412
# PRLabel: %Cognitive - Text Analytics
413413
/sdk/textanalytics/ @quentinRobinson @wangyuantao
414414

415+
# ServiceLabel: %Health Deidentification
416+
# PRLabel: %Health Deidentification
417+
/sdk/healthdataaiservices/ @GrahamMThomas @danielszaniszlo
418+
415419
# AzureSdkOwners: @YalinLi0312
416420
# ServiceLabel: %Cognitive - Form Recognizer
417421
# ServiceOwners: @bojunehsu @vkurpad

.vscode/cspell.json

+14
Original file line numberDiff line numberDiff line change
@@ -244,6 +244,7 @@
244244
"guids",
245245
"hanaonazure",
246246
"hdinsight",
247+
"healthdataaiservices",
247248
"heapq",
248249
"hexlify",
249250
"himds",
@@ -402,6 +403,7 @@
402403
"unpad",
403404
"unpadder",
404405
"unpartial",
406+
"uniquifier",
405407
"unredacted",
406408
"unseekable",
407409
"unsubscriptable",
@@ -440,6 +442,7 @@
440442
"BUILDID",
441443
"documentdb",
442444
"chdir",
445+
"radiculopathy",
443446
"reqs",
444447
"rgpy",
445448
"swaggertosdk",
@@ -1840,6 +1843,17 @@
18401843
"words": [
18411844
"dcid"
18421845
]
1846+
},
1847+
{
1848+
"filename": "sdk/healthdataaiservices/azure-health-deidentification/**",
1849+
"words": [
1850+
"deid",
1851+
"deidservices",
1852+
"deidentification",
1853+
"healthdataaiservices",
1854+
"deidentify",
1855+
"deidentified"
1856+
]
18431857
}
18441858
],
18451859
"allowCompoundWords": true
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Release History
2+
3+
## 1.0.0b1 (1970-01-01)
4+
5+
- Initial version
6+
7+
### Features Added
8+
9+
- Initial Code
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
Copyright (c) Microsoft Corporation.
2+
3+
MIT License
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
include *.md
2+
include LICENSE
3+
include azure/health/deidentification/py.typed
4+
recursive-include tests *.py
5+
recursive-include samples *.py *.md
6+
include azure/__init__.py
7+
include azure/health/__init__.py
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
2+
3+
# Azure Health Deidentification client library for Python
4+
Azure.Health.Deidentification is a managed service that enables users to tag, redact, or surrogate health data.
5+
6+
## Getting started
7+
8+
### Install the package
9+
10+
```bash
11+
python -m pip install azure-health-deidentification
12+
```
13+
14+
#### Prequisites
15+
16+
- Python 3.8 or later is required to use this package.
17+
- You need an [Azure subscription][azure_sub] to use this package.
18+
- An existing Azure Health Deidentification instance.
19+
#### Create with an Azure Active Directory Credential
20+
To use an [Azure Active Directory (AAD) token credential][authenticate_with_token],
21+
provide an instance of the desired credential type obtained from the
22+
[azure-identity][azure_identity_credentials] library.
23+
24+
To authenticate with AAD, you must first [pip][pip] install [`azure-identity`][azure_identity_pip]
25+
26+
After setup, you can choose which type of [credential][azure_identity_credentials] from azure.identity to use.
27+
As an example, [DefaultAzureCredential][default_azure_credential] can be used to authenticate the client:
28+
29+
Set the values of the client ID, tenant ID, and client secret of the AAD application as environment variables:
30+
`AZURE_CLIENT_ID`, `AZURE_TENANT_ID`, `AZURE_CLIENT_SECRET`
31+
32+
Use the returned token credential to authenticate the client:
33+
34+
```python
35+
>>> from azure.health.deidentification import DeidentificationClient
36+
>>> from azure.identity import DefaultAzureCredential
37+
>>> client = DeidentificationClient(endpoint='<endpoint>', credential=DefaultAzureCredential())
38+
```
39+
40+
## Key concepts
41+
42+
**Operation Modes**
43+
- Tag: Will return a structure of offset and length with the PHI category of the related text spans.
44+
- Redact: Will return output text with placeholder stubbed text. ex. `[name]`
45+
- Surrogate: Will return output text with synthetic replacements.
46+
- `My name is John Smith`
47+
- `My name is Tom Jones`
48+
49+
**Job Integration with Azure Storage**
50+
Instead of sending text, you can send an Azure Storage Location to the service. We will asynchronously
51+
process the list of files and output the deidentified files to a location of your choice.
52+
53+
Limitations:
54+
- Maximum file count per job: 1000 documents
55+
- Maximum file size per file: 2 MB
56+
57+
## Examples
58+
59+
```python
60+
>>> from azure.health.deidentification import DeidentificationClient
61+
>>> from azure.identity import DefaultAzureCredential
62+
>>> from azure.core.exceptions import HttpResponseError
63+
64+
>>> client = DeidentificationClient(endpoint='<endpoint>', credential=DefaultAzureCredential())
65+
>>> try:
66+
<!-- write test code here -->
67+
except HttpResponseError as e:
68+
print('service responds error: {}'.format(e.response.json()))
69+
70+
```
71+
72+
## Next steps
73+
74+
- Find a bug, or have feedback? Raise an issue with "Health Deidentification" Label.
75+
76+
77+
## Troubleshooting
78+
79+
- **Unabled to Access Source or Target Storage**
80+
- Ensure you create your deid service with a system assigned managed identity
81+
- Ensure your storage account has given permissions to that managed identity
82+
83+
## Contributing
84+
85+
This project welcomes contributions and suggestions. Most contributions require
86+
you to agree to a Contributor License Agreement (CLA) declaring that you have
87+
the right to, and actually do, grant us the rights to use your contribution.
88+
For details, visit https://cla.microsoft.com.
89+
90+
When you submit a pull request, a CLA-bot will automatically determine whether
91+
you need to provide a CLA and decorate the PR appropriately (e.g., label,
92+
comment). Simply follow the instructions provided by the bot. You will only
93+
need to do this once across all repos using our CLA.
94+
95+
This project has adopted the
96+
[Microsoft Open Source Code of Conduct][code_of_conduct]. For more information,
97+
see the Code of Conduct FAQ or contact [email protected] with any
98+
additional questions or comments.
99+
100+
<!-- LINKS -->
101+
[code_of_conduct]: https://opensource.microsoft.com/codeofconduct/
102+
[authenticate_with_token]: https://docs.microsoft.com/azure/cognitive-services/authentication?tabs=powershell#authenticate-with-an-authentication-token
103+
[azure_identity_credentials]: https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/identity/azure-identity#credentials
104+
[azure_identity_pip]: https://pypi.org/project/azure-identity/
105+
[default_azure_credential]: https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/identity/azure-identity#defaultazurecredential
106+
[pip]: https://pypi.org/project/pip/
107+
[azure_sub]: https://azure.microsoft.com/free/
108+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"AssetsRepo": "Azure/azure-sdk-assets",
3+
"AssetsRepoPrefixPath": "python",
4+
"TagPrefix": "python/healthdataaiservices/azure-health-deidentification",
5+
"Tag": "python/healthdataaiservices/azure-health-deidentification_a8eed6d322"
6+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
__path__ = __import__("pkgutil").extend_path(__path__, __name__) # type: ignore
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
__path__ = __import__("pkgutil").extend_path(__path__, __name__) # type: ignore
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# coding=utf-8
2+
# --------------------------------------------------------------------------
3+
# Copyright (c) Microsoft Corporation. All rights reserved.
4+
# Licensed under the MIT License. See License.txt in the project root for license information.
5+
# Code generated by Microsoft (R) Python Code Generator.
6+
# Changes may cause incorrect behavior and will be lost if the code is regenerated.
7+
# --------------------------------------------------------------------------
8+
9+
from ._client import DeidentificationClient
10+
from ._version import VERSION
11+
12+
__version__ = VERSION
13+
14+
try:
15+
from ._patch import __all__ as _patch_all
16+
from ._patch import * # pylint: disable=unused-wildcard-import
17+
except ImportError:
18+
_patch_all = []
19+
from ._patch import patch_sdk as _patch_sdk
20+
21+
__all__ = [
22+
"DeidentificationClient",
23+
]
24+
__all__.extend([p for p in _patch_all if p not in __all__])
25+
26+
_patch_sdk()
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# coding=utf-8
2+
# --------------------------------------------------------------------------
3+
# Copyright (c) Microsoft Corporation. All rights reserved.
4+
# Licensed under the MIT License. See License.txt in the project root for license information.
5+
# Code generated by Microsoft (R) Python Code Generator.
6+
# Changes may cause incorrect behavior and will be lost if the code is regenerated.
7+
# --------------------------------------------------------------------------
8+
9+
from copy import deepcopy
10+
from typing import Any, TYPE_CHECKING
11+
from typing_extensions import Self
12+
13+
from azure.core import PipelineClient
14+
from azure.core.pipeline import policies
15+
from azure.core.rest import HttpRequest, HttpResponse
16+
17+
from ._configuration import DeidentificationClientConfiguration
18+
from ._operations import DeidentificationClientOperationsMixin
19+
from ._serialization import Deserializer, Serializer
20+
21+
if TYPE_CHECKING:
22+
# pylint: disable=unused-import,ungrouped-imports
23+
from azure.core.credentials import TokenCredential
24+
25+
26+
class DeidentificationClient(
27+
DeidentificationClientOperationsMixin
28+
): # pylint: disable=client-accepts-api-version-keyword
29+
"""DeidentificationClient.
30+
31+
:param endpoint: Url of your De-identification Service. Required.
32+
:type endpoint: str
33+
:param credential: Credential used to authenticate requests to the service. Required.
34+
:type credential: ~azure.core.credentials.TokenCredential
35+
:keyword api_version: The API version to use for this operation. Default value is
36+
"2024-07-12-preview". Note that overriding this default value may result in unsupported
37+
behavior.
38+
:paramtype api_version: str
39+
:keyword int polling_interval: Default waiting time between two polls for LRO operations if no
40+
Retry-After header is present.
41+
"""
42+
43+
def __init__(self, endpoint: str, credential: "TokenCredential", **kwargs: Any) -> None:
44+
_endpoint = "https://{endpoint}"
45+
self._config = DeidentificationClientConfiguration(endpoint=endpoint, credential=credential, **kwargs)
46+
_policies = kwargs.pop("policies", None)
47+
if _policies is None:
48+
_policies = [
49+
policies.RequestIdPolicy(**kwargs),
50+
self._config.headers_policy,
51+
self._config.user_agent_policy,
52+
self._config.proxy_policy,
53+
policies.ContentDecodePolicy(**kwargs),
54+
self._config.redirect_policy,
55+
self._config.retry_policy,
56+
self._config.authentication_policy,
57+
self._config.custom_hook_policy,
58+
self._config.logging_policy,
59+
policies.DistributedTracingPolicy(**kwargs),
60+
policies.SensitiveHeaderCleanupPolicy(**kwargs) if self._config.redirect_policy else None,
61+
self._config.http_logging_policy,
62+
]
63+
self._client: PipelineClient = PipelineClient(base_url=_endpoint, policies=_policies, **kwargs)
64+
65+
self._serialize = Serializer()
66+
self._deserialize = Deserializer()
67+
self._serialize.client_side_validation = False
68+
69+
def send_request(self, request: HttpRequest, *, stream: bool = False, **kwargs: Any) -> HttpResponse:
70+
"""Runs the network request through the client's chained policies.
71+
72+
>>> from azure.core.rest import HttpRequest
73+
>>> request = HttpRequest("GET", "https://www.example.org/")
74+
<HttpRequest [GET], url: 'https://www.example.org/'>
75+
>>> response = client.send_request(request)
76+
<HttpResponse: 200 OK>
77+
78+
For more information on this code flow, see https://aka.ms/azsdk/dpcodegen/python/send_request
79+
80+
:param request: The network request you want to make. Required.
81+
:type request: ~azure.core.rest.HttpRequest
82+
:keyword bool stream: Whether the response payload will be streamed. Defaults to False.
83+
:return: The response of your network call. Does not do error handling on your response.
84+
:rtype: ~azure.core.rest.HttpResponse
85+
"""
86+
87+
request_copy = deepcopy(request)
88+
path_format_arguments = {
89+
"endpoint": self._serialize.url("self._config.endpoint", self._config.endpoint, "str"),
90+
}
91+
92+
request_copy.url = self._client.format_url(request_copy.url, **path_format_arguments)
93+
return self._client.send_request(request_copy, stream=stream, **kwargs) # type: ignore
94+
95+
def close(self) -> None:
96+
self._client.close()
97+
98+
def __enter__(self) -> Self:
99+
self._client.__enter__()
100+
return self
101+
102+
def __exit__(self, *exc_details: Any) -> None:
103+
self._client.__exit__(*exc_details)

0 commit comments

Comments
 (0)