Skip to content

[Perfstress][Storage] Added Blobs perf tests #15833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Feb 10, 2021
79 changes: 79 additions & 0 deletions sdk/storage/azure-storage-blob/tests/perfstress_tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Blob Performance Tests

In order to run the performance tests, the `azure-devtools` package must be installed. This is done as part of the `dev_requirements`.
Start be creating a new virtual environment for your perf tests. This will need to be a Python 3 environment, preferably >=3.7.
Note that tests for T1 and T2 SDKs cannot be run from the same environment, and will need to be setup separately.

### Setup for test resources

These tests will run against a pre-configured Storage account. The following environment variable will need to be set for the tests to access the live resources:
```
AZURE_STORAGE_CONNECTION_STRING=<live storage account connection string>
```

### Setup for T2 perf test runs

```cmd
(env) ~/azure-storage-blob> pip install -r dev_requirements.txt
(env) ~/azure-storage-blob> pip install -e .
```

### Setup for T1 perf test runs

```cmd
(env) ~/azure-storage-blob> pip install -r dev_requirements.txt
(env) ~/azure-storage-blob> pip install tests/perfstress_tests/T1_legacy_tests/t1_test_requirements.txt
```

## Test commands

When `azure-devtools` is installed, you will have access to the `perfstress` command line tool, which will scan the current module for runable perf tests. Only a specific test can be run at a time (i.e. there is no "run all" feature).

```cmd
(env) ~/azure-storage-blob> cd tests
(env) ~/azure-storage-blob/tests> perfstress
```
Using the `perfstress` command alone will list the available perf tests found. Note that the available tests discovered will vary depending on whether your environment is configured for the T1 or T2 SDK.

### Common perf command line options
These options are available for all perf tests:
- `--duration=10` Number of seconds to run as many operations (the "run" function) as possible. Default is 10.
- `--iterations=1` Number of test iterations to run. Default is 1.
- `--parallel=1` Number of tests to run in parallel. Default is 1.
- `--no-client-share` Whether each parallel test instance should share a single client, or use their own. Default is False (sharing).
- `--warm-up=5` Number of seconds to spend warming up the connection before measuring begins. Default is 5.
- `--sync` Whether to run the tests in sync or async. Default is False (async). This flag must be used for Storage legacy tests, which do not support async.
- `--no-cleanup` Whether to keep newly created resources after test run. Default is False (resources will be deleted).

### Common Blob command line options
The options are available for all Blob perf tests:
- `--size=10240` Size in bytes of data to be transferred in upload or download tests. Default is 10240.
- `--max-concurrency=1` Number of threads to concurrently upload/download a single operation using the SDK API parameter. Default is 1.
- `--max-put-size` Maximum size of data uploading in single HTTP PUT. Default is 64*1024*1024.
- `--max-block-size` Maximum size of data in a block within a blob. Defaults to 4*1024*1024.
- `--buffer-threshold` Minimum block size to prevent full block buffering. Defaults to 4*1024*1024+1.

#### List Blobs command line options
This option is only available to the List Blobs test (T1 and T2).
- `--num-blobs` Number of blobs to list. Defaults to 100.

### T2 Tests
The tests currently written for the T2 SDK:
- `UploadTest` Uploads a stream of `size` bytes to a new Blob.
- `UploadFromFileTest` Uploads a local file of `size` bytes to a new Blob.
- `UploadBlockTest` Upload a single block of `size` bytes within a Blob.
- `DownloadTest` Download a stream of `size` bytes.
- `ListBlobsTest` List a speficied number of blobs.

### T1 Tests
The tests currently written for the T1 SDK:
- `LegacyUploadTest` Uploads a stream of `size` bytes to a new Blob.
- `LegacyUploadFromFileTest` Uploads a local file of `size` bytes to a new Blob.
- `LegacyUploadBlockTest` Upload a single block of `size` bytes within a Blob.
- `LegacyDownloadTest` Download a stream of `size` bytes.
- `LegacyListBlobsTest` List a speficied number of blobs.

## Example command
```cmd
(env) ~/azure-storage-blob/tests> perfstress UploadTest --parallel=2 --size=10240
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# --------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
# --------------------------------------------------------------------------------------------

import os
import uuid

from azure_devtools.perfstress_tests import PerfStressTest

from azure.storage.blob import BlockBlobService

class _LegacyServiceTest(PerfStressTest):
service_client = None
async_service_client = None

def __init__(self, arguments):
super().__init__(arguments)
connection_string = self.get_from_env("AZURE_STORAGE_CONNECTION_STRING")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another minor nit: could we mention that its fetching the connection string from an env variable in the README?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch - forgot all about that!

if not _LegacyServiceTest.service_client or self.args.no_client_share:
_LegacyServiceTest.service_client = BlockBlobService(connection_string=connection_string)
_LegacyServiceTest.service_client.MAX_SINGLE_PUT_SIZE = self.args.max_put_size
_LegacyServiceTest.service_client.MAX_BLOCK_SIZE = self.args.max_block_size
_LegacyServiceTest.service_client.MIN_LARGE_BLOCK_UPLOAD_THRESHOLD = self.args.buffer_threshold
self.async_service_client = None
self.service_client = _LegacyServiceTest.service_client

@staticmethod
def add_arguments(parser):
super(_LegacyServiceTest, _LegacyServiceTest).add_arguments(parser)
parser.add_argument('--max-put-size', nargs='?', type=int, help='Maximum size of data uploading in single HTTP PUT. Defaults to 64*1024*1024', default=64*1024*1024)
parser.add_argument('--max-block-size', nargs='?', type=int, help='Maximum size of data in a block within a blob. Defaults to 4*1024*1024', default=4*1024*1024)
parser.add_argument('--buffer-threshold', nargs='?', type=int, help='Minimum block size to prevent full block buffering. Defaults to 4*1024*1024+1', default=4*1024*1024+1)
parser.add_argument('-c', '--max-concurrency', nargs='?', type=int, help='Maximum number of concurrent threads used for data transfer. Defaults to 1', default=1)
parser.add_argument('-s', '--size', nargs='?', type=int, help='Size of data to transfer. Default is 10240.', default=10240)
parser.add_argument('--no-client-share', action='store_true', help='Create one ServiceClient per test instance. Default is to share a single ServiceClient.', default=False)


class _LegacyContainerTest(_LegacyServiceTest):
container_name = "perfstress-legacy-" + str(uuid.uuid4())

def __init__(self, arguments):
super().__init__(arguments)

async def global_setup(self):
await super().global_setup()
self.service_client.create_container(self.container_name)

async def global_cleanup(self):
self.service_client.delete_container(self.container_name)
await super().global_cleanup()
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# --------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
# --------------------------------------------------------------------------------------------

from azure_devtools.perfstress_tests import get_random_bytes, WriteStream

from ._test_base_legacy import _LegacyContainerTest


class LegacyDownloadTest(_LegacyContainerTest):
def __init__(self, arguments):
super().__init__(arguments)
self.blob_name = "downloadtest"
self.download_stream = WriteStream()

async def global_setup(self):
await super().global_setup()
data = get_random_bytes(self.args.size)
self.service_client.create_blob_from_bytes(
container_name=self.container_name,
blob_name=self.blob_name,
blob=data)

def run_sync(self):
self.download_stream.reset()
self.service_client.get_blob_to_stream(
container_name=self.container_name,
blob_name=self.blob_name,
stream=self.download_stream,
max_connections=self.args.max_concurrency)

async def run_async(self):
raise NotImplementedError("Async not supported for legacy T1 tests.")
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# --------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
# --------------------------------------------------------------------------------------------

from ._test_base_legacy import _LegacyContainerTest


class LegacyListBlobsTest(_LegacyContainerTest):

async def global_setup(self):
await super().global_setup()
for i in range(self.args.num_blobs):
self.service_client.create_blob_from_bytes(
container_name=self.container_name,
blob_name="listtest" + str(i),
blob=b"")

def run_sync(self):
for _ in self.service_client.list_blobs(container_name=self.container_name):
pass

async def run_async(self):
raise NotImplementedError("Async not supported for legacy T1 tests.")

@staticmethod
def add_arguments(parser):
super(LegacyListBlobsTest, LegacyListBlobsTest).add_arguments(parser)
parser.add_argument('--num-blobs', nargs='?', type=int, help='Number of blobs to list. Defaults to 100', default=100)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
azure-storage-blob==2.1.0
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# --------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
# --------------------------------------------------------------------------------------------

import uuid

from azure_devtools.perfstress_tests import RandomStream

from ._test_base_legacy import _LegacyContainerTest


class LegacyUploadTest(_LegacyContainerTest):
def __init__(self, arguments):
super().__init__(arguments)
self.blob_name = "blobtest-" + str(uuid.uuid4())
self.upload_stream = RandomStream(self.args.size)

def run_sync(self):
self.upload_stream.reset()
self.service_client.create_blob_from_stream(
container_name=self.container_name,
blob_name=self.blob_name,
stream=self.upload_stream,
max_connections=self.args.max_concurrency)

async def run_async(self):
raise NotImplementedError("Async not supported for legacy T1 tests.")
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# --------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
# --------------------------------------------------------------------------------------------

import uuid

from azure_devtools.perfstress_tests import get_random_bytes

from ._test_base_legacy import _LegacyContainerTest


class LegacyUploadBlockTest(_LegacyContainerTest):
def __init__(self, arguments):
super().__init__(arguments)
self.blob_name = "blobblocktest-" + str(uuid.uuid4())
self.block_id = str(uuid.uuid4())
self.data = get_random_bytes(self.args.size)

def run_sync(self):
self.service_client.put_block(
container_name=self.container_name,
blob_name=self.blob_name,
block=self.data,
block_id=self.block_id)

async def run_async(self):
raise NotImplementedError("Async not supported for legacy T1 tests.")
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# --------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
# --------------------------------------------------------------------------------------------

import os
import tempfile
import uuid

from azure_devtools.perfstress_tests import get_random_bytes

from ._test_base_legacy import _LegacyContainerTest


class LegacyUploadFromFileTest(_LegacyContainerTest):
temp_file = None

def __init__(self, arguments):
super().__init__(arguments)
self.blob_name = "containertest-" + str(uuid.uuid4())

async def global_setup(self):
await super().global_setup()
data = get_random_bytes(self.args.size)
with tempfile.NamedTemporaryFile(delete=False) as temp_file:
LegacyUploadFromFileTest.temp_file = temp_file.name
temp_file.write(data)

async def global_cleanup(self):
os.remove(LegacyUploadFromFileTest.temp_file)
await super().global_cleanup()

def run_sync(self):
self.service_client.create_blob_from_path(
container_name=self.container_name,
blob_name=self.blob_name,
file_path=LegacyUploadFromFileTest.temp_file,
max_connections=self.args.max_concurrency)

async def run_async(self):
raise NotImplementedError("Async not supported for legacy T1 tests.")
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# --------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
# --------------------------------------------------------------------------------------------

import os
import uuid

from azure_devtools.perfstress_tests import PerfStressTest

from azure.core.exceptions import ResourceNotFoundError
from azure.storage.blob import BlobServiceClient as SyncBlobServiceClient
from azure.storage.blob.aio import BlobServiceClient as AsyncBlobServiceClient


class _ServiceTest(PerfStressTest):
service_client = None
async_service_client = None

def __init__(self, arguments):
super().__init__(arguments)
connection_string = self.get_from_env("AZURE_STORAGE_CONNECTION_STRING")
kwargs = {}
kwargs['max_single_put_size'] = self.args.max_put_size
kwargs['max_block_size'] = self.args.max_block_size
kwargs['min_large_block_upload_threshold'] = self.args.buffer_threshold
if not _ServiceTest.service_client or self.args.no_client_share:
_ServiceTest.service_client = SyncBlobServiceClient.from_connection_string(conn_str=connection_string, **kwargs)
_ServiceTest.async_service_client = AsyncBlobServiceClient.from_connection_string(conn_str=connection_string, **kwargs)
self.service_client = _ServiceTest.service_client
self.async_service_client =_ServiceTest.async_service_client

async def close(self):
await self.async_service_client.close()
await super().close()

@staticmethod
def add_arguments(parser):
super(_ServiceTest, _ServiceTest).add_arguments(parser)
parser.add_argument('--max-put-size', nargs='?', type=int, help='Maximum size of data uploading in single HTTP PUT. Defaults to 64*1024*1024', default=64*1024*1024)
parser.add_argument('--max-block-size', nargs='?', type=int, help='Maximum size of data in a block within a blob. Defaults to 4*1024*1024', default=4*1024*1024)
parser.add_argument('--buffer-threshold', nargs='?', type=int, help='Minimum block size to prevent full block buffering. Defaults to 4*1024*1024+1', default=4*1024*1024+1)
parser.add_argument('-c', '--max-concurrency', nargs='?', type=int, help='Maximum number of concurrent threads used for data transfer. Defaults to 1', default=1)
parser.add_argument('-s', '--size', nargs='?', type=int, help='Size of data to transfer. Default is 10240.', default=10240)
parser.add_argument('--no-client-share', action='store_true', help='Create one ServiceClient per test instance. Default is to share a single ServiceClient.', default=False)


class _ContainerTest(_ServiceTest):
container_name = "perfstress-" + str(uuid.uuid4())

def __init__(self, arguments):
super().__init__(arguments)
self.container_client = self.service_client.get_container_client(self.container_name)
self.async_container_client = self.async_service_client.get_container_client(self.container_name)

async def global_setup(self):
await super().global_setup()
await self.async_container_client.create_container()

async def global_cleanup(self):
await self.async_container_client.delete_container()
await super().global_cleanup()

async def close(self):
await self.async_container_client.close()
await super().close()


class _BlobTest(_ContainerTest):
def __init__(self, arguments):
super().__init__(arguments)
blob_name = "blobtest-" + str(uuid.uuid4())
self.blob_client = self.container_client.get_blob_client(blob_name)
self.async_blob_client = self.async_container_client.get_blob_client(blob_name)

async def close(self):
await self.async_blob_client.close()
await super().close()
Loading