Skip to content

Latest commit

 

History

History
338 lines (220 loc) · 21.8 KB

File metadata and controls

338 lines (220 loc) · 21.8 KB

Asset Sync (Retrieve External Test Recordings)

The test-proxy optionally offers integration with other git repositories for storing and retrieving recordings. This enables the proxy to work against repositories that do not emplace their test recordings directly alongside their test implementations.

Colloquially, any file that is stored externally using the asset-sync feature of the test-proxy is called an asset.

asset sync block diagram

In the context of a monorepo, this means that we store FAR less data per feature. To update recordings, the only change alongside the source code is to update the targeted tag.

With the addition of asset-sync capabilities, the test-proxy now responds to a new key in the intial Record/Start or /Playback/Start POST request.

The header x-recording-assets-file will contain a value of where the assets.json is located within the language repo, expressed as a relative path. EG sdk/tables/assets.json.

The combination of the the assets.json context and the original test-path will allow the test-proxy to restore a set of recordings to a path, then load the recording from that newly gathered data. The path to the recording file within the external assets repo can be predictably calculated and retrieved given just the location of the assets.json within the code repo, the requested file name during playback or record start, and the properties within the assets.json itself. The diagram above has colors to show how the paths are used in context.

The assets.json and how it enables external recordings

An assets.json contains targeting information for use by the test-proxy when restoring (or updating) recordings "below" a specific path.

For the azure-sdk team specifically, engineers are encouraged to place their assets.json files under a path of form sdk/<service>/<package>/assets.json

An assets.json takes the form:

{
  "AssetsRepo": "Azure/azure-sdk-assets",
  "AssetsRepoPrefixPath": "python",
  "TagPrefix": "python/core",
  "Tag": "python/core_<10-character-commit-SHA>"
}
Property Description
AssetsRepo The full name of the external github repo storing the data. EG: Azure/azure-sdk-assets
AssetsRepoPrefixPath The assets repository may want to place the content under a specific path in the assets repo. The default should be the language that the assets belong to. EG: python, net, java etc.
TagPrefix <Language>/<ServiceDirectory> or <Language>/<ServiceDirectory>/<Library> or deeper if things are nested in such a manner. This exists purely for ease of recognizing your tags.
Tag Initially empty until after the first push at which point the tag will be the <TagPrefix><10-character-commit-SHA>

Comments within the assets.json are allowed and maintained by the tooling. Feel free to leave notes to yourself. They will not be eliminated.

As one can see in the example image above, the test-proxy does the heavy lifting for push and pull of files to and from the assets repository.

The Tag "commit SHA" is literally the SHA of the tag being pushed. This allows us limited restore capabilities in the case of non-GC-ed accidentally-deleted tags.

How does the test-proxy relate to the assets.json?

The assets.json contains targeting information about WHERE to get recordings, but how do those recordings actually end up available on disk?

The test-proxy uses the information contained within the assets.json to restore files that are contained within the targeted tag. These files are downloaded to the user machine, under the .assets folder. A restore option is invoked in one of two possible ways:

  1. The user explicitly calls test-proxy restore <path-to-assets.json>.
  2. The user's test framework provides an additional key in the BODY of the record/start or playback/start request.

Only in the above two scenarios will assets be restored. Scenario #2 is discussed in a section below.

Restore, push, reset when proxy is waiting for requests

Interactions with the external assets repository are accessible when the proxy is actively serving requests. These are available through routes:

Route Description
/Playback/Restore Retrieve files from external git repo as targeted in the Tag from assets.json
/Playback/Reset Discard pending changes and reset to the original Tag from targeted assets.json.
/Record/Push Push pending changes to a new tag as targeted by assets.json. After the operation, the new recordings will be pushed and the target assets.json will be automatically updated with the new target tag.

Each of these CLI Commands takes an assets.json argument that provides the context that should be pushed to the external repository.

A note about using on Windows + WSL

When using a Windows machine, it is technically possible to invoke tests from WSL against a windows clone. That path would appear under /mnt/c/path/to/your/repo. This is not a supported scenario, as the test-proxy shells out to git for the push/restore actions. Running a git push/pull from linux against a repo that was cloned down using a windows git client can have unexpected results. Better to avoid the situation entirely and use an entirely separate clone for work on WSL.

test-proxy CLI (asset) commands

The test-proxy also offers interactions with the external assets repository as a CLI. Invoking test-proxy --help will show the available list of commands. test-proxy <command> --help will show the help and options for an individual command. The options for a given command are all --<option>, for example, --assets-json-path, but each option has an abbreviation shown in the help, those are a single dash. For example the abbreviation for --assets-json-path is -a.

Please note that all test-proxy asset commands should be invoked in the context of the language repository itself.

The following CLI commands are available for manipulation of assets

Restore

A restore operation is merely a test-proxy-encapsulated clone or pull operation. A given assets.json provides the target Tag and AssetsRepo.

test-proxy restore --assets-json-path <assetsJsonPath>

Reset

Reset discards local changes to a targeted assets.json files and resets the local copy of the files back to the version targeted by the given assets.json Tag. Reset would be used if the assets were already restored, modified (maybe re-recorded while library development was done), and then needed to be reset back to their original files. If there are pending changes, the user will be prompted to overwrite. If there are no pending changes, then reset is no-op, otherwise, the following prompt will be displayed. There are pending git changes, are you sure you want to reset? [Y|N]

  • Selecting N will leave things as they are.
  • Selecting Y will discard pending changes and reset the locally cloned assets to the Tag within the targeted assets.json.
test-proxy reset --assets-json-path <assetsJsonPath>

Push

After assets have been restored and then modified (re-recorded etc.) a push will update the assets in the AssetsRepo. After the push completes, the Tag within the targeted assets.json will be updated with the new Tag. The updated asset.json will need to be committed into the language repository along with the code changes.

test-proxy push --assets-json-path <assetsJsonPath>

Config Commands

When a client provides the additional body key x-recording-assets-file to /Record/Start or /Playback/Start, the test-proxy will invoke that test using external assets.

It's great that recordings are externalized, but this adds some complexity as these recordings don't live directly next to their test code anymore. The test-proxy provides the config verb to offer easy insight into interactions with the assets.json to assist with these complexities.

Currently there are two sub-verbs for test-proxy config:

  • locate
  • show

locate: Dumps which folder under the .assets folder contains your recordings. See config usage in layout section below. show: Dumps the contents of the targeted assets.json.

# from C:/repo/azure-sdk-for-python, the root of the python language repository
test-proxy config locate --assets-json-file sdk/keyvault/azure-keyvault-keys/assets.json
# from C:/repo/azure-sdk-for-js, the root of the js language repository
test-proxy config show -a sdk/tables/data-tables/assets.json

Using asset-sync for azure sdk development

Where are my files?

Test-Proxy maintains a separate clone for each assets.json. This means that for every assets.json that the test-proxy has interacted with. By default, this will be located just under your repo root under the .assets folder.

+-------------------------------+
|  azure-sdk-for-python/        |
|    sdk/                       |
|      storage/                 |
| +------assets.json            |
| |    appconfiguration/        |
| | +----assets.json            |
| | |  keyvault/                |
| | |    azure-keyvault-secrets |
| | |      assets.json-------+  |
| | |    azure-keyvault-keys |  |
| | |      assets.json---+   |  |
| | |                    |   |  |
| | |.assets/            |   |  |
| | +--AuN9me8zrT/       |   |  |
| |      <sparse clone>  |   |  |
| +----5hgHKwvMaN/       |   |  |
|        <sparse clone>  |   |  |
|      AuN9me8zrT--------+   |  |
|        <sparse clone>      |  |
|      BSdGcyN2XL------------+  |
|        <sparse clone>         |
+-------------------------------+

As you run tests in recording or playback mode, the test-proxy automatically checks out the appropriate tag in each local assets repo. After running docs in record mode, the newly updated recordings will be sitting within the appropriate assets repository.

To view the changes before pushing, use one of the one-liners defined below below.

I'm starting entirely fresh with no recordings, what should I do first?

All new packages in azure-sdk must externalize their recordings, so begin by creating an assets.json at your package root.

You could...

Given the relative lack of complexity present in an assets.json, manual generation is recommended unless recordings already exist. In which case this later section has you covered.

Once an assets.json with blank Tag value is present, start your recordings in Record mode as normal. Given that there is no tag present in the assets.json, the main branch will be restored from azure-sdk-assets. From there, on successful record-test run, the assets will be in a test-proxy push-able state. After that first push, your tag will be populated!

My tests don't use the test-proxy at all currently, how do I externalize my recordings?

You don't. Your first step is to integrate your test framework with the test-proxy.

Refer to:

I'm a dev who uses the test-proxy currently, how do I externalize my recordings?

First, ensure that your language-specific "shim" supports the automatic addition of the x-recording-assets-file key to the test-proxy Record|Playback/Start/ endpoints.

Use the transition script and follow the readme!

In summary, once an assets.json is present, the shim must be updated to actually send a reference to that assets.json inside the record/start or playback/start requests!

request with and without assets.json

What does this look like in practice?

Layout within a language repo

Once a package or service has an assets.json and a targeted tag, each language-repo test framework will automatically provide this assets.json alongside the recording file path. This will allow the test-proxy to automatically restore recordings as necessary.

In practice, this is what a code repo will look like after running playback-mode for a couple packages. Specifically, this is for the python repo.

assets store

One can see the automatically restored assets repos within the .assets folder. Each of the top-level folders within the .assets folder contains a single slice of the assets repo.

The below diagram illustrates how an individual assets.json, language repo, and assets repo relate to each other.

assets diagram

A user can use the config verb to access this the location of their assets on disk! Using assets diagram directly as a reference. we can work an example:

# from the root of the azure-sdk-for-net repo, run:
test-proxy config locate -a "sdk/confidentialledger/Azure.Security.ConfidentialLedger/assets.json"
# returns -> path/to/azure-sdk-for-net/.assets/2Km0Z8755m/net/"

The config verb offers various interactions with an input assets.json path, with locate just being one of them! In all cases, all interactions with the config verb should be made in the context of the language repository, where the source for a given package resides.

A few details about context directory

The test-proxy starts in a "storage context". EG, where it starts looking for recordings or assets.json files. Any recording/assets.json paths that be relative in a way that makes sense for the arguments being sent.

C:/repo/sdk-for-python/>test-proxy push -a sdk/tables/assets.json

test-proxy was not given a "context" argument. It'll use CWD as the "context" directory. When passed a relative path to an assets.json, it attempts to find that assets.json by joining the current context-directory with the relative path. CI implementations simply pass the additional argument of --storage-location=<repo root> to ensure that current directory doesn't matter.

So with the above invocation, the actual location of the assets.json would be: C:/repo/sdk-for-python/sdk/tables/assets.json.

When calling the tool, absolute paths are also supported. In that case, context directory does not matter at all.

test-proxy push -a C:/repo/sdk-for-python/sdk/tables/assets.json

Fortunately, the test-proxy takes care of restore operations automatically (both record and playback mode). This means that users only really need to understand the storage context when push-ing new recordings.

Pushing new recordings

After running tests in record mode.

  1. Confirm lack of secrets (as always with recordings).
  2. test-proxy push <path-to-assets-json>

An additional note about using test-proxy push in codespaces

The test-proxy can (and is) used to run tests in github codespaces. However, there is a wrinkle when pushing from a default codespaces configuration to the assets repository.

A dev (@timovv) on the azure-sdk-for-js team succinctly states the problem:

GitHub grants minimal permissions to a Codespace when it is created through creating a personal access token (PAT). By default, this PAT only grants write access to the repo that the Codespace was created from. This causes permissions issues when pushing assets to the Azure/azure-sdk-assets repo since the PAT does not grant write permission to that repo. Fortunately, we can request additional permissions through the devcontainer.json, which will give the Codespace write access to the Azure/azure-sdk-assets repo.

CodeSpaces reference about this topic.

The azure-sdk team has chosen to address this difficulty by applying the following customization to devcontainer.json for each language repo. This means that codespaces created off of the upstream repo will automagically have the correct permissions to push to azure-sdk-assets.

Note Codespaces created on forks do not magically gain write permissions to azure-sdk-assets.

To push from a codespace on a fork, devs will need to set GIT_TOKEN themselves to a PAT that has write access to azure-sdk-assets.

Recordings Growth

The test-proxy has no context or knowledge of which files are present in each tag. It only knows how to restore an assets.json and attempt to start playback given a relative path. With this being the case, azure-sdk devs should pay attention to the contents of these assets directories, as there is no mechanism to clean up unused recordings.

Use test-proxy config locate -a path/to/assets.json from the base of your language repo to discover the folder under .assets where recordings will be stored:

C:/repo/azure-sdk-for-python [hotfix/resolve-failing-nightly-datalake]|>test-proxy config locate -a ./sdk/storage/azure-storage-blob
Running proxy version is Azure.Sdk.Tools.TestProxy 20240610.1
git --version
C:/repo/azure-sdk-for-python/./.assets/yoPImn7QKL/python # <-- cd here to find all test recordings

Most devs will only update one or two recordings as they adjust features, meaning that it can be difficult to tell which recordings are actually utilized by current test code. The easiest way to find un-utilized files is to cd into the assets directory and delete every recording prior to a clean re-run all tests in record mode.

I am getting weird errors out of my test-proxy operations

If you think that the test-proxy has somehow gotten itself into a weird "in-between" state that it can't automatically dig itself out of, you have a couple options.

Reset it

The first, most foolproof, and most destructive of options.

test-proxy reset --assets-json-path <assetsJsonPath>

This will force the locally cloned assets to align with the assets.json that has been targeted.

Attempt to manually resolve

A new tag is pushed with each test-proxy push invocation. There should be no such thing as merge conflicts when automatically pushing up a new tag. However, if you wish to manually resolve instead of discarding current state, cd into the assets repo using the config locate command discussed above.

Once there, use standard git operations to resolve your issue.

For help with this external to Microsoft, file an issue against this repo with the question label. Within Microsoft, please ping the test-proxy teams channel for additional context and assistance.