You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Most clients are likely to be using the Python or Typescript SDKs and they're official so will be closely aligned with the spec. We should add some e2e tests to ensure that an example application (a test 'host') can connect and list/use tools, esp with some custom headers.
We could also use this for evaluation using LangEval (related to LangWatch), by having some standard questions, running them against multiple models with MCP tools enabled, and asserting that we obtain the correct answers within X tokens or API calls.
TODO:
Require contributor approval before running actions on PRs
Create a new cloud instance for tests
Create a tests subdirectory
Create a Python package in there
Add langeval
Add tests
Run in CI
Workflows to test:
[docker-compose] what are the most recent log lines from Grafana?
[docker-compose] what dashboard should I look at to see container CPU?
[cloud]: who is on call for team X?
[cloud]: what incidents are active?
(add more)
The text was updated successfully, but these errors were encountered:
It might be nice to also test SSE as well as stdio (probably not worth parametrizing all our tests, but maybe a single smoke test to make sure both work would be good).
Most clients are likely to be using the Python or Typescript SDKs and they're official so will be closely aligned with the spec. We should add some e2e tests to ensure that an example application (a test 'host') can connect and list/use tools, esp with some custom headers.
We could also use this for evaluation using LangEval (related to LangWatch), by having some standard questions, running them against multiple models with MCP tools enabled, and asserting that we obtain the correct answers within X tokens or API calls.
TODO:
tests
subdirectoryWorkflows to test:
The text was updated successfully, but these errors were encountered: