-
Notifications
You must be signed in to change notification settings - Fork 176
Gateway-in-server early prototype #1718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit implements most of the reverse proxying logic for gateway-in-server. It also includes a dependency injection mechanism that will allow the new gateway app to work with different repo (storage) implementations in-server and remotely. For this prototype, gateway-in-server duplicates remote gateways, i.e. all services are available both on a remote gateway and on gateway-in-server. This will be changed later. Behind the GATEWAY_IN_SERVER feature flag.
if "Upgrade" in request.headers: | ||
raise fastapi.exceptions.HTTPException( | ||
fastapi.status.HTTP_400_BAD_REQUEST, "Upgrading connections is not supported" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean websocket is not supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will be supported later
Which isn't coming soon necessarily, right? I think it's crucial to cover proxy() since it adds non-trivial logic that's going to be a pain to maintain/develop without tests. |
|
||
async def stream_response( | ||
response: httpx.Response, replica_id: str | ||
) -> AsyncGenerator[bytes, None]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think AsyncIterator[YieldType]
could be used in place of AsyncGenerator[YieldType, None]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I subjectively prefer more generic return types in public interfaces and more specific return types in private ones. AsyncGenerator
is more specific than AsyncIterator
here
@@ -166,6 +174,10 @@ def register_routes(app: FastAPI, ui: bool = True): | |||
app.include_router(gateways.router) | |||
app.include_router(volumes.root_router) | |||
app.include_router(volumes.project_router) | |||
if FeatureFlags.GATEWAY_IN_SERVER: | |||
app.include_router( | |||
service_proxy.router, prefix="/gateway/services", tags=["gateway-in-server"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/gateway/ in the path would be misleading when the server is proxying the requests. gateway-in-server should remain an internal concept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I remember we agreed that we need two separate concepts: gateways and service proxies, so I'd expect not to see "gateway-in-server" in the code as well. Currently, it's unclear what gateway is since the term is used for different things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gateway is a component of dstack used to access services and models. It can run in two modes: in-server and remotely. Each mode has it's own specifics, e.g. remote gateways need to be created, while gateway-in-server exists always, remote gateways can serve services at subdomains, while gateway-in-server can't, etc.
I think this is easier for users to understand than a new term. "Service proxy" isn't a suitable term because this component will also be used for LLMs, not just services. So it will need a more generic name, like "Proxy". And having two similar terms like "Gateway" and "Proxy" can cause confusion.
Overall, I don't see a good reason to have a common repository interface (BaseGatewayRepo) and the injection mechanism unless we're going to move remote gateway from nginx to Python proxying (which I don't think is a good idea). The server and remote gateways would need different repo interfaces, although they might share some methods. |
Even if we keep Nginx for remote gateways, two gateway modes will still have much shared logic that will benefit from a common repo interface: retrieving model mappings for the OpenAI endpoint, service and replica details for restarting SSH tunnels after gateway restarts, user tokens for authentication, backend credentials for #1631, and any other data for future features. Some repo methods will be useless for one of the gateway operation modes, so these methods will have to be empty in the respective repo implementation. This may be a complication, but the alternatives are hard repo dependencies in gateway code with lots of if-else statements, or completely different applications for remote gateways and gateway-in-server. Both seem like a greater complication to me, so I'd rather go with BaseGatewayRepo. If it turns out to be inconvenient, it won't be difficult to remove it at a later stage. |
41751db
to
4cd5911
Compare
This commit implements most of the reverse
proxying logic for gateway-in-server. It also
includes a dependency injection mechanism that
will allow the new gateway app to work with
different repo (storage) implementations in-server
and remotely.
For this prototype, gateway-in-server duplicates
remote gateways, i.e. all services are available
both on a remote gateway and on gateway-in-server.
This will be changed later.
Behind the GATEWAY_IN_SERVER feature flag.
Part of #1595
Note: unit tests will likely follow later, as it should be
easier to test with the in-memory repo implementation
that will be added later for remote gateways.