Concept and FAQs¶
There are a few terms used in mosec
.
worker
: a Python process that executes theforward
method (inherit frommosec.Worker
)stage
: one processing unit in the pipeline, each stage contains severalworker
replicasalso known as
Runtime
in the codeeach stage retrieves the data from the previous stage and passes the result to the next stage
retrieved data will be deserialized by the
Worker.deserialize_ipc
methoddata to be passed will be serialized by the
Worker.serialize_ipc
method
ingress/egress
: the first/last stage in the pipelineingress gets data from the client, while egress sends data to the client
data will be deserialized by the ingress
Worker.serialize
method and serialized by the egressWorker.deserialize
method
pipeline
: a chain of processing stages, will be registered to an endpoint (default:/inference
)a server can have multiple pipelines, check the multi-route example
dynamic batching
: batch requests until either the max batch size or the max wait time is reachedcontroller
: a Rust tokio thread that works on:read from the previous queue to get new tasks
send tasks to the ready-to-process worker via the Unix domain socket
receive results from the worker
send the tasks to the next queue
FAQs¶
How to raise an exception?¶
Use the raise
keyword with mosec.errors. Raising other exceptions will be treated as an “500 Internal Server Error”.
If a request raises any exception, the error will be returned to the client directly without going through the rest stages.
How to change the serialization/deserialization methods?¶
Just let the ingress/egress worker inherit a suitable mixin like MsgpackMixin
.
Note
The inheritance order matters in Python. Check multiple inheritance for more information.
You can also implement the serialize/deserialize
method to your ingress/egress
worker directly.