Supporting streaming deployments with Qwak
![Supporting streaming deployments with Qwak](https://cdn.prod.website-files.com/64b3ee21cac9398c75e5d3ac/64b3ee21cac9398c75e5d8d8_supporting-streaming.png)
Qwak now supports deploying machine learning (ML) models with event-driven streaming architecture using Apache Kafka to support high-throughput predictions.
This new capability allows data scientists to deploy with a click ML models as an endpoint to receive data as stream and output predictions as streams.
Real-time ML inference at scale has become an essential part of modern applications. Although we started our deployment service with support of real-time predictions based on a web server, we do see among our customers a high demand for streaming-based ML predictions.
Streaming inference is useful in the following cases:
- When the inference requests should triggered by an already existing stream of messages
- When you would like to decouple the caller from the model
- When you need to handle with prediction service failures because of high prediction traffic
How it works
Once a model is deployed to Qwak using the Streaming option, the deployed model will be triggered when the producer topic receives features/inference requests, and then it pushes the prediction to a consumer topic.
![](https://cdn.prod.website-files.com/64b3ee21cac9398c75e5d3ac/64b3ee21cac9398c75e5d4f5_Pq9gnipRqX1Kahlo6Mc2DEGwIsfVklvG07KVfnkdkuma_aW-GaQE7-SGpdmijCm-Yk6mKtd4yeoSCewYyIK6Hcw0KG3Nj-xKRKiBAzVHVLXI8A-cj14GO0vPkpPP1BndQf_cVJOd.png)
Once the model is deployed, you can track the service health metrics and be alerted if the metrics are above/below certain thresholds, such as error percentage, average throughput, consumed messages, consumer lag, processing lag, and errors over time.
![](https://cdn.prod.website-files.com/64b3ee21cac9398c75e5d3ac/64b3ee21cac9398c75e5d4df_OThAqDPsZRHq8HsiV28r9U9aSjP391ScbI7IM8rIdWk-Ls7QGOjoy_rEP_Kij9LRc9PO_H0UT2zJzGzpdsYSyclNUVWzV7ewmDpC1ghIkvjcxAEuWQwDcfoXowLy55NQicjamZBc.png)
Getting started with Qwak Streaming deployment
Using Qwak Management console
![](https://cdn.prod.website-files.com/64b3ee21cac9398c75e5d3ac/64b3ee21cac9398c75e5d5f2_CU05IfaSwPuBVsrZ4TVFAAYa0xIXlkCt3GVNhH4DnVrftermjVgRNqPU5Yq_UF4q9UVpJWj9Cn5rdB1uWdG3QftPCZVtz1cyQsW5H_g4qVuLkIFMj2R0eR5qOm1tjBqaob3JXN7T.png)
![](https://cdn.prod.website-files.com/64b3ee21cac9398c75e5d3ac/64b3ee21cac9398c75e5d530_3O0wg1_t5zZtTvTk59-7St-ts-qTzExKS5dBr3bhXA1BkT3ncQapYq3S5XPVqwyy7O8DRfBoesGGNoHveHO2aXuANtV584yUGMmh-Z-EIRwTXq_MD-wdf7HPj83xYiAvWQ21Erqr.png)
Choose the number of pods and CPU/memory size of the pod, and then add the address of the Bootstrap server and consumer/producer topic names.
![](https://cdn.prod.website-files.com/64b3ee21cac9398c75e5d3ac/64b3ee21cac9398c75e5d610_DwgQL1-hdCyYOrEsnnI4YzA7YYgmTyB_hj2n-RcNDEPyuw8P-sJU3jOXnQTWl_fYtSCbx-7vifj7NazvCymRNfFf2PWAx9l5GuWxyav9pog2u0LhY-R8JOVDm104C2_5Rr-Mg3-a.png)
Qwak CLI command
Qwak streaming deployment is perfect for event-based predictions that require high throughput, low latency, and fault tolerant environments.
Get started for free today!