Announcing Qwak Feature Store support for mongoDB data sources

Exciting news: Qwak Feature Store now supports MongoDB data sources!
Pavel Klushin
Pavel Klushin
Head of Solution Architecture at Qwak
August 3, 2021
Table of contents
Announcing Qwak Feature Store support for mongoDB data sources

Qwak Feature Store provides a unified store for features during training and real-time inference without the need to write additional code or create manual processes to keep features consistent. 

As we support different ways to ingest features into Qwak Feature Store, including Batch, streaming, or non-materialized features, we recently added support for MongoDB as a batch data source to pull data from.

The Data Source connectors provide you with a consistent data source interface for any database, and they create a standard way to combine stream and batch data sources for Feature Transformations.

Integrating MongoDB and Qwak Feature Store

Defining a Feature Set enables you to create features from your analytical data: when calculating feature values, Qwak will simply read from the underlying data source.

For example, in a fraud detection model use case, we might have two values to pull from the MongoDB data source:

  • Average transaction per customer - avg_amount
  • Standard deviation of a transaction per customer - sttdev_amount

And two from Streaming events from Kafka:

  • Last transaction amount
  • Last transaction time

Architecture

MongoDB and Qwak feature store

How to configure MongoDB as Data Source

Snowflake data source connector definition:


from qwak.feature_store.sources.data_sources import MongoSource 
users_table = MongoSource(name='mongo_source',
                               description='a mongo source description',
                               date_created_column='insert_date_column', #the field of the insertion time of the records
                               hosts='',
                               username_secret_name='qwak-mongodb-user', #a key to obtain the actual username from Qwak secrets
                               password_secret_name='qwak-mongodb-pass', #a key to obtain the actual password from Qwak secrets
                               database='db_name',
                               collection='collection_name',
                               connection_params='authSource=admin')

Register batch feature using the MongoDB connector:


BatchFeatureSet(
  name=”batch_transaction_features”,
  data_sources=[“mongodb_transactions_history”],
  scheduling_policy=”daily”,
  validations=[expect_column_values_to_be_between(
    column=”amount”, min_value=0, max_value=None)]
  function=SqlFunction(
    “””
    SELECT User_ID,
      avg(Amount) AS avg_amount,
      sttdev(Amount) AS stddev_amount
    FROM mongodb_trasactions_history
    GROUP BY User_ID
    “””)
)

Qwak Feature Store helps ensure that models make accurate predictions by making the same features available for both training and for inference. 

Chat with us to see the platform live and discover how we can help simplify your AI/ML journey.

say goodbe to complex mlops with Qwak