You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/getting-started/architecture-and-components/offline-store.md
+2
Original file line number
Diff line number
Diff line change
@@ -13,3 +13,5 @@ It is not possible to query all data sources from all offline stores, and only a
13
13
14
14
Please see the [Offline Stores](../../reference/offline-stores/) reference for more details on configuring offline stores.
15
15
16
+
Please see the [Push Source](reference/data-sources/push.md) for reference on how to push features directly to the offline store in your feature store.
Copy file name to clipboardExpand all lines: docs/how-to-guides/running-feast-in-production.md
+12-12
Original file line number
Diff line number
Diff line change
@@ -3,15 +3,15 @@
3
3
## Overview
4
4
5
5
After learning about Feast concepts and playing with Feast locally, you're now ready to use Feast in production.
6
-
This guide aims to help with the transition from a sandbox project to production-grade deployment in the cloud or on-premise.
6
+
This guide aims to help with the transition from a sandbox project to production-grade deployment in the cloud or on-premise.
7
7
8
8
Overview of typical production configuration is given below:
9
9
10
10

11
11
12
12
{% hint style="success" %}
13
-
**Important note:** We're trying to keep Feast modular. With the exception of the core, most of the Feast blocks are loosely connected and can be used independently. Hence, you are free to build your own production configuration.
14
-
For example, you might not have a stream source and, thus, no need to write features in real-time to an online store.
13
+
**Important note:** We're trying to keep Feast modular. With the exception of the core, most of the Feast blocks are loosely connected and can be used independently. Hence, you are free to build your own production configuration.
14
+
For example, you might not have a stream source and, thus, no need to write features in real-time to an online store.
15
15
Or you might not need to retrieve online features.
16
16
17
17
Furthermore, there's no single "true" approach. As you will see in this guide, Feast usually provides several options for each problem.
@@ -95,7 +95,7 @@ In summary, once you have set up a Git based repository with CI that runs `feast
95
95
96
96
To keep your online store up to date, you need to run a job that loads feature data from your feature view sources into your online store. In Feast, this loading operation is called materialization.
97
97
98
-
### 2.1. Manual materializations
98
+
### 2.1. Manual materializations
99
99
The simplest way to schedule materialization is to run an **incremental** materialization using the Feast CLI:
100
100
101
101
```text
@@ -116,7 +116,7 @@ In the above example we are materializing the source data from the `driver_hourl
116
116
117
117
The timestamps above should match the interval of data that has been computed by the data transformation system.
118
118
119
-
### 2.2. Automate periodic materializations
119
+
### 2.2. Automate periodic materializations
120
120
121
121
It is up to you which orchestration/scheduler to use to periodically run `$ feast materialize`.
122
122
Feast keeps the history of materialization in its registry so that the choice could be as simple as a [unix cron util](https://en.wikipedia.org/wiki/Cron).
@@ -160,7 +160,7 @@ feature_refs = [
160
160
]
161
161
162
162
training_df = fs.get_historical_features(
163
-
entity_df=entity_df,
163
+
entity_df=entity_df,
164
164
features=feature_refs,
165
165
).to_df()
166
166
@@ -214,7 +214,7 @@ There are three approaches for that purpose sorted from the most simple one (in
214
214
215
215
This approach is the most convenient to keep your infrastructure as minimalistic as possible and avoid deploying extra services.
216
216
The Feast Python SDK will connect directly to the online store (Redis, Datastore, etc), pull the feature data, and run transformations locally (if required).
217
-
The obvious drawback is that your service must be written in Python to use the Feast Python SDK.
217
+
The obvious drawback is that your service must be written in Python to use the Feast Python SDK.
218
218
A benefit of using a Python stack is that you can enjoy production-grade services with integrations with many existing data science tools.
219
219
220
220
To integrate online retrieval into your service use the following code:
@@ -245,9 +245,9 @@ This service will provide an HTTP API with JSON I/O, which can be easily used wi
245
245
### 4.3. Java based Feature Server deployed on Kubernetes
246
246
247
247
For users with very latency-sensitive and high QPS use-cases, Feast offers a high-performance Java feature server.
248
-
Besides the benefits of running on JVM, this implementation also provides a gRPC API, which guarantees good connection utilization and
249
-
small request / response body size (compared to JSON).
250
-
You will need the Feast Java SDK to retrieve features from this service. This SDK wraps all the gRPC logic for you and provides more convenient APIs.
248
+
Besides the benefits of running on JVM, this implementation also provides a gRPC API, which guarantees good connection utilization and
249
+
small request / response body size (compared to JSON).
250
+
You will need the Feast Java SDK to retrieve features from this service. This SDK wraps all the gRPC logic for you and provides more convenient APIs.
251
251
252
252
The Java based feature server can be deployed to Kubernetes cluster via Helm charts in a few simple steps:
Alternatively, if you want to ingest features directly from a broker (eg, Kafka or Kinesis), you can use the "push service", which will write to an online store.
297
+
Alternatively, if you want to ingest features directly from a broker (eg, Kafka or Kinesis), you can use the "push service", which will write to an online store and/or offline store.
298
298
This service will expose an HTTP API or when deployed on Serverless platforms like AWS Lambda or Google Cloud Run,
299
299
this service can be directly connected to Kinesis or PubSub.
Copy file name to clipboardExpand all lines: docs/reference/data-sources/kafka.md
+2-2
Original file line number
Diff line number
Diff line change
@@ -4,9 +4,9 @@
4
4
5
5
## Description
6
6
7
-
Kafka sources allow users to register Kafka streams as data sources. Feast currently does not launch or monitor jobs to ingest data from Kafka. Users are responsible for launching and monitoring their own ingestion jobs, which should write feature values to the online store through [FeatureStore.write_to_online_store](https://rtd.feast.dev/en/latest/index.html#feast.feature_store.FeatureStore.write_to_online_store). An example of how to launch such a job with Spark can be found [here](https://github.com/feast-dev/feast/tree/master/sdk/python/feast/infra/contrib).
7
+
Kafka sources allow users to register Kafka streams as data sources. Feast currently does not launch or monitor jobs to ingest data from Kafka. Users are responsible for launching and monitoring their own ingestion jobs, which should write feature values to the online store through [FeatureStore.write_to_online_store](https://rtd.feast.dev/en/latest/index.html#feast.feature_store.FeatureStore.write_to_online_store). An example of how to launch such a job with Spark can be found [here](https://github.com/feast-dev/feast/tree/master/sdk/python/feast/infra/contrib). Feast also provides functionality to write to the offline store using the `write_to_offline_store` functionality.
8
8
9
-
Kafka sources must have a batch source specified. The batch source will be used for retrieving historical features. Thus users are also responsible for writing data from their Kafka streams to a batch data source such as a data warehouse table. Feast plans on shipping `FeatureStore.write_to_offline_store` functionality soon, so users will be able to write data to the offline store just as easily as to the online store. When using a Kafka source as a stream source in the definition of a feature view, a batch source doesn't need to be specified in the feature view definition explicitly.
9
+
Kafka sources must have a batch source specified. The batch source will be used for retrieving historical features. Thus users are also responsible for writing data from their Kafka streams to a batch data source such as a data warehouse table. When using a Kafka source as a stream source in the definition of a feature view, a batch source doesn't need to be specified in the feature view definition explicitly.
10
10
11
11
## Stream sources
12
12
Streaming data sources are important sources of feature values. A typical setup with streaming data looks like:
Copy file name to clipboardExpand all lines: docs/reference/data-sources/kinesis.md
+2-2
Original file line number
Diff line number
Diff line change
@@ -4,9 +4,9 @@
4
4
5
5
## Description
6
6
7
-
Kinesis sources allow users to register Kinesis streams as data sources. Feast currently does not launch or monitor jobs to ingest data from Kinesis. Users are responsible for launching and monitoring their own ingestion jobs, which should write feature values to the online store through [FeatureStore.write_to_online_store](https://rtd.feast.dev/en/latest/index.html#feast.feature_store.FeatureStore.write_to_online_store). An example of how to launch such a job with Spark to ingest from Kafka can be found [here](https://github.com/feast-dev/feast/tree/master/sdk/python/feast/infra/contrib); by using a different plugin, the example can be adapted to Kinesis.
7
+
Kinesis sources allow users to register Kinesis streams as data sources. Feast currently does not launch or monitor jobs to ingest data from Kinesis. Users are responsible for launching and monitoring their own ingestion jobs, which should write feature values to the online store through [FeatureStore.write_to_online_store](https://rtd.feast.dev/en/latest/index.html#feast.feature_store.FeatureStore.write_to_online_store). An example of how to launch such a job with Spark to ingest from Kafka can be found [here](https://github.com/feast-dev/feast/tree/master/sdk/python/feast/infra/contrib); by using a different plugin, the example can be adapted to Kinesis. Feast also provides functionality to write to the offline store using the `write_to_offline_store` functionality.
8
8
9
-
Kinesis sources must have a batch source specified. The batch source will be used for retrieving historical features. Thus users are also responsible for writing data from their Kinesis streams to a batch data source such as a data warehouse table. Feast plans on shipping `FeatureStore.write_to_offline_store` functionality soon, so users will be able to write data to the offline store just as easily as to the online store. When using a Kinesis source as a stream source in the definition of a feature view, a batch source doesn't need to be specified in the feature view definition explicitly.
9
+
Kinesis sources must have a batch source specified. The batch source will be used for retrieving historical features. Thus users are also responsible for writing data from their Kinesis streams to a batch data source such as a data warehouse table. When using a Kinesis source as a stream source in the definition of a feature view, a batch source doesn't need to be specified in the feature view definition explicitly.
10
10
11
11
## Stream sources
12
12
Streaming data sources are important sources of feature values. A typical setup with streaming data looks like:
Copy file name to clipboardExpand all lines: docs/reference/data-sources/push.md
+8-6
Original file line number
Diff line number
Diff line change
@@ -4,12 +4,12 @@
4
4
5
5
## Description
6
6
7
-
Push sources allow feature values to be pushed to the online store in real time. This allows fresh feature values to be made available to applications. Push sources supercede the
7
+
Push sources allow feature values to be pushed to the online store and offline store in real time. This allows fresh feature values to be made available to applications. Push sources supercede the
Push sources can be used by multiple feature views. When data is pushed to a push source, Feast propagates the feature values to all the consuming feature views.
11
11
12
-
Push sources must have a batch source specified. The batch source will be used for retrieving historical features. Thus users are also responsible for pushing data to a batch data source such as a data warehouse table. Feast plans on shipping `FeatureStore.write_to_offline_store` functionality soon, so users will be able to write data to the offline store just as easily as to the online store. When using a push source as a stream source in the definition of a feature view, a batch source doesn't need to be specified in the feature view definition explicitly.
12
+
Push sources must have a batch source specified. The batch source will be used for retrieving historical features. Thus users are also responsible for pushing data to a batch data source such as a data warehouse table. When using a push source as a stream source in the definition of a feature view, a batch source doesn't need to be specified in the feature view definition explicitly.
13
13
14
14
## Stream sources
15
15
Streaming data sources are important sources of feature values. A typical setup with streaming data looks like:
@@ -20,11 +20,11 @@ Streaming data sources are important sources of feature values. A typical setup
20
20
4. Write stream 2 values to an online store for low latency feature serving
21
21
5. Periodically materialize feature values from the offline store into the online store for decreased training-serving skew and improved model performance
22
22
23
-
Feast now allows users to push features previously registered in a feature view to the online store for fresher features.
23
+
Feast allows users to push features previously registered in a feature view to the online store for fresher features. It also allows users to push batches of stream data to the offline store by specifying that the push be directed to the offline store. This will push the data to the offline store declared in the repository configuration used to initialize the feature store.
24
24
25
25
## Example
26
26
### Defining a push source
27
-
Note that the push schema needs to also include the entity
27
+
Note that the push schema needs to also include the entity.
28
28
29
29
```python
30
30
from feast import PushSource, ValueType, BigQuerySource, FeatureView, Feature, Field
@@ -44,14 +44,16 @@ fv = FeatureView(
44
44
```
45
45
46
46
### Pushing data
47
+
Note that the `to` parameter is optional and defaults to online but we can specify these options: `PushMode.ONLINE`, `PushMode.OFFLINE`, or `PushMode.ONLINE_AND_OFFLINE`.
Copy file name to clipboardExpand all lines: docs/reference/feature-servers/python-feature-server.md
+6-6
Original file line number
Diff line number
Diff line change
@@ -152,13 +152,12 @@ curl -X POST \
152
152
}'| jq
153
153
```
154
154
155
-
### Pushing features to the online store
156
-
You can push data corresponding to a push source to the online store (note that timestamps need to be strings):
155
+
### Pushing features to the online and offline stores
156
+
You can push data corresponding to a push source to the online and offline stores (note that timestamps need to be strings):
157
157
158
-
You can also define a pushmode to push offline data, either to the online store, offline store, or both. The feature server will throw an error if the online/offline
159
-
store doesn't support the push api functionality.
158
+
You can also define a pushmode to push stream or batch data, either to the online store, offline store, or both. The feature server will throw an error if the online/offline store doesn't support the push api functionality.
160
159
161
-
The request definition for pushmode is a string parameter `to` where the options are: ["online", "offline", "both"].
160
+
The request definition for pushmode is a string parameter `to` where the options are: ["online", "offline", "online_and_offline"].
Please see [here](https://github.com/feast-dev/streaming-tutorial) for the tutorial.
3
+
Feast supports registering streaming feature views and Kafka and Kinesis streaming sources. It also provides an interface for stream processing called the `Stream Processor`. An example Kafka/Spark StreamProcessor is implemented in the contrib folder. For more details, please see the [RFC](https://docs.google.com/document/d/1UzEyETHUaGpn0ap4G82DHluiCj7zEbrQLkJJkKSv4e8/edit?usp=sharing) for more details.
4
+
5
+
Please see [here](https://github.com/feast-dev/streaming-tutorial) for a tutorial on how to build a versioned streaming pipeline that registers your transformations, features, and data sources in Feast.
0 commit comments