Fauna’s query model relies on indexes to support all query patterns that do not involve looking up a document directly by their Reference. Indexes are also key to searching and sorting results. Therefore, an understanding of index creation and usage is crucial for effective Fauna development.
An index, once it is created, is considered to be ‘active’ when it contains all of its source collection’s documents, including each document’s events.
To check if an index is active, run the following query. The index is active and is ready for queries if the output is true.
Select("active", Get(Index("index_name")))
When an index is created, documents are immediately indexed if the associated collection contains up to 128 events (which include document creation, updates, and deletions).
For collections with more than 128 events (which include document creation, updates, and deletions), indexing is handled by a background task, and you may have to wait a short period before the index returns values. Until the indexing task is complete, the index is an "inactive" index.
Once an index is active, all write transactions on the source collection will include writing to the index as well.
When will my index be ready for queries?
The short answer is - it depends.
The time it takes for an index to be active depends on a number of factors:
- number of documents in the collection
- size of the documents
- number of terms and values defined in the index
- operations involved in any defined index bindings
- amount of history kept for the documents
While the first four points are self-explanatory, it is worth taking a closer look at how/why the amount of history of documents matters for index build time:
Whenever a document is created or updated, Fauna stores a new version of the document along with the current transaction timestamp. Fauna indexes also retain the history of the documents for history_days
amount of time. (history_days
is configured for the source Collection).
This is necessary for the temporality feature, which allows you to query an index at a specific timestamp with the At function.
Any time a change to a document would result in the index being updated (term
or value
fields changed), the history of the index is updated. An index needs to track all of the history for its associated documents to find the latest version of the correct document.
As more history is accumulated per document, it takes longer to read the document and update index(es) referencing the document. The time to read further increases as the size of the collection grows.
What can I do to speed things up?
There are a few things to consider to optimize index/query performance:
- When creating an index, fields that change frequently are not a great choice to use as
terms
of the index. (Each update to theterm
requires an update to the index) - Set the
history_days
setting for your collections as low as possible.history_days
is a collection field that specifies the number of days of document history that should be maintained, for all documents within the collection (default is 0 days). Once the specified number of days has elapsed, document history is removed, but the document itself is retained.
If you don't need to use the temporality features, then consider setting it to 0.
WARNING: setting history_days to null means infinite history retention. So make sure that you set to 0 if what you want is no history
Further Reading: