Storage options | Multinode

In any distributed system, processes must communicate with each other. Examples of this in the multinode framework include functions returning outputs directly to parent processes, and jobs storing outputs of individual executions.

But sometimes, it is more appropriate to store these outputs in a data structure that is not tied to any single function call or any single job execution. In these situations, you can either use multinode's in-built dict class, or you can use an external managed data store provider.

Multinode's dict

If you need a key-value store that can accessed by any process in the application, then the multinode dict is the option of least resistance.

Here is an excellent example, taken from the section on scheduled tasks.

import multinode as mn
from fastapi import API
import uvicorn

# This dict is accessible from the scheduled task AND the service.
summaries_dict = mn.get_dict(name="tweet_summaries")

@mn.scheduled_task(period=timedelta(days=1))
def extract_summaries_from_last_day():
    tweets = scrape_tweets()
    for tweet in tweets:
        summary = summarise(tweet)
        summaries_dict[summaries.subject] = summary.content

app = FastAPI()

@app.get(/tweet_summaries)
def get_summary_of_most_recent_tweet(subject):
    return summaries_dict[subject]

@mn.service(port=80)
def api():
    unicorn.run(app, host="0.0.0.0", port=80)

This dict behaves very much like a Python dict.

my_dict = mn.get_dict(name="name")

my_dict["a"] = 1
my_dict["b"] = 2
my_dict["b"] = 3
my_dict["c"] = 4

print(my_dict["c"])  # 4

del my_dict["a"]

print("a" in my_dict)  # False
print("b" in my_dict)  # True

for key in my_dict.keys():
    print(key)  # "b", "c"

print(my_dict.get("a"))  # None
print(my_dict["a"])  # raises KeyError

It also supports certain atomic update operations.

my_dict["x"] = 4
my_dict["x"] += 2  # increments atomically

# Only updates if the current value is 6
my_dict.update("x", new_value=9, expected_current_value=6)

# Only sets a value if the key "y" does not currently exist
my_dict.set_if_absent("y", value=10)

External data stores

If the multinode dict is insufficient for your use case, then consider using a managed storage solution from an external vendor.

Postgres: Supabase, Amazon Aurora
Mongo: MongoDB Atlas
Kafka: Confluent
Object storage: Amazon S3

When working with external data stores, remember to include the access credentials in the environment variables for your multinode deployment.

Coming soon

In future, we may provide utilities to help you deploy some of the more popular managed data stores more seamlessly.