Cosmos DB: 4 Things You Should Know

Cosmos DB: 4 Things You Should Know

How It Began

About 6 months ago, I began to build a web API that would track widgets going through their manufacturing process. I had played around with Cosmos DB before, and was looking for an opportunity to really dig into the nitty-gritty details of exactly what a production grade Cosmos DB-backed application looked like, and how it performed.

Suffice to say, it was a wild (and stressful) adventure from start to finish. Coming from a primarily SQL dominated background, there was a steep learning curve to the NoSQL document-oriented database. I could rely on almost none of my prior experience. I decided to write this article in an attempt to help guide developers curious about making the leap to one of Azure’s newest offerings.

As a precursor, if you are not familiar with Cosmos DB, I’d suggest this introduction before continuing. I’d also like to stress that since this was my first foray into not only NoSQL development, but also development with Cosmos DB, this is absolutely not an expert review; this is simply a reflection on my own experiences. Your mileage may vary.

TL;DR:

  • Cosmos DB is a NoSQL, partitioned, JSON document-storing database; you will need to change your paradigm in order to effectively build an application using it. Think in terms of partitions, queries, and data propagation (denormalization).
  • Partition keys are your friend; build with the anticipation they might change (they shouldn’t, but you don’t always know what the best partitioning strategy for your dataset is up front), and come up with a data migration solution in case they do. Additionally, anticipate performing all of your work within a partition, for best performance.
  • When it comes to data propagation, you have a few solutions at hand: Change Feed Function App listeners and stored procedures. Use them correctly; don’t make the mistakes I did.
  • Stored procedures can be used in more ways then one, to solve more than one type of business problem; since they are partition bounded, they can also be used for bulk data importing, as well as atomic transactions moving multiple parts around. Beware though; here there be dragons.

1. Change Your Paradigm

If you are like me, you’re most likely coming from a primarily SQL dominated workplace and background. You’re used to building schema’s with data models, focusing on normalization of data, and performing migrations when the data model or schema changes.

Throw all that out the window. You’re in NoSQL land now.

Since Cosmos DB stores JSON documents, when you save an entity to a collection, you store the entity with all of its child entity properties (watch out for circular references here when serializing your entities before storage). Additionally, this means no more normalization of data. While this may make you wince, consider all of the performance savings you made from no longer having to make expensive joins across tables.

Since this NoSQL database no longer cares about relationships, there are no longer any constraints (beyond having a partition key, and any unique key you decide to specify on creation of the collection). So your data migration is simply an update of the model used to serialize data before storage. No more SQL database schema migrations! Your model can now adapt and evolve much faster than before with changing business requirements.

Developer beware: this means there must be nullable fields for all properties, as they can go missing with a single model update. Your front end developer/API consumer must be made aware of each model update so they don’t get caught out when they attempt to build their UI around your models.

I’ll also say that although you can create multiple collections in a database, typically you don’t want to store only one type of entity in that collection. Use discriminators to distinguish between entity types, and partition keys to segment groups of entities to really harness the type of performance of which Cosmos DB is capable.

It’s not all roses in NoSQL land, though. There is a constraint, one that I overlooked at first, and came around to bite me in the end…

2. Partition Keys: A Warning

I did some cursory investigation into primary keys when I first delved into Cosmos DB; after I discovered the EnableCrossPartitionQuery option when saving data to the collection. That one option did give me considerable pause. Looking back, it was a red flag, a part of the map labelled “Here, there be dragons”. Little did I know how much it would come back to bite me in the end.

Here’s my key takeaways when it comes to partition keys.

  • Make the partition key field distinct from the rest of the data model. You can’t change the partition key for a collection, so if you decide to partition on something else, it’s MUCH easier to simple start storing different data in your partition key property.
  • Use part of the partition key in your entity/document ID. I started out using EF Core for interfacing with CosmosDB, and one strategy they used was to splice together the entity discriminator (usually the objects class name), followed by the partition key, and then a unique identifier for the entity, like so: Entity|<partitionKey>|<id>
  • Don’t use GUIDs for your partition keys/IDs if you can help it. I used GUIDs at first. It made debugging, troubleshooting and acceptance testing very painful and slow. I ended up using names, and code names, and identifiers coming from outside the system; this made identifiers much easier to read and it was faster to debug/troubleshoot issues with data propagation.
  • Make sure the partition key is set on the document before you attempt to save it. If not, you’ll get weird errors back from Cosmos DB about how the headers from the response didn’t match the request.
  • There are a billion articles out there on choosing a partition key. The short and sweet version: you should be designing it around the type of queries you are performing. Do you find that most of your queries include a particular field they are querying by? Use that. For me, it was the company code of the entities I was storing; since the collection was supposed to support multiple companies, every single query was restricted to the company that was making the query, so it made sense to restrict that collection by company. Later on I made collections entirely centered around storing widgets to be filtered in some form or fashion; these were partitioned by both company and whatever the heavily used filter was.

3. Data Propagation: How & Why

Since there’s no data normalization or concept of relationships, we need a way to propagate changes from one entity to all of its related entities. There’s two ways to do this:

  • Change Feed Listener via Azure Function App
  • Stored Procedures, Triggers (Pre and Post)

I initially went with a Change Feed Listener/Function App. This eventually was revealed to be a mistake; the latency between when changes occurred and when those changes were eventually propagated was initially not a big deal, but it eventually began to cascade, and large operations caused undefined behavior until all the propagations stopped. This is when I learned my first lesson with Cosmos DB and data propagation:

  • Use the Change Feed Listener in a Function App when you need to propagate changes across collections.
  • Use stored procedures/triggers when propagating changes within collections (and within a single atomic transaction).

If you have the same widgets in multiple collections for various reasons, and you want to propagate changes to a widget in one collection to another collection, use the Change Feed Listener. Otherwise, if you want to propagate changes to a widget to its owner, use a stored procedure.

4. Stored Procedures and Triggers: Development, Use Cases and Best Practices

After I realized my mistake with intra-collection data propagation, I delved into Cosmos DB stored procedures, and was mildly surprised to find they were written in JavaScript. I’ll quickly go over the different types of stored procedures you can build:

  • Stored procedures can take arguments and perform operations within a partition in a single atomic transaction, as long as the transaction takes less than ~5 seconds. Otherwise it bails out, returning a continuation token if the originator wants to pick back up and continue executing.
  • Post-triggers execute immediately after an entity/document is updated. I used these heavily for propagating changes to an entity to its relationships within a single atomic transaction.
  • Pre-triggers execute immediately before an entity/document is updated. I didn’t mess with these too much, and there are much better articles out there on how to use these procedures.

Here’s some things I wish were emphasized more by the docs:

  • Cosmos DB JavaScript stored procedures/triggers are formatted a very particular way. Your stored procedure must reside in a single file, and the main function (the one matching your stored procedure ID) must be at the top, like so:
// updateEntity.sproc.js
function updateEntityTrigger() {
    var context = getContext();
    var request = context.getRequest();
    var collection = context.getCollection();
    updateEntityImpl(request, collection);
}
function updateEntityImpl(request, collection)
{
    // Code goes here...
}
  • Since the entire stored procedure must reside in a single file, if you want to import functionality from other libraries or files, you must use Rollup or Gulp to bundle and minify the stored procedure into a single file, with the above formatting. No imports allowed.
  • Even though the documentation says it support ES2016…you need to polyfill things like the spread operator, and other bugs that can ONLY be found in Internet Explorer. It’s almost as if they simply reused the same engine for their (now deprecated) browser…in Cosmos DB server side programming. Absolutely baffling.
  • Deployment of stored procedures (and triggers) are typically done via REST API…not exactly conventional, to say the least. I ended up using PowerShell scripts in an Azure DevOps release pipeline to deploy my stored procedures and triggers, after a build runs. Caveat: if you read in your JavaScript stored procedures/trigger via PowerShell, you can run into a problem with newline characters. Since you technically are sending a string to the REST API, if the newline characters aren’t correctly formatted, the stored procedures will technically exist but won’t show up in Azure Portal. I believe this is a bug with Azure Portal but I’m not entirely certain. The only way to resolve this is to delete the stored procedure/trigger by name via PowerShell, not via Azure Portal, and attempt to redeploy.

Here is my sample deployment script, with the assistance of this blog:

. ./CosmosDBServerScriptDeployment.ps1
$sprocsFolder = Join-Path ((Get-Location).path) "dist"
$database = "{{Database}}"
$account = "{{Account}}"
$collection = "{{Collection}}"
$primaryKey = "{{PrimaryKey}}"
Get-ChildItem $sprocsFolder -Filter *.sproc.js |
Foreach-Object {
$fullPath = $_.FullName
$sprocId = $_.BaseName.Split(".")[0]
try {
DeployStoredProcedure `
    -AccountName $account `
    -AccountKey $primaryKey `
    -DatabaseName $database `
    -CollectionName $collection `
    -StoredProcedureName $sprocId `
    -SourceFilePath $fullPath
} catch { Write-Host $_}}
Get-ChildItem $sprocsFolder -Filter *.trigger.js |
Foreach-Object {
    $fullPath = $_.FullName
    $triggerId = $_.BaseName.Split(".")[0]
try {
DeployTrigger `
    -AccountName $account `
    -AccountKey $primaryKey `
    -DatabaseName $database `
    -CollectionName $collection `
    -TriggerName $triggerId `
    -TriggerType 'Post' `
    -TriggerOperation 'All' `
    -SourceFilePath $fullPath
}
catch {
    Write-Host $_
  }
}
  • Use post-triggers for propagating changes to a single entity; use stored procedures for managing more complex data updates and propagation. Caveat: you can’t call triggers from stored procedures. You will need to perform the data propagation manually from the stored procedure. Here's a sample post-trigger I used for propagating changes to a widget, to the stage it existed within:
function updateWidgetTrigger() {
    var context = getContext();
    var request = context.getRequest();
    var collection = context.getCollection();
    updateWidgetImpl(request, collection);
}


function updateWidgetImpl(request: IRequest, collection: ICollection) {
    if (request.getOperationType() == "Delete") {
        // this is not a 'Create' or 'Replace' operation,
        // so we can ignore it in this trigger
        return;
    }
    var document = request.getBody();


    handleStagePropagation(collection, document);


    var inserted = collection.upsertDocument(collection.getSelfLink(), document);
    if (!inserted) {
        throw new Error("Could not insert widget document."
                        + JSON.stringify(document));
    }
}


function handleStagePropagation(collection: ICollection, document) {
    // With the updated part document, we need to propagate changes to
    // its relationships:
    if (!document.stage) return;
    var stageQuery =
`SELECT * FROM CompanyWidgetCollection AS c WHERE c.id = '${document.stage.id}'`;
    var stageDocument = collection
        .queryDocuments(collection.getSelfLink(),
                        stageQuery,
                        updateStageCallback);


    function updateStageCallback(err: IRequestCallbackError,
                                 items: Stage[],
                                 responseOptions: IFeedCallbackOptions) {
        if (err) throw new Error("Error" + err.body);
        if (items.length == 0) throw 'Unable to find stage document'
                                     + JSON.stringify(document);
        var ownedWidget = {
            id: document.id,
            discriminator: document.discriminator,
            name: document.name,
            partitionKey: document.partitionKey
        };

        var stage = items[0];
        stage.widgets = stage?.widgets?.filter(ws => ws.id != document.id) || [];
        stage.widgets.push(ownedWidget);

        var accept = collection.upsertDocument(collection.getSelfLink(),
            stage, function (err, itemReplaced) {
                if (err) throw "Unable to update widget, abort";
            });
        if (!accept) throw "Unable to update widget, abort";
        return;
    }
}
  • Additionally, a not-so-common use for Cosmos DB stored procedures are bulk imports. One business problem that needed to be solved was that although the emphasis was on being able to query for thousands of entries quickly, there was also an insane amount of pressure to be able to import thousands of entries quickly. However, Cosmos DB stored procedures allow for you to pass arrays of entities to a stored procedure, which could then load them in within a single transaction. Depending on your RU set, you could load thousands of entries in very quickly, and then scale your RU limit back down to minimize costs.
  • Stored procedures and triggers execute within a bounded context; if the transaction takes longer than 5 seconds, it bails out. You need a continuation model to process longer-running transactions to completion. See here for more about handling continuation

Final Review

I went in excited to tinker with a new type of database that I thought would let me iterate on my data model quickly as business problems evolved, and came out slightly jaded and cynical. Cosmos DB felt a little janky; I was disappointed by development experience for its server side programming. I thoroughly enjoy Typescript/JavaScript, but I kept running into issues, from struggling to bundle functionality into a single file, to the frustration at lack of support for several key features of ECMA2016 (like the spread operator) despite the public support for it. Then the REST API deployment strategy gave me very confusing issues where I couldn’t see the stored procedure/trigger in the UI but it was technically there, just formatted incorrectly.

Additionally, they really push the Change Feed listener for data propagation without really telling the developer it’s strictly for cross-collection propagation, not normal intra-collection propagation for collections that store multiple types of entities.

On the other hand, it was absolutely a breeze to set up the Cosmos DB Emulator and spin up Function Apps to build and develop locally. I thoroughly enjoyed building on top of Cosmos DB, all things considered. I’ll probably try out MongoDB next to see how it stacks up to Cosmos DB in the NoSQL landscape.

Feel free to checkout my Github or LinkedIn!