Storing Data with Dapr (Dapr Series #3)

Dapr! No, not dapper, dapr.

If you haven't read my past articles about dapr, that's ok. Let me try to boil this big project down to a sentence:

👋 Dapr is a process you run that does all the hard stuff for your app you'd rather not do

If you want to read more about it, see the intro blog post.

And with that, let's talk about (and demo!) probably the hardest thing that most (any?) apps have to deal with: storing data.

🙋‍♀️ Dapr does other things besides storage too

Why It Matters

Dealing with data stores is hard. You have to get the right SDK, mock out your data store for unit tests, write integration tests for round-tripping data, make sure you deal with migrations, failover, and more...

Dapr helps with all that, but it also comes with some limitations too. Before we go demo the thing, some pros and cons for storage:

➕ You read/write data with a pretty simple REST API (or gRPC!)
➕ You get a lot of useful features (retries anyone?)
➕ You can swap out implementations (Dapr comes with a bunch of different implementations of the data API across clouds and open/closed source.)
➖ You have to fit your data model into key/value. Not all apps fit into this model
➕ and ➖ You have to do a bit of reading to take advantage of some useful storage features:
- Strong consistency
- Optimistic Concurrency
- Retry policies

ℹ In Dapr land, the data storage API isn't called a database on purpose because it gives you a key/value storage API. It's not SQL or any other kind of rich query language, so keep that in mind when deciding what to use.

If you want, you can check out more details on storage here.

🏄‍♀️ Surf's Up, Let's Demo This Thing

Dapr comes with a bunch of different implementations of the data API across clouds and open/closed source. We'll use the Redis implementation here.

ℹ You'll need to have Docker installed and running for this demo to work

1️⃣ - Installing 🛠

I followed the installing instructions, using the Linux script (since I'm using WSL2):

$ wget -q https://raw.githubusercontent.com/dapr/cli/master/install/install.sh -O - | /bin/bash

I had to provide my password because the script sudos things, so if you're not into letting a random script from GitHub get superuser on your computer, you can download binaries instead.

It was installed and I could test things out:

$ dapr --help

         __
    ____/ /___ _____  _____
   / __  / __ '/ __ \/ ___/
  / /_/ / /_/ / /_/ / /
  \__,_/\__,_/ .___/_/
              /_/

======================================================
A serverless runtime for hyperscale, distributed systems

Usage:
  dapr [command]

Available Commands:
  components     List all Dapr components
  configurations List all Dapr configurations
  help           Help about any command
  init           Setup dapr in Kubernetes or Standalone modes
  invoke         Invokes a Dapr app with an optional payload (deprecated, use invokePost)
  invokeGet      Issue HTTP GET to Dapr app
  invokePost     Issue HTTP POST to Dapr app with an optional payload
  list           List all Dapr instances
  logs           Gets Dapr sidecar logs for an app in Kubernetes
  mtls           Check if mTLS is enabled in a Kubernetes cluster
  publish        Publish an event to multiple consumers
  run            Launches Dapr and (optionally) your app side by side
  status         Shows the Dapr system services (control plane) health status.
  stop           Stops a running Dapr instance and its associated app
  uninstall      Removes a Dapr installation

Flags:
  -h, --help      help for dapr
      --version   version for dapr

Use "dapr [command] --help" for more information about a command.

It even has ascii art 🤡!

2️⃣ Hello, World 👋

Let's get something running. The hello world tutorial actually goes much farther than I expected it would.

You get an app up and talking to the storage API (backed by Redis), but then you get a second app up and running and have it call an API on the first one. It's actually showing off service invocation too 🎉

I first got the hello world sample code:

$ git clone https://github.com/dapr/samples.git && cd samples/1.hello-world

The sample code has a Python app and a Node app. Both has a REST API in them.

The Node App 🗼

First, the Node app. The demo README first goes through the Node code a bit.

The really interesting bit I saw was that we're doing a POST request with the fetch API to the local Dapr API to store some data. What was cool was you can also just return some JSON and it'll be automatically stored.

{
    "state": [{
        "key": "nomnomnom",
        "value": "pizza"
    }]
}

I really like that part because you don't have to manually do anything to get data into the database. It just happens.

Magic

On to running the thing. You do the standard thing to install the JS dependencies:

$ npm install
npm WARN node_server@1.0.0 No repository field.

added 55 packages from 41 contributors and audited 55 packages in 0.626s
found 0 vulnerabilities



   ╭────────────────────────────────────────────────────────────────╮
   │                                                                │
   │      New patch version of npm available! 6.14.4 → 6.14.5       │
   │   Changelog: https://github.com/npm/cli/releases/tag/v6.14.5   │
   │               Run npm install -g npm to update!                │
   │                                                                │
   ╰────────────────────────────────────────────────────────────────╯

But you then run it with the dapr CLI, not npm or node. This spits out a lot of log lines, and that's normal. Here's approximately what it looks like:

$ dapr run --app-id nodeapp --app-port 3000 --port 3500 node app.js
ℹ️  Starting Dapr with id nodeapp. HTTP Port: 3500. gRPC Port: 42491
== DAPR == time="2020-05-18T16:26:52.702694-07:00" level=info msg="starting Dapr Runtime -- version 0.6.0 -- commit e99f712-dirty" app_id=nodeapp instance=DESKTOP-DQP07VM scope=dapr.runtime type=log ver=0.6.0

<snip>

== APP == Node App listening on port 3000!

== DAPR == time="2020-05-18T16:26:52.7568756-07:00" level=info msg="application discovered on port 3000" app_id=nodeapp instance=DESKTOP-DQP07VM scope=dapr.runtime type=log ver=0.6.0

<snip>

ℹ️  Updating metadata for app command: node app.js
✅  You're up and running! Both Dapr and your app logs will appear here.

Ok, now it's running. You use the dapr CLI to also make calls to the web service. I like the simplicity of a single CLI.

$ dapr invoke --app-id nodeapp --method neworder --payload '{"data": { "orderId": "41" } }'
✅  App invoked successfully

You can also use curl for this if you want

And verifying that the request actually went through, and the data was stored, we have logs:

== APP == Got a new order! Order ID: 41

== APP == Successfully persisted state.

And we can also call the Node App's API to get the new data back out of the datastore

$ curl http://localhost:3500/v1.0/invoke/nodeapp/method/order
{"orderId":"41"}

This and the above dapr invoke call are using one of the features in Dapr's service discovery building block

The Python App 🐍

Now for the grande finale! The Python code calls the Node app in an infinite loop, so we should be able to see the Node app responding to requests and also see the datastore fill up at the same time.

First, we set up the Python dependency:

$ pip3 install requests

<snip>

Successfully installed certifi-2020.4.5.1 chardet-3.0.4 idna-2.9 requests-2.23.0 urllib3-1.25.9

And then I ran it with the same dapr run command we used for the Node app, except in a new terminal tab. I cut out all the verbose log lines this time!

$ dapr run --app-id pythonapp python3 app.py
ℹ️  Starting Dapr with id pythonapp. HTTP Port: 43891. gRPC Port: 42163
ℹ️  Updating metadata for app command: python3 app.py
✅  You're up and running! Both Dapr and your app logs will appear here.

Now, back in the Node app logs, storage is starting to fill up!

== APP == Got a new order! Order ID: 1

== APP == Successfully persisted state.

== APP == Got a new order! Order ID: 2

== APP == Successfully persisted state.

<snip>

That's It!

From experience, getting storage and microservice-to-microservice calls working from scratch takes time, carefully crafted code, and effort.

I wanted to compare this Node/Python codebase to compare my experience, and was pleasantly surprised. Let's break it down:

87 lines of Node code, including comments
- We could save a lot of lines if we returned the "data" JSON dictionary instead of using fetch calls
23 lines of Node code, including comments

From experience with Go, I can say that a battle tested storage layer for this kind of thing would be about 50 lines. That's an apples to oranges comparison, but I know for sure that all the battle-tested bits are now behind that slick storage API the Node app is using. And funny enough, that part is written in Go 💪.

P.S. don't forget to clean up: dapr stop --app-id nodeapp && dapr stop --app-id pythonapp

Bonus: Behind the Scenes

I mentioned above that you need to have Docker installed in order to get this working. That's all because Dapr needs two images running in the background for this demo to work.

cb722412a5d3        redis               "docker-entrypoint.s…"   4 weeks ago         Up 6 hours          0.0.0.0:6379->6379/tcp     dapr_redis

The important one is ... Redis! By default, all the storage API calls the Node app is making get routed into Redis.

Wrap Up

So we've seen the storage API and a bit of the service invocation API. Honestly, those two pieces alone can get you a long way just on basic usage alone.

As you get more advanced, you'll probably want to learn more about and use things like concurrency management and retry policies.

Any way you go, the docs repository and particularly the state management section are great references.

That's all for today. I hope you go forth and enjoy dapr-ing!

🤘🚀