Architecting a Scalable Notification Service

Ardy included in Architecture System-Design

2020-02-04 2679 words 13 minutes

Contents

Architecting a Scalable Notification Service

This post can serve as a reference document that highlights some of the best practices in architecting a Notification Service infrastructure.

Source: Pixabay

Why a Notification Service?

I will be using “Notification Service” as an umbrella term to cover the different components that we will need to run a fully functioning notification service on production.

Notification services are widely used nowadays in any product. They are useful whether you’d like to be notified of a change in price (ideally a price drop) or the availability of a product that you are interested in, or whether you’d like to be notified if there’s a new job specification available for a job search criteria that you have specified.

With the advent of mobile apps, getting notified to take action through our mobile devices’ push notifications is convenient on top of receiving good old emails.

For simplicity, we will be focusing on the product price change use case.

A basic notification service.

After a user views a product, the next option is to either buy the product or to “watch” the product in order to get updates if the price changes in the future.

Use cases

The architecture we will come up with in this post will be able to cover the price change and the availability use cases. However, it is designed in such a way that it can be modified to cover other use cases of your choice.

Price change

When you are looking for an item in an e-commerce website and you are not sure yet whether you are buying the item in its current price, you might like to be notified if the price has dropped so you can buy within the price range that you desire. We are going to scope down our example in this post to cover this use case. The system will be designed such that this use case can be changed according to the product requirements.

Price change notification sequence.

Item Availability

If the item that you are looking for is out of stock, you’d like to be notified when it is available. Depending on the implementation, the same Price change logic can be used for this use case. For example, if the Latest Price is Null, then it means that the product is not available.

Requirements

Product requirements

Logged-in users should be able to create a notification either to “watch” or follow a product to get notified if there’s a price change. The “watch” functionality is available in the product page. A product page is usually linked from the product search results page of an e-commerce website.
The service should be able to send an email or a push notification to users if there is a price change to the “watched” product.
The user should receive a list of product notifications in the email. For push, there is only one notification at a time based on a set of rules. For example, if a user is watching multiple products, we can choose to send a push notification only about the product with the biggest drop in price.
The notification service should work even if the product includes price comparisons from multiple third-party pricing data sources.

Design goals

The notification service can be integrated into any existing services e.g. e-commerce or job search product.
The notification service should be isolated into its own set of components from its main service.

Main component: Notification API

This will be a vital component of the entire Notification Service. We should be able to create and delete a notification entry via an API. The existing service should call this Notification API to create or delete a notification entry.

Notification API high level design.

Note the “Notification DB Access Library” which will be handy later when we will need another service to connect to the Notification Database.

Notification criteria can be the product search parameters or the selected product. Each criteria entry is saved in the database. The created entry will help the system understand when we should notify the user.

The API schema could roughly look like the following:

createNotification(APIKey, ProductID, userID)
deleteNotification(APIKey, NotificationID)
getNotifications(APIKey, userID)

The database schema will require the following:

NotificationID. Unique notification ID. Can be a UUID or an Integer.
UserID. Logged-in user who created the alert.

The schema should include notification parameters like the following (depending on the product or service. In this example we will consider an e-commerce website.):

ProductID: Product that we are interested in.
LatestPrice. Most recent and cheapest price found for the searched product.
PreviousPrice. Optional field to keep track of the previous recorded price. Useful for debugging and investigating potential pricing issues raised.
LastUpdated. Timestamp as to when the entry last updated.

Implementation

This can be a RESTFul or a gRPC service with Java, Go, or C# with a NoSQL datastore and a caching system. We expect mostly create and delete for every notification row. Reads are expected if our product allows users to see the list of notifications created.

We will cover the different trade-offs when choosing a Database for our Notification API in the later sections*.*

Optional: User data component

Ideally, the notification API should only keep reference of the UserID. Users’ PII (Personally Identifiable Information) data like their names, emails, and phone numbers should be from another API for clear separation of responsibilities.

Different approaches

Now what? There are different solutions that we can go about, each with its own pros and cons. These design solutions can be combined together to achieve optimal results.

Solution #1 works if you have full control of your product prices, while solutions #2 and #3 work well if you are getting your prices from third-party providers or if you do not have full control over the prices.

Solution #1: Manual price update event

The basic feature of an e-commerce store is to have an administration panel where we can update the prices of our products. Every time there is a change in price for a product, our system will send a notification to the users who are “watching” that product.

Pros:

Relatively simpler to implement compared to other solutions.
Price changes are fully controlled internally.

Cons:

Works well if you are in full control of your product prices. Does not work when you rely on third-party for price changes.

What we will need: Communications, Price change event, and Notification services.

Manual price update with RDBMS database high level design.

If you’re using a NoSQL database, you will need an indexing service. This indexing service will help us map the product to notification IDs. See figure below.

Manual price update with NoSQL database high level design.

Indexing service

We will not need an Indexing service if we are using an RDBMS database since the Notifications API can execute an SQL query likethe following*:*

SELECT * FROM NOTIFICATIONS WHERE PRODUCT_ID='xxx' AND ANOTHER_COL=xxx’

But, for NoSQL, we will need an index.

The index key will be the product ID, and its fields will contain an array of Notification IDs of that product.

‘xxx-xxx-xxx’: {‘notifications’: [‘xxx-xxx-xxx’, ‘xxx-xxx-xxx’, ‘xxx-xxx-xxx’]}

The product ID will be in the payload of the price change event, then the Notification API will get the list of notification IDs using the product ID as the key from the Indexing service database.

Rebuilding the index

In case the Indexing service goes down, we should be able to rebuild it. We can also have a back-up service available in case the main one goes down. There should also be a way to update the index periodically to ensure that our product and notifications mapping are up-to-date.

Implementation tip: This indexing service can be a simple RESTful service written in that builds an in-memory index. An in-memory data store like Redis will do the job as well as long as we have a way to rebuild the index.

Price change event service

This service monitors for any product price change event triggered by the admin panel. It gets the affected notifications and updates the affected products’ prices via the Notification API.

After getting the list of affected notifications, the service checks when the notification was “last sent,” sends the request to the communication service, then updates the LastUpdated field if the request is successfully completed.

Implementation tip: AWS Lambda or a similar service that can monitor events emitted from another service will be suitable.

Communication service

This is a separate system that is worthy of its own post. Let’s treat this as a black box that serves as an abstraction between our systems and third-party services that allows us to send emails and push notifications.

Solution #2: Use batch jobs

A straightforward approach if your service relies on third-parties for pricing is to have a batch job that runs periodically to check if there are changes to the price of the product a user is “watching.” In order to check the cheapest price, the batch job needs to call an internal API that returns the price of the product.

Pros:

Good coverage. It will cover all the search criteria stored in the database.

Cons:

More notification entries mean more API calls to your internal API. Consider optimising the internal API that will be called the batch job.
Requires a partial or full database scan. Regardless, it’s best to run the scan during off-peak hours.
Requires more running components.

What we will need: Scheduler, Data Pipeline, Message Queue, Batch Job, and Communications service.

Batch jobs high level design.

Scheduler

This is a component that we can use to run our data pipeline, batch processing, and other services periodically following a specific schedule, e.g. daily or weekly.

The start time of the data pipeline and batch processing depends on the following:

Off-peak time. We’d like to do the Notification service database scan when there is less traffic.
User’s time zone. Time the scheduler accordingly to when the user is awake to receive the notification. This will most likely equate to a better chance of the user seeing the notification, especially for push.

Scheduling strategies to consider:

All notifications are covered per day.
Alternate schedule between regions or availability zones (AWS) per day. For example, if your services are deployed in four AWS regions you can alternate the regions covered per day withap-southeast-1andap-northeast-1on one day, theneu-west-1andus-east-1on another day.

Implementation tip: Good old Cron Job or anAWS Cloudwatch Event.

Data pipeline

The scheduler tells the data pipeline when to start. It scans the database, groups the notifications according to a set of criteria, and publishes these notifications to a queue.

This component reads from the Notification API Database.

These are some strategies to consider when reading from the Notification API Database:

Full Database scan. The batch job will read all the rows in the Notification API database. This is easier to implement for cloud-based storage likeAWS DynamoDBthat allows you to adjust the read capacity when needed, though it can incur costs. RDBMS will do the job, too, but will be better off with a read-only slave that is meant solely for the batch job.
Partial database scan grouped by regions. The batch job reads the entries based on the region it belongs to. For example, if the scheduled job is inap-southeast-1, then all notifications from users in Singapore, Malaysia, the Philippines, and others within the ASEAN region will be processed.

After reading the notification entries in the database, the data pipeline will then publish these entries to a message queue.

Implementation tip: Use AWS Data Pipeline, Spark, or similar services that can read from a database and group large amounts of data.

Batch jobs

Batch jobs are subscribed to the notification entries topic in the message queue. These jobs will pick up the messages from the notification queue.

The batch job will call the Internal Pricing or Availability API to get the latest prices based on the notification criteria gathered from the database.

Compare the recent price from the API with what was recorded in the database.
Based on the price difference or whether the product is available or not, decide whether to notify the user or not. Notifying the user means sending a request to the Communications component.
After we have successfully completed the request to the Communication component, we will update the notification entry in the database. The LastUpdate and Prices fields will need to be updated at this point.

Implementation tip: Use AWS Data Pipeline, Spark, or similar services that can read from a message queue and process large amounts of data.

Solution #3: Use the internal API response logs

The main problem with using Batch Jobs is that the more notifications our database has, the more API calls there will be to the internal Pricing API which is probably not the most efficient way to scale up your service. This will be a good solution to most of the modern services that are already streaming its logs with, for example, Kafka plus an ELK stack.

Before considering Solution #2, compare the live product views with the products in the notifications database to know how much coverage there is. That will help you decide whether it is worth the effort to continue with this solution.

Pros:

Price updates are more “real-time” than batch jobs.
Reduces internal API calls.

Cons:

Requires log streaming.
Requires fine tuning of the frequency logic of sending notifications.
Coverage is dependent on popular products or API queries.

Aside from ensuring good internal API implementation like using a cache, we can use the live API response logs from our Pricing API by matching the notification criteria entries with what the live users are requesting.

User search with API log stream high level design.

API response logs

When a user views a product, both request and response are logged. Product and price will be included in the logs.

One way to achieve this is to use Kafka to stream the API request and response, then a Samza job will process the streamed API request and response.

The overall architecture of this solution will depend on the decision we’ve made on our notifications database, whether it is a RDBMS or NoSQL database.

RDBMS solution high level design.

With RDBMS, we can easily do a select statement based on the search query to match the rows in your notifications database.

NoSQL solution high level design.

NoSQL solution needs an Indexing service, same as in Solution #1.

The Samza job will match the live user requests with the notifications that users are interested in. When a match is found, check if there is a price difference or if an item is available. Send a request to the communication component so that it will notify the user.

After successfully completing the request to the API, the price and other information from the response will be used to update the latest price or availability details in the database.

Implementation tip: Use Kafka for log stream. Use Samza to process the data coming in from the log stream.

What to watch out for

Make sure to skip notifications that were already updated within a specified period, otherwise there could be a lot of duplicate notifications especially for popular API queries.
Fine-tune the frequency of how often your service is sending the notification to each user. We do not want to send “too many” emails and push notifications to a single user per day.

Combining all solutions

These solutions have their own pros and cons. Having all of them side by side will deliver the best results. I would recommend starting with Solution #1. Then, as your product scales and expands to getting external pricing, you can start looking into Solution #2. Then, when your notifications database starts to reach a point where it’s causing too much load to your internal API, you can consider Solution #3.

Technical notes

Services communicate via HTTPS calls.
Redundancy is not covered in detail here. You can easily add a load balancer in front of the services that make up your Notification Service and run them in multiple regions.

Check out my other posts about software architecture: