At AppNexus we provide our clients with a RESTful API that allows people to manage all of their data in our adserving echosystem. The API is built on a LAMP stack with memcache sprinkled in to help out with session management and cacheing some commonly used objects. We have the typical cloud backend with various servers behind a load balancer with all of our servers housed in our East Coast data center.
When the time came for global expansion, we needed to spool up instances in our West Coast and European data centers.
Using our cloud backend, we could just spool up some virtual machines for Apache, memcache, and MySQL slaves replicating from the master in the East Coast. GET requests to the API will use the local MySQL slave and memcaches while POST/PUT/DELETE calls will go to the master database in the East Coast. Pretty damn simple, right?
Wrong! You didn’t think it would be that easy, did you?
The caveat with our API compared to some popular social media APIs is that if a client sets an incorrect value through AppNexus, it could cost them big advertising bucks within minutes – which is a lot worse than accidentally checking in to the same location twice.
There were two major issues that impeded a simple solution because of our consistency requirements:
1. Slave replication can be slow. The internet tubes can get filled with dump trucks causing data to transfer slowly from the East Coast to Europe and the West Coast.
2. Memcache doesn’t scale globally, and a local cache with data that is not up to date and incorrect can cause bad issues. We don’t want clients working with inconsistent data.
I understand those problems are a little vague and somewhat obvious. Let me give you some examples of how they could have bitten us in the bum:
1. Our validations on new or updated objects can be intense. Like I said, we don’t want our clients to lose money (think about a client changing the maximum allowed budget on the user level – that can have unforeseen effects on specific advertising campaigns that are children objects of the user). To prevent this we sometimes have to query the database dozens of times per API write. If a local slave is lagging during this process, our validations would be useless since it would be verifying actions on stale / incorrect data. And doing validations against the master database would be slow because the distance to the master database adds latency on each query making a poor user experience.
2. The authentication / session scheme we use is write through cacheing with memcache and mysql as a persistent store. The mixture of slave lag and inconsistency between memcache instances would lead to recently authenticated users seeming to be not logged in and unable to actually make API calls correctly.
3. If two users in accessing different data centers want to update the same object, we have a race condition. It probably sounds like a crazy case but we have a client support team in our office helping out clients all over the world. To further complicate it, our clients have many offices around the world and could have multiple people working on the same object.
So how did we manage to scale our API globally without crashing the whole system? Our super-awesome(ish) solutions to these problems:
1. All POST/PUT/DELETE calls proxy through to the East Coast data center. This solves issues with validation by running all queries against the master database, which is local and up-to-date. Sure there is some latency in proxying, but the size of the of the JSON input and output is more or less the same as the SQL inserts, so this just makes life easier
2. Use memcache and the replication delay of a slave to figure out if a slave is close enough to being up to date. Our system operartions guys set up a job that runs every second to update a field in a row in a table with the current timestamp. On a slave we can do something like “select now() – timestamp_field from table” to determine how man seconds behind the master that specific slave is. Also, the most recent timestamp for each object type for each user is stored in memcache. Using simple math, it is easy to determine if slave’s lag has any effect on a specific object type. On GET calls, if an object of the same type was altered at a point that a slave would not know about yet, the API will use the master database.
3. Created a protocol to allow both guaranteed current and eventual consistency between memcache instances. We built an application in node js that will route set, increment, decrement, and delete commands to various memcaches either synchronously or asynchronously. This application uses a memcache-style protocol so it was easy to utilize in our PHP code. Upon authentication, sessions are synced between all memcaches before the user receives their token. This ensures that they can access any instance in any datacenter and be logged in. The same thing is done for the object update timestamps. However, for static data caches, we just want eventual consistency because the cache values just make the calls slightly faster
4. Opt in data integrity check for PUT calls. If the user opts in, they must pass the last modified timestamp that they received when first getting the object. If the timestamp isn’t in sync with what we have in our master database, we reject the update since the changes might conflict with more recent changes.
Currently, this system is rolled out in production but only for selected users to test. We will be releasing it to every West Coast user in a week or so, which will make the API experience much snappier. Our frontend team is working on updating our client-facing website to utilize the data integrity checks. Like I always say, it’s all in the hips.