Orbus Gameworks

articles

Stupid Database Tricks

Mar 12, 06:51 PM

So I’ve been catching up on T=Machine, Adam Martin’s excellent blog on MMO tech (among other things). He posted a lot of notes from GDC sessions I didn’t manage to attend. One session he took notes on was called SQL Considered Harmful, . A provocative title, to be sure.

Here’s an excerpt of Adam’s notes from the session, which was folks from Cryptic talking about City of Heroes development.

On COH we had this idea that we’d store statistics on each individual player – how many times did they pick stuff up, how many times did they kill people, how long they spent online, the number of times they trash-talked in chat, etc. We made a table where there was one row-insert per player PER piece of data that had changed. Some of those stats change all the time, like “how long have they been online”. So the traffic to the database was extremely high, because each thing had to be updated constantly. So … we threw out all the tranditional DB concepts, and made one column per stat, with no actual names for columns, instead just have “Stat 1 value”, “stat 2 value”, etc. In the interests of performance, we were twisting our data, either in code, or in the DB, or in both places. This became a huge impedance to our development process.

This kind of thing gets me pretty worked up. First of all, why is it that a developer’s natural inclination is to put all of their damn data in a single table? It might be because it’s programmers who are in charge of the design of the statistics system. While I love programmers and would not be able to run this company without them, programmers are not data analysts. They do not design for ease of querying and data stored in a useful way. And why should they? That’s not their job. But it’s still a terrible thing to do, putting a bunch of barely contextually related events in a single table.

At least their first table actually had columns that meant something. When they switched to a system where columns did not actually signify anything, they fell prey to the flexibility siren. Do you realize how hard it is to query off of a table where there’s one INT that tells you what kind of event happened and then that changes the meaning of Stat Column 1? It’s not impossible, but you end up with extremely unwieldy queries. If you want to do relatively real-time metrics reporting, you are screwed.

Put contextually different events in different tables, people. Don’t worry: with modern technology, you can perform queries on more than one table, let me think of a good word for this, “joining” them, perhaps? (Sorry, had to snark.)

As for the database being hit way too hard with messages, this is why the way you handle metrics data on your game should be through contextual events, not individual pieces of data that spam the DB. Instead of a system where combat resolution sends 10 different messages to the DB about the stats that changed, package them up into one contextual call to a stored procedure that performs a single INSERT on a 10-column table. (Alright, to toot my own horn a little, this is the way we do it with our metrics software).

— Darius Kazemi

Comment

  1. So, have you looked at the slides yet? (http://doublebuffered.com/2008/03/07/sql-considered-harmful-presentation-slides/)

    Matthew Weigel · Mar 13, 05:12 PM · #

  2. Wow – it sounds like Cryptic could really use some help on their architecture, both on the HW and the database side. I mean one solution that suggests itself is this:

    Gather the data in a fast, lightweight format and get it off your game play servers fast. Then use this mid level tier to hold the data and massage it for insertion into your tables. Note the plural. (:

    oh and then, for data analysis and long term storage consider rejoining and denormalizing using one of the many data warehousing methodologies for overlaying on top of a RDBMS.

    Sometimes maybe we should look outside of the game industry rather than making it all up as we go along…

    Bill Crosbie · Mar 14, 11:17 AM · #

  3. The biggest problem is that your intermediate layer is where all of the performance and reliability problems can occur. It is NOT acceptable for players to lose changes because your lightweight, intermediate layer went down. So you need an entire database for the middle layer. That DB is CrypticDB. Other game devs have ad-hoc, unreliable, and slow solutions for this intermediate layer (like the CoH one).

    Massaging the data from the CrypticDB for later analysis is definitely in our plans, but that’s not essential to the project while storing and accessing it quickly is.

    Ben Zeigler · Mar 14, 09:26 PM · #

Commenting is closed for this article.

recently

Blog

Links