- By Mohannad B.
Watch out! This is one of the first (of many) technical posts from the sweetIQ Engineering Team.
Here at sweetIQ, we deal with the usual startup problems: Rapidly changing requirements, ever increasing demands on features and scale, and limited (human) resources. We use a great piece of Open Source software called Graphite, because it helps bring all of our systems together in one place.
What is Graphite?
Graphite is a real-time metric collection system. Made to be highly scalable by the boffins at Orbitz, it’s used by companies like Google, Etsy, Canonical (the company behind Ubuntu), Vimeo, and many others. Graphite gives you the back-end system to collect metrics from all different parts of your systems, and bring them together “under one roof” and view them. This is great because it lets you see relationships (and potential issues) between parts of your system that you might not think impact each other, when they actually do. If you want to know more about Graphite, have a look at the project’s (unfortunately-slightly-out-of-date) website.
Why we use Graphite
As our number of paying clients have increased over the last few months, our requirements for processing has increased exponentially – one new client might mean 500 new locations that we need to be able to process on a regular basis. Processing locations involves gathering data from all over the Internet, analyzing it, indexing it, aggregating it, and finally presenting it in a usable format.
Our server-to-nerd-sorry–engineer ratio is currently over 30-1, and climbing. We use AWS for our processing servers (but that’s a whole ‘nother blog post…), and while that means there’s effectively no upper-limit to our scale, getting visibility and control on all those servers is an on-going battle. Graphite helps us identify the state of many different systems in one place, and gives us trends of all the data over time – which is priceless for getting a feel for the load on the system, its throughput, and if there are any problems (happening or about to happen).
Graphite is flexible enough to do a lot. Regardless of the language or purpose of your applications, if it has a network connection, it can talk to Graphite. Given who else uses Graphite, it’s safe to say that for most people, scale and load is not going to be an issue. Unfortunately, the cost of all that flexibility is that you need to know how to configure it correctly.
This post is the first in a series which will guide you through the (previously undocumented) steps of getting Graphite setups with the new Ceres back-end. The steps are broken up for ease of reference (any which aren’t links yet will be released soon):
- Using Ceres as the back-end database to Graphite: The new Ceres database is very different from the old Whisper database. The biggest change is that metrics are not pre-allocated space on disk, which means long-stored or sparse metrics take up a lot less space than before.
- Carbon Daemon Writer Setup in Megacarbon: The new back-end is required for Ceres. This involves a change to how the carbon daemons are started and how configuration files are store, but most of the options are the same (if you’ve setup Graphite before, you’ll be familiar with them).
- Graphite on Nginx and uWSGI: My preferred stack for the Graphite web application is Nginx and uWSGI (for speed!). It was hard to find a nice step-by-step that was up to date, so here’s my version.
- Statsd for Graphite: We use Etsy’s statsd daemon to give us a few extra features like gauge metrics, and additional timer stats, with Upstart to make sure it runs when we want it to.
If you have any comments or feedback, please leave it in the comments or get in touch with me.