Sunday, September 7, 2014

Running ServiceStack / RavenDb API on Ubuntu

I've been playing about with Docker and Vagrant a lot recently, I find them to be really amazing tools. Obviously with the amount of OS envy streaming out of my little foray with Docker in particular I am trying to move the Sonatribe API over to run on Mono inside a docker container.

ServiceStack runs great self hosted on Mono but I have been getting issues with RavenDb: async calls seem to fail and things like bulk inserts seem to fail with this:

If anyone has any help or pointers on these i'd be massively grateful!

Wednesday, July 30, 2014

Sonatribe conversations platform - CQRS knockout - part 2:

Following on from part 1 (CQRS Journey) this is my implementation spike using GetEventStore. Assesing which CQRS setup we should use at sonatribe for the new conversations feature.

The code can be found here.

This implementation is quite a bit different so I'll discuss where they branch off from each other and maybe try and figure out the pros and cons.

Using EventStore as the persistence mechanism for storing event streams takes a lot of faff out of the process. For instance if you don't want to you don't need to have any service bus sitting between the API -> domain service and read model. However - EventStore is a more lower level technology and because of that you don't get so much out of the box as you do with CQRS Journey.

With this implementation I decided to keep the domain service in a separate process at the other end of a RabbitMQ queue. So - looking at the API you won't see a huge amount of difference in terms of complexity (except maybe RMQ is heaps simpler to get up and running).

Moving on from there I have a  console app (as opposed to a worker role) running with TopShelf (for easy promotion to Windows Service when required). The code for this is here: Really simple way of wiring up the command handlers using Castle Windsor.

Looking at the one and only command handler: you can see how easy it is to save an aggregate root into the event store using the GetEventStoreRepository. This was one area that the developer is left to figure out - finding this repository was a great help.

You can see in the AR that when I create a new conversation it fires a new ConversationStarted event. This event is listened to in the read model. The wiring up for the read model you can see here: Again - there is some work to do here to turn the received events into something that can be worked with. There is also some (overly simple) code here to dispatch the events to the event handlers.

Inside the event handlers I am denormalizing the data to a RavenDb read model.

Aside from the added work of figuring out the event dispatcher I find the GES architecture much much simpler than the Azure based CQRS Journey code. There seems to be much less ceremony - even after I reduced it drastically in my refactoring of CQRS Journey. I haven't got any perf details for either implementation but Greg claims GES will natively handle 20,000 messages a second on a single node so I doubt perf is an issue for either implementation.

Next up I'll be adding some real features using each of these infrastructures. I like to work from an emergent design BDD style approach so will be looking at which of these works best under a test 1st dev cycle.

Monday, July 28, 2014

Sonatribe conversations platform - CQRS knockout - part 1: CQRS Journey

I'm about to start hacking away at a new part of the sonatribe API - a feature we've been calling Conversations. I've been a fan of CQRS/ES for a while now and think that the natural way that a conversation is a stream makes it a very good fit for some event store goodness.

Where as I used JOlivers EventStore in the past itteration of sonatribe (we attempted to make the whole site using CQRS/ES at one point but struggled with infrastructure and complexity) - things have moved on since then. Most notably Microsoft have done a whole Patterns & Practices feature on CQRS/ES and there is now code available that can be used in projects like this. And perhaps more exciting is the fact that is out in the wild and ready for use!

Before I go running in and developing the platform with one or the other I decided to take them both for a spin. Starting with the MS P&P Cqrs Journey code I made a pretend set up so I could asses which would be better suited to our needs.

I guess the main concerns for me are:

  • Simple dev cycle- as I mentioned above - complexity seemed to be a major time burner for us before - I want this implementation to be drastically simpler.
  • Testable - this is a blatant must - previously this was quite hard - there was very little guidance here also.
  • It needs to be able to run on my laptop on a train! I do a lot of this work during my commute!
So - taking the CQRS Journey code ( I hacked away and took out the bits I didn't like, the bits I didn't need and added some bits I felt I wanted. Mainly I don't like Unity IOC - I feel like you have to hold it's hand far too much. I initially tried swapping in Funq but it was too simple in features so ended up with Castle Windsor - my old time favourite but might be a bit bloaty for some.  

I only managed to port over the SQL bits - the Azure ServiceBus part is not working yet and probably won't - for reasons I'll specify in my next post. 

I also decided to use RavenDb as my read model - Table Storage is OK but RavenDb is absolutely nuts in terms of speed and simplicity. 

My opinion after all this is: It's nice to have all the code - esp for the event store, the publishers and recievers etc. There is a _lot_ of config and messing about with the original code - I tried to simplify this by registering the command and event handlers by convention but it still feels very "ceremony" 

Next up I'll be taking for a spin - a very quick and simple dive into getting a command through an API and eventually handling the command to maintain a read model  - exactly the same scenario as the CQRS Journey try out - just using as my event store...

Saturday, July 19, 2014

Stop Checking in the Settings!!

Ideally configuration hangs off of the build setting or (better still) the deployment process takes care of the configuration. If you're using Puppet or Octopus to deploy your code then this is super easy. Having this sort of infrastructure isn't always feasible especially on smaller projects with smaller teams. The project I'm working on at the moment is one such codebase. We share settings.xml files, there are many of them and each developer needs to configure these files as per their own environment as well as the deployed environments - per feature. This is a problem we are very aware of and we have scheduled tickets to take care of it but in the mean time I'm going to look into how we can use Git to limit the pain of this.

I am a massive culprit for checking in settings files - everybody does it - especially the nub on the team (me)... It's really annoying to the poor person who does an update after the point and then has to rework the settings files. So here are a couple of ways we can fix this (this is very basic git stuff so if you're well versed on Git you'll be well aware of the following!):

Open up a command prompt and navigate to where you keep your git repos. Mine are normally in d:\git

Lets start with a fresh plate, run the following:

We've just created a brand new git repo for us to play with. very straight forward. We don't want to work in master so lets create a new branch:

Now that we have the dev branch we can make our settings file.

Add some dummy dev environment config to the settings.xml file. Now we can add that to the dev branch and commit it.

Now that that's safe and sound we can create our feature branch where we will want to do some work and manage our own config:
Put some code in test.cs so we can commit that into the wayne branch

Now lets customize the settings.xml file:

Make the dev specific settings feature specific so you know if this works or not when we come to doing the switcheroo. Now we want to commit the feature specific settings into the feature branch:
Now we'll continue to do some more work:
And commit that into the feature branch
OK so now we have a dev branch with some settings in, we branched from dev to create a feature branch, did some work in that branch, configured the settings file to make it specific to that branch and then did even more work, big day in the office. What we want to do now is to merge the work in the feature branch to the dev branch - but we definitely do not want the settings from the feature branch to show in dev, that's just going to wind everyone up.
To do this we can revert the commit we made to push the settings into the git branch. We can check the log to see the commit and revert it. To do this run the following:

This will yield something like the following (type q to exit the log):
As you can see the commit with ID "965f41809225abbe2ed2b74d6e162e27c9c18a72" so to revert this commit we can issue the following command:
And then we can safely merge our feature branch into dev:
Open up settings.xml and verify it's as expected - dev settings, not feature settings.
So that was one way of making sure we don't merge stuff we don't want to merge. If we wanted to do some more work in the feature branch and get the settings back we can use the same approach above to undo the commit made by git revert - git simply creates a compensating action to revert the commit by and this creates another commit - so we can just revert the commit created by the compensating action (which will create another compensating - git is just event sourcing after all!). Another approach is cherry pick commits from the feature branch. This will apply a single commit to the branch you are merging into. Here is the general gist of it:

Git is powerful stuff - cherry picking is basically the converse to the 1st approach. It shows how useful doing small, discreet commits can be. In general there are only a small handful of git commands that I use frequently - you can do so much with a small amount of git commands it's ridiculous. But there are some extremely useful, more "advanced" features which really bring Git into it's own. I'm still learning Git but one thing I have found is that using it properly really pays off and there is always a sane solution to whatever insane mess I find myself in!

Sunday, May 25, 2014

Regression testing ServiceStack services with RavenDB embeded

One of the great things about using RavenDB and ServiceStack on Sonatribe is the ability to spin the whole system up inside a unit test. ServiceStack offers AppHostHttpListenerBase which can run in process and RavenDB offers the embeded database which can run in-memory. The fact that these two run in-process and in-memory means that regression testing Sonatribe REST services is simple and fast. Almost as fast as unit testing!

To be able to run tests like the following regression test:

I use the AppHostHttpListenerBase in my BaseTest class. Which is roughly the same as the following:

Now, When I inherit from SelfHostedBaseTest each test spins up a new host and a new in-memory RavenDB instance. This means I have a clean slate for each test, ideal for running everything in isolation. Not only this but I am also testing my services in the exact same stack as they run in production!

Saturday, May 24, 2014

CQRS without CQRS and RavenDB

Following on from my previous post about sonatribe, one of the hard decisions I had to make on this iteration was refactoring out the CQRS code  to simplify the dev cycle and improve velocity. I'm a big proponent of CQRS - it answers a lot of questions in traditional dev and while there is some upfront complexity it is far outweighed by the lack of accidental complexity apparent in traditional software projects.

One of my favorite features of a CQRS system is the denomalized rad layer - where the data stored in the read store maps directly to the DTO's coming out of the UI. When I refactored the code, I really wanted to be able to still get this sort of functionality. Denormalized read models have been around a long long time - SQL Server has "Views" where the developer can specify some SQL - normally making use of JOIN and UNION to present a readily available denormalized view over the data.

In RavenDb we can achieve this using TransformResults. In a document database you model your data differently - rather than joins and lookup tables you can store reference data inside the main document (aggregate root). A good rule of thumb being that if you ever need to load that part of the data alone, independent of other data, then that data is a document of it's own, with it's own ID. Documents can reference other documents however using the Denormalized Reference pattern ( Denormalized references can become a chore when the referenced data gets updated - in this case you will need to PATCH the documents to reflect any changes needed.

One of the cases where we have used TransformResults is to present the event profile (a festivals main page on the website) - in order to cut down requests to gather the data to present on this view (tags, images, users who are going, campsites, lineup info etc) we can aggregate all of that information in one TransformResult:

While this looks quite the beast - it's actually quite simple - we're pre-compiling all of the associated data for each event. This means we can pull all of the JSON needed for an event profile using a single call!

It's ridiculously simple and without cache turned on we can return from the REST API an event profile in less than 100 milliseconds - with cache turned on and tuned we can smash that right down - that kind of optimization comes much later however.

Now this isn't _really_ anything like CQRS - but it gives me a good-enough-for-now alternative. At some point I will be bringing CQRS back into sonatribe, this time round most likely using Greg Youngs GetEventStore ( - previously I used JOlivers EventStore and CommonDomain (which is used in Greg Young's implementation to my understanding). But for now I'm happy with this setup and it's simplicity.

Sonatribe technical bits and pieces

The sonatribe project makes use of a broad range of technology from PHP to Microsoft Azure and a lot in between. We host the main site on Ubuntu but the backend is hosted on Azure and makes use of ServiceStack to build the REST API. The main components of this project are:

  • PHP UI - running on an Azure Ubuntu VM
  • ServiceStack REST API - running on an Azure Windows VM
  • RavenDB database
  • EXTJS admin application
  • 2 x Azure worker roles used to process event and artist information
  • Azure ServiceBus
  • Azure Cache
  • Azure CDN
  • SignalR backed chat server
  • Xamarin iOS app
  • Xamarin Android app
  • Facebook canvas
Sonatribe has always been developed in our (stu, chris and myself) spare time - so no wonder it's taken ~3 years to develop and get the alpha out the door! The initial build of sonatribe (the site is currently on it's third rebuild!) used a CQRS backend and although I maintain this is the best architecture for a project like sonatribe, the complexity in the development/test cycle slowed progress too much, for such a small team trying to get a working "something" out the door, I had to make the hard decision to refactor the code and simplify it. 

Both the main website  (PHP UI) and the Android & iOS apps use the same API for authentication and data. This is a great design as it means all of the business logic stays in one place. And when you're building a system that relies on multi-axis permissions/roles with a range of business logic and views over the data - this is a big advantage. 

The Extjs admin app allows us to import data from clashfinder. The data coming from that JSON is very basic and only specifies the act name, stage and time of the set. In order to build upon that data to be able to provide spotify playlists, artist bio's and images etc we have an import pipeline which also allows us to update that data (quite complex as the clashfinder data has no Id information). Importing information for each act is quite a long running process as we gather as much information as we can about each artist, we download images relevant to each artist and generate renditions for use in the site. Because of this long running process the import process follows the following flow (ignoring clashfinder updates - that's another story!):
  • Initial import fired from admin app
  • The REST service picks up the import ID and pulls the data from clashfinder
  • The JSON is deserialized into POCO
  • A RavenDb bulk insert operation is started
  • The locations (stages) are pulled out of the data and are added to our database
  • For each of the artist sets we create a new listing event - a simple document which specifies the act name, stage, start and end
  • The bulk insert is committed to the DB - this process can take ~ 1 sec for a large festival with ~2000 acts and ~190 stages
  • For each listing event that was created we push a message onto the Azure ServiceBus
  • The worker process at the end of this queue picks up each message and processes it asynchronously.
  • We check to see if we can find any information about the act by using their name - we check spotify,, music brainz, wikipedia etc
  • If we can find the artist we download their spotify track Ids, artist bio, images etc and create a new artist in our DB linking all of these assets.
  • The artist is added to the listing event and saved
  • If we can't find the artist by the act name alone we try to break the name down - searching for substrings - perhaps the act is a collaboration.
  • For each artist we find in the act name we add them to the listing event and save it away
The admin site has it's own schedule designer using bryntum scheduler which we can use to create and edit lineups. The preferred method atm is clashfinder imports due to the communal collaboration that goes into creating them.

Using Xamarin to develop the Android and iOS apps means we can share a lot of the code base between the apps as well as make use of SignalR to provide chat functionality. Servicestack also provides a great PCL client - it's a no brainer. The apps are currently in the early stages of development - at the moment you can log in, view festivals and their lineups. We aim to have very basic implementations out in the app stores before Glasto. The apps will be free :)

We've recently released the 1st alpha of sonatribe ( and while there are a few rough edges we're very pleased with the result. The platform differs from anything out there at the moment because of the social aspect of the site as well as the fact that we will be offering native Android and iOS apps. While the alpha is a very bare bones (and in some places quite clumsy) implementation, we have a great platform to build on. This year our aim is to stabilize the platform, introduce the mobile apps and react to feedback. Our next big undertaking is tackling conversations, to build on the social aspect of the site. We're planning on swerving the traditional "forum software" implementation and intend to build a conversation platform from the ground up based on the usage of our demographic - something we can grow and improve upon and will be suitable for the next 10 years - not extinct 10 years ago. We've always found forums to be too 1990's and we've always said Facebook waters down the semantics too much for something like this (the whole reason sonatribe exists). 

We've got a great set of features and a lot of work has gone into getting to where we are, we're listening to user feedback and looking forward to the next set of great features. There's a hell of a lot left to do though!