Thursday, October 24, 2013

Installing Solr in Apache on Windows

I've been messing about with Solr a lot recently and although the finished setup is great at what it does, getting it setup in the first place is a complete nause - especially on Windows. Anything to do with Java is going to be an XML configuration nightmare so to help myself in the future and whoever else stumbles into the same issues I had, I'll document the process here.

First up you want to ensure you have the latest install of the Java SDK installed. Grab that from Oracle and do a standard install. http://www.oracle.com/technetwork/java/javase/downloads/index.html

Next up you want to get Apache installed as a Windows Service. I installed the latest to date (Tomcat 8.0.0-RC5 (alpha)) - grab that from here: http://tomcat.apache.org/download-80.cgi and follow the instructions.

I'm not going to go into the details of installing the packages above as they are standard installations and you should accept the defaults in most cases.

Now that we have Tomcat installed and running we can use it as the container to host the Solr instance. Grab the latest build of Solr from their website (http://lucene.apache.org/solr/).  This is where the install gets a little interesting. Getting Solr running for your requirements will mean altering a lot of config XML files. It's obviously very handy to have those XML files under source control so keeping the config files inside the default locations withing Solr is not going to work. In this setup we'll want to point Solr to a home directory other than the default so we can keep the config files under Gits control.

Once Solr has downloaded unzip it and grab Solr***.war (where *** is the version number) file from the solr-4.5.0\solr-4.5.0\dist folder and place it inside the lib (C:\Program Files\Apache Software Foundation\Tomcat 8.0\lib) folder in Apache - rename it so that it is called Solr.war. Now grab all of the jar files from solr-4.5.0\solr-4.5.0\dist\solrj-lib and solr-4.5.0\solr-4.5.0\dist folders and place them inside the same lib folder in Apache. It might not be necessary to install all of the jar files but incrementally adding them by deciphering Solrs exceptions takes an age and I doubt it has any negative impact.

As I mentioned before, we want to store the config outside of the normal Solr installation so that it can live in the main repo folder along with whatever other supporting code you may have. To do this we need to tell Apache where to find our Solr home. To do this we need to place some XML inside Apache's conf directory (mine is at C:\Program Files\Apache Software Foundation\Tomcat 8.0\conf\Catalina\localhost). Here is an example which points Solr to c:\solr



Now restart the Apache service. If everything has gone well, navigating to http://localhost:8080 should bring up the Apache homepage. Navigating to http://localhost:6161/solr/ will bring up  an error complaining about collection1 not being available. This is to be expected as we haven't given the core config yet.

At this point, if you're seeing other isuses you're going to need to start using the logs directory in Apache to debug whatever issues you may have. If you have reached the same point as me then you can carry on the installation as follows.

Inside the Solr download folder there is an example installation. Grab the solr folder from inside the examples: (\solr-4.5.0\solr-4.5.0\example\) and copy all of the files to c:\solr - or where ever you have decided your config home directory is. You'll need to open solr.xml and replace the contents with this:


Rename the collection1 folder to test-solr as that is what is specified in the solr.xml. Restart Apache and reload http://loalhost:8080/solr. You should now see Solr's homepage (all be it with a warning like: Error Instantiating SearchComponent, solr.clustering.ClusteringComponent failed to instantiate)!!

SearchComponent is a contrib to the Solr project. For this example I have no need to go into getting that working so I'll just remove it form the config. Edit the solrconfig.xml file to look like this: https://gist.github.com/wayne-o/7140242 (linked not previewed as it's huge).

Now restart Apache and refresh the solr homepage. You should now be able to select the new test-solr core in the UI!

As a final note - Unless you want to keep the solr index (you don't) I would add the data folder to the gitignore file - or svn ignore it - whatever SCC you use.