Symfony2: Service Container Compiler Passes

In this post I am going to look at compiler passes. This is not something that you will often need to worry about when making an app with Symfony2. They do come in useful for some needs though and in particular for releasable bundles.

What are Compiler Passes

The lifecycle of Symfony2 service container is roughly along these lines. The configuration is loaded from the various bundles configurations and the app level config files are processed by the Extension classes in each bundle. Various compiler passes are then run against this configuration before it is all cached to file. This cached configuration is then used on subsequent requests. To read more about the loading of config files by a bundle’s extension class and processing the configuration see my posts Symfony2: Controller as Service and Symfony2: Writing a Dependency Injection Extension Configuration.

The compilation of the configuration is itself done first using a compiler pass. Further compiler passes are then used for various tasks to optimise the configuration before it is cached. For example, private services and abstract services are removed, and aliases are resolved.

Many of these compiler passes are part of the Dependency Injection component but individual bundles can register their own compiler passes. A common use is to inject tagged services into that bundle’s services. This functionality allows for services to be defined outside of a bundles config but still be used by that bundle. The bundle is not aware of these services in its own static config and a fresh container is passed to the Extension class meaning it does not have access to any other services, so compiler passes which are executed after all the configuration has been loaded are neccessary to inject tagged services. By using a compiler pass you know that all the other service config files have been already been processed.

Creating a Compiler Pass

To create a compiler pass it needs to implements the SymfonyComponentDependencyInjectionCompilerCompilerPassInterface interface.

It is standard practice to put the compiler passes in the DependencyInjection/Compiler folder of a bundle. They are not automatically registered though, to add the pass to the container, override the build method of the bundle definition class:

Uses for Compiler Passes

You can implement tags for your own services to allow other people to create services to be injected into your bundle. This of course only make sense for shared bundles which are released for other people to use. For bundles only used in a single application you have full control over its config and can just inject the services in there. There is a cookbook article in the Symfony docs on writing a compiler pass that makes use of tags.

Basically the compiler gives you an opportunity to manipulate the service definitions that have been compiled. Hence this being not something needed in everyday use. In most cases the service definitions in the config can be changed.

There are other uses such as providing more complicated checks on the configuration than is possible during configuration processing. For example, the configuration builder does not provide a way of checking that a value is required only when a certain service is present. The following example from AsseticBundle shows checking if a parameter has been set in the container if a particular service has been defined:

In my next post on this topic I will look at how to manipulate service container definitions from within a compiler pass.

Symfony2: elasticsearch autocomplete queries

This is the fifth in a series of posts about using elasticsearch with Symfony2:

  1. Symfony2: Integrating elasticsearch</a
  2. Symfony2: improving elasticsearch results</a
  3. Symfony2: elasticsearch custom analyzers
  4. Symfony2: elasticsearch custom repositories
  5. Symfony2: elasticsearch autocomplete queries

In this post I will take a quick a look at how we can create another query for the indexed site entities to provide results for an autocomplete field. I am not going to look at the actual implementation of the autocomplete box or even the formatting of the response since that will depend on the autocomplete implementation.

At the moment the sort of results we are retrieving from the search index will not be suitable for use with autocomplete. For a start, whilst we have used stemming we will not get results when searching for the first part of a word in the name. For example, if we have added a site with Lime Thinking as the name then it will not be returned for a search for L, Li, Lim etc as we will need for autocomplete.

Fortunately this is very easy to remedy with elasticsearch by making our query a textphraseprefix query. We can also simplify the query if we assume we are only searching on the name field. So let’s make another method in the custom repository for our new query and add the type to the query object:

This will now return the Lime Thinking site for searches for L, Li, Lim etc as we needed.

It is worth being aware of how this query is processed so that the results make sense. If we search for multiple words then this phrase is not split up but searched for whole. So a search for lime thi will return the Lime Thinking site but a search for lim thi will not find it. The search phrase does not need to match the start of the name though, so a search for thi will return the Lime Thinking entity.

Additionally if we were also to index a site with the name Lime Pickle then we would get both sites for lim but only Lime Thinking on a search for lime thin which differs from the search from previous posts where both would be found for a search for lime thin.

Symfony2: elasticsearch custom repositories

This is the fourth in a series of posts about using elasticsearch with Symfony2:

  1. Symfony2: Integrating elasticsearch</a
  2. Symfony2: improving elasticsearch results</a
  3. Symfony2: elasticsearch custom analyzers
  4. Symfony2: elasticsearch custom repositories

A new feature of the FOQElasticaSearch bundle is the provision of repositories similar to those for Doctrine queries. The first advantage of this is that instead of having to use the finder service specific to a particular entity you can use the same manager service for all mapped entities and use the entity class name to get a repository to run queries against. So using the query we have built up in the previous posts:

Note that the use of the short syntax ExampleBundle:Site is only available in master, if you are using the branch compatible with 2.0.x Symfony releases then you will need to use the fully qualified class name e.g. LimeThinkingExampleBundleEntitySite.

A much bigger advantage of using this functionality is that we can create a custom repository to encapsulate our query which will clean up our controller method and make it easy to reuse elsewhere. So our custom repository looks like this:

The custom repository must extend the base repository in order to be able to use the relevant finder service, which is automatically injected in. We also need to specify the custom repository class. In master this can be done using an annotation:

For the 2.0 branch we need to specify it in our config, which now looks like this:

Our controller method can now be simplified to this:

The controller method is now much cleaner with the implementation of the query moved to the repository, the query can now also easily be reused elsewhere.

Symfony2: elasticsearch custom analyzers

In my previous posts I looked at integrating elasticsearch into a Symfony2 app and at how to use an alternative analyzer.

One thing we did not do last time was indexing the url field of the Site entity. The reason for this is that if you index urls and email addresses using the default settings they will not be split up for indexing, meaning that you cannot search on part of them. For example if we have index http://www.limethinking.co.uk then a search for limethinking will not return the indexed document.

The reason for this is the way that the strings are analyzed to decide how to index them. An analyzer is a combination of a tokenizer and filters. The tokenizer decides how to split up the string to be indexed into individual tokens, the filters are then applied to the tokens before they are indexed. The default Standard Analyzer uses the Standard Tokenizer which, using language specific rules, splits up the string using whitespace and punctuation. It has the Lowercase Token filter, which lower cases the token to avoid search being case dependant, the Stop Token filter, which removes a specified set of words so that words like and, or etc., are not indexed and the Standard Token filter.

Unfortunately for us this does not work very well for the urls we want to index because the standard tokenizer does not split on dots followed by whitespace, so urls are not split up but indexed whole. Fortunately for us it is easy to change the analyzer used in the config:

So we are defining a custom analyzer under the settings section, using the name url_analyzer, under the site type we have added url to the mappings, specifying it should use this analyzer. The analyzer uses the Lowercase Tokenizer which splits on any non letter symbol as well as lower casing the resulting tokens. We are also using a couple of filters with this, a custom stop filter which removes http and https so they are not matched in searches as well as the standard stop filter. By removing http and https from the index for urls we can still get meaningful results if we search for these terms and have added sites with http or https in the title.

We now need to add searching the url field to our query from the previous post:

This adds another Text query searching the url field using the custom url_analyzer. Now all three fields of the Site entity are being searched on.

Symfony2: improving elasticsearch results

In my previous post I looked at integrating elasticsearch into a Symfony2 app using Elastica and the FOQElasticaBundle bundle. By the end we were indexing a Site entity and performing basic searches against the index. In the post I will look at improving how we index and search the Site entities.

We can improve the indexing of the name and keywords by switching to a different analyzer. Currently we are only going to find whole word matches, for example, if we index Lime Thinking as a site name then it will be found by a search for thinking, but not think or thinks. We can change this by instead using the snowball analyzer, this is a built in analyzer which is the same as the standard analyzer but with the edition of the snowball filter which stems the tokens. This means that words are indexed as their stems, so thinking will be indexed as think. We can then find it with searches for words such as think, thinks and thinkings. I will have a more detailed look at analyzers and filters in a future post.

We just need to make a small config change to start using this analyzer for indexing:

We need to make some further changes though to get the benefits of this. We also need to make sure that the search terms are analyzed with the same analyzer as the indexed field. If this does not happen we will only get matches if we search for the stemmed token e.g. think will find Lime Thinking but thinking will not. Our simple query does not specify which field we are searching, this means its searches the built in _all field which, unsurprisingly, contains all the fields. This means we cannot use different analyzers for searching different fields. We are going to want to add the url at some point using a different analyzer so we need to specify each field we want to search separately.

So we now need to split up our query into several parts. For this we need to use Elastica’s query builder objects. To search on a specific field we can use a Text query, so to search on the name field we use:

Notice that we pass the query object into the same method on the finder as before, this method accepts both simple search strings as well as queries built through objects. According to the elasticsearch documentation the analyzer will default to the field specific analyzer or the default one, to me this suggests that the above query will automatically use the analyzer set for the field. However this does not work for me, fortunately it easy to specify the analyzer to use for the field:

Our current query will of course only search the name field, what we want to do is search the name field and the keywords field using the snowball analyzer. This is done by creating another query as above for the keywords field and then using a boolean query to combine the two individual queries into one query:

Whilst this looks complicate each constituent part is simple and this is a good way to build more complicated queries.

A really helpful recent inclusion to the bundle is logging to the web profiler toolbar so you can see the parsed JSON query that is sent to elasticsearch. The combined query from above looks like this:

We have seen Text query and Boolean query here, these are just a few of the available query types. There is more information on each in the elasticsearch documentation. There is little in the way of documentation for the Elastica objects for creating these query types but the test suite provides quite a lot of example of putting them to use.

Symfony2: Integrating elasticsearch

Over a short series of posts I am going to have a look at using elasticsearch with Symfony2.

Elasticsearch is built on top of Lucene and indexes data as JSON documents in a similar way to the way MongoDB stores data. This means as with Mongo that it is schemaless and creates fields on the fly. It is queried over HTTP using queries which are themselves defined in JSON. I am not going to go into details about using elasticsearch in this way, there is plenty of information in its online documentation.

Reading through the documentation makes it look as though there is a steep learning curve to getting started with elasticsearch. What I want to do is look at how you can avoid having to deal with issuing JSON queries over HTTP from a Symfony2 app and actually get started using elasticsearch in a very simple way. This is possible by using Elastica, a PHP library which abstracts the details of the queries, along with the FOQElasticaBundle which integrates Elastica into Symfony2 applications. This is not just a basic wrapper though to make Elastica into a Symfony2 service, the integration with Doctrine to make indexing of ORM entities or ODM documents is fantastic and what I am going to look at here.

To get started you need to install elasticsearch itself, as well as installing Elastica and the FOQElasticaBundle in the usual way.

As an example of how easy the integration is I will look at a very basic application for bookmarking sites and searching for them. For simplicity’s sake we are just going to have a single entity to model each site, it is just a name, the URL and some keywords stored as a comma separated list. So here it is as a Doctrine entity class:

We can then set up the bundle to index the fields of our entity. By choosing to use the integration with doctrine we can make this very simple:

Whilst there are quite a few settings here it is fairly straight forward. The client just sets the port to use for the http communication. The bookmarks setting under indexes is the name of the index we will create. Within each index you can have types for each of your entity types, we just have the one type (site) here at the moment.

We have specified that we are using the ORM, the entity class and which fields to map, for now just the name and keywords (I will return to indexing in the url in my next post). That is enough to get any existing Sites stored in the database into the search index. Running the following console command will do this:

It is as easy as that! All the sites already stored in the database are now indexed without the need for even writing any code, just a small amount of configuration. Great as that is, it would be even better if we could automatically index any new entities, as well as updating and removing entities as they are updated and removed from the database without having to rerun the console command. This is just as easy to achieve with only one extra item added to the configuration:

This enables the bundle’s built in doctrine event listeners which will then do just that, keep the search index up to date with any changes we make to the entities, again without any additional code needed in typical CRUD controllers.

Before looking at searching the index there is one more bit of config which can be added to make integration easy:

By adding the finder line we activate the support for returning the search results as Doctrine entities, so the bundle will do the work of fetching the relevant entities from the database after querying the elasticsearch index.

So how do we query the index? The bundle dynamically creates a service you can request from the container with the format foqelastica.finder.index-name.type-name. These match the values in our config, so the service we need is foqelastica.finder.bookmarks.site. We can now issue queries using this service:

Elastica provides an OO query builder for creating more complicated queries but I will leave that for another day. Hopefully I have shown just how straightforward it is to get stated using elasticsearch with a Symfony2 app. As always, it is not limited to such simplicity and you can override these built in services to provide your own providers, finders and listeners if you have more complex requirements.

Symfony2: Obtaining the Request Object

In this quick post I am looking at how to obtain the Request object in a Symfony2 controller as a service. This is actually covered in the docs but in one of the earlier introductory pieces and so is easily missed or forgotten about by the time you actually need to know it. I know I missed this and have previously not shown the correct way to do this.

If you are extending the base controller then its getRequest() method will retrieve the Request object from the container. If you are making your controllers services then you will generally inject in the services, you would have requested from the container, to the constructor or via a setter method. This is not the correct way to work with the Request object as it has a narrower scope than the controller. This basically means that the controller can be used for multiple requests whereas a Request object should only be used for the single request it represents. This means that a fresh request object should be injected each time an action of the controller is called rather than just when the controller is created.

The base controller uses its injected container to get a new Request object each time the action is called, so one way would be to inject the controller itself into the service. There are many reasons not do this which I have covered elsewhere. Fortunately there is a simple way to avoid having to do this, thanks to a cool feature of the framework; Symfony2 will automatically inject the Request object into your actions each time they are called, if you ask for it in the following way:

The key here is the type hint of SymfonyComponentHttpFoundationRequest, as long as this is in place the Request object will be injected in as a method argument. This avoids any scope issues without having to inject the container. This will work for controllers whether they are services or not and regardless of whether there are any other method arguments. So, for example, you can still combine this with annotated parameter conversion:

Symfony2: Routing to Controller as Service with Annotations

A very quick post as I couldn’t find anything documenting this yet (I will submit this to the docs also). If you want to use the combination of annotations for routing and also make your controllers into services then you can by specifying the service id in the class route:

Whether you should make controllers services is a while different matter and I am going to stay away from that can of worms (at least for today).

Symfony2: Doctrine Migrations and ACL tables

Edit: This is an issue with Symfony 2.0.x and looks to have been resolved for 2.1, please see Denderello’s comment below for more details

If you use Doctrine migrations in a Symfony2 app then one difficulty you may run into is with database tables that do not relate to your entities. The doctine:migration:diff command will make sure that the current database schema is updated to match the schema generated from the entities in an app. Whilst this is an excellent way of managing database versions across multiple instances of the app, I have run into a problem with this. If you are also using the ACL feature of Symfony2 security then this uses database tables that do not have corresponding entities. This means that whenever you generate diffs they will include the SQL to remove the ACL tables.

I have tried various solutions to this. The easiest would be if there was a way to specify that certain tables should be ignored when running the diff command. As far as I can see there is no way to do this at the time of writing, someone please correct me if I am wrong.

I tried creating entities for the ACL tables by creating these using the init:acl command in a fresh version of the database and the following the procedure to generate entities for the tables. Unfortunately there were some differences between then entities and the tables so that running doctine:migration:diff still created SQL relating to these tables. It may not have been significant but I do not know enough about how the ACL part of the security component works to want to run these. Additionally it felt like a very hacky workaround anyway.

The successful attempt was just to move these tables to a different database which turned out to be a pretty simple procedure. So here is how to move the ACL tables to a different database:

You need to set up an additional connection in config.yml with the details of the second database:

The only differences between the connections is the database name in this case. The new database name parameter acldatabasename needs adding to parameters.ini. You also need to specify the new connection is to be used for ACL in security.yml:

Then setting up the ACL tables with the usual command will add them to the new database:

Now if you run the doctine:migration:diff command it will just look in the default connection’s database and ignore the other connection’s database.

This all assumes you are still at a development stage where you can just start over with a fresh database. If not then you will need to actually move the existing tables to the new database.

Symfony2: Testing with Behat and Mink

I have been looking at functional testing using Behat and Mink and their associated Symfony2 bundles. Having used Selenium for this sort of testing in the past and found the writing of the tests to be a long and torturous process, Behat is a huge improvement.

There are quite articles about on getting started with Behat for web application testing as well as its own great documentation. For example, http://www.whiteoctober.co.uk/blog/2011/06/15/getting-into-behat-with-symfony2/ and http://techportal.ibuildings.com/2011/07/27/behaviour-driven-development-in-php-with-behat/.

I am not going to go over installation and the basics in this post but just look a couple of specific things I have found useful myself.

Mink’s Built In Steps

One of the most useful features for me has been the built in steps provided when using Mink with Behat. These are available when you extend MinkContext for the FeatureContext class and allow you to write steps to perform common interactions with a website. For example clicking on a link:

These already have the behind the scenes action written so you do not need to write them yourself. There are numerous actions and assertions included. You can find a list of these with this command:

As well as really appreciating not having to write the code for actually controlling the browser session for these I liked the way that they match to multiple ways of selecting a link, button, field etc. So instead of having to know the id of a field to fill it out, it will match to the following its id, name, label or value.

Another couple of useful steps included are great for finding why a feature is not passing:

Adding these to a feature will allow you to see the last output from the browser which helps to find out what wrong with a test. Then print last response will display the last output in stdout. To use Then show last response you need to add the following to the Mink config:

This will then open the last response in the specified browser, which can be more helpful as you will see the rendered page and not the page source.

Adding users using the FOSUserBundle.

Here is a quick snippet for adding users if you are using the FOSUserBundle. If you have a Given step, effectively a precondition for a test, that requires certain users to be added e.g.

Then this step in the FeatureContext will add them:

Cleaning the database

The documentation’s recommendation is not to use Doctrine fixtures for testing but to do this work in the FeatureContext class in order to make it easy to see the starting state from the scenario. You can clear database tables like this:

In the above code you are using an Entity with the Entity Manager to clear the database. If you are using Symfony2’s ACL for security then this also needs clearing but does not have associated entities, the ACL tables can still be cleared by using the DBAL connection rather than the entity manager like this: