David Négrier CTO

This article is about experimenting the possibility to standardize how we register things (usually services) into containers. More specifically, here, I'm targeting only containers that can be compiled.

Why is it important?

Let's assume you are a PHP package author. If you want your classes to be easily usable in framework A or framework B, you must write a "meta-package" for each framework. This meta package usually contains "services definitions": it explains to the framework how you build your objects. So the job of a PHP package developer is to write his/her package, and then to write X meta-packages (one for each framework it wants to target). This is a problem, as X is quite large in the PHP world.

Why am I doing this?

I am fundamentally interested into everything that is connected to dependency injection containers and interoperability between these containers. I've written a DI container myself (Mouf), and I'm one of the authors of container-interop that will eventually become PSR-11, a standard regarding how to fetch values from a container. As part of this standardisation work, I've had numerous comments (sometimes harsh but most of them constructive) about how I'm looking at the problem from the wrong side. Some people told me that instead of standardizing how things are fetched from a container, we should standardize how things are put in it. To be quite clear, I'm pretty sure we need to standardize how things are fetched from a container first (I've written a lot about it here and here). Because it is the only thing that is common between every container. This being said, for a subset of containers (containers that have a compiler), I find that standardizing how things are put in the container can be quite interesting. Unlike PSR-11, this topic is quite new and I think it has seldom been discussed. This article is my first attempt at exploring the subject. I'm not saying that I found the ultimate solution (actually, I did not), but I want here to share my thought with people who might be interested in this very subject.

First prototype

So my first prototype is based on Bernhard Schussek idea (expressed on the PHP-FIG group), that explains we should try to find a common interface for the object descriptions (the objects that are consumed by the compiler to generate a container). Each object usually represents an entry in the container, and has attributes like the class name of the object to be created, the list of setters to call with associated parameters... This is my first attempt, and I've decided to prototype a very simple interface with one single method: toPhpCode. Why did I choose this approach? The idea here is to shift the responsibility of generating PHP code from the compiler to the definition object. Why? Because there are numerous ways to define container entries (creating instances with the "new" keyword, with a callback, with a factory, using lazy services, and so on...) Rather than listing all the possible ways of creating an entry (I'm not even sure there is a finite number of ways to do this), I thought we should delegate this responsibility to the definition object itself. This way, anyone can come up with a new way of doing things. Pros:

  • Very flexible
  • Simple to understand

Cons:

  • The compiler cannot optimize descriptions as it cannot understand them (I'm discussing alternatives at the end of this article)
interface DefinitionInterface
{
    /**
     * Returns a string of PHP code generating the container entry.
     *
     * The PHP code MUST be a closure or a PHP expression that evaluates to the value of the item.
     *
     * If the PHP code is a closure, then that closure MUST take one argument that is a
     * InteropContainerContainerInterface object.
     * The function MUST return the entry generated.
     *
     * For instance, this is a valid PHP string:
     *
     * "function(InteropContainerContainerInterface $container) {
     *     $service = new MyService($container->get('my_dependency'));
     *     return $service;
     * }"
     *
     * If the PHP code is a PHP expression, then the PHP expression must evaluate to the value returned for the
     * container entry.
     *
     * These are valid PHP expressions:
     *
     * "'localhost'" (a string)
     * "CONST_VAR" (a constant)
     * "array(42, 12)" (an array)
     * "12 + 32" (any valid PHP statement that evaluates to something)
     *
     * @return string
     */
    public function toPhpCode();
}

The toPhpCode method returns a string of valid PHP code. It is most of the time a callback taking one parameter: the container itself. I chose to pass the container in parameter of the function (rather than using $this) because this makes it straightforward to implement the "delegate lookup" pattern described in PSR-11. Also, for optimisation purposes, I decided that the toPhpCode method could also return PHP expressions. This is useful to store values that are not objects in the container. These are sometimes called "parameters" in other containers.

Demo time

To check if this idea works, I wrote the following packages:

common-container-definitions contains the following classes:

  • InstanceDefinition: the typical container entry. It is instanciated using the new keyword. You can add public properties assignement and method calls (to call setters on the newly created object).
  • ClosureDefinition: you pass a closure to this object and the code of the closure will be copied to the compiler. It has some limitations due to the nature of the operation performed: the closure cannot use a context (the use keyword) or the $this object.
  • AliasDefinition: this generates a simple alias to another entry
  • ParameterDefinition: used to generate parameters (simple values stored in the container)
  • ...

A few interesting facts:

  • common-container-definitions can generate optimized code by inlining some defintions. Inlining is the process of putting an entry declaration and its dependency in the same function call. This is possible if the dependency is only used by this entry. Here is a sample:

    use MoufContainerDefinitionInstanceDefinition;

    // We declare the dependency $dependencyDefinition = new InstanceDefinition(null, "MyDependency");

    // We declare the main instance $instanceDefinition = new InstanceDefinition("instanceName", "MyClass"); $instanceDefinition->addConstructorArgument($dependencyDefinition);

This code will generate an instance using this PHP code:

function(ContainerInterface $container) {
    $a = new MyDependency();
    $instance = new MyClass($a);
    return $instance;
}

As you can see, the MyDependency instance is declared and used in the same closure.

  • However, there is a limitation. I can do this only because the addConstructorArgument method consumes another InstanceDefinition. If I was passing a DefintionInterface to this method, I could not apply this inlining. At first, I was pretty sure this was not an issue, but there are some corner cases where it might be interesting to inline definitions coming from different packages.

To overcome that limitation, we have to modify the signature of the toPhpCode method. From my first attempt, I learned a few things I'll try to share here. When you try to inline some code, the code you write is split in 2 parts:

  • a number of PHP statements (i.e. full PHP lines ending with ; that are used to set up some variables)
  • a PHP expression that represents the value of your entry

Let's take the example above:

function(ContainerInterface $container) {
    $a = new MyDependency();
    $instance = new MyClass($a);
    return $instance;
}

In this example, $a = new MyDependency(); is the *PHP statement* of the inlined definition (there can be many statements), and $a is the PHP expression that represents the value. So the toPhpCode method must really return 2 things: the PHP statements, and the PHP expression. Additionnally, the toPhpCode method will probably need to create variables. In the example above, $a is a variable created on the fly. Of course, if there are many inlined definitions in the same closure, we must ensure that variable names do not conflict with each others.

Second prototype

Based on these facts, here is an updated possible interface:

/**
 * Objects implementing the DefinitionInterface represent a definition of a container entry.
 * They can be "rendered" to PHP code using the toPhpCode() method.
 */
interface DefinitionInterface
{
    /**
     * Returns the identifier for this object in the container.
     * If null, classes consuming this definition should assume the definition must be inlined.
     *
     * @return string|null
     */
    public function getIdentifier();

    /**
     * Returns an InlineEntryInterface object representing the PHP code necessary to generate
     * the container entry.
     *
     * @param string $containerVariable The name of the variable that allows access to the container instance. For instance: "$container", or "$this->container"
     * @param array $usedVariables An array of variables that are already used and that should not be used when generating this code.
     * @return InlineEntryInterface
     */
    public function toPhpCode($containerVariable, array $usedVariables = array());
}

The DefinitionInterface is not only consumed by the compiler but also by the other definitions. I decided to add the getIdentifier method to this interface. An entry must have an id, unless we want it to be inlined in another entry. The toPhpCode method now returns objects implementing the InlineEntryInterface.

/**
 * Objects implementing this interface represent PHP code that can be used to create an entry.
 */
interface InlineEntryInterface
{
    /**
     * Returns a list of PHP statements (ending with a ;) that are necessary to
     * build the entry.
     * For instance, these are valid PHP statements:
     *
     * "$service = new MyService($container->get('my_dependency'));
     * $service->setStuff('foo');"
     *
     * Can be null or empty if no statements need to be returned.
     *
     * @return string|null
     */
    public function getStatements();

    /**
     * Returns the PHP expression representing the entry.
     * This must be a string representing a valid PHP expression,
     * with no ending ;
     *
     * For instance, "$service" is a valid PHP expression.
     *
     * @return string
     */
    public function getExpression();

    /**
     * Returns the list of variables used in the process of creating this
     * entry definition. These variables should not be used by other
     * definitions in the same scope.
     * @return array
     */
    public function getUsedVariables();

    /**
     * If true, the entry will be evaluated when the `get` method is called (this is the default)
     * If false, the entry will be evaluated as soon as the container is constructed. This is useful
     * for entries that contain only parameters like strings, or constants.
     *
     * If false, a call to `getStatements` MUST return null.
     *
     * @return bool
     */
    public function isLazilyEvaluated();
}

What I like about this new interface:

  • It allows a greater level of interoperability between definitions
  • It makes it easy to inline the definitions (easily increase performance)
  • Overall, the code I implemented using this interface looks cleaner and more straightforward

What I don't like about this interface:

  • It is not obvious to understand (the previous interface was really easier to grasp)
  • It is not optimal to implement a definition based on closures
  • I'm not a huge fan of the getUsedVariable() method. We could instead parse the PHP string and find all declared variables... but that would probably come with a high performance cost. Compilation performance is important, as sometimes, you want to recompile the container on every request (for instance in development mode)

And here are the matching packages:

(Lo and behold! 100% test coverage!)

Missing features

Looking at existing containers, the definition interfaces described above cover almost all use cases. There are still 2 features that I cannot easily implement:

Lazy service

Lazy services are entries that are not directly served by the container. Instead, the container serves a proxy to the object and the object is actually instantiated only if it is used. Rather than adding a isLazy method to the interface (that would force compilers to support lazy services), a great option would be to wrap a definition into a ProxyDefinition class. Just imagine you could do:

$lazyService = new ProxyDefinition($service);

// $lazyService will generate PHP code that creates a proxy to $service.

Now, the problem is that to generate a proxy, you need to know the type of the class you are proxying. But with the current interface, there is no way to know what kind of object will be generated by the toPhpCode() method. To solve this issue, we might want to add 2 methods to the interface:

/**
 * Returns the type of the service (as would be returned by the PHP `gettype` function)
 * @return string
 */
public function getType();

/**
 * Returns the fully qualified class name of the object returned (if this is an object)
 * Returns null otherwise.
 * @return string
 */
public function getClass();

Tags

Some frameworks (Symfony) support the notion of tags. In Symfony, tags are directly applied to service definitions. This is a feature that is not covered at all by the interface I've designed so far. The notion is very powerful, and since it is not (yet) covered, we might want to add these methods to the DefinitionInterface:

/**
 * Returns the list of tags applied to this entry.
 * @return string[]
 */
public function getTags();

Alternatives

Of course, there are other alternatives to the interfaces I proposed here. I've tried to make the compiler as dumb as possible, but we could also push the code generation into the compiler (this is the way most compilers work). In this case, the interfaces should DESCRIBE the way the entries must be built. There must be one interface per technique of instantiation (using constructor, using factory, using closure...)

What I don't like about this technique

The interfaces would certainly look like this:

/**
 * Entry definitions implementing this interface are creating entries using the class constructor.
 */
interface ConstructorBasedEntryDefinitionInterface
{
    /**
     * Returns the fully qualified class name to be instanciated
     * @return string
     */
    public function getClass();

    /**
     * An array of arguments. Can be a scalar, an array, or another DefinitionInterface
     * @return array
     */
    public function getConstructorArguments();

    /**
     * @return MethodCallsInterface
     */
    public function getMethodCalls();
}

/**
 * Entry definitions implementing this interface are creating entries using factories.
 */
interface FactoryBasedEntryDefinitionInterface
{
    /**
     * Returns the fully qualified class name of the factory
     * @return string
     */
    public function getFactoryClass();

    /**
     * Returns the method name of the factory
     * @return string
     */
    public function getFactoryMethod();

    /**
     * An array of arguments passed to the factory method. Can be a scalar, an array, or another DefinitionInterface
     * @return array
     */
    public function getFactoryArguments();

    /**
     * @return MethodCallsInterface
     */
    public function getMethodCalls();
}

/**
 * Represents a method call on a newly created entry (e.g. setter, etc...)
 */
interface MethodCallInterface
{
    /**
     * The name of the method to call.
     *
     * @return string
     */
    public function getMethodName();

    public function getArguments();
}

// ...
// There are many many more interfaces to describe here
// ...

Each compiler would need to support ALL interfaces and in the future, we would be limited to the list of interfaces described here. This design is not extensible. Furthermore, there is an infinite number of ways to create instances. Just have a look at this code from the Puli documentation:

$factoryClass = PULI_FACTORY_CLASS;
$factory = new $factoryClass();

$repo = $factory->createRepository();

How could we write an interface that supports this kind of code? We have 2 solutions:

  1. We choose to support this kind of code. This means modifying the interfaces so that the class name can be a constant (and not only a string). If we choose this approach, we would probably end up having very complex interfaces as what we are trying to do is essentially write interfaces that can generate any kind of PHP code.
  2. We choose to keep this out-of-scope (and therefore we decide in the interface that some use cases will be out-of-scope for ever). The interface dictates the list of supported features.

None of those options is good in my opinion.

What I like about this technique

Surprisingly enough, there are plenty of things I do like in this technique. Most of all, if definitions are descriptive, they can be analyzed. It means we can have efficient compiler passes modifying the objects (if we add setters) and most of all, we can do reflection on the entries. As the author of Mouf (a dependency injection framework based on visual editing), I would very much like to be able to have reflection on all definitions. That would allow me to display relationships between any entry of a container in a graphical way.

Conclusion

After working on two prototypes of interfaces (with a dumb compiler) and looking at possible alternatives of descriptive interfaces with a clever compiler, it strikes me that this problem is not easily solvable. Both approaches (dumb vs clever compiler) have their interest and their drawbacks. A strategy might be to work with descriptive interfaces (clever compiler) with a possible fallback on a toPhpCode interface for edge cases. That would also represent the most work for compiler implementors. In the end, after several weeks of various tests, I cannot really decide what is the best strategy here to design a definition interface shareable by all compilers. Even worse, I'm starting to wonder if there is a point in designing such an interface. There are not many compilers out there. There are only 3 or 4 that I'm aware of. An alternative strategy to a common interface could be an interop package. For instance, something like thecodingmachine/common-container-definitions could become a common package with multiple adapters for different frameworks. We could write a Symfony compiler-pass to put these definitions into Symfony, etc... And for packages that do not have the notion of compiler passes, it would still be possible to use Yaco to get a PSR-11 compatible container and plug it into any other container. In this scenario, we would not have a PSR dedicated to putting things into a container. Instead, we would have a special package for interop (exactly like Puli is an interop package for managing files without any PSR in that topic needing to exist). I'll probably keep working on the topic in the coming months. Now that I coded a compiler, it would be foolish not to integrate it within Mouf. I'll also work a bit more on a possible integration with Puli, and of course, this topic will stay on my mind as I try to push forward PSR-11. As I haven't reached any hard conclusion, I would be very interested in gathering comments, advices and feedbacks from the community at large. Do not hesitate to post a comment or add a post in the dedicated thread on the PHP-FIG mailing list.

About the author

David is CTO and co-founder of TheCodingMachine. He is the co-editor of PSR-11, the standard that provides interoperability between dependency injection containers. David is the lead developer of Packanalyst, a website that references all PHP classes/interfaces ever stored on Packagist. He is also the lead developper of Mouf, the only graphical dependency injection framework and currently working on another PSR, regarding standardizing service providers (more containers goodness!).