Flag of Ukraine
SymfonyCasts stands united with the people of Ukraine

Joining Across a ManyToMany + EXTRA_LAZY Fetch

Keep on Learning!

If you liked what you've learned so far, dive in!
Subscribe to get access to this tutorial plus
video, code and script downloads.

Start your All-Access Pass
Buy just this tutorial for $12.00

With a Subscription, click any sentence in the script to jump to that part of the video!

Login Subscribe

On the genus list page, I want to add a new column that prints the number of scientists each Genus has. That should be simple!

Open the genus/list.html.twig template. Add the new th for number of scientists:

... lines 1 - 2
{% block body %}
<table class="table table-striped">
<thead>
<tr>
... lines 7 - 8
<th># of scientists</th>
... line 10
</tr>
</thead>
... lines 13 - 26
</table>
{% endblock %}

Then down below, add the td, then say {{ genus.genusScientists|length }}:

... lines 1 - 2
{% block body %}
<table class="table table-striped">
... lines 5 - 12
<tbody>
{% for genus in genuses %}
<tr>
... lines 16 - 21
<td>{{ genus.genusScientists|length }}</td>
... line 23
</tr>
{% endfor %}
</tbody>
</table>
{% endblock %}

In other words:

Go out and get my array of genus scientists and count them!

And, it even works! Each genus has three scientists. Until we delete one, then only two scientists! Yes!

The Lazy Collection Queries

But now click the Doctrine icon down in the web debug toolbar to see how the queries look on this page. This is really interesting: we have one query that's repeated many times: it selects all of the fields from user and then INNER JOINs over to genus_scientist WHERE genus_id equals 29, then, 25, 26 and 27.

When we query for the Genus, it does not automatically also go fetch all the related Users. Instead, at the moment that we access the genusScientists property, Doctrine queries all of the User data for that Genus. We're seeing that query for each row in the table.

Fetching EXTRA_LAZY

Technically, that's a lot of extra queries... which could impact performance. But please, don't hunt down potential performance problems too early - there are far too many good tools - like NewRelic and Blackfire - that are far better at identifying real performance issues later.

But, for the sake of learning... I want to do better, and there are a few possibilities! First, instead of querying for all the user data just so we can count the users, wouldn't it be better to make a super-fast COUNT query?

Yep! And there's an awesome way to do this. Open Genus and find the $genusScientists property. At the end of the ManyToMany, add fetch="EXTRA_LAZY":

... lines 1 - 14
class Genus
{
... lines 17 - 71
/**
* @ORM\ManyToMany(targetEntity="User", inversedBy="studiedGenuses", fetch="EXTRA_LAZY")
* @ORM\JoinTable(name="genus_scientist")
*/
private $genusScientists;
... lines 77 - 195
}

That's it. Now go back, refresh, and click to check out the queries. We still have the same number of queries, but each row's query is now just a simple count.

That's freaking awesome! Doctrine knows to do this because it realizes that all we're doing is counting the scientists. But, if we were to actually loop over the scientists and start accessing data on each User - like we do on the genus show page - then it would make a full query for all the User data. Doctrine is really smart.

Joining for Less Queries

Another way to optimize this would be to try to minimize the number of queries. Instead of running a query for every row, couldn't we grab all of this data at once? When we originally query for the genuses, what if we joined over to the user table then, and fetched all of the users immediately?

That's totally possible, and while it might actually be slower in this case, let's find out how to do join across a ManyToMany relationship. Open GenusController and find listAction(). Right now, this controller calls a findAllPublishOrderedByRecentlyActive() method on GenusRepository to make the query:

... lines 1 - 13
class GenusController extends Controller
{
... lines 16 - 57
public function listAction()
{
... lines 60 - 61
$genuses = $em->getRepository('AppBundle:Genus')
->findAllPublishedOrderedByRecentlyActive();
... lines 64 - 67
}
... lines 69 - 146
}

Go find that method! Here's the goal: modify this query to join to the middle genus_scientist table and then join again to the user table so we can select all of the user data. But wait! What's the number one rule about ManyToMany relationships? That's right: you need to pretend like the middle join table doesn't exist.

Instead, leftJoin() directly to genus.genusScientists. Alias that to genusScientist:

... lines 1 - 7
class GenusRepository extends EntityRepository
{
... lines 10 - 12
public function findAllPublishedOrderedByRecentlyActive()
{
return $this->createQueryBuilder('genus')
... lines 16 - 19
->leftJoin('genus.genusScientists', 'genusScientist')
... lines 21 - 23
}
}

When you JOIN in Doctrine, you always join on a relation property, like $genusScientists. Doctrine will automatically take care of joining across the middle table and then over to the user table.

To select the user data: addSelect('genusScientist'):

... lines 1 - 7
class GenusRepository extends EntityRepository
{
... lines 10 - 12
public function findAllPublishedOrderedByRecentlyActive()
{
return $this->createQueryBuilder('genus')
... lines 16 - 19
->leftJoin('genus.genusScientists', 'genusScientist')
->addSelect('genusScientist')
... lines 22 - 23
}
}

Ok, go back and refresh again! Woh, one query! And that query contains a LEFT JOIN to genus_scientist and another to user. Because we're fetching all the user data in this query, Doctrine avoids making the COUNT queries later.

If Doctrine JOINS are still a bit new to you, give yourself a head start with our Doctrine Queries Tutorial.

Leave a comment!

9
Login or Register to join the conversation

Hey Ahmedbhs,

Good question! Yes, it should be automatic. Actually, you can check it yourself, just run:

$ bin/console doctrine:schema:update --dump-sql

Do it first instead of passing --force and you will see the query(queries) that will be executed. You will see that some queries will add indexes. Or, use Doctrine migrations :)

But if you want to add more indexes (custom indexes) for your custom columns - you can totally do it, just check the docs:
https://www.doctrine-projec...

I hope this helps!

Cheers!

Reply

Hey there, thank you for the screencast, I have some questions:

1/ In wish case it's better to add ->addSelect('genusScientist') then having separate queries, because I think adding "addSelect()" with a join is slower than having separate queries?

2/ What is the recommended and the fastest fetch mode for ManyToOne relations (imagine that all the field's relation will be selected and serialized)

3/ Why, most of the time Eager mode or ManyToMany relation, is a performance response time killer ?

4/ If Eager mode is enabled, then there's no reason to add ->addSelect('genusScientist') to our query ?

5/ Between Layzy and EAGER what is the mode that save the response time, and the one that save memory usage ?

Cheers!

Reply

Hey Ahmedbhs,

Woh, a lot of questions :) Well, mostly it depends, and better to profile your specific case to understand what optimization will better fit you and what strategy to choose. I'd suggest you another our course about Blackfire.io - that tool helps to do some profiling and gives you a lot of data and hints about you specific case - that would be the best I think. You can find the course here: https://symfonycasts.com/sc...

1. addSelect() may increase memory usage, but without it you will increase query count, so it depends what will best in your specific case, but as I said, better to use a profiler tool that will help you to understand your specific case better

2. Fastest way is probably when you do not hydrate the result into objects, i.e. when you just return an array instead of hydrated entities. But it will be less cool to use array instead of objects, and it may be overkill. Also, you can always use profiling tool to record time that was taken and then compare what fits better for you :)

3. As always, it depends, especially depends on how you will use the result data. Even Symfony Web Debug Toolbar (WDT) may be helpful here, you can check the timeline and queries that was executed during the page rendering and see some bottlenecks :)

4. That mode tells Doctrine how to behave by default, but if you will explicitly specify addSelect() in your query - it will apply it of course. So, it makes sense to add it if you need/want in your specific case.

5. I'd not recommend you blindly chase numbers, better to try to see the full picture with a profiler. Any of those modes may fit better in your specific case and worse in another, that's why profiling tools exist and help.

So, don't do premature optimization, especially if you don't need it yet. Even if something is not optimized it still may work and save you more time from doing some complex optimization, that definitely will be great, or even it may save you time in the future when you will need to change/modify that part of code later. Every optimization will complicate your code and make, and so will require more time. And so, unless you really need to optimize - you better don't do it. But in case you want to do it just in sake of science - yeah, totally valid, but you should be aware that it may complicate things later when you will want to change it. And anyway, better to use profiling tools to see the full picture for your specific case and compare the results.

I hope this helps!

Cheers!

1 Reply
Default user avatar

Hello,
About < fetch="EXTRA_LAZY" >, something is not clear for me.
Should we use this always (when we need joins)? When we should avoid this? For example, if you put on a property, you will use this property many times in different parts of app. Is this fine? So, when to use it (or better to add joins instead of it) ?

Reply

Yo axa!

Yea, this is a really interesting question. Because, to me, fetch="EXTRA_LAZY" basically seems like the *best* value... in pretty much all cases! There is one case where it is not ideal: if you count the number of items in a collection relationship and then loop over it, having EXTRA_LAZY will cause you to have 2 queries instead of 1 (a count query, and then another query to loop over them). Because of that, I would use this rule: if you know that you want to count the items in a relationship sometimes, but NOT actually loop over them, use EXTRA_LAZY.

And when to use joins? Well, that solves a different problem. EXTRA_LAZY gives us a *really* simple way of counting the items in a collection in the most efficient way (a simple COUNT query). If you use a fetch join, then you will grab all the data at once. If you then simply ONLY count the collection, well, then EXTRA_LAZY doesn't help you: you already fetched all the data... even though you didn't really need it. But if you *will* loop over the collection, then of course doing a fetch join is always a little bit faster. But honestly, I usually write my queries a bit sloppy, and then optimize later with Blackfire.io :). I do often find that a page is slow because I'm querying for a lot of data, but I much less often find that the fix is a join.

Cheers!

1 Reply
Default user avatar

Thanks Ryan for explanation.

Reply
Default user avatar
Default user avatar jian su | posted 5 years ago

Hi guys:

When you join query
->leftJoin('genus.genusScientists', 'genusScientist')
->addSelect('genusScientist')
is slower than no join. it took 66.68ms for a join query, no join with 3.56ms. I guess joining do pay some price?

Reply

Hey Jian,

Yes! JOINs always slows your queries, sometimes it will be enough to add indexes for columns by which you are joining, but sometimes it makes sense do not use JOINs at all and make another 2nd query. So the simple answer is yes, you pay for JOINs :)

Cheers!

Reply
Default user avatar

Hi Victor: Thank you. it makes sense now :)

Reply
Cat in space

"Houston: no signs of life"
Start the conversation!

This course is built on Symfony 3, but most of the concepts apply just fine to newer versions of Symfony.

What PHP libraries does this tutorial use?

// composer.json
{
    "require": {
        "php": "^7.1.3",
        "symfony/symfony": "3.4.*", // v3.4.49
        "doctrine/orm": "^2.5", // 2.7.5
        "doctrine/doctrine-bundle": "^1.6", // 1.12.13
        "doctrine/doctrine-cache-bundle": "^1.2", // 1.4.0
        "symfony/swiftmailer-bundle": "^2.3", // v2.6.7
        "symfony/monolog-bundle": "^2.8", // v2.12.1
        "symfony/polyfill-apcu": "^1.0", // v1.23.0
        "sensio/distribution-bundle": "^5.0", // v5.0.25
        "sensio/framework-extra-bundle": "^3.0.2", // v3.0.29
        "incenteev/composer-parameter-handler": "^2.0", // v2.1.4
        "composer/package-versions-deprecated": "^1.11", // 1.11.99.4
        "knplabs/knp-markdown-bundle": "^1.4", // 1.9.0
        "doctrine/doctrine-migrations-bundle": "^1.1", // v1.3.2
        "stof/doctrine-extensions-bundle": "^1.2" // v1.3.0
    },
    "require-dev": {
        "sensio/generator-bundle": "^3.0", // v3.1.7
        "symfony/phpunit-bridge": "^3.0", // v3.4.47
        "nelmio/alice": "^2.1", // v2.3.6
        "doctrine/doctrine-fixtures-bundle": "^2.3" // v2.4.1
    }
}
userVoice