If you liked what you've learned so far, dive in!
Subscribe to get access to this tutorial plus
video, code and script downloads.
Tip
This element was an h1
tag (when we recorded the video) - now the site name is in
a
tag which is inside .wds-community-header__sitename
block. The code below was updated.
Back on the weird Wiki page, run an inspect element on the navigation. There's a hidden
h2
tag inside of the WikiHeader
and WikiNav
elements. Let's try to find
this and print out its text.
To do that use the find()
function: pass it css
as the first argument and then
use your css selector: .WikiHeader .WikiNav h2
:
... lines 1 - 25 | |
$header = $page->find('css', '.wds-community-header__sitename a'); | |
... lines 27 - 33 |
Surprise! This is actually the fourth - and final - important object in Mink. You
start with the page, but as soon as you find a single element, you now have a
NodeElement
. What's cool is that this new object has all the same methods
as the page, plus a bunch of extras that apply to individual elements.
Let's dump the $header->getText();
:
... lines 1 - 27 | |
echo "The wiki site name is: ".$header->getText()."\n"; | |
... lines 29 - 33 |
And re-run the mink file. Now it prints "Jurassic Park Wiki Navigation" - so finding by CSS is working.
Let's do something harder and see if we can find the "wiki activity" link
in the header by drilling down into the DOM twice. First, find the parent
element by using its subnav-2
class. So I'll say
$page->find('css', '.subnav-2');
. Oh and don't forget your dot!
... lines 1 - 29 | |
$subNav = $page->find('css', '.wds-tabs'); | |
var_dump($subNav->getHtml()); | |
... lines 32 - 33 |
Now, var_dump()
this element's HTML to make sure we've got the right one. Run mink.php
:
php mink.php
Great - it prints out all the stuff inside of that element, including the WikiActivity
link that we're after.
To find that, we need to find the li
and a
tags that are inside of the .subnav-2
.
We could do that by just modifying the original selector. But instead, once you have an
individual element you can use find()
again to look inside of it. So we can say
$nav->find()
and use css to go further inside of it with li a
:
... lines 1 - 29 | |
$subNav = $page->find('css', '.wds-tabs'); | |
$linkEl = $subNav->find('css', 'li a'); | |
... lines 32 - 35 |
The find()
method returns the first matching element.
Dump this element's text and check things:
... lines 1 - 32 | |
echo "The link text is: ". $linkEl->getText() . "\n"; | |
... lines 34 - 35 |
Yes! It returns Wiki Activity!
In addition to CSS, there's one more important way to find things using Mink: it's called
the named
selector. I'm going to paste in some code here: please do not write this -- it's
ugly code -- I'll show you a better way.
Instead of passing css
to find()
, this passes named
along with an array that says
we're looking for a "link" whose text is "Wiki Activity". The named
selector is all about
finding an element by its visible text. To see if this is working let's
var_dump($linkEl->getAttribute('href'));
:
... lines 1 - 32 | |
$selectorsHandler = $session->getSelectorsHandler(); | |
$linkEl = $page->find( | |
'named', | |
array( | |
'link', | |
$selectorsHandler->xpathLiteral('Books') | |
) | |
); | |
echo "The link href is: ". $linkEl->getAttribute('href') . "\n"; |
That should come back as the URL to the activity section. Try it out.
php mink.php
It works! The named
selector is hugely important because it lets us find elements by
their natural text, instead of technical CSS classes. In this case, we're using the text
of the anchor tag. But the named selector also looks for matches on the title
attribute,
on the alt
attribute of an image inside of a link and several other things. It finds
elements by using anything that a user or a screen reader thinks of as the "text" of an
element.
And instead of using this big ugly block of code, you'll use the named selector via
$page->findLink()
. Pass it "Wiki Activity":
... lines 1 - 32 | |
$linkEl = $page->findLink('Books'); | |
echo "The link href is: ". $linkEl->getAttribute('href') . "\n"; |
This should work just like before.
The named
selector can find 3 different types of elements: links, fields and buttons.
To find a field, use $page->findField()
. This works by finding a label that matches
the word "Description" and then finds the field associated to that label. To find a
button, use $page->findButton()
. Oh, and the named selector is "fuzzy" - so it'll
match just part of the text on a button, field or link.
Ok! Let's finally click this link! Once you have a NodeElement
, just use the click()
method:
... lines 1 - 36 | |
$linkEl->click(); | |
echo "Page URL after click: ". $session->getCurrentUrl() . "\n"; |
Run the script:
php mink.php
You can see it pause as it clicks the link and waits for the next page to load. And then
it lands on the Special:WikiActivity
URL.
When you have a single element, there are a lot of things you can do with it, and each
is a simple method call. We've got focus
, blur
, dragTo
, mouseOver
, check
, unCheck
,
doubleClick
and pretty much everything you can imagine doing to an element.
Head back up to the GoutteDriver
part - that was important object number 1. The driver
is used to figure out how a request is made. The Goutte driver uses cURL. If we wanted
to use Selenium instead, we only need to change the driver to $driver = new Selenium2Driver();
:
... lines 1 - 9 | |
//$driver = new GoutteDriver(); | |
$driver = new Selenium2Driver(); | |
... lines 12 - 42 |
Tip
By default, the Selenium2 driver uses Firefox. But recent versions may not work correctly with Selenium server. If you have any issues, try using Google Chrome instead:
// ...
$driver = new Selenium2Driver('chrome');
That's it! Oh and make sure you have $session->start()
at the top:
... lines 1 - 15 | |
$session->start(); | |
... lines 17 - 42 |
I should have had this before, but Goutte doesn't require it. Similarly, at the bottom,
add $session->stop();
:
... lines 1 - 41 | |
$session->stop(); |
That closes the browser.
In our terminal, I still have the Selenium JAR file running in the background.
Run php mink.php
.
The browser opens... but just hangs. Check out the terminal. It died!
Status code is not available from
Behat\Mink\Driver\Selenium2Driver
.
The cause is the $session->getStatusCode()
line:
... lines 1 - 16 | |
echo "Status code: ". $session->getStatusCode() . "\n"; | |
echo "Current URL: ". $session->getCurrentUrl() . "\n"; | |
... lines 19 - 39 |
Different drivers have different super powers. The Selenium2 driver can process JavaScript: a pretty sweet super power. But it also has its own weakness, like kryptonite and the inability to get the current status code.
The driver you'll use depends on what functionality you need, which is why Mink
made it so easy to switch from one driver to another. Remove the getStatusCode()
line and re-run the script:
... lines 1 - 18 | |
//echo "Status code: ". $session->getStatusCode() . "\n"; | |
echo "Current URL: ". $session->getCurrentUrl() . "\n"; | |
... lines 21 - 42 |
Other than this annoying FireFox error I started getting today, it works fine. The browser closes, and we're now dangerous with Mink.
Let's put this all together!
Yea, it would be great for this - especially if you needed those bots to interact with pages that use JavaScript - that's a lot more powerful than just a parser that parses links and follows them :).
How to use CSS selectors when user Behat Drupal? I dont see a mink.php file in the root when installing, but behal.yml has Drupal\DrupalExtension\Context\MinkContext
Hey Johnatas,
I'm 99% sure it's the same! Actually, mink.php in this tutorial is just a playground. First of all, you need to get session, and then you can get the page from it. Try this in your Behat context file:
$page = $this->getSession()->getPage();
And then you can call find() method on $page as we do in this screencast.
Cheers!
Actually the page is completely different now. I'm getting the 1st link however clicking it via Selenium driver fails with "PHP Fatal error: Uncaught WebDriver\Exception\ElementNotVisible: element not visible" error.
I just think the page is refreshing because of ads which is causing the error.
Instead I started from the Wiki Activity page (no ads) and looked for the first link in the content area since it was still complaining about the secondary navigation links not being visible.
Hey Maria,
Ah, what a bummer! We're sorry for the inconvenience, but it was something not related from us directly. We're working on a fix for this and will update the code soon. Thanks for letting us know! Actually, not a big deal, you just need to use a different of "Wiki Activity" link, because this one is hidden (it's shown only when you hover over the "Explore" menu item). So, to follow our example, you can look for the first link in the menu which is "Books" and which is visible by default, you can easily find it with ".wds-tabs li a" selector, or just use:
// Use:
$subNav = $page->find('css', '. wds-tabs');
$linkEl = $subNav->find('css', 'li a');
// Or just:
$linkEl = $page->findLink('Books');
Also, latest versions of Firefox (which is used by default) could not work properly, that's why I recommend to use Google Chrome instead as a Selenium2 driver, you can specify it as the first parameter to Selenium2Driver class:
$driver = new Selenium2Driver('chrome');
Cheers!
I've updated the script + code blocks for this (code block will show up soon when its cache refreshes).
Thanks Pietrino!
// composer.json
{
"require": {
"php": ">=5.4.0, <7.3.0",
"symfony/symfony": "^2.7", // v2.7.4
"twig/twig": "^1.22", // v1.22.1
"sensio/framework-extra-bundle": "^3.0", // v3.0.16
"doctrine/doctrine-bundle": "^1.5", // v1.5.1
"doctrine/orm": "^2.5", // v2.5.1
"doctrine/doctrine-fixtures-bundle": "^2.2", // v2.2.1
"behat/symfony2-extension": "^2.0" // v2.0.0
},
"require-dev": {
"behat/mink-extension": "^2.0", // v2.0.1
"behat/mink-goutte-driver": "^1.1", // v1.1.0
"behat/mink-selenium2-driver": "^1.2", // v1.2.0
"phpunit/phpunit": "^4.8" // 4.8.18
}
}
I'm facing a weird problem with
find
I have the following piece of code:$page = $this->getSession()->getPage();<br />var_dump($page->getContent());<br />var_dump($page->find('css', 'html'))
The best thing is that the first var_dump prints HTML content of the page (with
<html></html>
tags), but the latter returns null. When I change the function tofindAll
- empty array is returned.I am using behat/behat 3.5.0, behat/gherkin 4.6.0, behat/mink 1.7.1, behat/mink-browserkit-driver 1.3.3, behat/mink-extension 2.3.1, and seleniumServer 3.141.59 with ChromeDriver 75.0.3770.100