merged branch matthijsvandenbos/matthijsvandenbos/basetag-href-fix (PR #7178)

This PR was submitted for the master branch but it was merged into the 2.1 branch instead (closes #7178).

Commits
-------

c41f640 [DomCrawler] Fixed handling absent href attribute in base tag

Discussion
----------

[DomCrawler] Fixed handling absent href attribute in base tag

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets |
| License       | MIT
| Doc PR        |

# Description
The HTML5 spec states that the href attribute is optional for the
base tag. The current code causes an exception on conforming HTML.
Fixed the DomCrawler::addHtmlContent() method to support this.
Added Unit Test to check for this situation.

# Explanantion
Currently, if the base tag doesn't have an href attribute, the uri for the DomCrawler gets set to an empty string. This is incorrect behaviour, especially because it breaks DomCrawler::links(). The Symfony\Component\DomCrawler\Link objects it creates, expect a non-empty string in their constructor arguments and throw an InvalidArgumentException.

# References
http://www.w3.org/TR/html-markup/base.html#base.attrs.href
http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html#the-base-element
This commit is contained in:
Fabien Potencier 2013-02-26 17:49:54 +01:00
commit 0c09b9392a
2 changed files with 15 additions and 2 deletions

View File

@ -145,8 +145,9 @@ class Crawler extends \SplObjectStorage
$base = $this->filterXPath('descendant-or-self::base')->extract(array('href'));
if (count($base)) {
$this->uri = current($base);
$baseHref = current($base);
if (count($base) && !empty($baseHref)) {
$this->uri = $baseHref;
}
}

View File

@ -80,6 +80,18 @@ class CrawlerTest extends \PHPUnit_Framework_TestCase
$this->assertEquals('Tiếng Việt', $crawler->filterXPath('//div')->text());
}
/**
* @covers Symfony\Component\DomCrawler\Crawler::addHtmlContent
*/
public function testAddHtmlContentInvalidBaseTag()
{
$crawler = new Crawler(null, 'http://symfony.com');
$crawler->addHtmlContent('<html><head><base target="_top"></head><a href="/contact"></a></html>', 'UTF-8');
$this->assertEquals('http://symfony.com/contact', current($crawler->filterXPath('//a')->links())->getUri(), '->addHtmlContent() correctly handles a non-existent base tag href attribute');
}
/**
* @covers Symfony\Component\DomCrawler\Crawler::addHtmlContent
*/