scrapy: convert html string to HtmlResponse object

First of all, if it is for debugging or testing purposes, you can use the Scrapy shell:

$ cat index.html
<div id="test">
    Test text
</div>

$ scrapy shell index.html
>>> response.xpath('//div[@id="test"]/text()').extract()[0].strip()
u'Test text'

There are different objects available in the shell during the session, like response and request.


Or, you can instantiate an HtmlResponse class and provide the HTML string in body:

>>> from scrapy.http import HtmlResponse
>>> response = HtmlResponse(url="my HTML string", body='<div id="test">Test text</div>', encoding='utf-8')
>>> response.xpath('//div[@id="test"]/text()').extract()[0].strip()
u'Test text'

alecxe's answer is right, but this is the correct way to instantiate a Selector from text in scrapy:

>>> from scrapy.selector import Selector
>>> body = '<html><body><span>good</span></body></html>'
>>> Selector(text=body).xpath('//span/text()').get()

'good'