Xpath in python

This Comment will be submitted for moderation and will not be accessible to other users until it has been approved.


9 points

I have a python application in which I am scrapping data from a particular website.. and I want to use xpath. How do I use xpath library in python?



15 points

Use the lxml2 (http://codespeak.net/lxml/) package. It supports xpath. You can also try beautifulsoup for parsing your web-application which is widely used for scraping applications in Python.

For lxml2 as per their website, basic test of XPath wrappers:

import libxml2
 
doc = libxml2.parseFile("tst.xml")
ctxt = doc.xpathNewContext()
res = ctxt.xpathEval("//*")
if len(res) != 2:
    print "xpath query: wrong node set size"
    sys.exit(1)
if res[0].name != "doc" or res[1].name != "foo":
    print "xpath query: wrong node set value"
    sys.exit(1)
doc.freeDoc()
ctxt.xpathFreeContext()

Anonymous's picture
Created by Anonymous

Post Comment

  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <php>, <python>, <ruby>. Beside the tag style "<foo>" it is also possible to use "[foo]". PHP source code can also be enclosed in <?php ... ?> or <% ... %>.