<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Michael Jay Lissner</title><link href="https://michaeljaylissner.com/" rel="alternate"></link><link href="https://michaeljaylissner.com/feeds/tag/lxml" rel="self"></link><id>https://michaeljaylissner.com/</id><updated>2012-05-20T15:48:06-07:00</updated><entry><title>New tool for testing lxml XPath queries</title><link href="https://michaeljaylissner.com/posts/2012/05/20/new-tool-for-testing-lxml-xpath-queries/" rel="alternate"></link><updated>2012-05-20T15:48:06-07:00</updated><author><name>Mike Lissner</name></author><id>tag:michaeljaylissner.com,2012-05-20:posts/2012/05/20/new-tool-for-testing-lxml-xpath-queries/</id><summary type="html">&lt;p&gt;I got a bit frustrated today, and decided that I should build a tool to fix my frustration. The problem was that we&amp;#8217;re using a lot of XPath queries to scrape various court websites, but there was no tool that could be used to test xpath expressions&amp;nbsp;efficiently.&lt;/p&gt;
&lt;p&gt;There are a couple tools that are quite similar to what I just built: There&amp;#8217;s one called Xacobeo, Eclipse has one built in, and even Firebug has a tool that does similar. Unfortunately though, these each operate on a different &lt;span class="caps"&gt;DOM&lt;/span&gt; interpretation than the one that lxml&amp;nbsp;builds. &lt;/p&gt;
&lt;p&gt;So the problem I was running into was that while these tools helped, I consistently had the problem that when the &lt;span class="caps"&gt;HTML&lt;/span&gt; got nasty, they&amp;#8217;d start falling&amp;nbsp;over. &lt;/p&gt;
&lt;p&gt;No more! Today I built &lt;a href="https://github.com/mlissner/lxml-xpath-tester/"&gt;a quick Django app&lt;/a&gt; that can be run locally or on a server. It&amp;#8217;s quite simple. You input some &lt;span class="caps"&gt;HTML&lt;/span&gt; and an XPath expression, and it will tell you the matches for that expression. It has syntax highlighting, and a few other tricks up its sleeve, but it&amp;#8217;s pretty basic on the&amp;nbsp;whole.&lt;/p&gt;
&lt;p&gt;I&amp;#8217;d love to get any feedback I can about this. It&amp;#8217;s probably still got some bugs, but it&amp;#8217;s small enough that they should be quite easy to stamp&amp;nbsp;out.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; I got in touch with the developer of Xacobeo. There&amp;#8217;s an &lt;code&gt;--html&lt;/code&gt; 
flag that you can pass to it at startup, if that&amp;#8217;s your intention. If you use 
that, it indeed uses the same &lt;span class="caps"&gt;DOM&lt;/span&gt; parser that my tool does. Sigh. Affordances 
are important, especially in a &lt;span class="caps"&gt;GUI&lt;/span&gt;-based&amp;nbsp;tool.&lt;/p&gt;</summary><category term="Python"></category><category term="lxml"></category><category term="juriscraper"></category><category term="CourtListener"></category></entry></feed>