File:  [mozdev] / bugxula / www / bugxula.log
Revision 1.2: download - view: text, annotated - select for diffs - revision graph
Fri Nov 21 23:29:56 2003 UTC (14 years, 8 months ago) by myk
Branches: MAIN
CVS tags: HEAD
new version of Bugxula with fixes for a bunch of problems Asa reported yesterday

2003 November 19:

So, what we have is a set of data that we want to create UI for.
The data is stored as RDF, although that may not be the best format for it.
But perhaps we don't have to tackle that decision right now.
Actually, the data is not quite RDF, but it's stored inside an RDF data source.

We could use some sort of template to generate the UI.
The template options are XUL templates, XSLT, and something home-grown.
One of the problems is that besides inserting the data into a template
we also have to generate lists in the template (of fields and operators).
The operators can probably be a straight list, but the fields probably
need to be generated based on what the installation supports.

In theory we can do that with a template inside another template.
In practice I'm not so sure this is actually possible.

I wonder if it's possible for an XSLT stylesheet to process two separate
XML files, since the field list generally comes from a different place
than the criteria.

I guess I'm starting to think that RDF is too complicated and overkill
for most if not all of Bugxula.  RDF's graph representation is very powerful
and can be made to emulate many other data structures (tables, trees, etc.)
but at the cost of unnecessary complexity.  Even the current RDF data sources
in Mozilla generally don't store their data as RDF, they merely make it available
as RDF so that it can be processed by Mozilla into various UI.

So the question is, what do I do about it?  Keeping in mind that I really want
to finish up this code today or soon and not keep churning about looking for
a solution.  At the same time I don't want to implement something that sucks
and wastes my time down the road (i.e. rewriting or hacking around the limitations
of it).

One note: I should expect that the data generated by Bugzilla installations,
although RDF, has an internal DOM structure which is the same as its external
XML structure.  In other words, using something like XSLT, I should be able
to walk the DOM tree of a Bugzilla installation's RDF output and get an accurate
representation of the data.

The same is not true of the data generated by Mozilla, however, since Mozilla's
RDF serializer doesn't necessarily generate RDF that follows a tree arrangement
(which makes sense, since RDF doesn't represent trees).

What I have now:

1. I have RDF lists of fields.
2. I have a hard-coded list of operators.
3. I have URL query strings representing queries.
4. I have UI code that uses XUL templates to generate the field menu.

What's the simplest thing I could do right now to make this work?

1. use the existing XUL template to generate the field menus (if it works);
2. use the existing hard-coded list of operators;
3. use the existing URL query string data storage format, and write some code that parses it to a JavaScript data structure and thence to a UI representation and back;
4. create a "template" (a chunk of XUL code that is hidden) and build UI with it programmatically.

How could I modify this in the future to work better?

1. store queries in individual XML query files rather than in one big RDF file;
2. store an XML representation of the criteria rather than a URL query string representation of it;
3. generate XUL on the fly via an XSLT stylesheet that processes a query file and returns its XUL UI equivalent;
4. when the user changes the criteria, make those changes to the XML file and process it again with the stylesheet.

Hmm, I'm somewhat conflicted about this whole thing now.  I'm thinking so many things:

1. should I really be abandoning XUL templates and RDF, which are
   relatively mature and come with lots of features for managing
   the tree like built-in sorting (for non-content trees)?
2. should I really be trying to do this with XSLT, which hasn't
   been used much to generate XUL from the look of it?
3. do I really want to be making yet another XML output for Bugzilla?
4. if I do want to adopt this new XML-based architecture, should I do it now?
5. what does this mean for the searches/bugs tree/pane?

I think the thing to do here is experiment with various ways of doing things
so that I can figure out what the best way is.  It doesn't seem to make much sense
to jump into something right away without knowing how it works.

I guess the truth is I'm very excited about the prospect of using XSLT
because it seems like it could solve a lot of the problems I've had
with XUL templates in the past, namely that they have relatively limited features
for matching (something must equal something else, there's no way to specify
that something is greater than or less than something else).

In theory the XML <-> XSLT <-> XUL approach looks like it could provide
the cleanest MVC architecture for doing this work of storing the data
and generating UI from it.  Perhaps it would even be possible for an XSLT
template to generate back the XML when the XUL changes, although that
doesn't make much sense.

[Question: Can XSLT process a document fragment rather than a whole document?]

Here's what the XML representation might look like:

  <search id="mybugs-1" name="My Bugs">
      <filter field="product" value="Bugzilla">
      <filter field="op_sys">
      <criterion relation="initial" field="assignee" operator="eq" value=""/>
      <criterion relation="and" field="status" operator="ne" value="ASSIGNED"/>
      <criterion relation="or" field="creation_date" operator="gt" value="2003-10-11"/>
      <criterion relation="new" field="assignee" operator="eq" value=""/>

If I store searches as XML, then how do I generate the searches list?  Do I use XSLT for that too?
And in that case, do I give up whatever benefits I get from the graph representation of RDF?
(f.e. if we start putting bugs in the list, and a bug gets listed twice in two separate places,
is there anything I lose because those two occurrences are no longer inherently linked
as they are in an RDF graph representation?)

The other issue is that XML's requirement that the data fit into a serial tree
makes it harder to use as a database.  With RDF, you just put things into the data source
and don't worry about how they look internally.  In fact, with RDF the data store
could theoretically be a binary file; there's no requirement that the store
be a serialized text file.  That's not the case with XML, which must be serialized.

Thus overall it seems like RDF is the better solution as a data store, anyway.

The problem is that it's hard for me to go from RDF to XUL via XSLT because the RDF
isn't guaranteed to be in proper tree form.  I suppose I could write an RDF serializer
whose output was guaranteed to be in appropriate tree format for my code.

Then my data store could be RDF and I could still use XSLT to generate the templates.

So then my code would look something like this:

RDF backend datastore -> serializes to XML fragment -> processed by XSLT -> XUL

I could also have shortcut methods for manipulating the data in the RDF database
so I wasn't constantly having to do all that work, since it's pretty expensive.

In order for this to work I will need to ensure that I can hand the XSLT processor
a text string and have it process that text string (the only examples I've seen
so far is where it's processing some document located at some URL).

[I may also be able to use XSLT to generate the URL query string.]

I'm starting to like this idea of using RDF as the data store but XSLT as the template language.
First of all, it means I don't have to throw away my current investment in RDF.
Second of all, it means I can use what looks like the right tool for each job:

XSLT as template language
RDF as data store (to some degree)

RDF in a sense is a complicated system for representing objects in the real world.
It's complex because the world is complex; the trick is presenting simpler interfaces
on top of it for programmers to deal with.  That'll be my trick too: making APIs
to my RDF that make it easy for me to write code that hacks it.

Ok, so new plan:  if I'm going to be using RDF as the data store I can keep many things
the way they are, but I need a structure for storing a search in the RDF data store.
Here's my first shot:

<RDF:Description about="urn:bugxula:root">
  <bz:queries resource="urn:queries"/>
<RDF:Seq about="urn:queries">
  <RDF:li resource="urn:query1"/>
  <RDF:li resource="urn:query2"/>

<bz:query about="urn:query1" bz:name="My Bugs">
    <bz:filter bz:field="product" bz:value="Bugzilla">
    <bz:filter bz:field="op_sys">
      <bz:criterion                   bz:field="assignee"       bz:operator="eq" value=""/>
      <bz:criterion bz:relation="and" bz:field="status"         bz:operator="ne" value="ASSIGNED"/>
      <bz:criterion bz:relation="or"  bz:field="creation_date"  bz:operator="gt" value="2003-10-11"/>
      <bz:criterion bz:relation="new" bz:field="assignee"       bz:operator="eq" value=""/>

Note that this example doesn't show multiple boolean charts, but if I wanted to implement those
I could do it with multiple <bz:criteria> tags, since each one is independent of the others
and it doesn't matter what order they appear.  I guess I could also implement a Bag at some point
if I wanted to.

My primary question is how to represent the relationships between the query criteria.
First of all, Bugzilla's boolean charts feature doesn't let you do (foo AND bar) OR (baz AND buz),
it only lets you do (foo OR bar) AND (baz OR buz).  This makes it easier, because the parentheses
are then implicit in the order of the criteria and don't need to be stored, but I still have the problem
of how to represent the data.  The RDF above stores it in the "bz:relation" tag within each criterion
but the first.  Somehow that doesn't seem exactly right to me.

One thing I could do is store the criteria as an actual tree, ala:

  clause: operator
    criterion: field operator value
    criterion: field operator value
  clause: operator
    criterion: field operator value
    criterion: field operator value
  clause: operator
    criterion: field operator value
    clause: operator
      criterion: field operator value
      criterion: field operator value

A query is made up of one or more criteria, all of which must be true for a given bug in order
for that bug to appear in the result set.  A criterion can be either a field-operator-value triplet
or a set containing multiple triplets and/or other clauses.  Take the following criteria for example:

(foo AND bar) OR baz AND (buz OR biz) AND bez

Since there's always some precedence, this could be written as:

((foo AND bar) OR baz) AND (buz OR biz) AND bez

So there are three main sections:

((foo AND bar) OR baz)
(buz OR biz)

And that gets further broken down into:

clause: AND
  clause: OR
    clause: AND
      criterion: foo
      criterion: bar
    criterion: baz
  clause: OR
    criterion: buz
    criterion: biz
  criterion: bez

The question is how to describe this using RDF.  Perhaps let's start with some thinking.

What kinds of things are there?  There are criteria, clauses, and the query criteria/definition/conditions.
The query definition comprises all criteria and clauses and looks like this:


We might have an easier time if we think of the definition as not also being a clause, which means there will
always be a top-level clause right under the definition.  I.e. every query has a definition which is a clause
containing all the query criteria.  So perhaps the definition can be as simple as:

  <bz:clause bz:type="AND">

Now, filling that out a little, we'd get:

  <bz:clause bz:type="AND">
    <bz:clause bz:type="OR">
    <bz:clause bz:type="OR">
    <bz:criterion bez>

And a little more:

  <bz:clause bz:type="AND">
    <bz:clause bz:type="OR">
      <bz:clause bz:type="AND">
      <bz:criterion baz>
    <bz:clause bz:type="OR">
      <bz:criterion buz>
      <bz:criterion biz>
    <bz:criterion bez>

And one more level:

  <bz:clause bz:type="AND">
    <bz:clause bz:type="OR">
      <bz:clause bz:type="AND">
        <bz:criterion foo>
        <bz:criterion bar>
      <bz:criterion baz>
    <bz:clause bz:type="OR">
      <bz:criterion buz>
      <bz:criterion biz>
    <bz:criterion bez>

Ick.  It's complicated, but that's what it should look like I guess.  The good news
is that because each clause contains only one logical operator (AND or OR), it doesn't matter
the order of the items inside it, so we don't have to use containers.  Of course, we could
use containers to store the values (f.e. Bags to store ANDs and Alts to store ORs).
We'll have to experiment and see what works.

In really proper RDF, then, the snippet above should look like this:

    <bz:clause RDF:about="urn:clause1"> <!-- clause: AND -->
            <bz:clause RDF:about="urn:clause2"> <!-- clause: OR -->
                    <bz:clause RDF:about="urn:clause3"> <!-- clause: AND -->
                            <bz:criterion bz:field="foo" bz:operator="foo" value="foo"/>
                            <bz:criterion bz:field="bar" bz:operator="bar" value="bar"/>
                    <bz:criterion bz:field="baz" bz:operator="baz" value="baz"/>
            <bz:clause RDF:about="urn:clause4"> <!-- clause: OR -->
                    <bz:criterion bz:field="buz" bz:operator="buz" value="buz"/>
                    <bz:criterion bz:field="biz" bz:operator="biz" value="biz"/>
            <bz:criterion bz:field="bez" bz:operator="bez" value="bez"/>

Yikes, what a mess.  It turns out that I actually did need to use containers,
and I needed for clauses to have definitions, because RDF requires triplets
(resource -> property -> value), and I can't have a value be its own property
(i.e. I can't say the definition of this query is this clause and the clause's
clause is a clause.  This is obviously a complete mess, but only for human eyes.
In theory with the appropriate API it will be easy to manipulate programmatically.

What should that API look like?

Should there be a query object that represents the query in memory?
If so, then it will have to be synchronized with the RDF version.
Hmm, but that should be ok.  In fact it could be the case that it just wraps
the RDF version and doesn't duplicate data.  Hmm, something to think about.

Boy, thinking about all this stuff is much better for complex systems like this one
than just flailing around trying to figure out the right architecture.  I wish
I had someone to review this though and tell me their opinion about it.

So, a query object that we create that represents the current query.
It should hold a reference to the data source and a reference to the resource
within the data source which represents the query.

It should have a serialize method that generates an XML DOM document suitable
for transformation by XSLT into the query form.  Or perhaps I should just wrap
everything up into a single function that goes from RDF to XUL.

[Aside: I can parse XML into a DOM document with nsIDOMParser.parseFromString.]

There should be an addClause method that adds a clause to the query and a removeClause
method that removes a clause from the query.  Should a clause be a separate object?
Perhaps.  One that also has a definition and addClause and removeClause methods.

There should also be addCriterion and removeCriterion methods.  Perhaps a criterion
should also be an object.  This needs to be figured out.

We'll also need to figure out how we're going to uniquely ID these objects
(in the RDF data store as well as in the JavaScript object).  Since Bugzilla
currently uses the x-x-x notation where the first digit indicates the chart,
the second the AND clause and the third the OR clause, we could number them
the same way as Bugzilla, or more accurately number them according to the name
of the query plus the number, ala my-query1#0-1-0.

To what should we serialize?  In theory we could serialize to hierarchical RDF,
but perhaps it would be easier to serialize to something much simpler and easier
to understand, f.e. the following format:

  <clause id="urn:clause1">
    <clause id="urn:clause2">
      <clause id="urn:clause3">
        <criterion field="foo" operator="foo" value="foo"/>
        <criterion field="bar" operator="bar" value="bar"/>
      <criterion field="baz" operator="baz" value="baz"/>
    <clause id="urn:clause4">
      <criterion field="buz" operator="buz" value="buz"/>
      <criterion field="biz" operator="biz" value="biz"/>
    <criterion field="bez" operator="bez" value="bez"/>

Note that the kinds of clauses we can generate in practice will be much simpler,
since we won't be using multiple charts (we'll implement multi-query unions/intersections for that),
and because Bugzilla only allows one to query for a intersection of union clauses
(i.e. one or more OR clauses that get ANDed together).  So this is the most we can have:

foo AND (bar OR baz) AND (biz OR buz OR bez)

The UI will reflect this by not letting people do anything else, but the code itself
will support more complicated scenarios since Bugzilla itself may support them one day
(and when it does I want to implement a command line interface that lets you actually
type in foo AND (bar OR baz) etc. and be able to parse that back into a query).

Ok, so to summarize, I think I want the following architecture:

1. RDF datastore (model) in which queries are stored as resources
   as they are now but with the addition of a <definition> property
   that specifies the query conditions;
2. XUL UI (view) for editing query conditions;
3. JavaScript object(s) (controller) that access the RDF datastore
   and convert query data into a UI so that users can specify the query;
   when the users make changes the controller saves them back to the model
   and updates the view;

Tasks 1:

1. build controller shell;
2. enable controller to access datastore and retrieve query;
3. enable controller to serialize query conditions into XML;
4. make XSLT stylesheet that converts XML to XUL;
5. hook it all up and make sure that it works;

Tasks 2:

1. add code to controller to change datastore when user modifies existing criterion;
2. add code to controller to change datastore when user adds criterion;
3. add code to controller to change datastore when user removes criterion;
4. add code to controller to add new query to datastore;

One thing that we'll have to work out is that users may want to run queries that they don't save.
We'll need to work out the user interaction for that scenario and also how we store it internally.

So, the user starts Bugxula, and upon starting Bugxula they are confronted with a query form.
They enter a query into the form and run it.  At this point the query is not saved.  If they
modify the form then their old query is lost.  The user can run query after query this way
and no query they run will ever be saved or be retrievable (actually, in the long run we should
keep a history of these queries so that we can retrieve them if the user decides they were interested
in one of those queries after all).

At some point the user runs a query and then saves it.  The query shows up in the queries list
as the currently selected query.  Now, if the user makes modifications in the query form
those modifications will be reflected in the query.  At this point we have to decide whether or not
the query should automatically get those changes or we should prompt the user or what.  Perhaps
the answer is that we should auto-save changes if the user actually runs the query again with the changes,
but if they change the query and then don't run it and move on to something else we should throw away
their changes.  Hmm, something to think about.

Then again we could put up a tabbed interface where when they run the query it opens up a tab
in the application for the query.  Then it would be obvious when the user was on a saved query
and when they were on an unsaved query.  Hmm, something to think about.

Hmm, here's what I have now:

      <RDF:Bag about="urn:foo1#clause1">
          <RDF:Alt about="urn:foo1#clause2">
            <RDF:li><bz:condition bz:field="foo" bz:operator="foo" bz:value="foo"/></RDF:li>
            <RDF:li><bz:condition bz:field="bar" bz:operator="bar" bz:value="bar"/></RDF:li>
          <RDF:Alt about="urn:foo1#clause3">
            <RDF:li><bz:condition bz:field="baz" bz:operator="baz" bz:value="baz"/></RDF:li>
            <RDF:li><bz:condition bz:field="biz" bz:operator="biz" bz:value="biz"/></RDF:li>
        <RDF:li><bz:condition bz:field="baz" bz:operator="baz" bz:value="baz"/></RDF:li>

Basically, since containers are first-class resources in Mozilla (and will be in RDF with the next syntax update),
I can use them as the clauses without having to add extra bz:clause tags.  And since there are several types
of containers, I can use Alts to signify unions (ORs) and Bags to signify intersections (ANDs).

[Note: Bugxula-specific stuff should have its own namespace under something like "bx:".]

I guess the remaining question is whether it would be better to create union and intersection tags
instead of using the generic Alt and Bag sequences.  That would delineate things nicely,
and it would also prevent the order of the criteria from changing in case Mozilla changes in the future
and no longer maintains Alts and Bags as ordered lists (which they aren't in reality).

I guess it would look something like this:

      <RDF:Seq about="urn:foo1#clause1">
            <RDF:Seq about="urn:foo1#clause2">
              <RDF:li><bz:condition bz:field="foo" bz:operator="foo" bz:value="foo"/></RDF:li>
              <RDF:li><bz:condition bz:field="bar" bz:operator="bar" bz:value="bar"/></RDF:li>
            <RDF:Seq about="urn:foo1#clause3">
              <RDF:li><bz:condition bz:field="baz" bz:operator="baz" bz:value="baz"/></RDF:li>
              <RDF:li><bz:condition bz:field="biz" bz:operator="biz" bz:value="biz"/></RDF:li>
        <RDF:li><bz:condition bz:field="baz" bz:operator="baz" bz:value="baz"/></RDF:li>

The problem is that this isn't valid RDF.  The intersection is ok--it has turned into a property
of the query rather than a resource the query is related to.  The unions, however, have become
properties of properties (which is what the RDF:lis are).  I'm not sure how this would work at all.
If it's really a problem to use Alts and Bags, which I'm not sure that it is, I could change
them to Seqs and add custom properties to the containers to identify them as unions or intersections.
I strongly suspect that's possible.

Anyway, I'm done for the day.  So far I've got the code that serializes the RDF to XML
and then transforms it into XUL.  Now I need to write the stylesheet that actually does
the transformation, since right now it's transforming into a barebones XUL fragment.
After that it's a matter of hooking up code that notices when things change in the UI
and directing those changes back to the RDF datastore.

Then I have to wrap my brain around this issue of a new query that hasn't been saved yet
and figure out what I'm going to do about that.  Store it in the datastore anyway?
just create the UI and mass save the query back to the datastore if the user decides
to save the query?  Perhaps mass saving is the right way to handle any change, actually,
since queries are small enough that there shouldn't be a performance penalty in wiping out
the existing RDF and replacing it with new RDF.  Something to think about.

After that I'll want to figure out how to convert the RDF conditions into a query string,
perhaps using another XSLT template.  Then I'll want to make sure I handle the filters

At some point I'll want to hook up custom fields back into the UI.  I suspect that I
will can them and go back to a hard-coded list of fields in the next working version.
Then I should probably also create an XSLT template to transform the list of installations
into a set of checkboxes so people really can search multiple installations.

I should add to the future feature list a way in the UI to add and remove installations.

2003 November 20:

<Asa_> myk: bugxula (version avail at mozdev) doesn't have Firebird or Thunderbird in the product select. Is that not dynamically pulled from bmo?
Asa Asa_
<myk> Asa: yes, but b.m.o's version isn't dynamic
Asa Asa_
<myk> Asa: it will be in a few weeks when we upgrade the server
Asa Asa_
<myk> Asa: in the meantime, i'll add them
<Asa_> thanks.
<Asa_> I'm trying to move over to using it as my primary interface.
<Asa_> myk: should I report bugs to you at mozdev, through email or IRC?
Asa Asa_
<myk> Asa_: ok, fixed; you might have to clear your cache to pick up the changes, however. on my copy of mozilla with bugxula installed the cache manager doesn't seem to notice that the file has been updated and is pulling it from cache
Asa Asa_
<Asa_> cool.
<myk> Asa_: ah, cool!  any of those methods will work.  you can always ping me on IRC about something if i'm around, or email me if i'm not
Asa Asa_
<myk> Asa_: if i can fix something quickly and spin a new build i will, otherwise i may ask you to file a bug to make sure it doesn't slip my mind
Asa Asa_
<myk> Asa_: the next thing i'm working on is a better query form which is essentially the quick search toolbar but with multiple criteria (ala the boolean chart)
<Asa_> ahh. cool. a bug I found is that loading a saved query doesn't set all the fields in the search toolbar back correctly.
<myk> that'll nail down the basic feature set, then it's on to the icing + a lot of work on UI usability
<myk> hmm, which fields does it miss?
<Asa_> my "flag is equal to foopy" query loads up "bug# is equal to foopy"
<myk> hmm, that seems wrong
<Asa_> also, sorting by votes seems to be quite wrong (at least for the queries I have defined)
<Asa_> it's like most of them are in order but a few with lots of votes are scattered through the list.
<myk> do you get query results when searching for flag is equal to foopy?
<Asa_> sorry, not "foopy".  I have several queries for flag. try equal to blocking1.6? or approval1.6?
<Asa_> those return results fine for me.
<Asa_> also, it would be cool if you added bugxula to the Firebird extension panel in preferences so it could be disabled.
<Asa_> and if you installed to the profile directory so I didn't have to reinstall each day.
<myk> ah, found the flag problem
<Asa_> cool.
<myk> i'll fix the installer too so it installs to the profile directory and spin you a new build
<myk> i'll try to get the extension panel fix in as well
<myk> i'll get you a new build by tomorrow
<myk> so let me know if there's anything else you want fixed by then
<Asa_> OK!!
<Asa_> thanks.
<Asa_> OK. I've got another feature request, might be too large to do now. Renaming, reordering and foldering of searches in the saved searches sidebar
<Asa_> any of those
<Asa_> and for some reason, after installing it on firebird on OS X, I don't see it listed in the menu.
<Asa_> seems fine on Seamonkey on OS X
<Asa_> and it shows up in the menu on Firebird windows.
<Asa_> and the windows XP progress meter is blank during searches.
<Asa_> also, it seems as though on Windows I have a query that fails but which works on Mac. If I search for all bugs reportedby it fails on bugxula firebird and seamonkey on windows (both builds a few days old) but does fine on Mac seamonkey.
<myk> i don't think i'll be able to get to renaming/reordering/foldering/ by tomorrow
<myk> i'm not sure what the deal is on OS X firebird; i'll have to look into that when i can sit in front of a mac this weekend
<myk> (i've tested on firebird linux and seamonkey linux, and it shows up in both of those)
<myk> yeah, i've been meaning to figure out why that progress bar won't fire up; i should be able to knock that down by tomorrow
<myk> hmm, that query works for me on firebird linux
<myk> (0.7)
<myk> but i got another report of query failure recently; can you open a console on windows and send me the URL that gets printed to it when you run that query?

2003 November 21:

As it says in index.html:

  <li>fixes a bug where the field menu wouldn't always show the right value after you clicked on a saved search;</li>
  <li>installs to the profile directory so you don't have to reinstall every time you get a new build;</li>
  <li>adds itself to the Firebird extensions list so you can disable it;</li>
  <li>fixes a bug where the item for running Bugxula didn't appear in the Tools menu on Firebird for Mac OS X;</li>
  <li>fixes a bug where the progressmeter wasn't working;</li>

I also fixed the "votes don't sort right" problem with bug 226469 (a Bugzilla bug):

FreeBSD-CVSweb <>