backUser Guide: Searching

  • General

    Not every search pattern is accepted by clucene. In case of invalid queries an error will be thrown. To avoid this situation the following rules should be maintained:
    • queries don't starts with wildcards
    • query length should be greater then 2
    • only us-ascii characters (8-bit) supported
  • Queries

    The indexer is based on the c++ port of apaches lucene project. Therefore it supports the whole query syntax, too. Mostly following wildcards are useful:

    '*': Looks for terms with 0 or more characters.

    Example (works):

    eigenf* or eigenf*s

    Example (don't works):

    *faces or *

    '?': Looks for terms that match that with the single character replaced.

    Example (works):

    eigenf?ces or eigenface?

    Example (don't works):

    ?igenfaces or ?
    The query parser includes a lot of useful wildcards to create a efficient search request. For more details see query parser.
  • Address Bar Support

    From version 1.05 bookmark tools supports searching via address bar. Practically, this means that you can enter every pattern (see section above) directly into the urlbar text field. Note, index results are marked with a magnifier.

    Example

    back
  • Finding relevant results

    Each query returns a descending sorted list of search results. The sort property score is a factor that describes the relevance of a clucene document. In other words, each returned bookmark is characterized by a value that specifies how interesting is a result related to a query. Normally this value has data type float in range of [0..1] but the extension displays a score with 0 to 100 as integer. Therefore a number like e.g. 0.0012312 results in a score value 0 and isn't shown.

    Note, the urlbar hides every score value by default.
  • Sidebar features

    The extended sidebar helps you to find bookmarks in your folder tree. If a search result is selected it is displayed in the book tab below. Remember, in case of duplicates the first entry, that equals the selected bookmark, will be shown.




backHowTo use the index component (uses clucene-core-0.9.20)

Writing::


									//initialize file object (location: profile directory)
									var file = Components.classes["@mozilla.org/file/directory_service;1"]
												.getService(Components.interfaces.nsIProperties)
												.get("ProfD", Components.interfaces.nsIFile);

									file.append("index"); //folder named 'index'
									
									if(!file.exists() || !file.isDirectory()) 
										file.create(Components.interfaces.nsIFile.DIRECTORY_TYPE, 0777);

									//initialize xpcom
									var writer = Components.classes["@bookmarktools.mozdev.org/search/indexWriter;1"].createInstance();
									writer = writer.QueryInterface(Components.interfaces.ICLWriter);
									
									//set path
									writer.setPath(file.path);
									
									//create index document
									var cl_document = new Object();
									cl_document.primaryKey = "key0";						//this shoud be a unique key
									cl_document.name = "first document";					//document name
									cl_document.content = "Hello World!";					//some content
									
									//append document
									writer.appendDocument(
										cl_document.primaryKey,
										cl_document.name,
										cl_document.content
									);
									
									//optimize data
									writer.optimize();
									
								

Searching::


									//initialize xpcom
									var reader = Components.classes["@bookmarktools.mozdev.org/search/indexReader;1"].createInstance();
									reader = reader.QueryInterface(Components.interfaces.ICLReader);
									
									//set path
									reader.setPath(file.path); //see section 'Writing'
									
									if(reader.exists()){ //check for index data
										
										var query = "Hell*";
										var results = Components.classes["@mozilla.org/array;1"]
														.createInstance(Components.interfaces.nsIMutableArray);
										reader.search(query, results);
										
										var e = results.enumerate();
										while (e.hasMoreElements()) {
											var resultNode = e.getNext().QueryInterface(Components.interfaces.IResultNode);
											
											//Object: resultNode
											//-> resultNode.primaryKey
											//-> resultNode.score
											alert(resultNode.primaryKey + "\n" + resultNode.score);
										}
										
									}else {
										//display message or do nothing
									}
									
								

Lookup::


									//initialize xpcom
									var reader = Components.classes["@bookmarktools.mozdev.org/search/indexReader;1"].createInstance();
									reader = reader.QueryInterface(Components.interfaces.ICLReader);
									
									//set path
									reader.setPath(file.path); //see section 'Writing'
									
									var results = Components.classes["@mozilla.org/array;1"]
													.createInstance(Components.interfaces.nsIMutableArray);
									reader.lookup(results);
									
									var e = results.enumerate();
									while (e.hasMoreElements()) {
										var lookupNode = e.getNext().QueryInterface(Components.interfaces.ILookupNode);
										
										//Object: lookupNode
										//-> lookupNode.primaryKey
										//-> lookupNode.name
										alert(lookupNode.primaryKey + "\n" + lookupNode.name);
									}

								

Delete documents::


									//initialize xpcom
									var writer = Components.classes["@bookmarktools.mozdev.org/search/indexWriter;1"].createInstance();
									writer = writer.QueryInterface(Components.interfaces.ICLWriter);
									
									//set path
									writer.setPath(file.path); //see section 'Writing'
									
									//delete documents with index 0, 5 and 2
									var docs = [0, 5, 2];
									writer.deleteDocument(docs, docs.length);

								

backHowTo use the tidy component (uses ctidy version 060405)

Cleanup::


									//initialize xpcom
									var tidy = Components.classes["@bookmarktools.mozdev.org/repair/tidy;1"].createInstance();
									tidy = tidy.QueryInterface(Components.interfaces.ITidy);
									
									var src = "
									<html>
										<head></head>
										<body
											broken body tag & no entity
										</body>
									</html>";
									
									//result contains the repaired html/xhtml/xml source
									//otherwise a error message returned
									var result = tidy.cleanup(src);
									
									if(result == "Tidy: no output"){
										alert("error returned");
									}else {
										alert(result);
									}