The Search News Object enables full-text searching of news content. The DayPI search engine is based on Lucene. As such, the DayPI supports a subset of Lucene's search syntax.
Search News Object can accept queries with that are composed of upto 32 terms. Each term is considered to be a word where space is a word boundary. The logical operators are not counted in these 32 terms.
Search News Object method calls must have their query strings URL Encoded. For example, an "iraq war" query string (sans quotes) must be converted to either iraq+war or iraq%20war:
http://freeapi.daylife.com
A query is broken up into terms and Boolean operators (discussed below). There are two types of terms: Single Terms and Phrases.
A Single Term is a single word such as "test" or "hello".
A Phrase is a group of words surrounded by double quotes such as "hello dolly".
Multiple terms can be combined together with Boolean operators to form a more complex query (see below). Note that all single terms and phrases are case insensitive.
Calls to the Search API are limited to 32 terms in total.
You can refine your search by limiting the application of a particular term to the headline or title of an article. This works like so:
title:"Global Warming"
You can combine this with a search against the body of the article, for example:
"jail time" AND title:Libby
This finds articles that contain the phrase "jail time" and have the word Libby in the headline.
The DayPI (via Lucene) supports modifying query terms to provide a wide range of searching options.
Daylife searches determine the relevance level of matching documents based on the terms found. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.
Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for
cheney wiretapping
and you want the term "cheney" to be more relevant boost it using the ^ symbol along with the boost factor next to the term. You would type:
cheney^4 wiretapping
This will make documents with the term cheney appear more relevant. In effect, this changes the sort order of the documents returned to you when you query using sort=relevance.
You can also boost Phrase Terms as in the example:
"cheney wiretapping"^4 "Alberto Gonzales"
By default, the boost factor is 1. The boost factor must be positive, and it can be less than 1 (e.g. 0.2)
Boolean operators allow terms to be combined through logic operators. The DayPI supports AND, "+", OR, NOT and "-" as Boolean operators. Note: Boolean operators must be in ALL CAPS.
The AND operator is the default conjunction operator. This means that if there is no Boolean operator between two terms, the AND operator is used.
You can use up to 32 terms in any given call to the Search API.
The AND operator matches documents where both terms exist anywhere in the text of a single document. This is equivalent to an intersection using sets. The symbols && can be used in place of the token AND.
To search for documents that contain "libby perjury" and "Valery Plame" use the query:
"libby perjury" "Valery Plame"
or
"libby perjury" AND "Valery Plame"
The OR operator links two terms and finds a matching document if either of the terms exist in a document. This is equivalent to a union using sets. The symbol || can be used in place of the word OR.
To search for documents that contain either "libby perjury" or just "plame" use the query:
"libby perjury" OR plame
The "+" or required operator requires that the term after the "+" symbol exist somewhere in the text of a retrieved document.
To search for documents that must contain "libby" and may contain "pejury" use the query:
+libby perjury
The NOT operator excludes documents that contain the term after the NOT operator. This is equivalent to a difference using sets. The symbol ! can be used in place of the token NOT.
To search for documents that contain "libby perjury" but not "Valery Plame" use the query:
"libby perjury" NOT "Valery Plame"
Note: The NOT operator should not be used with just one term. For example, the following search will return no results:
NOT "Valery Plame"
The "-" or prohibit operator excludes documents that contain the term after the "-" symbol.
To search for documents that contain "libby perjury" but not "Valery Plame" use the query:
"libby perjury" -"Valery Plame"
You can use parentheses to group clauses to form sub queries. This allows specific control over how Boolean logic is applied to a query.
To search for either "libby" or "cheney" and "indictment" use the query:
(libby OR cheney) AND indictment
This retrieves articles where either "libby" or "cheney" are used along with "indictment".