The DayPI supports a mechanism we call source filtering as a means to limit API results to a specific set of sources, or to eliminate a specified set of sources from returned results.
This recipe describes the two existing methods for invoking source filters in DayPI calls.
In general, source filters can contain either a white list, which ensures that returned results are limited to a specified set of results, or as a black list, which suppresses results from a particular set of sources from return data.
In addition, source filters can be defined in two ways: as a predefined set or as a runtime defined set , which you provide in the context of your DayPI call.
In the currently available production version of the DayPI, Daylife must set up a predefined source filter for you.
Predefined source filters are defined per access key; only you will be able to access a source filter defined for you.
The mechanism is straightforward, and is initiated with an email to us that includes the following information:
We'll set up the source filter on our end, and return to you a source filter ID that you can add to your DayPI calls to achieve the filtering you've defines.
You can have several source filters going at once, each of which you can choose to invoke at appropriate points in your code. For instance, if you're building an application for a newspaper, you might build a source filter that whitelists only your publication, then another that blacklists sources that are direct competitors or whose viewpoints from which you wish to spare your readers. You could then build a page that shows news first from your publication, then from a sanitized set of sources from around the web.
You invoke a predefined sourcefilter by including it as a parameter in your DayPI call.
Without waiting on us, you can specify a whitelist or blacklist for sources at runtime by using on-the-fly source filters.
Using an on-the-fly source filter, you can limit the returned results to a set of sources, or exclude a particular set of sources from your results.
You can put an on-the-fly source filter in place by adding a parameter to your call:
source_whitelist=<sourcename | sourceID>
To define a blacklist, you'd use the following:
source_blacklist=<sourcename | sourceID>
For instance, the following call gets the past week’s worth of CNN articles that match the search term “Iraq”:
Note that if you use a source name for the source_whitelist value, you need to match the name we use directly. A completely reliable way to get results from a particular publication is to use that source’s source_id:
(0blKbZP140g1Z is the source_id for CNN)
If you wish to have several sources in your filter, specify the source_whitelist or source_blacklist multiple times. For instance:
You can download the list of sources here.