
This news indexer sample is what we use to power the indexer widgets for Treehugger and CNN's Meme-o-meter
The indexer takes a list of queries as an input and generates a weekly index of number of mentions of each query term in the news in last 2 weeks. It also generates a list of 2 stories for each query term that were in the news in last 2 weeks.
You specify the list of query terms either as a semi-colon separated list or as a URL to an input file that adheres to this format. The input xml file format has a list of
The index creator outputs an XML that you can use to render your own indexer. Here is how you can test the indexer -- using a input file or using a list of query terms.
This index creator is not the fastest and the performance degrades as you have more input terms. To implement a high performance application, we suggest that you get the response XML every 30 minutes or so and cache it on your side to serve it to a live application.
The indexer has been written in PHP and you can download the code here.