Clean Run Magazine Online Search
04 Sep 2006
I regularly read the Clean Run agility email list and there are always requests from magazine readers to locate the issue(s) containing various articles. The Clean Run staff usually search their indices and reply promptly. Clean Run’s website also provide Microsoft Word documents containing indexes to individual articles for free download. They also provide a brief description of each magazine in their online store where customers can order back issues. I figured from the combination of that information I could put together a database of descriptions and indexes and provide an online search page for other Clean Run readers.
I first contacted Monica Percival at Clean Run to see if they were planning on providing search in the near term and if they would let me use their data for that purpose. Monica said they weren’t currently and kindly gave me permission to copy and reuse the data. So I downloaded the data and got to work.
Right now the search page supports searching each issue’s summary description and the article index from Jan 1997 to the latest data on the Clean Run website. I have written some scripts to “crunch” the data but there is still a fair amount of manual work on the article index MS Word files to make it fit the format I need. So there will be a slight lag for me to keep the database up to date as new issues are published. There is a checkbox on the search page to let you choose whether to search index or description information.
Here’s the link to the search page.
Technology
Not a lot of technology is involved. I found a simple Perl CGI script for searching flat file databases called “free-search.cgi” created by CNC Technology. It had the basic functionality and didn’t require configuring and populating a database, after all there isn’t that much data to search.
I had to modify the script to sort the output (newest issues first), provide flexibility in styling and placing the previous and next result buttons, rerouting the no results page back to the search template page, and populating the search box with the previous search keywords. Including desiging the page layouts, I probably spent about 8 hours getting this to work as I needed.
I’ll send my changes back to CNC Technology in case they want to incorporate them into their script. Their copyright doesn’t allow me to redistribute my changes.
There are two flat file “databases”: one containing a line for the text summaries of each issue and another containing a line for each article in each issue. It is the extracting and formatting of that data that is time consuming part.
For the article database I’ve also added a keyword column. It isn’t populated yet but I was thinking it could be used to add keywords to each article to help in locating articles whose titles don’t contain all the words that could identify it. I figured I could enter them as I read each article in future issues. Maybe I can get other enthusiasts to volunteer to provide keywords for the back issues.
So I hope you find this search tool helpful, let me know if you have any suggestions or comments.
If you enjoyed this article won't you please: Thanks!