Tuesday, November 29, 2011

Such Seriousness

It's 6:50 (EST) in the morning and in the last 10 minutes Citation Machine has cited:

 7,138 books, 
   841 encyclopedia articles, 
10,738 journal articles, 
 1,946 magazine articles, 
 2,230 newspaper articles, 
   630 chapters from compiled works, 
 1,468 government or corporate documents,  
   546 interviews, 
    96 conference papers, 
22,731 web pages, 
 1,185 web-based media objects, 
   745 blog entries, 
    65 online discussion comments, 
   586 documents from databases, 
   240 radio or TV programs, 
   690 films or videos, and 
   195 lectures.  

So many very serious people out there.

Monday, November 21, 2011

Progress Report

New advertising to pay for additional RAM and the new uber server coming in December
This is a blog post that is way overdue.  A combination of economic calamity, rising young edtech stars, and a scheduled move to scale back my public speaking activities have had me in my home office -- mostly working on Citation Machine.  Some associations with teams running similar web services have bent my attention toward improving this popular tool, that has gone a long time without proper attention and TLC.

I started out wanting and needing to do a total redesign of CM and how it worked, which inspired some rather enthusiastic resistance from commenters about why I should think that something that works so well should be so changed.  Although I still believe that the changes made the tool more efficient, efficiency is a personal thing, and practice plays a big part in what makes something work well.  So I lamented and went back to the old design.

Since then, I have spent some time adding sources to and populating out the Chicago style section, and making corrections to the other three citation formats.  I've also, for a long time, been interested in a way to automate some of the format building.  One way was to tie in to Google's ISBN book lookup API, enabling researchers to simply type in a book ISBN, and having the available information plugged into the citation template -- automatically.  That worked well until its use far exceeded Google's limits on how many lookups were allowed per web service.

I'd also been interested in creating an automated way of doing Web address lookups and tried my hand at page scraping, which is a highly technical and occasionally successful way of writing software that looks at a web page and pulls pertinent information from its text.  This, surprisingly, worked far better than I'd expected, but not well enough to consistently make CM more efficient -- and it cost way to much computer processing for CM's web servers.  So I abandoned it for another solution.

Going back to the ISBN issue, I decided to take a leap and to start archiving book citations that included ISBN numbers.  This has quickly generated a database of, at this writing, 45,244 books.  So, if you have the ISBN of a book today, you can enter it at the opening CM screen (or APA or MLA book template pages), select either MLA or APA styles, and there's a pretty good chance that the following template form will at least be partially filled out by the database.  This seemed such a good idea that I started archiving Web sites as well, by URL.  At this writing there are considerably more Web sites in the database than books, showing 859,668.  So entering a Web add (with http:// included) avails a fairly good chance of saving some time with at least partially completed template forms.

This, too, costs CPU power, so we had to double the RAM on one of the servers ($), and have concluded that we need to upgrade to a new uber server during the December holidays ($$$).  This is the reason for the additional advertising.  But increasing the size of CM's pages to make room for a 200x300 pixel ad also provided more space for instructions.  So for each CM source there are now some fairly detailed instructions on each form element to be entered in the template form.

The thing that got me writing so early this morning is that I've done most of this in silence, in the closed confines of my man-cave office.  So I'm going to try to be more open about what I'm doing, not just here in my Blogger blog, but also through the Facebook page and perhaps even set up a Twitter account, posting periodic updates on what I'm doing with CM and why.

So pay attention!

Oh!  And then there's the squirrel.  But we'll talk more about him later ;-)