Insight Horizon
lifestyle /

Text Analyzer - April 2017

Hi Jessica,

Thanks for your quick reply. I had ducked into a meeting. So a few quick observations:

· Love the interface starting with the drag and drop. I could do it all day!
· Love the progress bar
· Love how the results are presented on the right, with the links out to the documents. I haven’t tried it too much yet, but each of my result hits usually reaches the document as I am inside the MIT libraries and we kind of have everything.
· I assume the prevalence of JSTOR hits has been based in part on the availability of that text to mine?
· Love the sliders for the prioritized terms and the click-to-add interface for the identified terms. I can see great use in both of those especially if the target corpus is substantial enough. In other words, if my interest is in visual acuity and you have a critical mass of related information, then I can use the two filtering mechanisms to my heart’s content.

The results range from excellent to a little curious. I am attaching two journal articles that I tested. Both are in the area of composition theory/rhetoric and are by the same author. The author, Peter Elbow, is a major figure with a very long cv. The file “fulltext_stamped” is a less significant article of his, the second one “fulltext_stamped (1)” is heavily cited. Not sure if those things are at all related, but the results of the lesser-known article are not on-target for the most part while the results of the better-known article are. The results for the lesser-known work seem to be skewed by an over-reliance on “oral” and “literature” and interpreting those terms as being related to storytelling and literature while the article is about how students use oral language in the course of creating their written work. (This topic is a staple in Elbow’s work.)

The results for the better-known article are much better. Without any filtering, the first three results* are highly relevant as are a number in the first 10 or so. (One is the original document, though in its original citation and not as a reprint). My results get even better when I move the slider for “singers” to off and further refine when I add terms such as “writing,” “writing instruction,” and “written composition.” Actually with all of those changes the full result set is relevant. I would want to just keep playing with it at that point.

So I am not sure if my two examples are instructive or not for your own purposes but they are pretty striking to me.

Oh, and one small point, though perhaps I am missing this in the interface—it would be nice if the results page referred to the original document by file name and/or article title. In fact if the analyzer took a shot at the bibliographic citation at that point it might be kind of cool.

Great work. So fun to look at. Let me know if I can be of any help.

· * One interesting point in the results--your tool heavily associated the second article with the work of Joseph Harris. The article does indeed center on some of Harris’s work and the results point to one of the specific citations in the article (his book, Voice) but does not end up listing (at least in the early search results), a specific article cited, “The Plural Text/The Plural Self: Roland Barthes and William Coles.”

--email follow up from academic publisher in Massachusetts, April 26, 18:25