Good Feng Shui? A Course Evaluation

We started this blog as an experiment. I was expecting this to just a writing assignment but it turned out to be more than that.

I like the blog assignment because it helps me keep up with some of the concepts we learn in class and apply them to my own interactions with the Web (apparently you capitalize the Web). For example, I was a Pinterest addict. When I first started using it I knew something was different about it, yet I couldn’t quite put my finger on it. Of course Pinterest could be analyzed in relation to almost any topic we covered in class this year. I chose to examine it as an information retrieval system. Not only am I more critical of what a website has to offer (is it Web 2.0? is it Web 3.0 yet?!) but I am also equipped with a better vocabulary to discuss it.

Similarly, writing this blog encouraged me to try out new tools and give them a test run. I learned how to use Altmetric, Wordle, Voyant Tools, TAGS, and more. Many of these are in their first stages of production. The developers from Old Bailey Online were particularly excited to hear the feedback from our class. I’m not so sure I will be continuing on for a PhD in the future (probably not ever) or will need to use these tools in the work place, but I’m sure they will be helpful when I’m conducting my thesis research. Again I feel like I have developed stronger analytical skills. I’m not going to kid myself, I have not gained any computer engineering skills in this course. Instead we gained the ability to ‘see’ websites and consider how they are organized. We can’t build websites but we sure can comment on them.

I found this class important and relevant because the future is online. Text and data mining are here to stay because of the shear volume of information made available on the web. I would have liked to have learned more about the semantic web because that’s another development of the Web that’s still in progress. I think I have gained relevant skills to be able to participate in the next fundamental change in the way information is structured online.


Old Bailey

The Old Bailey was London’s criminal court between the 17th and 20th centuries. A combination effort from 2 universities and several bodies for grant funding allows access to thousands of proceedings online. The website is set up for simple and complex searches. This makes it a great tool for researchers and students learning about digital technologies, like myself.

The simple search is found on the websites home page. It’s use is for generic searching or just browsing.  Or if you navigate to the search page, you have a wider variety of search options. See the respective pictures below.

Screen Shot 2014-11-30 at 10.16.56 PM Screen Shot 2014-11-30 at 10.16.41 PM

Both these searches are limited by the documents you are able to search and by the types of queries you are able to make from those documents. You can search the proceedings and the Ordinary’s Accounts dating from 1669-1772. Oliver Twist inspired by search terms as I was looking for keyword ‘bread’, offense ‘theft’, and punishment ‘public whipping’. Only 15 results came back, and unsurprisingly no Artful Dodgers.

The API let’s the user explore their results differently from the simple search. The simple search just produces a list of results. The API search allows you to explore those results. One way is to essentially search within the results that were produced by an original simple search. They term this process as ‘undrilling‘ or breaking the results down by sub category. You can also further explore these results with other tools as the API lets you export your results into Voyant. Probably because of the amount of traffic caused by our lab session, i wasn’t able to export my data. It was taking too long.



Textual Analysis…Analysis

When I hear the phrase ‘word cloud’ a memory from the HBO show Weeds surfaces in my mind (Season 6, Episode 12). The anti-hero, Nancy, is threatened by an under cover journalist getting dangerously close to the truth. Nancy scoffs when presented with a word cloud, but is then on her guard when she hears that the top 5 adjectives for her son, Shane, have aided the journalist to correctly guess that Shane is Pilar’s murder! Drama!

While wildly entertaining, this is a scene of pure fiction. I doubt word clouds will be protocol for investigating anytime soon. I also go on to doubt that word clouds will be used in serious academic writing either. New York Times journalist, Jacob Harris, considers the tool to be “the crudest sorts of textual analysis” for simply using size to indicate frequency of words used. Strong opinion considering this is coming from a guy who specializes in data journalism. On the other hand, author Julie Meloni would say the Wordle tool is simple and useful. Her evidence though is firmly based in literary examples. Creating a word cloud is appropriate for single pieces of text like poems, novels, or speeches because you are often looking for themes or patterns of rhetoric. Textual analysis in an academic setting is meant to search large amounts of texts not just one.

I experimented with make a word cloud of my own. First, I used Altmetric to gather articles from David Bawden’s suggested journals (listed at the end) to use in our RECS assignment all published in the last 6 months. I exported the data into a .csv file and opened in excel. Next, I simply copy and pasted all the titles into the Wordle text box.

Prescribed Titles Word Cloud

I noticed that Wordle automatically uses stop words (common words that don’t mean anything by themselves, like conjunctions or prepositions). A convenient feature, but it doesn’t have anyway for you to customize the stop words. The only alterations the user can make are superficial, things like layout, font and color. This website is a great tool for visualization, but not such a great tool for analysis.

Another website also includes a visualization of your text along with a wide variety of useful statistical tools. I’ll also mention that if you hover a word in the word cloud with your cursor then the number of times that word is used will appear.

Screen Shot 2014-11-21 at 12.32.07 AM


I’ll admit that I probably wouldn’t use the frequency chart very often even though it looks very analytical. Let’s just say it doesn’t speak to me. However I would use the ‘key word in context’ tool. This tool will list out the sentence a selected word originated in, thus eliminating the problem of separating signifiers from what they signify Harris described.

Screen Shot 2014-11-21 at 12.39.57 AM


In a very brief conclusion, Voyant has much more to offer than simple Wordle.

List of journals for Altmetric data set:

Journal of Librarianship and Information Science

Library Trends

Library Review

Journal of Documentation

Journal of Information Science

Journal of the Association for Information Science and Technology

Information Research