Google continues to be the most popular search engine, but it is not resting on its laurels. Improvements in the Google search algorithm and more powerful modes of image search may offer important new capabilities for Chemistry. Science applications of Big Data are now being used for genomic sequencing, climate science, electron microscopy, oceanography and physics. Chemical applications of Big Data have been slower to appear, but these techniques are now proving valuable for chemical literature searches and Quantitative Structure–Activity Relationship (QSAR) models. In addition, the special software that is being used to analyze Big Data is beginning to appear in traditional chemical searches, like Chemical Abstracts. danah boyd (sic) argues that Big Data has emerged as a system of knowledge that is already changing the objects of knowledge that we work with every day.
Recent Developments in Search Engines and Applications to Chemical Search
Over the years, one of the recurring themes of this column has been the question, “What is the best search engine for chemists?” For some time Google has been the most popular engine, and this continues to be the case. Despite the millions of dollars that Microsoft has spent to advertise Bing, Google’s market share has only dropped from 65% to 64.8%. Bing has gained some market share, but most of the increase has been at the expense of the third place engine, Yahoo. It is estimated that the efforts to overtake Google are cost Microsoft almost a billion dollars a quarter. ((http://www.dailytech.com/Bing+Loses+Nearly+1B+per+Quarter+for+Microsoft/article22817.htm)). Microsoft’s latest marketing ploy is the “Bing It On Challenge” (http://www.bingiton.com/) that asks people which search results they would choose, Bing or Google, in a “side-by-side, blind-test comparison.” Microsoft claims that people choose Bing by as two-to-one ratio. I tried it with chemical terms and picked Google four out of five times, with the other result being a draw. Similarly, a Professor from Yale Law School recruited people to make the comparison and found that Google was the preferred engine unless the search terms were selected by Microsoft. (http://www.slate.com/blogs/future_tense/2013/10/02/bing_vs_google_professor_fact_checks_bing_it_on_challenge_microsoft_loses.html) Microsoft claims that the professor’s experiment was flawed, but identifies no specific problems.
Google continues to improve their search engine. In my last column I talked about knowledge graph, the side bar on some search results that will help to disambiguate searches when there is more than one possible term that corresponds to the query (http://www.ccce.divched.org/P1Fall2012CCCENL) . Google announced last month that it had quietly rolled out a new search algorithm, called "Hummingbird" (http://www.infoworld.com/t/search-engines/googles-new-hummingbird-algorithm-about-queries-not-just-seo-227732. Google constantly makes small improvements to the search algorithm, but hummingbird is billed to be a more significant change.
One new feature is the comparison tool. For example, if you search for "butter vs. olive oil,” the first result is a convenient table comparing the two. Apparently, this is still being implemented because a chemical topic, like “benzene vs. cyclohexane” does not display a similar table (although it does return sites that will make this comparison). For this kind of comparison, the Wolfram Alpha search engine still provides much better results for chemists (http://www.wolframalpha.com/).
Many users pose search queries as natural language questions, but in the past search engines have responded by parsing search phrases one word at a time. The new Hummingbird algorithm is an artificial intelligence program tuned to parsing entire questions, not individual words. For many searchers, especially those who do not construct sophisticated query strings, this should give far better results. Hummingbird also is supposed to see a search in the context of a previous search. For example, If you search for "pictures of Great Danes" for instance, and then "common health problems," Google says it will understand that you're looking for common health problems associated with Great Danes (http://www.entrepreneur.com/article/228620#ixzz2gNwHUfee).
Google has announced a new and more sophisticated way to search for images (http://www.google.com/insidesearch/features/images/searchbyimage.html). There was not enough time to explore this facility in any detail, but to get an idea of how it might work, do the following:
This may be a useful capability if you are looking for suggestions of compounds that have a structure similar a known compound.
The fact that Google is still popular does not tell the whole story. Because chemists search for unusual terms, the crucial question that determines the best engine is the size of the search engine index. Remember that search engines do not directly search the WWW. Instead, automated programs called netbots move around the web and collect information, which is then stored in an index. When a chemist does a search, he or she is really searching that index. The larger the index, the more likely it is that it will contain the scientific information that the chemist is searching for.
Maurice de Kunder estimates that as of September, 2013, the indexed Web maintained by Google contains at least 3.84 billion pages
(http://www.worldwidewebsize.com/). This is also probably a good indication of the size of the accessible web. The total size of the WWW is difficult to guess because much of it is dark, that is, it isn’t easily accessible by search engine netbots. Tammy Everts estimated that in 2012, the average web page was over a Megabyte in size (http://www.webperformancetoday.com/2012/05/24/average-web-page-size-1-mb/). This suggests that the accessible web is about 1 petabyte. De Kunder estimates that Google uses an index about four times as large as Bing, which is another reason for chemists should expect better results from Google.
Looking at de Kunder’s data makes it clear that search engines must deal with huge amounts of data. The following comparisons that may help to clarify the size of a petabyte (http://groups.yahoo.com/neo/groups/SUNNETmedia/conversations/topics/816). One petabyte of data would fill 250,000 DVDs or 20 million four –drawer file cabinets. These seem like an impossibly large amount of data, but Google currently processes about 20 petabytes of data per day and Facebook currently stores about 100 petabytes of data. The entire written works of mankind from the beginning of recorded history in all languages is estimated to represent about 50 petabytes.
Even a petabyte is small compared with the vast amounts of information in circulation. According to researchers at the UC-San Diego, Americans consumed about 3.6 zettabytes of information in 2008, where a zettabyte is 1 x1021 bytes of data. David Weinberger suggests another way to visualize these very large units . A digital version of War and Peace would be about 2 megabytes or 1296 pages in a traditional book, so one zettabyte would equal 5x 1014 copies of War and Peace. It would take a photon of light travelling at 186,000 miles per second 2.9 days to go from the top to the bottom of this stack.
Part of the reason for this incredible accumulation of data is that computing resources are becoming much cheaper and more powerful. In 1980 a terabyte of disk storage cost $14 million; now it costs about $30. Not too long ago, only a few individuals or companies had access to supercomputers; now Amazon or Google will “rent” a cloud-based supercomputer cluster for only a few hundred dollars an hour. Social networks are also contributing massive amounts of data. Twitter generates more than 7 Terabytes (TB) a day; Facebook more than 10 TBs; and some enterprises already store data in the petabyte range. (A Terabyte is about 1 x 1012 bytes.) This leads to the question of how search engines deal with these huge quantities of data.
The problem is not just the amount of data. IBM describes the current information environment as The Three V’s, volume, variety, and velocity . The sheer volume of stored data is exploding. IBM predicts that there will be 35 zettabytes stored by 2020, and the amount of data available is growing by about 60% every year. The velocity of data depends on not just the speed at which the data is flowing but also the pace at which it must be collected, analyzed, and retrieved. And finally, this data comes in a bewildering variety of structured and unstructured formats. Whereas traditional data warehouses use a relational database (like Excel rows and columns), search engines must also handle unstructured data in non-relational databases, sometimes called NoSQL. Unstructured data is often text-based and may include large amounts of metadata. The combination of the three V’s is often described as “Big Data.”
Search engine providers long ago realized that new computational methods are needed to analyze Big Data. In order to power its searches, Google developed a strategy called MapReduce. A search is divided into many small components, each of which is assigned to one of the available processors for execution (Google has over a million servers) and then the results from each processor are recombined. The most popular software to accomplish this is a program called Hadoop, first developed in 2005 by the Apache Software Foundation. Doug Cutting, who was one of the developers, named the program after his son's toy elephant. There are currently six different freeware versions of Hadoop available. Hadoop is designed to collect data, even if it does not fit nicely into tables, distribute a query across a large number of separate computer processors, and then combine the results into a single answer set in order to deliver results in almost real time.
Are Big Data procedures relevant for Chemists?
Although chemical applications don’t usually involve data sets that are as big as those described above, Big Data tools can be useful even for moderately large data sets
For example, some chemical literature searches and Quantitative structure–activity relationship (QSAR) models can benefit from the application of Hadoop and similar techniques. The Royal Society of Chemistry has digitized all of the articles from their journals going back to 1841and plans to make this database available (http://www.slideshare.net/AntonyWilliams/digitally-enabling-the-rsc-archive?from_search=1). The goal is to extract reactions, spectra, data, and as many small molecules as possible. It is hoped that processing this information with Big Data tools will promote the discovery of new relationships that had previously been overlooked.
Chemical search services, like Chemical Abstracts, are now beginning to use Hadoop to work with datasets that are too large for conventional methods. Anthony Trippe writes that those who work with chemical patents have had to deal with Big Data even before the name became popular (http://www.patinformatics.com/blog/first-look-new-stn-big-data-creates-chemistry-without-limits/). He points out that, “The universe of available patent documents, worldwide, is well over 80 million, and in the CAS world of chemistry, the running count of known organic, and inorganic molecules currently stands at over 73 million substances, not including an additional nearly 65 million sequences. These types of numbers, as well as the interconnectedness of the data, certainly allow patent, and chemical information to qualify as sources of Big Data.”
In July of this year, FIZ Karlsruhe and Chemical Abstracts Service (CAS) announced the launch of Version One of a new STN platform. Trippe reports that this new version of STN is powered by Hadoop. Up until recently, data analysts sometimes found it to be difficult to run a broad structure search for patents dealing with compounds of interest, but Trippe says the new platform “. . . puts the entire collection of chemical, and patent data at their fingertips, and allows them to manipulate it at will.”
To demonstrate the power of the new STN powered by HADOOP, Trippe describes how a pharmaceutical company could search for compounds that might function like Januvia, a new class of anti-diabetic agents which inhibits Dipeptidyl Peptidase – IV (DPP4). This requires a search for all compounds that are structurally similar to sitagliptin (the free-base of Januvia), and have been studied in conjunction with DPP4. Sitagliptin has two ring systems, one a phenyl, and the other a 6,5 system with four Nitrogen atoms. Any compound that involves this basic skeleton would be of interest. This search involved more than 10 million substances, and all of the references related to these items. A query of this magnitude would have been very difficult to run on the old version of STN. It runs in less than 30 seconds on the new STN, and produces almost 2.5 million structures. Trippe concludes by writing, “Combining the breadth, and depth of the collections available, with the deep indexing that has been created by the database producers, generates a powerful combination that opens the door to exploring chemistry, and patents in a way that has never existed before. “
Search engines have become such a ubiquitous part of everyday life that it is easy to assume that there are few changes occurring. This is far from the truth. Google is leading the way towards new methods for search, and when these become fully functional they may be very useful to chemists. Beyond this, the techniques that are being developed for search are opening new avenues for science. In particular, the application of ideas from Big Data is already playing a significant role in Chemistry, and may become even more important in the future.
Big Data is important not just because of size but also because of how it connects data, people, and information structures. It enables us to see patterns that weren’t visible before. Major companies, like Google, Amazon, and Netflix, are already using large data sets to offer new services to their customers, and the National Security Agency is also using big data for less user-friendly purposes. danah boyd (sic) summarizes the potential of Big Data by writing, “Just as Ford changed the way we make cars – and then transformed work itself - Big Data has emerged as a system of knowledge that is already changing the objects of knowledge, while also having the power to inform how we understand human networks and community (http://softwarestudies.com/cultural_analytics/Six_Provocations_for_Big_Data.pdf).” boyd adds, “And finally, how will the harvesting of Big Data change the meaning of learning and what new possibilities and limitations may come from these systems of knowing?”
1. Weinberger, D., Too Big to Know: Rethinking Knowledge Now That the Facts Aren't the Facts, Experts Are Everywhere, and the Smartest Person in the Room Is the Room. 2012, New York, NY: Basic Books.
2. Paul C. Zikopoulos , e.a., Understanding Big Data. 2012, New York, NY: McGraw-Hill.
This is the twentieth anniversary of the ConfChem online conferences, which may make them the oldest ongoing online conference in chemical education, or possibly the chemical sciences. This paper will give an overview of the evolution of ConfChem and this related newsletter. We will also introduce you to new CCCE efforts to resurrect the archives and introduce a folksonomy that can connect past papers to present and future papers.
The Twentieth Anniversary of ConfChem Online Conferences:
Past, Present and Future
The CCCE, the Committee on Computers in Chemical Education, is a standing committee of the ACS Division of Chemical Education. We are all volunteers with the majority of our members being faculty at academic institutions. In this paper we will provide an overview of our Newsletter and online ConfChem conferences, from a past, present and future perspective. We will look at how our communications have evolved from paper-based media to online publications utilizing social web technologies. In this paper we will introduce our latest project, a folksonomy indexing of our archives. We can provide you with access to our development site, http://confchem.ccce.us/, and are seeking your input on this project before we move it to the production servers at divched.org. It is our hope that this new functionality will be part of next year’s ConfChem and Newsletter articles.
Part 1: The Past
The Printed CCCE Newsletter (1978-2000)
The CCCE Newsletter started as a printed communication, with the oldest one we have so far found being the March 1985 issue (fig. 1). As this is Volume VIII, Number 1, we can deduce that the first Newsletter was published in 1978. We are currently seeking old editions and the ones we have found are posted online at http://www.ccce.divched.org/Newsletter. Would you please contact Dr. Robert Belford (firstname.lastname@example.org) if you have access to any old Newsletters which are not posted at the above site?
Fig. 1. March 1985 CCCE Newsletter and order form.
Currently, these old newsletters are only available as PDF files, but we are undertaking a project to scan them with OCR software, separate the individual articles and post them in a web 2.0 content management system. It is our intention for this to become tag-able, and part of our future folksonomic indexing (see part 3 of this article). When this project is completed the old printed Newsletters will have the same level of web presence as the new ones. In 2000 the Newsletter went online and was run like a ConfChem, but there was no conference theme, with each paper being on a current topic related to the use of computers in chemical education.
Origins of ConfChem: ChemConf (1993-1998)
Around 1991 Bill Halpern started the CHEMED-L list at the University of Western Florida, and in 2011 CHEMED-L moved to Google Groups http://groups.google.com/group/chemed-l . This list is actively used today by the chemical education community to discuss a wide variety of chemical education related topics. ConfChem actually evolved out of early CHEMED-L discussions.
“The idea originated in 1992, on the CHEMED-L listserv. Members of that listserv had been using it to announce upcoming meetings, conferences, and symposia. At one point, several interesting meetings were announced in a short period of time at difference locations; I commented that it would be difficult to attend all of them and I mused that it would be nice if these types of conferences, with presented papers and question-and-answer periods, could be conducted over the nascent Internet. Don Rosenthal was quick to support the idea and he and I organized the first "ChemConf“. – Tom O’Haver (private email correspondence)
The idea was to run an online conference by posting papers and discussing them on a list server. It was decided not to use CHEMED-L and a new list, CHEMCONF@UMDD.UMD.EDU was created for the sole purpose of discussing ConfChem papers. The first ConfChem, called ChemConf, was run in 1993 and organized by Donald Rosenthal and Thomas O’Haver. This basic model, which is what we are using to discuss this newsletter, has not substantially changed since the first ConfChem. This year is the 20th anniversary of ConfChem, which we believe makes it the oldest ongoing online conference in the chemical sciences. To put this in perspective let’s look at a few comments from the 1993 survey that was given at the end of the first ConfChem.1
“This was my first experience using e-mail, so I was using it as a learning experience. I found it very rewarding, and expect to continue using e-mail when it makes sense.”
“E-mail puts everyone on equal footing: I can ask questions or state opinions with-out fear that they will be ignored because of my relative inexperience”
“The face to face contact is lost, but for someone who is new to teaching it provided remarkable access to discussion. At a usual conference I would not have known who to introduce myself to and who's conversations to eavesdrop on.”
“Many of the discussion comments and ideas in the papers will be passed on to our Dean for consideration, as well as shared with other faculty. The fact that it is in writing, rather than just notes from a conference, makes it easier to organize”
“I didn't participate in any discussions but kept up with them daily by reading all messages that came through. I picked up some good information and communicated by private e-mail with a few of the participants. In this way, the conference was invaluable. I now have e-mail addresses of a wide variety of people that I can call on with questions, etc.”
In the pre-web 1993 Confchem the papers were ACSII text files delivered by anonymous ftp and the images had to be separately downloaded. The instructions are still available; they were not platform independent and required the use of UNIX commands, which would obviously have been a challenge for many of the participants. Here is an excerpt from the “Instructions for Participants”:2
“There are three steps involved in viewing these figures: (1) "down-loading" the figure to your personal computer; (2) converting it into a binary file; and (3) opening the resulting file in a GIF viewer.”
The first conference in the chemical sciences to use the World Wide Web was the Electronic Computational Chemistry Conference organized by Steven Bachrach in 1994,3 which was followed by the first ConfChem to use internet browsers in 1995 (organized by Arlene Russell and Michael Pavelich). This was a major advance as through the use of Netscape Navigator the pictures were seamlessly embedded into the papers and you did not need separate instructions for different platforms. Both the list and the papers of the early ConfChems were maintained by Tom O’Haver at the University of Maryland. In 1998, the name changed from ChemConf to ConfChem and the list moved from the University of Maryland to Clarkson University and was moderated by Donald Rosenthal, while Brian Tissue at Virginia Tech became the webmaster and was responsible for posting the papers. In 2006 John Penn of West Virginia became the webmaster running the website until 2010, while Bob Belford took over moderating the list, which was moved to the University of Arkansas at Little Rock in 2008. Over these years the CCCE posted papers on three different websites, with additional papers actually posted on the authors own web sites, and many of which have now been lost.
Part 2: The Present
Drupal 6 & Web 2.0
The need to consolidate ConfChems became apparent and a decision was made to host the online conferences on a Drupal 6 Web 2.0 site run by the ACS Division of Chemical Education, http://www.ccce.divched.org/. Jon Holmes was instrumental in this work, and the new site was first used for the Spring 2010 ConfChem, and this is the site we are using in this Newsletter. The most obvious improvement was to connect the list server to the comment feature of each paper, thereby associating the discussions with the papers instead of storing them in a remote archive. So now, instead of replying to emails, responses are pasted below the comment or article being responded to, and that triggers an email to the list. We simply subscribe the ConfChem list to the paper being discussed. We also started creating a PDF of each paper and uploading that as a supporting file. The Drupal content management system also made it easy to embed YouTube videos, images, and even embedded applets that were run off an author’s servers.
The other major change was behind the scenes, and how we actually post the papers and conferences. In 2011 we created two new content types, ConfChem-Conference and ConfChem-Article. A web content type is a specific web page with pre-designated features, blogs, forums and wikis are typical content types. The public viewable webpage extracts material from a database, and we started using taxonomies to generate the viewable content. This article you are reading is of the ConfChem-Article content type and the Abstracts page is of the ConfChem-Conference content type. What you see on this Newsletter’s abstract page are three fields (title, author and abstract) extracted from the content-type “ConfChem Article” that are tagged “Fall2013Newsletter” and sorted by a fourth field, the date field. If we changed the tag to Fall2012Newsletter, the article would appear in the abstracts page of last year’s Newsletter.
JCE ConfChem Feature
Clearly, the CCCE, being a group of volunteers does not have adequate resources for long term archiving, and we made an arrangement with the Journal of Chemical Education to bundle 800 word communications on each ConfChem paper into the printed journal. These communications are abstracted with JCE provides a citable DOI. PDFs of the original ConfChem papers are uploaded as supporting information and thus the JCE version is a better source to cite. The last three ConfChem have now been published in JCE, with each ConfChem being on a specific topic of relevance to the chemical education community.
Part 3: The Future
What is a Folksonomy? Why a Folksonomic Index of the Confchem Archives?
A folksonomy, a taxonomy created by the “folks,” is the resulting method for classifying a set of articles or comments using a set of user-generated tags. The term was coined in 2007 by Thomas Vander Wal (http://vanderwal.net/folksonomy.htm). Vander Wal originally defined this as a, “user-created bottom-up categorical structure development with an emergent thesaurus.” A more modern definition (from Wikipedia) is, “A folksonomy is a system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content.” (http://en.wikipedia.org/wiki/Folksonomy) .
Folksonomies have gained popularity because they have been used on popular web sites like Del.icio.us and Flickr. Although popular, this method of information retrieval has several apparent disadvantages. The query responses are not prioritized as they would be by a search engine, and the metacontent, i.e. the tags, are often not developed by information specialists who would use a controlled list of terms for the classification. Thus, folksonomies are often considered less useful for seeking specific information than traditional search methods. Sinclair and Cardew-Hall2 have compared the usefulness of folksonomies with traditional search and conclude that, “…tag clouds are useful when the information-seeking task is nonspecific. That is, tag clouds support browsing or serendipitous discovery.”
Instead of an open tagging process for the Newsletter and ConfChems, we will only allow participants within the ConfChem community access to tagging permissions, which will allow the emergence of new vocabularies and terms as they gain traction in the community of chemical educators. Since the papers in the ConfChem database span a broad spectrum of topics, it is also hoped that this will lead to the discovery of fortuitous relationships that would be unlikely to result from more traditional methods.
In 2012 the CCCE was awarded a small ACS Innovative Projects grant to develop the Folksonomy indexed ConfChem archive. We created our first cloud based development site with Drupal 7 at http://ccce.us/ using the Bluehost cloud service, and are using the ConfChem subdomain in this project http://confchem.ccce.us/. As mentioned earlier, we are now using taxonomies to publish our ConfChem papers and Newsletters. These taxonomies are a closed vocabulary that define the particular Newsletter or ConfChem conference a paper belongs to. But what if we used an open vocabulary and allowed the community to tag the papers? Then you can create the equivalence of the abstract page of this newsletter that shows all articles defined by a given tag filter process. That is, the equivalence of a Newsletter extracting articles from the archives based on the folksonomy. Please feel free to join us on our development site as we refine the user interface to make this as useful as possible to the community.
Figure 2 through 4 gives a screen shots of the development site. Figure 2 is a display that an anonymous user who does not log in would see. Note on the right are filter options and a list of tags they can use as they bundle papers from different conferences through their tags. Once we are done, you can relate papers from as far back as 1985 with the most current papers, and for example, see how ConfChem and Newsletter articles on “visualizations” have changed over the last 28 years.
Fig. 2: Screen shoot of a ConfChem papers sort list at confchem.ccce.us showing social tags and filter options.
Fig. 3: Two screenshots of ConfChem articles with the tag option. The right screenshot shows tags a current paper has, and provides an autocomplete form for adding a tag.
Note, only authenticated users will see the tags option in figure 3, but everyone can use the tags. Once we move to the divched.org server your ConfChem login will give you the ability to tag items, and you can tag both past papers and current papers you are actively commenting on. Only members of the ConfChem community can tag papers, but everyone can use them.
Fig. 4: Screenshot giving the titles of papers which have been tagged with either safety, undergraduate or visualization.
This view is open to the public and on the left block are several options we have set up for extracting ConfChem papers. We are still experimenting around and seeking input on useful interfaces. Figure 4 shows the Sortable Article List, where you can sort by date or title of all ConfChem and Newsletters, or just ConfChems or Newsletters, or a particular ConfChem or Newsletter. The right block allows you to use the folksonomy to extract papers (fig. 5).
Fig. 5: Screenshot showing thea where papers tagged by highlighted terms on right block are extracted from the database.
In the right frame of figure 5 we have two ways of extracting content from the folksonomy. The top block is an alphabetical scroll bar where you can choose one or more tags and extract titles, authors and dates of papers that have been tagged with the highlighted word. You have several options, like “is one of”, or “is all of,” or even “is none of”. The second block is a hyper-linked list of the top 12 tags, with the number of papers tagged by each tag. You can expand this list to see all tags.
Our objective is for the chemical education community to be able to extract from this corpus of literature content defined by this community’s tagging lexicon, thereby relating prior and current work in new ways. We intend to go public and port this to divched.org in the Spring of 2014. Please contact Bob Belford, email@example.com if you would like to assist in the test phase, and we will set you up with an account at the development site, confchem.ccce.us.
This work has been graciously supported by an ACS Divisional Innovative Projects Grant.
1. http://terpconnect.umd.edu/~toh/ChemConference/ (last accessed Oct. 21. 2013)
2. http://terpconnect.umd.edu/~toh/ChemConference/ParticipantInstructions.txt (last accessed Oct. 25,2013)
3. Bachrach, Steven, M., Electronic Conferencing on the Internet: The First Electronic Computational Chemistry Conference. J. Chem. Inf. Comput. Sci., 1995, 35, 431-441.
4. Sinclair, J., Cardew-Hall, M., The folksonomy tag cloud: when is it useful? J. Info. Sci., 2008, 34 (1), 15-29.
Lab3D is an online resource of animated organic chemistry reactions. The resource is targeted at undergraduate students in their first or second year of organic chemistry. The reactions featured, nucleophilic substitutions and eliminations, are taught in the first year curriculum. Following ‘best use’ guidelines for instructional animations (Burke, Greenbowe, and Windschitl 1998), the animations in Lab3D are short, accurate, interactive, and are accessible outside of the classroom on the web. In addition, Lab3D is unique in displaying synchronized 2-D and 3-D animations simultaneously. The split screen video display is intended to help students intuitively connect the sub-micro and symbolic levels of molecular representation and construct more comprehensive and dynamic mental models of chemical reactions.
 Lab3D was created towards the requirements of a MSc in Biomedical Communications (BMC.erin.utoronto.ca, University of Toronto)
Visualizations are symbolic constructions used to codify information in order to make it meaningful to learners (Kleinman, Griffin, and Kerner 2005). Visualizations (graphs, tables, illustrations, animations) are valuable in educational settings because they “help make complex information accessible and cognitively tractable”, and “help us think in visual rather than abstract, symbolic terms” (Uttal and Doherty 2008). In chemistry education, where the principle actors cannot be seen by eye, visualizations are of even greater significance.
In the history of molecular representation (Perkins 2006b, 2006a), numerous visualizations have been created to represent the different properties of a molecule. These can be generally divided into 2-D symbolic and 3-D particulate or sub-micro “representational levels” (Gilbert 2008). 2-D chemical structures (wedge-dash, Haworth, Fischer and Newman projections, etc.) are generally preferred for coding atom connectivity and stereochemistry, while 3-D representations (ball-and-stick, CPK, electron isodensity surface, etc.) are preferred for coding atom spatial arrangement, size and other molecular properties such as electrostatic potential, hydrophobicity and so on. The ability to switch rapidly between the representational levels and relate the complementary information that they offer can provide a deeper understanding of chemical reactivity.
Linking 2-D and 3-D representational levels is a challenge for undergraduate chemistry students (Gilbert 2008). That these visualizations are significant abstractions from reality (colour and value are meaningless at the molecular level) and cannot be directly compared to their referent make it difficult to develop 'representational insight' (Uttal and Doherty 2008), the understanding that the representation stands for something else, and that different representations can stand for the same thing (Fig. 1). In addition, the molecular visualization can demand a knowledge of topics in chemistry that the student might not yet have mastered, and consequently require greater cognitive processing. To be adept consumers of chemistry visualizations and take full advantage of their benefits, it would be helpful to provide students with a “visual education” of what the different levels show, and how they are related (Gilbert 2008).
Fig. 1. Various representations of glucose.
Ainsworth (2008) suggests that by showing multiple presentations simultaneously, a visual education can be provided. Firstly, a student can draw on the representation with which they are more familiar to inform their understanding of a newer or more complex representation: the more-familiar representation constrains inappropriate interpretations of the less-familiar. Secondly, cross-comparison of the representations highlights their shared, invariant features. This would enable students to more easily translate one representation to another. At the same time, cross-comparison also clarifies what features are unique to that representation and hints at the type of information that can be gleaned from it. Multi-representational displays may also serve as a tacit reminder that these are merely different ways of ‘dressing’ the molecular data and help to build representational insight.
Web applications are particularly well suited as a medium for chemistry visual education since rich media, such as 2-D and 3-D displays and interactivity, can be integrated. Interactive features allow students to interrogate information at their own pace, which can reduce the burden on cognitive processing (Lowe 2004). Interactive movies can show reaction dynamics explicitly, further reducing the burden on the student, since he or she would otherwise need to perform the transformations mentally (Lowe 2004). Digital media may also increase visuo-spatial reasoning. The use of molecular software contributed to significantly improved student performance in stereochemical tests over the textbook and hand-held model groups (Abraham, Varghese, and Tang 2010). The researchers proposed that the visuals generated by the software reinforced the natural method by which we form and manipulate mental models of molecules: first, by forming a mental image of the molecule, then performing a mental transformation (Abraham, Varghese, and Tang 2010). A stationary image is also initially presented in Lab3D, which the student then has the opportunity to freely manipulate.
The objective of Lab3D (Fig. 2) is to provide a learning resource for undergraduate students that facilitates comparison of the two representational modalities of chemistry visuals by showing side-by-side, synchronized 2-D and 3-D interactive movies. For each reaction, there are two organic reaction viewer windows, the 3-D animation viewer where the 3-D scene can be rotated, translated and scaled as the animation plays, and the 2-D animation viewer where a symbolic animation is shown. In addition, the type of 3-D representation (ball-and-stick, etc.) can be toggled using buttons to right of the 3-D viewer. An overview of the Lab3D user interface is shown in Fig. 2.
Fig. 2. Overview of the Lab3D user interface. (or click here for a walkthrough of the current site functionality). A) 3-D Animation viewer. As the animation plays, the scene can be rotated, translated, and scaled. B) The overall reaction equation is shown by default, but can also be hidden from view. C) 2-D Animation viewer. D) The animations are controlled through the media controls and slider. Play, pause, advance or rewind by a frame, move to the start or end, and scrub through the animation at your own speed. The slider functions also as a reaction coordinate, showing the location of transition state(s). E) From a list of general reaction categories, select a set of conditions. F) Toggle stick, ball-and-stick, and CPK representations. G) Toggle atom labels. H) Play a movie of the reaction with a surface representation. I) For smaller screens, click and drag to adjust the height of the 3-D viewer and bring the 2-D viewer “above the fold” of the browser window. J) Curated, contextual links are provided for additional information about each reaction.
The website was created in a multi-stage process:
1) 3-D data collection. Reactions were modelled in Spartan Student ‘10 (Wavefunction Inc.) using a coordinate driven approach at the B3LYP/6-31G* level of theory.
2) 3-D data ‘work-up’. Energy vs. constraint (internuclear) length was plotted and used to identify the transition state. Bonding information was updated to match and the resulting sequence of structures exported as a MDL SD File. Isodensity surfaces showing electrostatic potential maps were calculated and a movie was generated.
i) 3-D molecular viewer. The 3-D viewer uses native Web technologies to load and display molecular graphics data. ChemDoodle Web Components (iChemLabs) was implemented to parse the molecular data (a MDL SD File is retrieved from the server) and generate 3-D WebGL representations.
ii) Reaction equation. Reaction equations were drawn in the ChemDoodle Web Sketcher and exported for further editing. Additional features were generated using a custom library (Lab3D.js). Finally, the ChemDoodle ViewerCanvas class was used to display the reaction equation in the browser.
Following expansion of the reactions library and further testing and debugging, We hope to find partners in chemistry education would be interested in incorporating use of the tool in the classroom and providing feedback on use, changes in misconceptions (resolves previous misconceptions or creates new ones), and achievement.
I would like to eventually realize the incorporation of 3-D interactive chemistry animations within digital textbooks. While visuals have been constrained so far by the limitations of print medium, "rich-media", such as audio, video and 3-D objects can be incorporated within e-textbooks. For example, some biochemistry titles available on the Inkling platform showcase 3-D rotatable molecules. The movement of educational publishers towards digital publishing – despite perhaps slow adoption by students and teachers (Greenfield 2013) – creates opportunities to revisit and potentially elevate the quality of visuals in chemistry textbooks.
Abraham, M., V. Varghese, and H. Tang. 2010. "Using Molecular Representations To Aid Student Understanding of Stereochemical Concepts." Journal of Chemical Education no. 87 (12):1425-1429.
Ainsworth, S. 2008. "The educational value of multiple-representations when learning complex scientific concepts." Visualization: Theory and practice in science education:191-208.
Burke, K. A., T. J. Greenbowe, and M. A. Windschitl. 1998. "Developing and using conceptual computer animations for chemistry instruction." Journal of Chemical Education no. 75 (12):1658-1661.
Gilbert, J K. 2008. "Visualization: An emergent field of practice and enquiry in science education." Visualization: Theory and practice in science education:3-24.
Greenfield, Jeremy. Students, Professors Still Not Yet Ready for Digital Textbooks 2013. Available from http://www.digitalbookworld.com/2013/students-professors-still-not-yet-r....
Kleinman, R., H. Griffin, and N. K. Kerner. 2005. "Images in chemistry."1-5 and references therein.
Lowe, Richard. 2004. "Interrogation of a dynamic visualization during learning." Learning and Instruction no. 14 (3):257-274.
Perkins, James A. A History of Molecular Representation Part 2: The 1960s - Present, Feb 17 2006a.
———. A History of Molecular Representation Part One, Feb 17 2006b.
Uttal, D. H., and K. O. Doherty. 2008. "Comprehending and Learning from ‘Visualizations’: A Developmental Perspective." Visualization: Theory and practice in science education:53-72.
The educational benefits of students performing simulated chemistry laboratory experiments in the 3D, immersive, virtual world of Second Life (SL) are being investigated at Texas A&M University by students enrolled in General Chemistry II Laboratory, with funding provided by a 3 year NSF grant. This fall, 90 students have completed two weeks of lab activities in Second Life while 400 other students complete the same experiments in a real laboratory. In Spring 2014, 100 students will perform SL experiments and over 2000 students will participate in the control group. This will be repeated for the 2014-2015 school year.
This project will answer the following research questions:
• How does the laboratory environment (Second Life or the real world) affect students’ ability to achieve the learning goals of the laboratory experiment, including content knowledge and kinesthetic skills?
• How does the laboratory environment affect students’ attitudes towards learning chemistry in the laboratory and performing laboratory work?
Our assessment methods include surveys, focus groups, pre/post lab quizzes, lab reports and a practical exercise in which student assemble parts of a laboratory apparatus. Differences in student outcomes due to academic background or demographic characteristics will be analyzed.
This study is the first to evaluate students’ learning and attitudes in a Second Life chemistry laboratory. If we find that SL experiments lead to better student attitude and academic performance in the lab, the information would be most useful for (1) designing new on-line distance learning science lab experiments and (2) creating a viable alternative for schools which do not offer chemistry laboratory courses.
What Can Students Learn from Virtual Labs?
Virtual worlds offer chemical educators an interesting new platform for faculty and students to interact in order to augment or even replace existing classroom and laboratory sessions. Virtual worlds provide a visually rich, three dimensional environment in which users interact with each other and virtual objects. Each user controls an avatar, the user’s representation in the virtual world. By creating content for the world, educators can design new learning activities that would not be possible in the real world.
Second Life is the mostly widely used and well known virtual world and is maintained by Linden Labs. Users communicate with each other through audio using a headset and microphone or through instant messages and they interact with their surroundings by clicking on objects with the mouse pointer. Access to Second Life is free (with some age restrictions for young users). The success of Second Life is due to Linden Labs providing the platform for the virtual world but allowing users to create their own content (similar to YouTube maintaining the website but users uploading their own videos). Users can write programming code within Second Life to create objects and control their properties using Linden Scripting Language, a language similar to C+ and JAVA.
Second Life is not a game - there are no predetermined goals, scoring or inherent competition. Instead, Second Life is designed to promote socialization, communication and exploration among avatars. Second Life has its own economy based on the Linden dollar (L$), which users can exchange for real world currencies. Users purchase Linden dollars so that they can buy items for their avatar. Users can purchase land in Second Life (server space) in order to own and maintain their own section of the world. Land owners can modify land features and control access to their property. This is important to educators who might only want students to enter their part of Second Life.
Use of Second Life in education is growing, as is the research showing its effectiveness. Students can feel more comfortable attending class in Second Life compared to a real classroom.(1) Since many students attend class online, Second Life can provide a sense of “presence”, which is important in distance learning classes.(2-4) Many studies report that students respond positively to learning within virtual worlds when added to an existing course(2,5-8) and influences their grades as well.(6-8) Educators in information systems,(5) computer science,(6) biology,(7) medicine(9, 10) and chemistry(11-15) have all used Second Life for their courses. Two reviews explain how chemists use Second Life in the classroom.(11, 16)
Dr. Wendy Keeney-Kennicutt, project Co-PI, completed an extensive study of using Second Life to teach students about 3D molecular shapes and Valence Shell Electron Pair Repulsion (VSEPR) theory. She employed a quasi-experimental re-/post-test control group research design study on her two Texas A&M general chemistry lecture classes (a total of 480 students). The experimental group performed activities with 3D molecules in Second Life while the control group did the same activities using 2D images which were screen shots of SL images. Ultimately, she found that students working in a 3-D environment did show subtle but significant differences in increased student ability by the SL group for interpreting routine 2D presentations of 3D chemical structures using solid lines, dashed lines and wedges.(13, 14, 15, 17)
Although STEM educators use Second Life in a variety of ways, no virtual laboratory experiments are available. More importantly, it is not clear how well such lab experiments might compare to real world experiments in terms of students’ learning and attitudes.
A year ago, we received a 3 year NSF TUES grant entitled “Evaluating Students’ Learning and Attitudes in a Virtual Chemistry Laboratory.” The first year was spent developing assessment tools and two laboratories that were as identical as possible to two laboratories that were part of the curriculum in second semester general chemistry laboratory at Texas A&M University; we used two professional SL programmers in the development process. Our goals were to measure students’ attitudes towards the real world and virtual experiments, their ability to achieve the learning goals of both types of experiments and the students’ development of kinesthetic skills during the experiments.
Here is a link to a 20 minute video tour of the facilities:
Figures 1-3 show the Second Life environment.
Figure 1. The virtual laboratory building on Chemistry World Island
Figure 2. The virtual labroom.
Figure 3. Super-sized equipment for the first SL experiment behind Dr. Keeney-Kennicutt’s avatar
The lab experiments were designed to mimic as closely as possible the actual lab experiments. In Second Life, students assemble equipment and perform the experiment by clicking on chemicals and pieces of laboratory equipment, and use menus to select other options. They wear headsets with microphones to communicate with their lab partner and TA. Students record their own data and their results depend on their actions, just like in a real chemistry laboratory experience. A student’s mistake in performing the procedure or inattention to details affects the experimental results. Although the mathematical equations that are a part of the experiment’s programming code provide perfectly precise results, the code also introduces a small degree of randomness into the results so that the data “looks real.” Just as in a real classroom, no two sets of data from the SL experiments are exactly the same. Students still have to read volumes in graduated cylinders, graduated pipets and burets. The first SL experiment, Experiment 2: Molar Mass Determination, involved collecting gas over water and the ideal gas law to determine the molar mass of the gas in a butane lighter. The second SL experiment, Experiment 3: Precipitation Titrations, involved 7 argentometric titrations, to determine the salinity of 2 San Antonio bay water samples at one location in the bay at two different times. Each pair of students had samples from a different part of the bay.
Fall 2013 Pilot Study
This fall was our pilot study. Four experienced teaching assistants were chosen who would teach one section in SL and the other as normal. Their schedules determined which sections were chosen as the experimental group. On the first day of lab during week 1, students signed their IRB consent forms and took an on-line survey. Four sections (69 students) were in the experimental group and 19 sections (371 students) were in the control group. Here were the student demographics:
27% engineering, 18% education, 16% agriculture, 14% science, 11% biomedical science plus 14% in geoscience, liberal arts, business and general studies.
50% sophomore, 21% freshman, 18% junior and 11% senior
Semesters at TAMU:
30% less than 1 semester, 27% 2 semesters, 20% 3 semesters
98% 18-24 with 2% under 18 and 1% 25-34
56% female and 44% male
66% white (non-Hispanic), 15% Hispanic, 10% Asian, 5% mixed and 3% black.
Among other facts, we found that 99% of students had access to computers or laptops with 94% using them daily and 34% own or have access to a tablet. The top 5 uses for computers are: email, doing homework, using social media, watching videos and doing research. However, 87% had little to no experience with online virtual worlds. Training is critical for a student’s success in a virtual world. During week 1, the experimental group was introduced to SL on the laboratory computers. Each student created their avatar and was able to find their way within the program to the area where they would be professionally trained. Our trainer is located out-of-state. She and the co-PI met with 12 groups of students the following week within Second Life for a 30 – 40 minute training period to give proper lab attire and lab goggles to their avatars, teach them to read a buret and graduated cylinder, give access to the study area on the Chemistry World island and find the classroom. There is a learning curve to Second Life. Students need to feel comfortable controlling their avatars in SL, so they can concentrate on doing the experiment and not the software. As a note, the first lab was short, so students who missed training were able to get trained as well as complete the lab.
During week 2, all students did the first lab as normal. For weeks 3 and 4, the experimental group met at their normal time at a nearby computer lab and the control group met in their regular lab room to do the second and the third labs. See Figures 4, 5, 6, and 7. At the start of each period, all students would take a 5 question multiple choice quiz on the procedure. Then the TA either in SL or the normal lab would show a PowerPoint presentation explaining the lab and the lab would begin. As students finished, they would take the identical quiz again. During week 5, a practicum on the week 3 lab procedure was given to the 8 sections taught by the experienced TAs: 4 SL sections and 4 control sections. During week 5, the 4 TAs took an on-line survey and in week 6, they participated in a focus group. In week 8, the control group took an on-line attitudinal survey during lab and the experimental group took a lengthier similar survey in lab. At the end of the week, volunteers from the experimental group took part in a focus group. A subset of both the experimental and control groups took an additional on-line survey to help the assessment team better understand and interpret the attitudinal survey results from Week 8. The assessment team also received all student lab report grades for the two labs under study.
Figure 4a. Prelab lecture in the SL classroom.
Figure 4b. Prelab lecture in the computer lab.
Figure 5a. TA giving prelab lecture
Figure 5b. Students listening to prelab lecture.
Figure 6a. Students in SL lab preparing to begin their experiments.
Figure 6b. The actual lab at TAMU
Figure 7a,b. Students completing Exp. 2 in SL.
Here are links to videos, demonstrating the Second Life activities in the computer lab.
Preliminary Results and Discussion
Here are some of our findings for the Fall 2013 Pilot Study:
No significant differences were seen between the SL group and the control group in
We did see some intriguing differences between the two groups:
The data is still being analyzed for this pilot study. We still have the TA survey, TA focus group, the student focus group and their written comments to examine. We hope to glean more interesting finding as we move forward with the study for 3 more semesters.
1. Lamoureux, E. “Teaching Field Research in a Virtual World” In R. Smith (Ed.), 2007 NMC summer conference proceedings, Austin, Texas: The New Media Consortium, 2007, pp. 105-110.
2. Sanchez, Joe “Pedagogical applications of Second Life” Libr. Technol. Rep. 2009, 45(2), 21-28.
3. Feldon, D. F.; Kafai, Y. B. “Mised Methods for Mixed Reality: Understanding Users’ Avatar Activities in Virtual Worlds” Educ. Technol. Res. Dev. 2008, 56(5-6), 575-593.
4. Edirisingha, P.; Nie, M.; Pluciennik, M.; Young, R. “Socialization of Learning at a Distance in a 3-D Multi-User Virtual Environment” Brit. J. Educ. Technol. 2009, 40(3), 458-479.
5. Dreher, Carl; Reiners, Torsten; Dreher, Naomi “Virtual Worlds as a Context Suited for Information Systems Education: Discussion of Pedagogical Experience and Curriculum Design with Reference to Second Life” J. Info. Sys. Educ. 2009, 20(2), 211-224.
6. Wang, Yuanqiong; Braman, James “Extending the Classroom through Second Life” J. Inform. Sys. Educ. 2009, 20(2), 235-247.
7. Cobb, Stephanie; Heaney, Rose; Corcoran, Olivia; Henderson-Begg; Stephanie “The Learning Gains and Student Perceptions of a Second Life Virtual Lab” Biosci. Educ. 2009, 13, 1-9.
8. Hew, Khe Foon; Cheung, Wing Sum “Use of Three-Dimensional (3-D) Immersive Virtual Worlds in K-12 and Higher Education Settings: A Review of the Research” Brit. J. Educ. Technol. 2010, 41(1), 33-55.
9. Salmon, Gilly; Nie, Ming; Edirisingha, Palitha “Developing a Five-Stage Model of Learning in Second Life” Educ. Res. 2010, 52(2), 169-182.
10. Delwiche, Aaron “Massively Multiplayer Online Games (MMOs) in the New Media Classroom” Educ. Technol. Soc. 2006, 9(3), 160-172.
11. Bradley, Jean-Claude; Lang, Andrew S.I.D. “Chemistry in Second Life” Chem. Central J. 2009, 3, 1-20.
12. Lang, Andrew S.I.D. and Kobilnyk, D. C. “Visualizing Atomic Orbitals Using Second Life” J. Virtual Worlds Res. 2009, 2(1), 4-8.
13. Merchant, Z., Goetz, E.T., Keeney-Kennicutt, W., Kwok, O., Cifuentes, L., Davis, T.J., The Learner Characteristics, Features of Desktop 3D Virtual Reality Environments, and College Chemistry Instruction: A Structural Equation Modeling Analysis, Computers & Education (2012) doi: 10.1016/j.compedu.2012.02.004
14. Merchant, Z., Goetz, E.T., Keeney-Kennicutt, W., Kwok, O., Cifuentes, L., Davis, T.J., (2013) Exploring 3-D Virtual Reality Technology for Spatial ability and Chemistry Achievement, Journal of Computer Assisted Learning. 12 JUN 2013, DOI: 10.1111/jcal.12018
15. Keeney-Kennicutt, W.L. & Merchant, Z. “Virtual Worlds and Their Uses in Chemical Education” in Pedagogic Roles of Animations and Simulations in Chemistry Courses ACS Symposium Series 1142, Jerry Suits and Kimberly Pacheco (Eds). 2013; New York: Oxford University Press, pp 181-204.
16. Winkelmann, K. “Virtual Worlds and Their Uses in Chemical Education” in Pedagogic Roles of Animations and Simulations in Chemistry Courses ACS Symposium Series 1142, Jerry Suits and Kimberly Pacheco (Eds). 2013; New York: Oxford University Press, pp 161-179.
17. 12th Man Island, location of Dr. Keeney-Kennicutt’s VSEPR project in Second Life, http://maps.secondlife.com/secondlife/12th%20Man/221/235/26, accessed Nov. 12, 2013.
18. Bauer, Christopher F. “Beyond ‘Student Attitudes’: Chemistry Self-Concept Inventory for Assessment of the Affective Component of Student Learning” J. Chem. Educ. 2005, 82(12), 1864-1870.
19. Chatterjee, Suparna; Williamson, Vickie M.; McCann, Kathleen; Peck, Larry M. “Surveying Students’ Attitudes and Perceptions toward Guided-Inquiry and Open-Inquiry Laboratories” J. Chem. Educ. 2009, 86(12), 1427-1432.
During my summer course, I began to integrate ChemDraw for iPad into my undergraduate organic chemistry lecture. There was an obvious increase in classroom participation and engagement with the material as a result. I will show the types of problems students worked on in class and model how Flick-to-Share works to exchange information. Successes and difficulties in integrating ChemDraw for iPad into the course will be discussed as well as how some difficulties have been addressed and future development needs.
Using ChemDraw for iPad and Flick-to-Share to Increase Engagement in Organic Chemistry
My organic course, teaching from iPad
I began teaching from an iPad during Fall semester 2012 for my organic chemistry 1 course. I wrote on the iPad similarly to using the whiteboards while using Airserver to connect to the computer in the room and projecting onto a screen so the class could follow along. I use Camtasia Relay to record the screen and a room microphone to record what I am saying as well as any class discussion. These lectures are then posted on Blackboard for students to watch back while studying.
Drawing and Flick
ChemDraw for iPad can be used to draw nearly all chemical interactions I use in organic chemistry, from reactions and mechanisms, to stereochemistry via wedges, sawhorses, Newman projections, Fischer projections and Haworth projections as well as molecular orbitals. Flick-to-Share can be used to send any of these drawings in real time to other users that have been added as contacts. It can also be used to share with a group of contacts at once, where one flick can send the material to a number of users, which is ideal for in-class use.
During spring of 2013, I was contacted by McGraw-Hill Education about pilot testing the new ChemDraw for iPad that would be released by PerkinElmer during the summer of 2013. I was immediately interested in finding ways to help our students become more familiar with ChemDraw as I felt it was being under-used on our campus. PerkinElmer lent 25 iPads for my summer organic chemistry students to be able to take with them for the entire course. This allowed them to work on ChemDraw for iPad outside of class time and become more familiar with using the app.
As I prepared to use ChemDraw in class, I considered what it could be used for. In addition to the students becoming more familiar with ChemDraw, I immediately saw the value in the imbedded Flick-to-Share functionality. This allows people to easily exchange their drawings with others by simply flicking the drawing to another user’s ID. During most class periods, I used this to flick a problem out to the students and have them flick their answers back to me. Since I began teaching from the iPad, I have been walking around the room while I am teaching, and I noticed that when I give students a problem to work on during class that 1/3 to ½ of the class did not attempt the problem, but simply waited until I later solved the problem to write anything down. To me, this defeated the pedagogical purpose of giving in-class problems, which is to help students determine if they understand how to apply the concepts we are discussing. If students attempt to solve a problem and get it right, it shows they have a good understanding, while if they try and get it wrong, it shows they need to spend more time studying the material outside of class. If they wait for me to solve the problem they have missed this opportunity to self-assess. By using ChemDraw and Flick-to-Share, I was able to get all the students in class to engage with the problem-solving by giving daily points for sending me answers via Flick. I felt it was important not to penalize students for giving wrong answers, as I believed that engaging with the problems would encourage self-assessment, therefore I gave students points whether their answers were correct or not.
Fig. 1 An example of an in-class problem. The black drawings were flicked to the class, the red is an example student response.
There are many types of problems that I have been able to use ChemDraw for iPad to have students solve during class including:
One unexpected outcome I found while walking around the room as students worked on their problems was that I could quickly identify mistakes being made in mechanisms and structures. I attribute this to the more standardized nature of drawing with ChemDraw than drawing by hand. This has led to being able to correct these mistakes as they are happening during class.
Fig. 2 A common error drawing E2 reactions without having the base remove the proton alpha to the leaving halogen: ChemDraw for iPad allowed me to quickly show why this mechanism pathway is not possible.
I also experimented with using ChemDraw on two exams during the summer. I flicked partial reactions to the students and asked them to answer the problems and flick the responses to me. I required them to complete this part of the exams before handing out the paper portion as ChemDraw could be used to answer some of the other exam questions (such as configuration of stereocenters, as stereochemical labeling can be turned on or off on each user’s ChemDraw app). This worked well for the early exams before the students were told that it was possible for them to flick to each other (at the time, they only knew they could flick to me as the instructor). I did not have the students use the app on the final exam for this reason.
Fig. 3 Average student responses to questions relating to their use of ChemDraw for iPad during summer courses at UIS and SLU. The scores were on a 7 point Likert scale with 1=low and 7=high, SLU n=21, UIS n=7.
PerkinElmer sent a user interface and usability expert, Jennifer McCormick from User Experience, to run a focus group with the students from my summer class. She used a combination of survey and interviews to assess student attitudes toward the use of ChemDraw for iPad. McGraw-Hill Education sent Patrick Diller to do the same, but to assess the educational aspects of using the app in class. The feedback from these two sessions was very important in increasing the usability of ChemDraw for iPad in the classroom. The low score for ease of sharing structures at UIS had two easily identifiable reasons. First, group flicks were not yet possible, which meant that I had to flick everything individually to each student in class. This functionality was added to Flick-to-Share in a subsequent update. This low sharing score also helped us to identify that the students wanted to use the app outside of class while studying with each other and wished they could flick drawings to one another. After learning this, I immediately told the students that they could flick to each other and showed them how to do so. In hindsight, I feel that students use of their iPads while studying is much more valuable than using it on exams (which is the reason I had not told them they could flick to one another). The comments relating to ease of drawing structures showed that many students (as well as the instructors, though we were not included in the survey data) desired a text tool that would allow for labeling of items on the drawings. This was also subsequently added to ChemDraw for iPad along with chair, Newman and Fischer structure templates all of which have expanded the usability of the app. Many of the comments relating to overall satisfaction or perceived usefulness related to the awkwardness of switching between writing notes on paper and using the iPad for problem solving.
The main element that would improve the classroom usefulness of ChemDraw for iPad would be a way to integrate with some type of class response program. This would allow for immediate analysis of responses which would aid instructors in identifying and correcting misconceptions during class.
ChemDraw for iPad has been used to increase engagement during organic chemistry lectures by utilizing Flick-to-Share to send out problems and receive answers from the students. There have been several improvements in the usability of ChemDraw for iPad since the summer pilot that make it more versatile and have fixed some of the main issues students experienced. I am continuing to use it with a larger class and will again survey this class to determine how the improvements will affect student attitudes and engagement.
This work has been supported by PerkinElmer with special recognition to Hans Keil and Phil McHale for their oversight of the pilot studies. Patrick Diller at McGraw-Hill Education for insight into pedagogical developments relating ChemDraw for iPad to teaching and survey data. Jennifer McCormick from User Experience for survey creation and implementation. Dr. Michael Lewis of Saint Louis University who also participated in the pilot study with his 2013 summer organic chemistry course.
Online learning environments have shown that students work at all hours. To provide assistance to students at all hours, these systems frequently offer hints, feedback, solutions, videos, eBooks, and similar problems. This article discusses a new, multi-level assistance framework (Deep Help) based on principles from knowledge space theory, the zone of proximal development, and cognitive load theory.
Homework usage stats:
Unlike analog homework, students can access online learning environments at any hour of the day and analytics are available to track student usage patterns. As shown in Figure 1, student usage peaks in the evening, when instructors tend to be unavailable.
Figure 1. Relative usage of an online learning system over the course of a day
Online learning environments often incorporate various tools to provide real-time assistance to the student, including hints, feedback, solutions, videos, eBooks, and access to example problems. Students in introductory-level courses frequently come from a diverse set of education backgrounds and consequently require different types and levels of assistance. This article describes the new Deep Help system, which provides on-demand, multi-level assistance to the learners when and where they run into difficulty. This system was designed to help every student best by using principles from the zone of proximal development, knowledge space theory, and cognitive load theory. When applied together these theories provide the foundation for the innovative design of the Deep Help system.
Zone of Proximal Development
Vygotsky’s Zone of Proximal Development concept describes the range of abilities that a learner cannot perform independently, but can perform with assistance. The zone of proximal development, represented by the blue center area in Figure 2, is the gap between tasks a learner can do without help and what the learner cannot do, even with assistance.
Figure 2. Graphical representation of the zone of proximal development.
The role of a teacher is to provide guidance and assistance so that the learner can accomplish tasks in the center section. Teachers can use online learning systems as an extension of their role to provide additional assistance when they are not available. Organizing which tasks fall into each of these segments is further described by the next framework.
Knowledge Space Theory
Knowledge space theory considers the dependent relationship between subsets of knowledge. For instance, the concept of the balanced equation for a reaction is a necessary prerequisite for the concept of stoichiometry. This relationship is also why it is rare to assess student understanding of stoichiometry with a problem involving a 1:1 ratio, since that could be solved without meeting the prerequisites.
Figure 3a. Example of a portion of a knowledge space.
For instance, in Figure 3a, an understanding of B requires a prerequisite understanding of A. Similarly, D depends on B and C, indicating that to understand D, a learner must already comprehend A, B, and C. Thus a knowledge state of ACE, FACE, or CAB would be possible, but FEB or DEC would not.
Applying the Zone of Proximal Development to a knowledge space diagram requires that the three regions be aligned in ways allowed by the dependencies in the diagram. A hypothetical example of this is illustrated in Figure 3b, with a learner able to accomplish A,B,C,E, and F without assistance, D and G with assistance, but unable to perform H. The dependencies make clear that if, for instance, F required assistance, then G would require assistance as well.
Figure 3b. Application of the zone of proximal development to a knowledge space diagram
Cognitive Load Theory
The major idea behind cognitive load theory is the assumption of a finite cognitive load capacity in a learner. This capacity is spread among intrinsic, extraneous, and germane aspects of the activity being performed. Intrinsic cognitive load comes from the difficulty and complexity of the concept. While extraneous cognitive load relates to the means through which a concept is presented. And germane load addresses the construction of schemas. This theory indicates that in order for the student to use the majority of his or her cognitive abilities for learning, it’s important to avoid extraneous tasks and distractions.
Deep Help Framework
The Deep Help framework was designed to provide stepped tutorials for prerequisite information. Students can dive deeper into the provided extra support as needed, until they fully understand all elements required to perform the original problem. The instructor has full control over student access to tutorials and can configure Deep Help to always be available or available only after a specified number of answer submissions. While many interactive tutorials and other help tools are associated at the question level in a student’s assignment, the Deep Help system is associated with individual steps in a tutorial (Figure 4).
Figure 4. Help systems associations
Cognitive load theory suggests a “just in time” paradigm: making it easy to find while the student is learning and limiting decision options so that less cognitive load is expended. Rather than having four tutorials to choose from, in the Deep Help system each step of the tutorial (highlighted in orange) only has one or two options for a learner to choose. The multi-step approach used in the tutorial helps the student see which step they are having difficulty with and easily identify what Deep Help exists.
From the perspective of knowledge space theory and zone of proximal development, we presume that instructors would assign questions that their students are able to do with assistance, however, we recognize that there are cases where this is not practical, such as before-class assignments or when students have missed class due to illness or other reasons. To use the example shown in Figure 3, the Deep Help system allows a student to backtrack to activities they can accomplish with the assistance of the system (blue), which can expand their capabilities to tackle the original question. It is important to note that students would not usually access a large portion of the Deep Help available for a given question, but would dive as deep as needed in an area in which they are having trouble. For students who are completely lost, the tutorial offers a step-by-step breakdown of the problem.
A student who is working on a problem has easy access to the relevant reference materials, lowering cognitive demands, as well as a link to the tutorial, which leads to the Deep Help (Figure 5). A basic demo can be reviewed at: http://www.webassign.net/info/demo_assignment.html?deployment=2934
Figure 5. Student view of a question with part of the associated tutorials and Deep Help
(click here to download original image)
By applying these learning theories in the design of the Deep Help System, instructors are able to offer additional support to students through the use of an online instructional system that assists students in indentifying where they are having trouble, provides multi-level assistance in those areas, and is always available.
Sapling Learning is an education company (saplinglearning.com) that provides online homework and instruction for the science disciplines. In addition to its learning platform, the company develops interactive labs that support the inquiry process. Each lab comes with suggested homework and clicker questions that probe student understanding of the concepts. This article will offer examples of integrating Sapling Learning's interactive labs with instruction to engage students inside and outside of the classroom. The article will also describe the principles that guide the design of the labs, with particular emphasis on design considerations for touch interfaces (1).
Sapling Learning’s mission is to engage students and empower educators. We aim to empower educators by providing a course management system that automatically monitors student progress. We aim to engage students by providing online homework that gives instant feedback on their understanding of course content. The screenshot below shows how a question appears to students. Our library includes over 10,000 such questions.
The goal of this question is for students to compare different representations of molecules. They can rotate the 3D models, and they have the option to view a hint. Each molecule contains 4 or 5 bonds. The labels include distractors, such as shapes with less than 4 or more than 5 bonds. The question is also randomized so that some students will see SiH4, for example, while others will see SiF4 or SiCl4. This encourages students to collaborate meaningfully on their homework.
One feature of online homework is that students receive immediate feedback on their understanding after they submit an answer. This feature has been associated with improved student performance (2). At Sapling Learning, we have two modes of feedback for incorrect answers: specific and general. For the specific feedback, our authors predict common student errors and provide targeted responses. The image below shows an example of the specific feedback for the question above.
We include general feedback for those errors we cannot anticipate. This ensures that all students get some form of assistance. As shown in the image below, the general feedback for the question above includes a table that contains the formula of each molecule along with 2D and 3D models. This gives students more guidance toward the correct shape. Did students actually label the 2D model of CF4 as square planar? Read on for the results.
Our products span the range from assessment to instruction. We recently launched a series of eBooks for high school science that include videos, 3D animations, and interactive labs. We used HTML5 to develop the content to enable use on multiple platforms. To date, we have created approximately 50 interactive labs for the subjects of physics, chemistry, and biology. Below is an example of one of the chemistry labs.
Click on the image to open a video of the lab.
The goal of this lab is for students to examine the effect of concentration of the pH of a solution. Students can add a common liquid, such as coffee or juice, to the beaker and use the probe to measure the pH. They can add water or open the drain and observe the effect on the color of the liquid. The tick marks along the side of the beaker also enable students to quantify the effect of dilution. In what follows, we describe our design process for the labs (3).
The design of each interactive lab is guided by two types of learning goals: content and process. Our content goals are for students to develop a conceptual understanding of the science topic. Our process goals are to engage students in science by giving them an opportunity to ask their own questions and test their own hypotheses.
For the high school labs, our design goals also align with the Texas Essential Knowledge and Skills (TEKS), the state standards. Below is an excerpt from one of the science concepts (4).
TEKS Chemistry 10F: “The student is expected to investigate factors that influence solubilities and rates of dissolution such as temperature, agitation, and surface area.”
We developed two labs to address this standard: one for solubility, and one for dissolution rate. In both labs, students can dissolve fine or coarse salt or sugar in water, and they can use a hot plate to examine the effects of heating or stirring the solution. In the Solubility lab, students are given measuring spoons to compare the amounts required to reach saturation. In the Rate of Dissolution lab, students are given a timer. We performed the actual experiment ourselves to obtain relative dissolution times.
Our design goals are also informed by research: We consult the chemistry education literature for insight into student ideas. Our set of electrochemistry labs provides an example. In one of the labs, students can build a voltaic cell using half-cells and a salt bridge. We included an option to animate the flow of electrons from the anode to the cathode to confront the student ideas about current flow reported by Sanger and Greenbowe (5).
We apply the idea of implicit scaffolding (6) to create environments that enable students to ask their own questions. This allows us to guide students while giving them a sense of autonomy. A common example of this idea comes from door design: A door that people must push to open should not have a handle that people can pull. Below we use the Specific Heat lab to illustrate our use of affordances and constraints.
The goal of this lab is for students to plan a procedure to determine the specific heat of a metal. Students can drag the cup, the metal block, and the thermometer. Students can also select and identify a mystery metal. They can use the reset button to repeat an experiment.
This lab affords certain actions. The water dropper is poised above the cup to cue students to add water. Likewise, the metal block is poised above the burner to cue students to heat the metal. The balance and the thermometer cue students to measure mass and temperature. Even the dropdown menu cues students to compare the metals.
This lab also constrains certain actions. For example, students can only add water and heat once per experiment. Students are also not able to drag the metal block out of the cup. Part of the reason is to simplify the model, but the main reason is to encourage productive experimentation.
We use two types of user experience testing to ensure that an interface is intuitive: hallway and online. For hallway testing, we literally walk around the Sapling Learning office and ask coworkers to think aloud while using the lab. We occasionally use a screen-capture program to enable us to revisit the tests. After about three users, we begin to observe common interface issues and interpretation errors.
For online testing, we employ a service (UserTesting.com) that provides on-demand usability testing. We create the test and they recruit the testers. Within an hour, we can watch videos of people using the lab and read their responses to our follow-up questions. Here we use the Atom Builder lab to give an example of the feedback.
Click on the image to open a video of the lab.
In this lab, students can build atoms with protons, neutrons, and electrons. They can examine the effect of each particle on the identity of the atom, the charge, and the mass number. The nucleons shake when students build an unstable nucleus. The electrons move so rapidly in the play area that they are hard to locate. Students can click outside the nucleus to remove an electron from the play area. Our representation of the atom was inspired by a Nature article on hollow atoms (7).
For the online user testing, we set up the scenario: “You are a student in an introductory science class. Your teacher has asked the class to use an online lab for homework.” The first task was to explore the lab for a few minutes. The next few tasks were more specific. In the video clip below, one of the testers is using the lab to answer: “What changes the number after the name?”
Click on the image to open a user testing video.
Note how she uses the lab to test her predictions. “Is it the number of electrons? Let’s try it.” The immediate feedback in the lab allows her to develop a rule for the mass number. “I think the number shows how many protons and neutrons are in the atom.” Then she uses the lab to demonstrate the rule. This example suggests that students can learn from the lab without explicit guidance.
We use the results of user testing to make changes to the lab. For example, in the hallway testing for the Atom Builder lab, we saw that some users thought the way to make the nucleus stable was to add the nucleons in the correct order. The image below shows that the locations of the nucleons no longer depend on the order in which they are added to the nucleus.
In the online testing, we saw that some users equated stable and neutral. As shown in the image below, we added the word “nucleus” to the stability readout. We also changed a phrase in the Help text from “build as many stable, neutral atoms as possible” to “stable and neutral atoms”.
In both forms of interface testing, we saw that users were hesitant to click on the Help button. This was particularly the case for male users. The image below shows that the Help button is now an “Info” button, but we are still exploring other ways to make our Help buttons less intimidating.
We also try to address issues in the questions that we write for the lab. Below is a screenshot of one of the questions for the Atom Builder lab. It asks students to classify each atom description as stable and/or neutral to confront the idea that stable and neutral are equivalent.
Note that we encourage students to use the lab to answer the question, as they are not expected to know which combinations of nucleons result in a stable nucleus. The items are also randomized to promote meaningful collaboration. In summary, user testing informs both the design of the lab and the questions that we ask about the lab.
Engage students in lecture
The open design of the interactive labs enables use in a variety of educational settings. Below are two examples of using the Conductivity lab with clicker questions in a lecture environment. The goal of this lab is for students to compare different types of solutions. They can drag the electrodes into a solution to measure the conductivity and view the particles in solution.
One type of clicker question that lends itself well to an interactive lab is a prediction question. For example, an instructor can ask students to predict which light bulb below shows the result when the electrodes are placed in one of the solutions. The instructor can collect student responses and then use the lab to show students the result.
Another type of clicker question is one that generates critical discussion of the interactive lab. For example, water molecules are not shown in the Conductivity lab. An instructor can ask students to select the zoom view below that shows the best representation of water. The first option shows a macroscopic representation, the second shows water molecules floating in water, and the third shows water molecules tightly packed. The instructor can collect student responses and then facilitate a classroom discussion about particulate models of solutions.
Engage students in lab
Our interactive labs are designed to support the inquiry process. For example, the goal of the lab below is for students to investigate precipitation reactions. Students can mix two solutions of ionic compounds and observe whether a precipitate forms. They can also use the results to identify an unknown solution. An instructor can use this lab to ask students to construct the solubility rules rather than to confirm the rules.
Each interactive lab comes with suggested questions. Many of the questions that we write for the labs ask students to collect and analyze data from the lab. An instructor can use this resource to prepare students for the experience of a wet lab. The eBooks for high school also include lab videos. Below is a still from a video about precipitation reactions. An instructor can use this resource to ask students to contrast the physical reaction with its virtual representation.
The interactive labs provide a safe environment for experimentation. Some of our labs contain reactions with safety or waste issues. Others allow procedures that would require equipment that a high school science department may not own. In this way, our interactive labs can enhance and extend the wet lab experience.
Engage students at home
Not surprisingly, an instructor can use our interactive labs with online homework. One strategy is to introduce a concept at home so that students are prepared to apply the concept in class. Many of our questions ask students to notice relationships in the lab. Other questions, like the example for the Atom Builder lab, prompt students to construct a working definition of science terms. Below is a screenshot of a question for the pH lab that asks students to examine how adding water or opening the drain affects the pH of each solution.
We include a link to the interactive lab in the question stem. Since the interactive labs are not randomized, we randomize the questions associated with the labs to encourage students to collaborate. In the example above, the solutions in the table are randomized so that each student is likely to get a different set. We also encourage students to use the lab to answer the question. An instructor can use the same strategy with other online resources and other homework systems.
Engage students anywhere
We develop the interactive labs in HTML5 to enable students to use the labs on any device. This means that we must ensure that our labs work on laptops and tablets, in multiple browsers and platforms. This also means that we must consider touch interfaces in the design of the labs.
The pH lab provides one example of a design consideration. We often use cursor changes to signify when an object is interactive. As shown in the image below, when a student hovers over the button on the water bottle, the mouse cursor changes from a pointer to a hand. We occasionally show a tooltip, such as “add water”, when a student hovers over an interactive object. Both of these cues are not possible on touch interfaces, so we must rely on artistic effects.
The Density lab provides an example of another design consideration. The goal of this lab is for students to design an experiment to determine the density of a spherical object. They can use the balance to measure the mass, and they can use the ruler or water displacement to determine the volume. Students can compare two objects of the same material with different sizes or two objects of different materials with the same size. They can also identify a mystery object.
Click on the image to open a video of the lab on a tablet.
We used a tablet for hallway testing and saw that it was difficult for users to measure the diameter of the small objects because their fingertips covered the objects. In the next iteration, we made the ruler transparent and added the ability to drag the ruler over the objects.
We did a horizontal flip of the interface for another lab after we saw that users’ hands covered the features below their fingers. Touch interfaces are important to consider early in the design process, and user testing on tablets often reveals usability issues.
For teachers: Customization
Sapling Learning pairs each instructor with a “Tech TA”, a subject expert who provides support throughout the semester. One way that our Tech TAs support instructors is by editing our existing content. We also plan to offer customization of the interactive labs. For example, we can add or remove chemicals from a lab to better align with a particular experiment.
Below is a version of the Density lab with objects removed. In this version, the objects are made of the same material. The material is one that has a range of density values. The largest rock does not fit inside the graduated cylinder, and it also maxes out the balance. Students can use the ruler and the average density of the other rocks to determine the mass of the largest rock. Another version of this lab could include rocks with other shapes.
Below is a version of the Density lab with objects added. This version includes an object that floats in water and an object with an irregular shape. Students can use the ruler to determine the volume of the wood ball, and they can use water displacement to determine the volume of the gold nugget.
Another way that our Tech TAs support instructors is by developing new content to address specific learning goals. We plan to provide the same level of support for our interactive labs.
For researchers: Data
Over 200,000 students are using Sapling Learning this semester. This means that many of our homework questions are attempted by thousands of students. Our homework system tracks every incorrect response. We already use this data to gauge difficulty and to ensure that students are getting our specific feedback. Many instructors use this data to shape their lectures.
An education researcher can use this data to uncover common student ideas or to compare student ideas before and after a learning experience. Below is a screenshot of a question that an instructor may assign during a unit on density. The question asks students to use the intensive nature of density to predict the behavior of a small metal block.
More than 1000 students have attempted this question. With the targeted feedback, nearly all students correctly place the metal at the bottom of the beaker. The student paths are even more interesting. The image below shows the five patterns followed by 98% of the students. We see that only 84% of the students were correct on their first attempt. We also see that students were more likely to place the metal in the middle of the beaker than to place it on the surface of the water.
Here we return to the opening question: Did students label the 2D model of CF4 as square planar? Of the roughly 1000 students who attempted this question, only 6% selected this on their first attempt. On the other hand, 19% of students labeled SF4 as tetrahedral on their first attempt. Given the specific feedback that the central atom has one lone pair, most students went on to successfully complete the question.
We are currently working on math labs for a new high school product in addition to developing new interactive labs for science. We are also exploring ways to offer free access to the interactive labs for students and teachers. One plan is to form a community where we can share labs in progress and get feedback on the design. We hope that you are able to apply one of the ideas in this article in your own work, and we look forward to your comments.
(1) This article is based on a presentation given by the author:
Lancaster, K. Engaging students with Sapling interactives. Presented at the 246th ACS National Meeting & Exposition, Indianapolis, IN, September 8-12, 2013; Paper CHED 404.
(2) For an independent case study using Sapling Learning, see:
Parker, L.L.; Loudon, G.M. Case Study Using Online Homework in Undergraduate Organic Chemistry: Results and Student Attitudes. J. Chem. Educ. 2013, 90, 37-44.
(3) For more on challenges unique to the design of interactive chemistry simulations, see:
Lancaster, K.; Moore, E.B.; Parson, R.; Perkins, K.K. Insights from Using PhET’s Design Principles for Interactive Chemistry Simulations. In Pedagogic Roles of Animations and Simulations in Chemistry Courses; Suits, J.P., Sanger, M.J., Eds.; ACS Symposium Series, Vol. 1142; American Chemical Society: Washington, DC, 2013; pp 97-126.
(4) Texas Administrative Code, Title 19, Part II; Chapter 112. Texas Essential Knowledge and Skills for Science; Subchapter C. High School.
(5) Sanger, M.J.; Greenbowe, T.J. Students’ Misconceptions in Electrochemistry: Current Flow in Electrolyte Solutions and the Salt Bridge. J. Chem. Educ. 1997, 74, 819-823.
(6) For more on the application of implicit scaffolding in simulation design, see:
Podolefsky, N.S.; Moore, E.B.; Perkins, K.K. Implicit scaffolding in interactive simulations: Design strategies to support multiple educational goals. Submitted to J. Sci. Educ. Technol.
(7) Van Noorden, R. Bohr’s model: Extreme atoms. Nature 2013, 498, 22-25.
The author would like to acknowledge the content and art teams at Sapling Learning. In particular, she would like to thank Jeff Sims, our interactive developer and animator. The author finds it a funny coincidence that she used to design “sims” for the PhET project and that Jeff lives in Lancaster, PA.
Since 1996 the CCCE has organized five intercollegiate OnLine Chemistry Courses or OLCCs. These have enabled colleges and universities to provide classes to their students that would not normally have been offered. In contrast to a MOOC, an OLCC is really a hybrid course involving two types of faculty; local facilitators (instructors of record who meet face-to-face with students) and online guest lecturers (subject domain experts, who may not be educators). We are currently organizing an OLCC in Cheminformatics and are seeking teaching faculty who would like to offer this course. This paper will describe the OLCC, and why we feel it is important to develop one in Cheminformatics.
The first phase of this project involves the use of participatory web tools to bring together teaching faculty with professional cheminformaticians and chemical librarians to identify missing cheminformatics/information science skill sets in the curriculum. We will then collaboratively create a course curriculum to address these competencies, and generate Teaching and Learning Objects (TLOs) that can also be used both inside and outside of the course to address said competencies. We are creating a Web 2.0 content management strategy designed to allow schools to customize the material presented to their students as they interact with the cheminformatic lecturers. This is an international project involving multiple institutions and chemical societies. This project targets the needs of undergraduate students, but graduate classes are welcome. We are actively seeking input from teaching faculty in PUIs (Primary Undergraduate Institutions), and further information can be obtained at the Cheminformatics OLCC development site, http://olcc.ccce.us/
An OLCC, OnLine Chemistry Course, is a collaboratively taught hybrid (online/face-to-face) intercollegiate course enabling academic institutions to offer their students a class in a chemistry subject area where there may be inadequate resources and expertise to otherwise offer. The CCCE has been organizing OLCCs since 1996 and we are currently seeking faculty who would like to participate in a Cheminformatics OLCC. This project is not limited to schools within the US, and although we are seeking to address needs within the undergraduate curriculum, graduate level classes are welcome. In this paper we are going to approach several different questions. What is an OLCC and what are the issues we are tackling with respect to online collaborative intercollegiate teaching, learning and curriculum development? And, why an OLCC in cheminformatics? Our development site is http://olcc.ccce.us/, and please contact the authors if you are interested in participating in this project.
Why would a chemistry faculty member who teaches undergraduate students and is not formally trained in cheminformatics want to teach a course in cheminformatics? Part of the answer to this question deals with the evolving nature of today’s digital information landscape, the emergence of e-science, and the role cheminformatics will play in the practice of traditional science in the 21st century. The web has superseded the traditional library as our primary source of information and yet like the traditional library, today’s web is fundamentally document-centric. That is, the major interface for obtaining information over the web is a web-page, like this page that you are now reading, and which could be bound in a book and placed in a traditional library. All this is logical, as we effectively practice science in a document-centric world of communication. But is this the only way to use the web in the practice of science? Clearly there is a role for data-centric interfaces and emergent e-science technologies.
Few academic institutions are equipped today to teach the latest and most advanced cheminformatics techniques, even though many are employed in the chemical industries, and it would be an asset for our students to have skills and cognizance of these technologies when they graduate and seek gainful employment. Sure they can gain much of this vital training after graduation, but we need to ask ourselves, are we providing our students with the most useful education possible? We need to ask, is it important for our students to understand data standards? To understand and utilize electronic lab notebooks, smart spreadsheets, web APIs and mobile devices? To be able to perform science in a world where software agents interact with each other and databases to instantly provide scientists with the information most germane to the task on hand? Is something missing in today’s undergraduate curriculum? These are questions we need to ask and discuss in an honest and open manner.
Our objective in this project is to bring together academic and non-academic chemists, educators, librarians and cheminformaticians to develop a curriculum that can help us provide our students with the best skills in, and cognitions of, this new and evolving information landscape. But also, we are at a very challenging time in education, as our students are growing up in a world where they natively use cognitive artifacts that are foreign to many of our faculty. In 2009, Julie Evans of Project Tomorrow identified a new type of student in their “Speak Up” data set (which today represents data from over 3 million K-12 students, educators and parents)1. These were middle school digital native “free agent learners”2, kids who were using ICTs (Information and Communication Technologies) to develop new problem solving schema outside of the traditional curriculum. These kids who grew up in a mobile device driven world of instant communication and information are now beginning to enroll in our colleges and universities, and are bringing to the classroom a new set of skills and expectations the traditional curriculum may not be prepared to handle.
The Cheminformatics OLCC is in essence an experiment in curriculum development and dissemination, and if successful, will not only provide faculty and students with skills that will be of value in this evolving information landscape, but also provide specific modules that can introduce modern cheminformatic techniques into the traditional areas of chemistry. So it is vital that we have chemists who teach organic, inorganic, physical, analytical and biochemistry facilitate classes of this course at their home institutions, and collaboratively develop Teaching and Learning Objects (TLOs) that can be used outside of the OLCC and in their traditional classes.
OLCCs: A Brief History
OLCCs are intercollegiate courses hosted by the ACS DivCHED CCCE. To date, there have been five OLCCs with the first being held in the Spring of 1996, and the last in the Fall of 2004. Unfortunately, we have lost access to the content of all but the Fall 2004 OLCC, which has been preserved by Scott Van Bramer at Weidner3. We do have an early Spring 1996 CCCE Newsletter article on the first OLCC written by Donald Rosenthal.4 To date, OLCCs were held at the following times on the following topics:
Being an intercollegiate course an OLCC involves multiple classes at multiple institutions with students interacting with both multiple online lectures and local faculty, and to avoid confusion we need to define some terminology that we will use in this paper and discussion.
Eight schools participated in the Fall 2004 OLCC on chemical hygiene (fig. 1), including the University of Arkansas at Little Rock where Bob Belford, co-author of this paper was a facilitator. Each week the students would interact with a new lecturer in much the same format a ConfChem was run. The lecturer would post a paper that would be discussed over a listserv. For the OLCC there were two listservs, OLCC-FAC for the faculty, and OLCC-STU for both students and faculty. Figure 2 shows the lecture topics for the third and fourth weeks of the course, when George Wahl and Jay Young respectively interacted with students from all 8 campuses on the topics of “Exposure to Chemicals” and an “Introduction to Toxicology”. Just as in a ConfChem, multiple experts could participate in the discussion. This provided the students with a rich exposure to content that Belford could not have offered if he attempted to teach this course on his own.
Fig.1 List of schools offering a class in the 2004 OLCC
Fig. 2 Topics of the Fall 2004 OLCC syllabus on Chemical Health and Safety
Classroom Issues for an OLCC
One of the biggest classroom challenges for an OLCC results from the dynamics of collaborative teaching involving guest lecturers. In a normal classroom you have two basic types of human interactions, student-student and student-teacher. As the class progresses through the semester the student-teacher interactions become refined as they become conditioned to each other, and come to understand their respective needs and expectations. When a guest lecturer is invited to a classroom this prior experience is missing and often [read hopefully] the first question asked is “tell me about your students”. That is, the guest lecturer needs to identify the students’ background knowledge and predispositions, and then create content appropriate to their needs and abilities. How can that be done in an OLCC when each class has a unique set of students, with different needs, different background knowledge and different expectations?
This is one of the challenges this project is tackling, and we are creating what could effectively be called an intercollegiate course management system that enables different classes to teach the same course with different content. Just as the old OLCCs followed the ConfChem model, so will the new one. Each class will have their own homepage created in a similar manner to how Confchem conferences are created. This will be done through taxonomies the same way this Newsletter is distinguished from last year’s Newsletter, (described in the next article of this Newsletter, “The Twentieth Anniversary of ConfChem Online Conferences: Past, Present and Future”). That is, this project will create multiple TLOs (Teaching and Learning Objects) that through class-based taxonomies and content tagging allow the facilitators customize the content of their individual class homepage to the needs of their students. Each TLO will be discussed like a ConfChem paper is discussed, and if multiple facilitators tag the same TLO, there will be intercollegiate student-student interactions across multiple institutions.
Assessment of student learning is another challenge for the OLCC model. Students are conditioned to be examined by the person providing the lecture, and teachers are conditioned to presenting the material they expect their students to know, and provide emphasis in line with their expectations. Although the students meet face-to-face on a weekly basis with the facilitator, the person delivering the course content (lecturer) is not the person responsible for the grades. This is further complicated when there are different lecturers each week, all teaching to multiple classes at different schools. This is not a case of “lecturing to the exam”, but facilitators need control over the lecture content if they are expected to provide the grades. During the development phase when lecturers interact with facilitators to create TLOs, effort needs to be done on generating assessment material like ancillary test item files that both facilitators and lecturers can contribute to, and pool across campuses. But there is more.
In many ways OLCC’s are ideal for project-based learning and assessment, where the external lecturers can function as mentors with respect to student projects and assignments. During the 2004 OLCC on Chemical Hygiene students at UALR went into their research labs and identified activities for which they had no SOPs (Standard/Safe Operating Procedures), and then developed them as part of the course. They had the expertise of the lecturers to consult in their fulfillment of the project, and today, many of those SOPs are incorporated into the department’s Chemical Hygiene Plan. Cheminformatics is ideal subject for project-based learning and assessment.
Three Phases of the Cheminformatics OLCC
Phase 1: Curriculum Content Development
The initial curriculum development phase is expected to take 6-8 months. Lecturers will generate lesson plans and post modules in essentially the same way they did in the old OLCCs. These will be posted on a private development site and only available to faculty associated with the project. Facilitators and lectures will then discuss these lesson plans much the same way we are discussing this paper, with the idea of generating multiple, single concept Teaching and Learning Objects (TLOs) based on those discussions. These are what we are calling Derivative TLOs, being derived from phase 1 lecturer-facilitator interactions. Figure 3 provides a flow chart for this process.
One model we are considering for generating TLOs is a short video type “show and tell” screen capture interview. Consider an initial module on the representation of chemicals on computers with a section on InChI (topic of paper 3 in this Newsletter). One facilitator states the material on the InChI layers is too complicated and we need a simpler version, while another states there is not enough and we need to expand that section. Yet another wants to relate this to a specific type of compound central to an organic class, while another wants to relate it to a specific type of compound related to an analytical class. And another needs…. We could then generate multiple derivative TLOs, derived on the various needs of these facilitators. Once created, any facilitator can tag them and use them in their class. This enables all classes to customize the course to the needs of their students and institutions.
Phase 1 of the project clearly has components of an online faculty workshop, where faculty who are not experts in Cheminformatics interact with cheminformaticians and other information experts to create the curriculum content. This is not only giving faculty exposure to advanced cheminformatic techniques, but it is also giving them a form of ownership to the material they will use in the classroom, which is very important for adaption. There is also another aspect that we are trying to develop, and that is the “repurposablility” of the TLOs. We realize that many PUIs (Primary Undergraduate Institutions) will not be able to continue to offer a course in Cheminformatics once the OLCC is over, and thus it is important that these can be used in other courses. So we are seeking facilitators who teach the traditional core chemistry subjects (analytical, organic….) and as they interact with the cheminformaticians, to try and develop cheminformatic TLOs that can also be repurposed into the classes they normally teach.
Fig. 3 Flow chart showing interdisciplinary lecturer/facilitator interactions in the development of TLOs.
Phase 2: Curriculum Dissemination
We intend to offer the class twice, with a year between sessions to revise the modules, develop new TLOs and learn from our experiences. We also hope the course will attract more schools the second time around, and want to give new faculty the opportunity to participate in the phase 1 aspect.
As different schools have different academic calendars the first and last modules will be of variable length, depending on each school’s calendar, with the middle modules all synchronized to a common schedule. The first module will be on chemical literacy and involve activities with the local library, while the final module will be a class dependent final project, like creating a smart spreadsheet for a laboratory notebook. The following bullet list outlines the basic structure of the intercollegiate course dissemination during the synchronized component of the course.
Phase 3: Repurposable Archives
All TLOs will be open access with creative commons licenses that will allow others to repurpose them for needs outside of the OLCC. Our organizing committee includes members involved with the development of the XCITR (eXplore Chemical Information Teaching Resources) resource, that is now hosted by the RSC and was jointly created by ACS CINF and the GDCh CIC. This is an ideal repository for both TLOs and some project-based student assignments, where students would work with their local librarian to create tutorials on resources within their schools that could be of value to both future students and other users of those resources.
The organizing committee also has membership from the CHEMWIKI project at UC-Davis that is part of the STEMWiki Hyperlibrary project. CHEMWIKI currently receives 2.4 million pageviews per month and is developing a core Cheminformatics section. Their involvement not only ensures visibility of the archives, but also offers additional options for hosting the course in the event that scale-up issues occur. Although we are creating our own course management system within the Drupal content environment the DivCHED hosts or site on, we need to ensure that this strategy can be transferred to other platforms, like the CHEMWIKI.
With respect to the CCCE’s own archives, we will preserve the OLCC in the same site where we host the ConfChem and Newsletter archives, which are reported on the next article of this Newsletter, The Twentieth Anniversary of ConfChem Online Conferences: Past, Present and Future. Note that in the ensuing article we introduce the future open-tag capabilities for our ConfChem and Newsletter articles, and these will also be used during the OLCC. This will allow students, lecturers and facilitators the ability to tag the content as they take the course and generate an additional taxonomy, a folksonomy. This folksonomy can then use the collective intelligence of the course participants to relate and extract TLOs from different sections of the course that were taught by different lecturers. This can potentially bring forth new relationships from within the curriculum content.
Why an OLCC in Cheminformatics?
For students to learn the skills they need to excel as chemists they not only need to know the foundations and lab skills traditionally held central to the science, they also need to know how to communicate the results of their work, how to acquire and review the results of others' work, and increasingly how to use computational and informatics techniques as part of their scientific discovery process. Over the past several decades advances in Cheminformatics have been so rapid that the traditional undergraduate curriculum has not kept up, and it will be a significant competitive asset for our students to be cognizant of these new technologies when they graduate.
Cheminformatics is defined as the field of study of all aspects of the representation, management, integration, interchange, analysis and modeling of chemical and related biological information on computers. Extending chemical literacy skills with cheminformatic techniques can directly impact the success of practicing chemists. Acknowledging that this computational discipline encompasses a wide range of subjects, we seek to define a subset of the field, that we consider essential knowledge for the graduating 21st century chemist.
We are putting together an organizing committee of cheminformaticians, undergraduate chemical educators, practicing chemists and librarians to identify the components of cheminformatics that can be considered critical knowledge for graduating chemistry students. This committee is tasked with developing the course syllabus. Please contact the authors if you are interesting in contributing to this charge.
The initial modules will involve the local libraries and address competencies outlined in the Wikibook; Information Competencies for Chemistry Undergraduates: the elements of information literacy of the Chemistry Division of the Special Libraries Association and the ACS Division of Chemical Information. The amount of time a class spends in this module is contingent on their academic calendar. Schools that start the semester earlier will spend more time than those that start later. After all schools have spent at least a minimum of two weeks on this module we will start the synchronized lectures on cheminformatics. Cheminformatic topics and competencies the organizing committee is evaluating with respect to added value in the undergraduate curriculum include:
This work is supported by NSF TUES grant 1140485, and any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.
1 http://www.tomorrow.org/speakup/pr/SU12_June_PR.html (last accessed October 22, 2013)
2 http://www.tomorrow.org/speakup/ComingToCampus.html (last accessed October 22, 2013)
3 http://science.widener.edu/svb/olcc_safety/ (last accessed October 22, 2013)
4 http://www.ccce.divched.org/sites/www.ccce.divched.org/files/cccenls1996.pdf , p. 15 (last accessed October
Recent attempts by science research consortiums to validate results from contemporary publications have exposed a tremendously low percentage (~10%) of reproducible results. This astoundingly poor reproducibility can partly be attributed to flaws in static publishing format, namely lack of detail in describing complex methodologies. As new technologies emerge, science research and education have adapted to include these new offerings. However, there has been no accompanying evolution in publication format. The scientific publishing model initially introduced in the mid seventeenth century is the same model used to present today’s research findings. Articles are still published as text heavy tomes with the occasional scheme, diagram, or table to demonstrate a point. Why is the science community still employing ancient text standards? How should the publication template be modified to accurately communicate advanced techniques? This presentation describes a new trend in publication format, namely video publication, and its ability to accurately capture complex techniques, thereby increasing reproducibility of results. Recently conducted case studies will be shown in support of video publications as a valid communication venue in scientific publishing.
State of Publishing
Science research is presented using largely the same publication model introduced in the mid seventeenth century. With the exception of color printing and electronic dissemination (cataloguing and searching functions), not much has changed. While these advances have improved user accessibility, they have not improved the ease of application or utility of the presented findings to other areas of research.
Conversely, as new technologies emerge and are applied to various disciplines, research and education is advancing at a rapid rate. Despite these advances, both new research and education models are presented as text heavy documents with the occasional diagram or graph display of obtained results. Why hasn’t the presentation of these new reports evolved to include the concomitant advances in research and education?
Reproducibility on the Decline
A severe consequence of the current stagnate state of research communication presentation platforms, reproducibility in science is on the decline. Irreproducibility of research has surfaced as a serious concern in recent years with numerous studies dedicated to reproducing the findings of contemporary publications. Each of these studies has exposed an alarming percentage of scientific findings that cannot be reproduced.
Two years ago, Bayer Health called attention to the issue internally.1 Interested in quantifying reproducibility from their labs, they selected 67 published experiments and repeated them. Comparing their new results to those previously published, they found that a staggering 64.2% could not be replicated. Of the rest, a mere 20.9% were fully replicated, 11.9% results were partially replicated, and the remainder was not applicable.
A separate study conducted by C. Glenn Begly while he served as the head of global cancer research at Amgen Inc. calls attention to the reproducibility in cancer research.2 Begly’s research group identified 53 papers from high profile journals and tested their claims. Similar to Bayer’s findings, Begley’s group found that only 89% of these selected works could not be replicated.
Deleterious Effects of Irreproducibility
Obvious negative impacts of irreproducible published results are numerous: contamination of the literature with potentially false findings, assumptions based on these reports, and a backward trajectory of science advancement, among others. A more downstream consequence, research budgets allocated through government grants and independent industries are at risk. As the number of studies investigating the reproducibility of results in published research climbs, funding agencies are beginning to apply more stringent reproducibility guidelines to their applications.
Echoing this concern, an article in the Journal of American Medical Association provides a break-down of the biomedical research budget and funding.3 The report reveals a recent decline in budget trends despite the steady incline in the number of researchers. An updated study using recent fiscal calendars shows that research budgets continue to decline.4
The authors of the study posit high failure rate of new technologies as one possible cause for the budget decrease. Failings in technology can be traced back to the poor reproducibility of published research and the inability to translate basic research findings into advanced applications. Though some of these failings may be due in part to complications of scaling factors, lack of reproducibility certainly plays a large role.
Considering funding availability directly impacts science progress, science will suffer greatly if reproducibility is not addressed and remains an instigator in declining research budgets. To mitigate this serious and immanent threat, the science community must acknowledge the problem and alter the way in which information is shared for increased reproducibility.
Revolution in Science Publishing
The reproducibility epidemic has not gone without attention by the publishing groups responsible for distributing research reports. As the number of reproducibility investigations and published reports climbs, irreproducibility awareness has increased. Just this past spring, Nature published an editorial commentating on ways in which researchers can alter their published works to increase reproducibility.5 Among these suggestions was increased methodology details and specific reagent information, as the lack of sufficient detail in published reports was implicated as a main contributor of the irreproducibility problem.
What was not addressed in the Nature editorial was how researchers could provide greater level of detail when describing their recent findings. If publication methods or the ways in which information is presented to the community have not changed in centuries, how can scientists be expected to present their advanced findings with the required level of detail needed for replication? Publishers must adapt and provide new outlets for sharing methods and results to the science community. These new outlets must facilitate detail-oriented presentation.
Here at JoVE, the Journal of Visualized Experiments, we have approached this problem and developed a novel publication format, peer reviewed video publication. Using the traditional text manuscript model with all standard sections present (Abstract, Introduction, Protocol, Representative Results, Discussion, and References), JoVE then converts the text into a script. Next, JoVE films the entire contents of the manuscript in the researcher’s own lab or clinic. Dynamic video presentation provides a visual level of detail that mirrors that of an in-person training experience. Through observing how recent findings are obtained and analyzed, the time and resources spent reproducing and learning new published methods will drastically decrease while also increasing the reproducibility of results.
JoVE publishes 70 video articles a month in both the physical and life sciences with sections in Chemistry and Applied Physics, among others. In addition to citations, the video articles receive several thousand views with total monthly web traffic typically reaching 300,000 visitors. The high number of web visitors shows that the science community is interested in video publication.
Recently published video article in JoVE Chemistry
Does Video Publication Deliver?
Given the high usage statistics, we were interested in determining how usage translates to increased utilization of published methods. To this end, we interviewed several researchers at various institutions to inquire about their interaction with JoVE. A postdoctoral researcher at Baylor College of Medicine, Nikolaos Giagtzoglou, shared his JoVE experience with us.6 While developing a new application of his research, Dr. Giagtzoglou looked to the literature to learn three techniques for working with Drosophila (fruit flies). After spending time reading several traditional text publications, Dr. Giagzoglou was unable to learn from these reports and utilize the techniques in his experiment.
The obvious next step was to contact the author of the paper and request a training session. Unfortunately, when Dr. Giagtzoglou reached out to the authors, he found that “it can be hard to coordinate busy schedules to travel and learn the method.6” During his search for more literature sources, Dr. Giagtzoglou stumbled upon a JoVE article presenting the technique in video format and immediately recognized its value. Using the JoVE video article, Dr. Giagtzoglou shares with us that “I really had no starting point to learn these techniques, and JoVE was invaluable...Watching a JoVE video-article is so much more helpful than reading just materials or methods, which can have grammatical mistakes, bad syntax, or may be hard to interpret.6”
Similarly, Dr. Casey, an Assistant Professor at Purdue University, found a JoVE article when searching for methods describing how to dissect the suprachiasmatic nucleus in mice, a complicated neuroscience procedure.6 About her JoVE experience, she said, “I had a collaborator in Buffalo who knew the SCN surgery, and I’ve seen it done before. By using the JoVE video, we saved money in travel costs to go to Buffalo repeatedly to learn the technique.6” A cost analysis showed that Dr. Casey saved over six thousand dollars that would have been spent on travel expenses and reagents because the JoVE video article provided the necessary visual training needed to learn and perform the procedure independently.
In addition to saving valuable time, money and resources, video articles have the potential to help build the foundation for new ideas and research by providing the necessary instructions for non-experts to learn and implement new techniques in their own laboratory. In fact, Dr. Casey’s testimony reflects just that, “I’ve been doing research for 20 years, and having JoVE makes things so much easier. You can educate yourself on research other scientists are doing around you and get familiarized on a technique before you try it. I like to watch techniques and refresh myself on experiments I haven’t conducted in 18 years but need now.6” In this way, video articles facilitate the transfer of knowledge from one research lab to another from anywhere around the globe without the need to travel.
Researchers have also used video articles to validate results. After publishing novel results in a high impact journal, Dr. Jonathan T. Butcher at Cornell University explains that he received numerous inquiries from researchers in the field questioning the validity of the results within since “these other labs were not able to reproduce our results using the written instructions in the methods section of our novel research paper.6” Frustrated with having to defend his research in response to these claims, Dr. Butcher published this same method in JoVE and no longer receives correspondence disputing the validity or reproducibility of the lab’s results. Dr. Butcher believes this is because “the video format conveys complicated methods significantly better than text alone and helped to validate our novel results.6” Dr. Butcher’s testimony confirms that traditional text manuscripts are no longer sufficient to describe the complexity of contemporary research methods.
Video Publication for the Advancement of Science
Despite a decrease in reproducibility, science journals are publishing an increasing number of articles. This opposite trend suggests that peer reviewed journals are not assisting the research community with the tools necessary to accurately present complex research. While science continues to advance at a rapid rate, publishing has not kept pace with these advances.
The first major change to science publishing in 350 years, JoVE utilizes video technology to capture complex research in both physical and life sciences. These video accounts provide the viewers with the necessary level of detail for understanding, learning, and reproducing, the research presented. Increased transparency through visualization also serves to validate methods and their results. Building confidence in research through increased validation will encourage funding agencies to grant larger budgets to research labs and institutions. The culmination of these myriad benefits will provide a springboard for the advancement of science.