Comparison: CETIS Blogging vs EdTech Bloggers Generally (Jan 2011-Feb 2012)

This is an un-interpreted and automatically-generated report to show the variation in terms used in two sets of blogs: CETIS Blogs and EdTech Blogs in TELMap Mediabase (data obtained from the TELMap Mediabase created by RWTH Aachen University). A separate subsection shows those terms that appear more frequently in blog posts from each set in turn. Selection criteria based on statistical significance are used to determine which terms are selected; the probability that the difference in frequency might be due to pure chance must be less than 0.1% in addition to other criteria to select dominant terms (see "technicalities").

All plots will open in a new window/tab as 1000x1000 pixel images if clicked on. The "Wordle" is 1024x768.

Overview of Selected Terms

Only middle-frequency words are considered; the comparison is between terms that are neither very common nor very rare in the aggregate of all blogs being analysed.

Word cloud of compared terms
Word cloud for both sets combined: word sizes indicate frequency.

 

CETIS Blogs

Frequency Plot

CETIS Frequencies and Significance
Frequency = the fraction of terms
Significance = -log10 of the probability that the difference in frequency between the conferences is "pure chance" (i.e. 3 is 1 in 1,000, 4 is 1 in 10,000 etc)
Docs/1000 = the number of documents in the set that contain the term per thousand (colour code and area of square)
Also available as hi-res pdf.

 

Term Co-occurrence Graph

CETIS Term Co-occurrence
Node size = significance
Node colour is accoring to grouping
Edge (connector) size = number of documents containing both connected terms.
NB: only the most-connective 90% of edges are shown

 

EdTech Blogs in TELMap Mediabase

Frequency Plot

non-CETIS Frequencies and Significance
Frequency = the fraction of terms
Significance = -log10 of the probability that the difference in frequency between the conferences is "pure chance" (i.e. 3 is 1 in 1,000, 4 is 1 in 10,000 etc)
Docs/1000 = the number of documents in the set that contain the term per thousand (colour code and area of square)
Also available as hi-res pdf.

 

Term Co-occurrence Graph

CETIS Term Co-occurrence
Node size = significance
Node colour is accoring to grouping
Edge (connector) size = number of documents containing both connected terms.
NB: only the most-connective 90% of edges are shown

 

Information

Source Code, Data and Technicalities

Source code for processing and formatting is available on GitHub.

Raw results are available in pairs, one of each kind being the data behind the two sections above. Gephi files are available separately for CETIS and non-CETIS. All are under the same licence terms as this report.

The log file contains run parameters.

The technicalities of the method and explanatory notes on the content of the above downloads may be found on the GitHub wiki. These notes explain the term-selection criteria.

Copyright, Licence and Credits

This work was undertaken as part of the TEL-Map Project; TEL-Map is a support and coordination action within EC IST FP7 Technology Enhanced Learning.

Creative Commons Licence This work, its images and original text are ©2012 Adam Cooper, Institute for Educational Cybernetics, University of Bolton, UK.
Adam Cooper has licenced it under a Creative Commons Attribution 3.0 Unported License