While doing today’s reading about the growth of popularity of the internet, I was reminded of the old Usenet threads that one can find on Google Groups. Google Groups is a Google service which has kept Usenet bulletin boards alive by saving a huge chunk of them, some of which go back to 1981, and allows people (and spambots) to continue to contribute to them. Since Usenet was certainly was a colorful place of discussion, as it was used as the first online social network (No socializing was allowed on ARPANET), these old Google Groups discussions could come in handy for historians who choose to study how computer-savvy people thought about things like culture and politics in the 80’s-90’s. It is also a good example of how data in natural sciences and social sciences can be similar; in Usenet archives, like many natural science databases, there is a lot of bad data (spam), there is too much data just to read through it all and both of these concerns could be potentially dealt with by using software and hardware dedicated to the task of sifting through it.
Much like what the Wayback Machine has done for the HTML era of network computing, there is a clear potential for Google Groups to do the same (in a better manner, the Wayback Machine web pages don’t always load) for an older era. Unfortunately, Google Groups doesn’t really seem designed that way. It is based on the Google search engine, so although one can do an advanced search, it would certainly be difficult to retrieve all the possible info on a given subject (To say that quite a few things were posted on Usenet would be an understatement). It seems to be more of a way to keep Usenet threads alive and to allow those who still prefer the old ways of doing things to do them.
Many thoughts and ideas were shared on Usenet which may only be rediscovered through proper examination of the archives. For example, if someone wanted to see how the evolution of “phone phreaking” into full-scale computer hacking was portrayed among the computer savvy population, going through the archives would help, but searching and clicking through the archives might create problems. Links could be missed, and certain discussion concepts may not be considered in the retrieval process. Furthermore, there are simply too many things on these archives to just read through everything and find the things that are important.
What Google Groups needs to be a reliable history tool (which would certainly be beneficial) is several things: The first would be a serious attempt by Google to emphasize it’s historical capabilities by working on the search engine, the second would be the ability to data-mine Google Groups so that the spam can be weeded out and so that the necessary data can be easily retrieved. Of course, someone other than Google could emphasize the historical abilities and merely mine the data and create a new database.
I don’t know if what I’ve said has made sense to anyone but myself, but it is clear to me that non-STEM fields could use big data to their advantage if social scientists can somehow find a way to retrieve this stuff from the dustbin of history in an efficient manner. We have entire internet conversations online that we could have very easily lost, we should find a way to use them!!!