The wheel has turned.
Corpus linguistics, then
Twenty years’ ago, I put together a corpus of English Language with the help of the English Department at Birmingham University. Books were scanned by hand and we culled the misreads by hand working through the night wearing every item of clothing we possessed to make our computer budgets stretch further. We used several mainframe computers switching from one to another to complete different tasks.
Then we moved the whole bang shooting match back to Zimbabwe on computer tapes and carried on analysing the content using UNIX.
Munging, now
I had forgotten the word grep. Well youngsters don’t grep anymore. They search for ‘regular expressions’. They’ve never heard of computational linguistics. They talk about the semantic web. They munge.
And they are doing fine work using HTML mark up and linguistic markers to search the web for information such as the schools attended by Conservative MPS or the names of officials who have signed off large grants to private companies.
When will hacking stop being a hobby?
Open data has surely begun though it still seems to be at a hobbyist level. While academics are moving (wisely) from analysis to design (synthesis), hackers want the cut-and-thrust of a quick sortie – a raid on the establishment.
One of the growth areas on the next few years will be learning how to test the quality of answers provided by hackers.
Hack. Your business depends on it.
In the meantime, learn to hack. Because if you don’t, you’ll be hostage to the views of the world they put forward.
Comments