Philosophically, this NYTimes Op-Ed piece on the Homeland Security Act and privacy violations meshes with my own concerns, but such precise predictions of what will come to pass made me go look up the Information Awareness Office and have a look for myself. It’s an interesting read. My guess is that they do want to collect a wide and possibly irrelevant seeming collection of personal data – their approach is heavily learning-based, and it seems that they hope to tease out correlations that people wouldn’t normally spot by learning correlations that work rather than building and testing correlations they hope would work. Problem is, where’s their positive data going to come from, and do they have any prayer of statistical significance with many more non-terrorists than terrorists in the bunch? This isn’t a domain where the technology can be applied blindly with complete reliability. I don’t want to give up my privacy for a needle-in-the-haystack search relying on technology of questionable relevance.
Don’t even get me started on how pissed I am that the military applications of NLP are moving beyond translation tools and simple extraction/summarization into this kind of software. I was wondering who was funding all that research into automated story telling….