My mission is to span the boundary between computational linguistics (aka natural language processing) and relational data analysis (aka network analysis). I started working on this idea in order to better understand the co-evolution and interplay of the semantics and mechanics of real-world networks. More precisely, I am concerned with the controlled and efficient extraction of relevant, user-defined instances of node and edge classes from unstructured, natural language text data. CMU has been a great environment for doing this.
After working towards my goal from a computational and empirical standpoint for a couple of years I began to realize that there is another dimension to it: The challenge here is not just the interdisciplinary research, publications or projects that we can engage in, but it's yet again about the people - we need to build bridges between people who develop computational solutions and the consumers of those tools, methods, measures, etc.. This involves computational thinking (an idea suggested by Jeanette Wing from CMU, now NSF) and computational communication for everyone.
Why? Because the modern algorithms and techniques that we develop and deploy are complex in their underlying theories, models, algorithms and parameters. Solutions handed over from engineers and scientists to analysts and other end-users carry along decisions already made, some of which the user should know about. In order to address this challenge I currently work on the reverse engineering of robustness - more precisely I investigate the sensitivity of supervised and semi-supervised sequential stochastic machine learning techniques with respect to the impact of computational decisions on relational data that is distilled from texts. If you have thoughts or feedback on this I would be happy to talk with you about it.
Networks and Organizations: I study dynamic, complex and large scale socio-technical systems that face a change, emergencies or crisis. In my empirical work I have focused on business organizations (e.g. Enron), governmental organizations (e.g. FEMA), and a broad range of covert networks.
Relational Data and Computational Linguistics: My ongoing interest is the automated extraction of relational data from texts. This has led me to the development of new algorithms and methods for semantic network analysis and ontological text coding as well as to the application of these techniques to a variety of problems and domains. Those techniques are available in a toolkit that I built at CMU (AutoMap). AutoMap supports users in distilling relational data (more specifically, mental models of individuals and groups as well as the structure of social and organizational systems) from texts. The package provides a variety of natural language processing and information extraction techniques (e.g. Parts of Speech Tagging, Stemming, Collocations, Named-Entity Detection, Anaphora Resolution, creation and application of thesauri and ontologies). As a by-product, AutoMap supports classical Content Analysis.