I have focused my research on information
extraction, retrieval and organization from text documents, using several machine learning (probabilistic
and statistical) approaches.
In these years, I have analyzed different
aspects of text/web mining moving from classical text problems as
supervised and unsupervised learning, or new representation for
text document to more computational linguistic tasks, as language
evolution and statistical machine translation.
I have been part of the European project SMART,
where I have applied machine learning techniques to statistical machine translation (SMT)
problems, and I have been also involved in a media analysis project aimed at
modeling the mediasphere based on text mining and cross-language
analysis techniques.
My
current research is centered, but not limited, on SMT techniques applied to news domain, thesaurus indexing, machine
learning, multilingual patterns learning and
news analysis.
My
last works involve learning curves analysis of a SMT system, confidence
estimation in Machine Translation, self-learning for SMT system and summarization evaluation.