Intelligent information extraction from scholarly document databases

Fernando Vegas Fernandez


Extracting knowledge from big document databases has long been a challenge.
Most researchers do a literature review and manage their document databases with tools that
just provide a bibliography and when retrieving information (a list of concepts and ideas), there
is a severe lack of functionality. Researchers do need to extract specific information from their
scholarly document databases depending on their predefined breakdown structure. Those
databases usually contain a few hundred documents, information requirements are distinct in
each research project, and technique algorithms are not always the answer. As most retrieving
and information extraction algorithms require manual training, supervision, and tuning, it
could be shorter and more efficient to do it by hand and dedicate time and effort to perform an
effective semantic search list definition that is the key to obtain the desired results. A robust
relative importance index definition is the final step to obtain a ranked importance concept list
that will be helpful both to measure trends and to find a quick path to the most appropriate
paper in each case.


Market Market Intelligence, Business Intelligence, Competitive Intelligence, Information Systems, Geo-Economics

Full Text:



