Workshop Proceedings

Invited talks

Josef Steinberger:


University of West Bohemia, Plzeň, Czech Republic, and
SentiSquare s.r.o.
"Challenges in launching an NLP start-up company: Research meets the Real World"

Abstract: NLP is commercially a very interesting research field. However, there a long way to transforming a clever algorithm into a successful business. In this talk, I will discuss my experience with launching an NLP startup, which is connected to university research. It all started with just an idea. We went through two accelerator programs, where we learnt a lot and shaped the idea in a product. Later, a technology-transfer agreement was reached with the university. As the company was founded by the researchers, it was apparent very soon that the team has to be complemented by someone with business experience. Since that was done, the ball keeps rolling fast, and with contact to end-users our research has got a new dimension.

Bio: Josef obtained his PhD from the University of West Bohemia in 2007. Then he became a postdoc at the Joint Research Centre (JRC) of the European Commission, where he was part of the team working on Europe Media Monitor(EMM). After the postdoc, he moved back to continue career as a lecturer at the University of West Bohemia in 2012. In 2013, he was awarded a Marie Curie grant. In 2014, he co-founded the company SentiSquare, which builds on the university NLP research and delivers to clients a service which discovers online the main issues their customers are facing.

Tanja Samardžić:

University of Zurich
"A computational cross-linguistic approach to Slavic verb aspect"

Abstract: Verb aspect is one of the most prominent categories that distinguishes Balto-Slavic from other Indo-European languages. Linguistic discussion on potential functions of verb aspect and aspectual classification of verbs mainly in Slavic languages is extensive and long lasting. As opposed to this, computational approaches to Slavic verb aspect are relatively rare and underdeveloped. In this talk, I will discuss challenges for computational treatment of Slavic verb aspect and its relevance to natural language understanding. In particular, I will present an approach to automatic extraction of relatively fine-grained aspectual verb classes using parallel corpora. I will then show how these classes can be used in temporal classification of events across languages.

Bio: I studied Serbian language, literature, and general linguistics at the University of Belgrade, where I obtained my first MA degree in Linguistics in 2003. Having discovered Computational linguistics at one of the undergraduate courses, I continued my education in this direction through postgraduate studies that took place at the University of Geneva from 2006 till 2013. During this time, I was a member of the research group "Computational Learning - Computational Linguistics" headed Paola Merlo and James Henderson. I completed my PhD thesis in 2013, under the supervision of Paola Merlo. The same year, I started my current position as the director of the CorpusLab, within the University Research Priority Programme "Language and Space" at the University of Zurich.



Workshop Schedule

DAY 1: 10 September 2015
09:00 - 09:10 Welcome remarks
09:10 - 10:00 A Computational Cross-Linguistic Approach to Slavic Verb Aspect
Invited talk by Tanja Samardžić
Session I: Syntax
10:00 - 10:35 Universal Dependencies for Croatian (that work for Serbian, too)
Željko Agić and Nikola Ljubešić
10:35 - 11:00 Analytic Morphology – Merging the Paradigmatic and Syntagmatic Perspective in a Treebank
Vladimír Petkevič, Alexandr Rosen, Hana Skoumalová and Přemysl Vítovec
11:00 - 11:30 Coffee break
Session II: Information Extraction
11:30 – 11:50Resolving Entity Coreference in Croatian with a Constrained Mention-Pair Model
Goran Glavaš and Jan Šnajder
11:50 – 12:15Evaluation of Coreference Resolution Tools for Polish from the Information Extraction Perspective
Adam Kaczmarek and Michał Marcinczuk
12:15 - 12:35 Open Relation Extraction for Polish: Preliminary experiments
Jakub Piskorski
12:35 - 14:00 Lunch
14:00 - 14:50 Challenges in Launching an NLP Start-up Company: Research Meets the Real World
Invited talk by Josef Steinberger
Interactive Session
15:00 - 15:10 Regional Linguistic Data Initiative (ReLDI)
Tanja Samardžić, Nikola Ljubešić and Maja Miličević
15:10 - 15:20 Online Extraction of Russian Multiword Expressions
Mikhail Kopotev, Llorenç Escoter, Matthew Pierce, Lidia Pivovarova and Roman Yangarber
15:20 - 15:30 E-law Module Supporting Lawyers in the Process of Knowledge Discovery from Legal Documents
Marek Kozlowski, Maciej Kowalski, Maciej Kazula
15:30 - 16:00 Coffee break (including continuation of the Interactive Session)
16:30 - 17:30 Discussion on BSNLP/SIGSLAV Activities: Shared NLP Task
DAY 2: 11 September 2015
Session III: Semantics
9:00 - 9:25 Experiments on Active Learning for Croatian Word Sense Disambiguation
Domagoj Alagić and Jan Šnajder
9:25 - 9:45 Automatic Classification of WordNet Morphosemantic Relations
Svetlozara Leseva, Ivelina Stoyanova, Maria Todorova, Tsvetana Dimitrova, Borislav Rizov and Svetla Koeva
Session IV: Corpus Analysis and Resources
9:55 - 10:20 Applying Multi-Dimensional Analysis to a Russian Webcorpus: Searching for Evidence of Genres
Anisya Katinskaya and Serge Sharoff
10:20 - 10:40 Distinctive Similarity of Clausal Coordinate Ellipsis in Russian Compared to Dutch, Estonian, German, and Hungarian
Karin Harbusch and Denis Krusko
10:40 - 11:00 Universalizing BulTreeBank: a Linguistic Tale about Glocalization
Petya Osenova and Kiril Simov
11:00 - 11:30 Coffee break
Session V: Sentiment Analysis and Text Classification
11:30 - 11:50 Types of Aspect Terms in Aspect-Oriented Sentiment Labeling
Natalia Loukachevitch, Evgeniy Kotelnikov and Pavel Blinov
11:50 - 12:15 Authorship Attribution and Author Profiling of Lithuanian Literary Texts
Jurgita Kapociute-Dzikiene, Andrius Utka and Ligita Sarkute
12:15 - 12:35 Classification of Short Legal Lithuanian Texts
Vytautas Mickevičius, Tomas Krilavičius and Vaidas Morkevičius
12:35 - 12:40 Closing remarks
END OF WORKSHOP