Multi-Lingual Natural Language Processing for AI systems

Abend Omri, HUJI, School of Computer Science and Engineering, Computer Science



Computer science and Engineering   


Multilingual AI, Natural Language Processing, Semantic representation, Text-to-text generation and its evaluation

Current development stage

Know-how transfer; Looking for industrial partners & data to reach Proof-of-Concept for verticals    



Most AI developed today is English-centric. While most NLP research over the years targeted English and a handful of other languages for which extensive resources are available (e.g., in the form of plain text, annotated data, dictionaries, translations), recent research emphasizes work on other "low-resource" languages.


Our Innovation

The researchers are developing set of algorithms for multilingual semantic representations able to abstract away from syntactic detail, and uncover a more abstract, shared level of representation.

The group has recently developed a fully workable Universal Conceptual Cognitive Annotation (UCCA). UCCA is a novel multi-layered framework for semantic representation that aims to accommodate the semantic distinctions expressed through linguistic utterances.

The researchers demonstrated UCCA's portability across domains and languages, and its relative insensitivity to meaning-preserving syntactic variation. The UCCA scheme can be effectively and quickly learned by annotators with no linguistic background, and be used to compile an annotated corpus.

The group has developed cross-linguistically applicable semantic parsers, and current work is focused on improving the performance of machine translation systems by leveraging these advances.


Applied AI transforms almost every field of industry: health, finance, property, transport and travel, manufacturing, marketing and sales, agriculture, energy, business services, decision making, and even politics. Multilingual Natural Language processing techniques will save expensive development efforts of Artificial Intelligence Systems when extending to further languages, and will support:

  • Multilingual chat bots for e-commerce, query processing and interaction
  • Multilingual social media listening and sentiment analysis
  • Global information extraction, categorization and summarization



Contact for more information:

Anna Pellivert
Contact ME: