The challenge

TaMTAS addresses two connected barriers: the language in which science circulates and the complexity of scientific writing itself.

The deeper view

The project proposes terminology-aware, document-level machine translation and augmentation for life sciences. Large Reasoning Models treat translation as a reasoning task, while quality estimation, automatic post-editing and audience adaptation improve reliability and accessibility.

A linguistic barrier

English dominance disadvantages non-native researchers and leaves less-represented languages with fewer scientific resources and terms.

A comprehension barrier

Even after translation, complex structures and specialist terminology can keep scientific knowledge inaccessible to students and the public.

Eight objectives, one integrated system.

  1. SO1

    Compile and enrich multilingual scientific corpora

  2. SO2

    Advance terminology extraction and integration

  3. SO3

    Train document-level translation with LRMs

  4. SO4

    Detect terminology errors with quality estimation

  5. SO5

    Join quality estimation and automatic post-editing

  6. SO6

    Adapt translated text to different audiences

  7. SO7

    Validate with stakeholders in real settings

  8. SO8

    Publish reusable outputs openly