Shared Task at SIGTURK 2026 Workshop

Terminology-Aware Machine Translation for English–Turkish Scientific Texts

Today, English serves as the default language of science, leading to the creation of a vast body of scientific texts with specialized technical terminology in English. Domain experts play a crucial role—and ideally should be the primary decision-makers—in determining how scientific terms are translated into their native languages. In this work, we explore whether models can be developed to follow experts’ translation choices and automatically correct or post-edit translations accordingly.

Subtask 1 – Term Detection

Before correcting the translation of technical terms, we must first be able to detect them. The initial task, therefore, is to accurately identify the boundaries of these terms. Given the paragraphs and the sentences, the model should fill the term_pairs with the detected terms. To support contextual understanding, we also provide the corresponding source and target paragraphs.

Data Format

Example JSON instance:

{
    "paragraph_id": 1,
    "sentence_id": 1,
    "source_paragraph": "Randomness is one of the most important parts of cryptography because key generation and the key itself depend on random values. In literature, there exist statistical randomness tests and test suites to evaluate the randomness of the cryptographic algorithm. Although there exist randomness tests, there is no mathematical evidence to prove that a sequence or a number is random. Therefore, it is vital to choose tests in the test suites due to independency and coverage of the tests used in the suites. Sensitivity of these tests to nonnandom data is also important. The tests should be classified to determine that tests are independent and wide.",
    "target_paragraph": "Rasgelelik kriptografinin en önemli kısımlarından biridir çünkü, anahtar üretimi ve anahtarın kendisi rastgele değerlere bağlıdır. Literatürde birçok istatistiksel rastgelelik testi ve bu testleri içeren test paketleri yer almaktadır. Buna rağmen bir dizinin veya bir sayının rastgele olduğunu gösterecek hiçbir matematiksel kanıt yoktur. Bundan dolayı bir istatistiksel test paketi oluştururken bu testlerin seçimi hayati bir önem taşımaktadır. Ayrıca bu testlerin rastgele olmayan verilere karşı duyarlılığı da çok önemlidir. İstatistiksel testlerin birbirinden bağımsız olduğunu ve kapsamının geniş olduğunu belirlemek için sınıflandırılması gerekmektedir.",
    "source_sentence": "Randomness is one of the most important parts of cryptography because key generation and the key itself depend on random values.",
    "target_sentence": "Rastgelelik kriptografinin en önemli kısımlarından biridir çünkü, anahtar üretimi ve anahtarın kendisi rastgele değerlere bağlıdır.",
    "term_pairs": [
        {
            "en": "Randomness",
            "en_start": 0,
            "en_end": 10,
            "tr": "Rastgelelik",
            "tr_start": 0,
            "tr_end": 11
        },
        {
            "en": "cryptography",
            "en_start": 49,
            "en_end": 61,
            "tr": "kriptografinin",
            "tr_start": 12,
            "tr_end": 26
        },
        {
            "en": "key generation",
            "en_start": 70,
            "en_end": 84,
            "tr": "anahtar üretimi",
            "tr_start": 66,
            "tr_end": 81
        },
        {
            "en": "key",
            "en_start": 70,
            "en_end": 73,
            "tr": "anahtar",
            "tr_start": 66,
            "tr_end": 73
        },
        {
            "en": "random values",
            "en_start": 114,
            "en_end": 127,
            "tr": "rastgele değerlere",
            "tr_start": 103,
            "tr_end": 121
        }
    ]
}

Evaluation

We use Precision, Recall, and micro/macro F1 to evaluate each direction separately. Evaluation is token-based: partial matches (e.g., detecting “key” instead of “key generation”) receive proportional credit. See the provided evaluation script for exact scoring details.

Subtask 2 – Term Correction with Expert Input

In this subtask, the goal is to post-edit the translation of technical terms using expert-provided hints. Given the source and target paragraphs, corresponding sentences, detected term boundaries, and expert input (if available), the model must correct the translation of each term to conform to the expert’s preferred terminology.

The system should fill the "correction" field for each term:

The field must contain only the corrected Turkish term (with appropriate suffixes if required).

This task focuses on evaluating the instruction-following capability of pretrained language models. Note that:

The provided hints may already represent the correct translation.
In some cases, the hint may appear without Turkish suffixes or as a partial translation.
Models should use the hint contextually and output a fluent, morphologically correct correction.

Data Format

An example JSON entry with required corrections is shown below:

{
    "paragraph_id": 1,
    "sentence_id": 3,
    "source_paragraph": "The simplest port scanners use the operating system's network functions and are generally the next option to go to when syn is not a feasible option (described next). Nmap calls this mode connect scan, named after the unix connect() system call. If a port is open, the operating system completes the tcp three-way handshake, and the port scanner immediately closes the connection to avoid performing a denial-of-service attack. Otherwise an error code is returned. This scan mode has the advantage that the user does not require special privileges. However, using the os network functions prevents low-level control, so this scan type is less common.",
    "target_paragraph": "En basit port tarayıcıları işletim sisteminin ağ işlevlerini kullanır ve ve genellikle uygulanabilir bir seçenek olmadığında syn gidilebilecek sonraki seçenektir (sonraki bölümde açıklanmaktadır). Nmap , unix connect () sistem çağrısından sonra adlandırılan bu mod bağlantı taramasını çağırır. Eğer bağlantı açıksa işletim sistemi tcp 3 yollu el sıkışmasını tamamlar ve bağlantı noktası dos saldırısı yapılmasını önlemek amacıyla bağlantıyı hemen kapatır. Aksi halde bir hata kodu döndürülür. Bu tarama modu sayesinde, kullanıcının özel ayrıcalıklara sahip olmasına gerek yoktur. Buna rağmen, işletim sistemi ağ fonksiyonlarını kullanmak düşük seviye kontrolünü önler, bu nedenle bu tarama türü daha az yaygındır.",
    "source_sentence": "If a port is open, the operating system completes the tcp three-way handshake, and the port scanner immediately closes the connection to avoid performing a denial-of-service attack.",
    "target_sentence": "Eğer bağlantı açıksa işletim sistemi tcp 3 yollu el sıkışmasını tamamlar ve bağlantı noktası dos saldırısı yapılmasını önlemek amacıyla bağlantıyı hemen kapatır.",
    "term_pairs": [
        {
            "en": "port",
            "en_start": 5,
            "en_end": 9,
            "tr": "bağlantı",
            "tr_start": 5,
            "tr_end": 13,
            "hint": "bağlantı noktası",
            "correction": "bağlantı noktası"
        },
        {
            "en": "tcp three-way handshake",
            "en_start": 54,
            "en_end": 77,
            "tr": "tcp 3 yollu el sıkışmasını",
            "tr_start": 37,
            "tr_end": 63,
            "hint": "üç yönlü tokalaşma",
            "correction": "tcp 3 yönlü tokalaşmasını"
        },
        {
            "en": "port scanner",
            "en_start": 87,
            "en_end": 99,
            "tr": "bağlantı noktası",
            "tr_start": 76,
            "tr_end": 92,
            "hint": "bağlantı noktaları tarayıcısı",
            "correction": "bağlantı noktaları tarayıcısı"
        }
    ]
}

Evaluation

Accuracy is measured on the correction field using Exact Match.

Subtask 3 – End-to-End Post-Edit

Here, we are curious how the models would perform end-to-end when given access to terimler.org. The models need not query terimler.org instantly; the offline glossary will be provided in a dictionary format. Here, we will not provide the term boundaries or hints, but only the final translation. Given the source and target paragraphs and sentences; the task is to post-edit the target sentence. The model should fill the edited_target_sentence correctly.

Example

{
    "paragraph_id": 3,
    "sentence_id": 2,
    "source_paragraph": "In this thesis we study static and time dependent solutions of supergravity theories. We discuss p-branes, plane waves, Kaluza-Klein monopoles, and time-dependent S-brane solutions. We then proceed to describe the Kaluza-Klein dimensional reduction procedure and discuss how theories in lower dimensions can be obtained from theories in higher dimensions. As the main result of this thesis, we present new solutions of supergravity theories involving intersections of S-branes with plane waves and Kaluza-Klein monopoles. We find that configurations involving intersections of S-branes with waves are restricted in that the wave can be placed only on the transverse space of the S-brane and the transverse space must be flat. We also find that a larger number of configurations involving intersections of S-branes with Kaluza-Klein monopoles exist.",
    "target_paragraph": "Bu tezde süperçekim kuramlarının zamandan bağımsız ve zamana bağlı çözümlerini inceleyeceğiz. Dalgaları, p-branları, Kaluza-Klein monopollerini ve zamana bağlı S-brane çözümlerini tartışacağız. Bunun ardından Kaluza-Klein boyutsal indirgeme kuramını çalışacağız ve düşük boyutlardaki kuramların bu yolla yüksek boyutlardaki kuramlardan nasıl elde edildiğini göreceğiz. Temel sonuç olarak süperçekim kuramlarının S-branlerinin dalgalarla ve Kaluza-Klein monopolleriyle kesişimlerini içeren yeni çözümlerini sunacağız. Bulgularımız S-branlerinin dalgalarla kesişimlerinde dalganın S-branının sadece dış uzayına yerleştirilebileceğini ve dış uzayın düz seçilmesi gerektiğini gösteriyor. S-branlerinin Kaluza-Klein monopolleriyle kesişimlerinde ise daha fazla sayıda seçeneğin bulunduğunu göstereceğiz.",
    "source_sentence": "We discuss p-branes, plane waves, Kaluza-Klein monopoles, and time-dependent S-brane solutions.",
    "target_sentence": "Dalgaları, p-branları, Kaluza-Klein monopollerini ve zamana bağlı S-brane çözümlerini tartışacağız.",
    "edited_target_sentence": "Düzlem dalgaları, p-zarları, Kaluza-Klein monopollerini ve zamana bağlı S-brane çözümlerini tartışacağız."
}

Evaluation

We evaluate post-edit outputs using chrF and COMET scores (subject to change).

Evaluation Script and Development Data

We provide an evaluation script along with development data for all subtasks on our GitHub repository: aligebesce/sigturk2026_sharedtask

Submission of Model Predictions

Before making any submission to Codabench, each team must first register via the shared task Google Form: SIGTURK 2026 Shared Task registration form

Only registered teams will be considered in the official rankings and in the shared task overview paper.

Test files for all subtasks are available in our GitHub repository (test data directory): test_data on GitHub

Participants must:

run their systems on the released test files,
collect all predictions into a single JSON file named predictions.json,
compress this file into a ZIP archive named predictions.json.zip, and
submit this single predictions.json.zip file to the Codabench competition page: Codabench competition page

Important format requirements:

You must submit exactly one file named predictions.json.zip to Codabench (not a raw predictions.json file).
The archive predictions.json.zip must contain exactly one file named predictions.json at the top level (no nested folders, no extra files).
The structure of predictions.json must match the examples in the test_data directory.
Each JSON entry must correspond to one test instance.
Each entry must include a task_type field (for example: "task_type": "task3"), and its value must correctly indicate the relevant subtask ("task1", "task2", or "task3"), as shown in the provided test data.

Due to Codabench sandboxing, the evaluation code may not have permission to access additional files beyond the uploaded predictions.json.zip. Extra files or a different directory layout may cause your submission to fail or receive no score.

Submission of Papers

TBA

Important Note on Models

Participants may use only pretrained models and resources whose weights are openly available for download at evaluation time. Use of models with closed or restricted weights (e.g., API‐only, gated by manual approval, paywalled checkpoints, or proprietary services) is not permitted. All model architectures are allowed.

Participants must clearly document all models and resources used in their system description papers, including:

model name and version/commit,
where the weights can be obtained (URL) and the license,
any additional resources, training or fine‐tuning data, and prompts.

Important Dates

Event	Date (AoE)
Task details and dev data release	October 27, 2025
Test data release / submissions open	November 8, 2025
Submission deadline	December 31, 2025
Evaluation completed	January 10, 2026
System paper deadline	January 10, 2026
Notification of acceptance	January 23, 2026
Camera-ready papers due	February 3, 2026

Event

Date (AoE)

Task details and dev data release

October 27, 2025

Test data release / submissions open

November 8, 2025

Submission deadline

December 31, 2025

Evaluation completed

January 10, 2026

System paper deadline

January 10, 2026

Notification of acceptance

January 23, 2026

Camera-ready papers due

February 3, 2026

Organizers

Asst. Prof. Gözde Gül Şahin, Koç University
Ali Gebeşce, Koç University
Ege Uğur Amasya, Koç University

Contact

For any questions regarding the shared task, please contact: sigturk2026.sharedtask@gmail.com

Acknowledgements

This research is supported by the Wikimedia Foundation Research Fund (Grant No. G‐RS‐2402‐15231). We thank Zafer Batık and Başak Tosun of the Wikimedia Community User Group Turkey for introductions to the Turkish Wikipedia community and assistance with our inquiries regarding the Wikimedia Foundation and community; Kızıl of the Wikipedia Turkey Translators Group for connecting us with translators and demonstrating the translation workflow within Turkish Wikipedia; Prof. Bülent Sankur of terimler.org for insights on technical translations and for facilitating connections with academics who contributed to terminology decisions; and Gizem Ekiz for invaluable help organizing project events and coordinating communication among academics and Wikipedians.