Logo UNIBA Logo CILAB Logo CAIP

SignRec Contest 2025

21st International Conference on Computer Analysis of Images and Patterns - CAIP 2025

22-25 September 2025, Las Palmas de Gran Canaria, Spain

SignRec Contest 2025

Contest

Sign language recognition represents a formidable challenge in computer vision due to the complex spatiotemporal dynamics of gestures and the need for precise segmentation in continuous video streams. Many current methods struggle with reliably partitioning continuous signing into individual glosses or mapping these segments into text. In this contest, participants will leverage a novel isolated Italian Sign Language dataset to train robust models. In isolated sign language recognition, each sign is presented individually, without context, similar to flashcards used for learning new words in spoken language. In contrast, continuous sign language recognition mirrors real-life communication, where signs transition seamlessly from one to the next, forming structured expressions much like spoken sentences. The SignRec Contest is a competition among methods for sign language recognition using knowledge transfer. For the contest, we propose the use of a novel training set completely annotated with corresponding transcription and glossary. The performance of the competing methods will be evaluated in terms of accuracy on a private test set composed of images that are different from the ones available in the training set; in fact, the objective is to transfer learning from isolated training data to continuous test data.

Test Dataset Example

Dataset

Isolated Dataset

The traning set is a comprehensive collection of 15,058 videos featuring Italian Sign Language (LIS) signers. Each video is annotated with the following labels:

  • Text: The corresponding word in Italian Sign Language (LIS).
  • Category: The semantic category of the word (e.g., sports, language, baby signs, etc.).
  • Gloss: The gloss of the word in Italian, generated using a semi-supervised method.
  • Frame Count: The total number of frames in the video.
  • Video Path: The file path to the video.
  • Keypoints Path: The file pkl of keypoints extracted by SOTA human pose estimation models.

Participants are provided with two folders containing isolated and continuous videos, along with a JSON file for each set that includes the labels for the samples. The dataset aims to support research in sign language recognition and encourages participants to enhance their models by incorporating additional publicly available samples or annotations.

Continuous Dataset

The Continuous LIS Dataset is a comprehensive of Italian Sign Language (LIS) signers. Each video is annotated with the following labels:

  • CSV Transcription Path:The corresponding file to the transcription text.
  • Video Path: The file path to the video.

Participants are provided with two folders containing isolated and continuous videos, along with a JSON file for each set that includes the labels for the samples. The dataset aims to support research in sign language recognition and encourages participants to enhance their models by incorporating additional publicly available samples or annotations.

Training Dataset Examples

Evaluation Protocol

To ensure a fair and rigorous assessment of the submitted models, the SignRec Contest employs multiple evaluation metrics, primarily focusing on Word Error Rate (WER), a standard measure for sequence-based recognition tasks. Sign Accuracy (SA) and Boundary Segmentation Score (BSS) are also introduced to evaluate different aspects of model performance.

Word Error Rate (WER)

Word Error Rate is the primary evaluation metric, quantifying the accuracy of the recognized sign sequences compared to the ground truth annotations. It is computed as:

$$\text{WER} = \frac{S + D + I}{N} = \frac{S + D + I}{S + D + C}$$

Where:

  • $S$: Number of substitutions (incorrectly recognized signs).
  • $D$: Number of deletions (missed signs).
  • $I$: Number of insertions (extra signs detected).
  • $C$: Number of correctly recognized signs.
  • $N$: Total number of signs in the ground truth sentence ($N = S + D + C$).

Lower WER values indicate better recognition performance, as fewer errors are introduced in the predicted sign sequences.

Sign Accuracy (SA)

Sign Accuracy measures the percentage of correctly predicted signs within a sequence, providing a complementary metric to WER. It is defined as:

$$\text{SA} = \frac{C}{N} \times 100\%$$

Here, $C$ represents the number of correctly predicted signs. Unlike WER, SA does not penalize insertions, making it helpful in evaluating raw recognition accuracy.

Boundary Segmentation Score (BSS)

To assess the ability of models to segment continuous signing into discrete glosses, we introduce the Boundary Segmentation Score (BSS), which measures the alignment of predicted sign boundaries with ground truth annotations. It is calculated based on precision and recall, evaluating how well the system detects the start and end points of individual signs in continuous sequences:

$$\text{BSS} = \frac{2 \times P \times R}{P + R}$$

Where:

  • $P$ (Precision): Measures how many predicted boundaries are correct.
  • $R$ (Recall): Measures how many actual boundaries were correctly detected.

A higher BSS indicates better temporal segmentation performance, critical for accurate continuous sign recognition.

The method with the lowest Word Error Rate (WER) will be declared the winner of the SignRec 2025 Contest

Rules

  1. The deadline for the submission of the methods is 15th June, 2025. The submission must be done with an email in which the participants share (directly or with external links) the trained model, the code, and a report. Please follow the detailed instructions reported here.
  2. The participants can receive the training set and the annotations by sending an email to the organizers, in which they also communicate the name of the team.
  3. The participants can use these training samples and annotations, but they can also use additional samples and/or add the missing labels, under the constraint that they make the additional samples and annotations publicly available.
  4. The participants are free to design novel architectures or to define novel training procedures and loss functions for classifiers or regressors.
  5. Participants must produce a brief PDF report of the proposed method, by following a template that can be downloaded here.
  6. The participants are strongly encouraged to submit a contest paper to CAIP 2025, whose deadline is 10th July, 2025. The contest paper must be also sent by email to the organizers. Otherwise, the participants must produce a brief PDF report of the proposed method, by following a template that can be downloaded here. If you submit a paper, you can cite the paper describing the contest by downloading thebibtex file or as follows:
      Colonna E., Vessio Gnnaro V., Giovanna C., "SignRec Contest 2025: Knowledge Transfer from Isolated to Continuous Sign Language", 21st International Conference Computer Analysis of Images and Patterns, CAIP 2025

Instructions

The methods proposed will be executed on a private test set not made available to the participants. In this way, the evaluation is entirely fair and we ensure that there is no overlap between the training and the test samples. To leave the participants totally free to use all the software libraries they prefer and to correctly reproduce their processing pipeline, the evaluation will be done on Google Colab (follow this tutorial) by running the code submitted by the participants on the samples of our private test set.

Therefore, partecipants must submit an archive including the following elements:

      A Python script, designated as test.py, is utilized to process annotations, which are stored in CSV format. The script takes as inputs the annotations folder, denoted as --data, and the folder containing the test videos, referred to as --video. Additionally, it processes the procedures and generates a CSV file that contains the predicted sign for each video. Therefore, the execution of the script should be carried out in accordance with the following command:

      python test.py --data foo_test/ --video foo_test/ --results foo_results.csv

      A Google Colab Notebook test.ipynb, which includes the commands for installing all software requirements and executes the script test.py

      All the files necessary for running the test, namely the training model, additional scripts and so on.

The test.py script should be implemented to include the reading of the annotations JSON file and the creation of a file containing all the obtained results. An example of a dictionary from the JSON file might be: {"video_id": "date.mp4", "predicted_signs": [1, 4, 6, 19, 5]}. The results file will be formatted in the same manner as the original annotation file.

The submission must be done by email. The archive file can be attached to the e-mail or shared with external links.

The participants are strongly encouraged to submit a contest paper to CAIP 2025, whose deadline is 10th July, 2025. The contest paper must be also sent by email to the organizers. If you submit a paper, you can cite the paper describing the contest by downloading the bibtex file or as follows:

Organizers

Organizer 1

Emanuele Colonna


PhD Student

Department of Computer Science

University of Bari Aldo Moro, Italy

LinkedIn Google Scholar
Organizer 2

Gennaro Vessio


Assistant Professor

Department of Computer Science

University of Bari Aldo Moro, Italy

LinkedIn Google Scholar
Organizer 3

Giovanna Castellano


Full Professor

Department of Computer Science

University of Bari Aldo Moro, Italy

LinkedIn Google Scholar

Contact