22-25 September 2025, Las Palmas de Gran Canaria, Spain
Sign language recognition represents a formidable challenge in computer vision due to the complex spatiotemporal dynamics of gestures and the need for precise segmentation in continuous video streams. Many current methods struggle with reliably partitioning continuous signing into individual glosses or mapping these segments into text. In this contest, participants will leverage a novel isolated Italian Sign Language dataset to train robust models. In isolated sign language recognition, each sign is presented individually, without context, similar to flashcards used for learning new words in spoken language. In contrast, continuous sign language recognition mirrors real-life communication, where signs transition seamlessly from one to the next, forming structured expressions much like spoken sentences. The SignRec Contest is a competition among methods for sign language recognition using knowledge transfer. For the contest, we propose the use of a novel training set completely annotated with corresponding transcription and glossary. The performance of the competing methods will be evaluated in terms of accuracy on a private test set composed of images that are different from the ones available in the training set; in fact, the objective is to transfer learning from isolated training data to continuous test data.
The traning set is a comprehensive collection of 15,058 videos featuring Italian Sign Language (LIS) signers. Each video is annotated with the following labels:
Participants are provided with two folders containing isolated and continuous videos, along with a JSON file for each set that includes the labels for the samples. The dataset aims to support research in sign language recognition and encourages participants to enhance their models by incorporating additional publicly available samples or annotations.
The Continuous LIS Dataset is a comprehensive of Italian Sign Language (LIS) signers. Each video is annotated with the following labels:
Participants are provided with two folders containing isolated and continuous videos, along with a JSON file for each set that includes the labels for the samples. The dataset aims to support research in sign language recognition and encourages participants to enhance their models by incorporating additional publicly available samples or annotations.
To ensure a fair and rigorous assessment of the submitted models, the SignRec Contest employs multiple evaluation metrics, primarily focusing on Word Error Rate (WER), a standard measure for sequence-based recognition tasks. Sign Accuracy (SA) and Boundary Segmentation Score (BSS) are also introduced to evaluate different aspects of model performance.
Word Error Rate is the primary evaluation metric, quantifying the accuracy of the recognized sign sequences compared to the ground truth annotations. It is computed as:
Where:
Lower WER values indicate better recognition performance, as fewer errors are introduced in the predicted sign sequences.
Sign Accuracy measures the percentage of correctly predicted signs within a sequence, providing a complementary metric to WER. It is defined as:
Here, $C$ represents the number of correctly predicted signs. Unlike WER, SA does not penalize insertions, making it helpful in evaluating raw recognition accuracy.
To assess the ability of models to segment continuous signing into discrete glosses, we introduce the Boundary Segmentation Score (BSS), which measures the alignment of predicted sign boundaries with ground truth annotations. It is calculated based on precision and recall, evaluating how well the system detects the start and end points of individual signs in continuous sequences:
Where:
A higher BSS indicates better temporal segmentation performance, critical for accurate continuous sign recognition.
The methods proposed will be executed on a private test set not made available to the participants. In this way, the evaluation is entirely fair and we ensure that there is no overlap between the training and the test samples. To leave the participants totally free to use all the software libraries they prefer and to correctly reproduce their processing pipeline, the evaluation will be done on Google Colab (follow this tutorial) by running the code submitted by the participants on the samples of our private test set.
Therefore, partecipants must submit an archive including the following elements:
test.py
, is utilized to process annotations, which are stored in CSV format.
The script takes as inputs the annotations folder, denoted as --data
, and the folder containing the test videos,
referred to as --video
. Additionally, it processes the procedures and generates a CSV file that contains the predicted sign for each video.
Therefore, the execution of the script should be carried out in accordance with the following command:
python test.py --data foo_test/ --video foo_test/ --results foo_results.csv
test.ipynb
, which includes the commands for installing all software
requirements and executes the script test.py
All the files necessary for running the test, namely the training model, additional scripts and so on.
The test.py
script should be implemented to include the reading of the annotations JSON file and the creation of a file containing all the obtained results.
An example of a dictionary from the JSON file might be: {"video_id": "date.mp4", "predicted_signs": [1, 4, 6, 19, 5]}
.
The results file will be formatted in the same manner as the original annotation file.
The submission must be done by email. The archive file can be attached to the e-mail or shared with external links.
The participants are strongly encouraged to submit a contest paper to CAIP 2025, whose deadline is 10th July, 2025. The contest paper must be also sent by email to the organizers. If you submit a paper, you can cite the paper describing the contest by downloading the bibtex file or as follows:
Assistant Professor
Department of Computer Science
University of Bari Aldo Moro, Italy
Full Professor
Department of Computer Science
University of Bari Aldo Moro, Italy