Transfer Learning Approach for Railway Technical Map (RTM) Component Identification

Read original: arXiv:2405.13229 - Published 5/24/2024 by Obadage Rochana Rumalshan, Pramuka Weerasinghe, Mohamed Shaheer, Prabhath Gunathilake, Erunika Dayaratna
Total Score

0

🔄

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • Discusses a system to digitize and extract data from railway technical maps (RTMs) using deep learning and optical character recognition (OCR) techniques.
  • Compares performance of different object detection models, with Faster R-CNN yielding the highest mean Average Precision (mAP) and F1 score.
  • Demonstrates that pre-processing can improve OCR results for text-containing images.

Plain English Explanation

Railway transportation is extremely popular worldwide, so maintaining efficient railway management systems is crucial. Currently, many RTMs are available only in PDF format. This research proposes a system to automatically extract relevant data from these RTM images and convert them into a formatted text file.

The researchers tested several object detection models, including YOLOv3, SSD, and Faster R-CNN. Faster R-CNN performed the best, with the highest mean Average Precision (mAP) and F1 score. This means it was most accurate at detecting and identifying the relevant components in the RTM images.

Additionally, the researchers found that applying sophisticated pre-processing to the images before running OCR could further improve the text extraction results. This helps overcome issues like distortions in the original RTM images.

Technical Explanation

The authors propose a system that uses deep learning and OCR techniques to digitize and extract data from railway technical maps (RTMs). They compared the performance of three popular object detection models: YOLOv3, SSD, and Faster R-CNN.

Out of these, Faster R-CNN yielded the highest mean Average Precision (mAP) of 0.68 and the highest F1 score of 0.76. This indicates that Faster R-CNN was the most accurate at detecting and classifying the relevant components within the RTM images.

Furthermore, the researchers demonstrated that applying a sophisticated pre-processing pipeline to the input images could improve the OCR results. This helped overcome issues like distortions in the original RTM images, leading to more accurate text extraction.

Critical Analysis

The paper presents a promising approach for digitizing railway technical maps, but does not address some potential limitations. For example, the performance of the system may degrade on RTMs with more complex layouts or specialized components not included in the training data.

Additionally, the paper does not explore the robustness of the system to variations in image quality, resolution, or other real-world factors that could affect its practical deployment. Further research would be needed to assess the system's generalizability and reliability in diverse, real-world scenarios.

Conclusion

This research demonstrates the feasibility of using deep learning and OCR techniques to automate the digitization of railway technical maps. By leveraging state-of-the-art object detection models and sophisticated pre-processing, the proposed system can accurately extract relevant data from RTM images and convert them into a structured, machine-readable format.

This advancement could streamline railway management and planning workflows, as well as enable more efficient data analysis and decision-making processes. Further refinements and real-world evaluations would help solidify the practical applications of this technology in the railway industry.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🔄

Total Score

0

Transfer Learning Approach for Railway Technical Map (RTM) Component Identification

Obadage Rochana Rumalshan, Pramuka Weerasinghe, Mohamed Shaheer, Prabhath Gunathilake, Erunika Dayaratna

The extreme popularity over the years for railway transportation urges the necessity to maintain efficient railway management systems around the globe. Even though, at present, there exist a large collection of Computer Aided Designed Railway Technical Maps (RTMs) but available only in the portable document format (PDF). Using Deep Learning and Optical Character Recognition techniques, this research work proposes a generic system to digitize the relevant map component data from a given input image and create a formatted text file per image. Out of YOLOv3, SSD and Faster-RCNN object detection models used, Faster-RCNN yields the highest mean Average Precision (mAP) and the highest F1 score values 0.68 and 0.76 respectively. Further it is proven from the results obtained that, one can improve the results with OCR when the text containing image is being sent through a sophisticated pre-processing pipeline to remove distortions.

Read more

5/24/2024

🔎

Total Score

0

Ultra-Fast Adaptive Track Detection Network

Hai Ni, Rui Wang, Scarlett Liu

Railway detection is critical for the automation of railway systems. Existing models often prioritize either speed or accuracy, but achieving both remains a challenge. To address the limitations of presetting anchor groups that struggle with varying track proportions from different camera angles, an ultra-fast adaptive track detection network is proposed in this paper. This network comprises a backbone network and two specialized branches (Horizontal Coordinate Locator and Perspective Identifier). The Perspective Identifier selects the suitable anchor group from preset anchor groups, thereby determining the row coordinates of the railway track. Subsequently, the Horizontal Coordinate Locator provides row classification results based on multiple preset anchor groups. Then, utilizing the results from the Perspective Identifier, it generates the column coordinates of the railway track. This network is evaluated on multiple datasets, with the lightweight version achieving an F1 score of 98.68% on the SRail dataset and a detection rate of up to 473 FPS. Compared to the SOTA, the proposed model is competitive in both speed and accuracy. The dataset and code are available at https://github.com/idnihai/UFATD

Read more

5/24/2024

Machine Learning Models for Improved Tracking from Range-Doppler Map Images
Total Score

0

Machine Learning Models for Improved Tracking from Range-Doppler Map Images

Elizabeth Hou, Ross Greenwood, Piyush Kumar

Statistical tracking filters depend on accurate target measurements and uncertainty estimates for good tracking performance. In this work, we propose novel machine learning models for target detection and uncertainty estimation in range-Doppler map (RDM) images for Ground Moving Target Indicator (GMTI) radars. We show that by using the outputs of these models, we can significantly improve the performance of a multiple hypothesis tracker for complex multi-target air-to-ground tracking scenarios.

Read more

7/4/2024

Real-Time Detection and Analysis of Vehicles and Pedestrians using Deep Learning
Total Score

0

Real-Time Detection and Analysis of Vehicles and Pedestrians using Deep Learning

Md Nahid Sadik, Tahmim Hossain, Faisal Sayeed

Computer vision, particularly vehicle and pedestrian identification is critical to the evolution of autonomous driving, artificial intelligence, and video surveillance. Current traffic monitoring systems confront major difficulty in recognizing small objects and pedestrians effectively in real-time, posing a serious risk to public safety and contributing to traffic inefficiency. Recognizing these difficulties, our project focuses on the creation and validation of an advanced deep-learning framework capable of processing complex visual input for precise, real-time recognition of cars and people in a variety of environmental situations. On a dataset representing complicated urban settings, we trained and evaluated different versions of the YOLOv8 and RT-DETR models. The YOLOv8 Large version proved to be the most effective, especially in pedestrian recognition, with great precision and robustness. The results, which include Mean Average Precision and recall rates, demonstrate the model's ability to dramatically improve traffic monitoring and safety. This study makes an important addition to real-time, reliable detection in computer vision, establishing new benchmarks for traffic management systems.

Read more

4/15/2024