Modeling Large-Scale Walking and Cycling Networks: A Machine Learning Approach Using Mobile Phone and Crowdsourced Data

2404.00162

YC

0

Reddit

0

Published 4/4/2024 by Meead Saberi, Tanapon Lilasathapornkit

📊

Abstract

Walking and cycling are known to bring substantial health, environmental, and economic advantages. However, the development of evidence-based active transportation planning and policies has been impeded by significant data limitations, such as biases in crowdsourced data and representativeness issues of mobile phone data. In this study, we develop and apply a machine learning based modeling approach for estimating daily walking and cycling volumes across a large-scale regional network in New South Wales, Australia that includes 188,999 walking links and 114,885 cycling links. The modeling methodology leverages crowdsourced and mobile phone data as well as a range of other datasets on population, land use, topography, climate, etc. The study discusses the unique challenges and limitations related to all three aspects of model training, testing, and inference given the large geographical extent of the modeled networks and relative scarcity of observed walking and cycling count data. The study also proposes a new technique to identify model estimate outliers and to mitigate their impact. Overall, the study provides a valuable resource for transportation modelers, policymakers and urban planners seeking to enhance active transportation infrastructure planning and policies with advanced emerging data-driven modeling methodologies.

Create account to get full access

or

If you already have an account, we'll log you in

Overview

• Walking and cycling have many benefits, but developing effective transportation policies has been challenging due to limitations in data.

• This study developed a machine learning approach to estimate daily walking and cycling volumes across a large transportation network in New South Wales, Australia.

• The approach leveraged various data sources, including crowdsourced and mobile phone data, to model a network of nearly 200,000 walking links and 115,000 cycling links.

• The study discusses the unique challenges and proposes techniques to address limitations in the training, testing, and inference of the model.

Plain English Explanation

Getting people to walk and cycle more can provide significant benefits for public health, the environment, and the economy. However, transportation planners and policymakers have faced difficulties in developing effective strategies to promote active transportation. This is partly due to limitations in the data available to them.

For example, crowdsourced data from apps and websites may not represent the full population, and mobile phone data can have its own biases. To address these challenges, the researchers in this study used a machine learning approach to estimate how much walking and cycling is happening across a large transportation network in the Australian state of New South Wales.

The researchers combined information from various data sources, including crowdsourced and mobile phone data, as well as data on factors like population, land use, terrain, and weather. Using this combined data, they were able to build a model that could predict the daily volumes of walking and cycling across nearly 200,000 walking paths and 115,000 cycling paths in the network.

Developing such a large-scale model came with its own unique difficulties, which the researchers discussed in detail. They also proposed new techniques to help identify and mitigate any problematic outliers in the model's predictions.

Overall, this study provides transportation planners and policymakers with a valuable new tool to help them better understand and plan for walking and cycling in their communities. By using advanced data-driven modeling, they can make more informed decisions about how to invest in and support active transportation infrastructure.

Technical Explanation

The researchers developed a machine learning-based approach to estimate daily walking and cycling volumes across a large-scale transportation network in New South Wales, Australia. The network comprised 188,999 walking links and 114,885 cycling links.

The modeling methodology leveraged a variety of data sources, including crowdsourced data, mobile phone data, as well as information on population, land use, topography, climate, and other relevant factors. The researchers faced unique challenges in training, testing, and deploying the model due to the large geographical extent of the network and the relative scarcity of observed walking and cycling count data.

To address these challenges, the researchers proposed a new technique to identify and mitigate the impact of model estimate outliers. This involved using a combination of statistical measures and expert domain knowledge to flag potentially problematic predictions and adjust them accordingly.

The study provides transportation modelers, policymakers, and urban planners with a valuable resource for enhancing active transportation infrastructure planning and policies. By using advanced data-driven modeling, they can gain deeper insights into walking and cycling patterns across large-scale transportation networks, which can inform more effective strategies to promote active and sustainable mobility.

Critical Analysis

The researchers acknowledge several limitations and areas for further research in this study. For example, they note that the crowdsourced and mobile phone data used to train the model may still have inherent biases and representational issues, despite the efforts to combine multiple data sources.

Additionally, the researchers highlight the challenges of validating the model's predictions, as observed walking and cycling count data was relatively scarce across the large geographical extent of the network. While the proposed outlier detection technique helps to mitigate this, there may be room for improvement in the model's accuracy and reliability.

Further research could explore the integration of additional data sources, such as sensor-based counts or video analytics, to enhance the model's training and validation. Exploring the transferability of the modeling approach to other regions or contexts would also be a valuable area of investigation.

It is important to note that while this study provides a valuable framework for data-driven active transportation planning, the effectiveness of any resulting policies or infrastructure investments will ultimately depend on their acceptance and uptake by the local community. Engaging with stakeholders and incorporating their feedback will be crucial for ensuring the relevance and impact of such initiatives.

Conclusion

This study presents a innovative machine learning-based approach to estimate walking and cycling volumes across a large-scale transportation network. By leveraging a variety of data sources and addressing the unique challenges of large-scale modeling, the researchers have developed a valuable tool for transportation planners and policymakers.

The insights gained from this approach can help inform more effective strategies to promote active and sustainable mobility, with the potential to deliver significant health, environmental, and economic benefits. As transportation agencies continue to explore data-driven solutions, this study offers a compelling example of how advanced modeling techniques can be applied to enhance active transportation planning and decision-making.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Exploring Urban Mobility Trends using Cellular Network Data

Exploring Urban Mobility Trends using Cellular Network Data

Oluwaleke Yusuf, Adil Rasheed, Frank Lindseth

YC

0

Reddit

0

The growth of urban areas intensifies the need for sustainable, efficient transportation infrastructure and mobility systems, driving initiatives to enhance infrastructure and public transport while reducing congestion and emissions. By utilizing real-world mobility data, a data-driven approach can provide crucial insights for planning and decision-making. This study explores the efficacy of leveraging telecoms data from cellular network signals for studying crowd movement patterns, focusing on Trondheim, Norway. It examines routing reports to understand the spatiotemporal dynamics of various transportation routes and modes. A data preprocessing and feature engineering framework was developed to process raw routing reports for historical analysis. This enabled the examination of geospatial trends and temporal patterns, including a comparative analysis of various transportation modes, along with public transit usage. Specific routes and areas were analyzed in-depth to compare their mobility patterns with the broader city context. The study highlights the potential of cellular network data as a resource for shaping urban transportation and mobility systems. By identifying deficiencies and potential improvements, city planners and stakeholders can foster more sustainable and effective transportation solutions.

Read more

4/4/2024

🎯

From Counting Stations to City-Wide Estimates: Data-Driven Bicycle Volume Extrapolation

Silke K. Kaiser, Nadja Klein, Lynn H. Kaack

YC

0

Reddit

0

Shifting to cycling in urban areas reduces greenhouse gas emissions and improves public health. Street-level bicycle volume information would aid cities in planning targeted infrastructure improvements to encourage cycling and provide civil society with evidence to advocate for cyclists' needs. Yet, the data currently available to cities and citizens often only comes from sparsely located counting stations. This paper extrapolates bicycle volume beyond these few locations to estimate bicycle volume for the entire city of Berlin. We predict daily and average annual daily street-level bicycle volumes using machine-learning techniques and various public data sources. These include app-based crowdsourced data, infrastructure, bike-sharing, motorized traffic, socioeconomic indicators, weather, and holiday data. Our analysis reveals that the best-performing model is XGBoost, and crowdsourced cycling and infrastructure data are most important for the prediction. We further simulate how collecting short-term counts at predicted locations improves performance. By providing ten days of such sample counts for each predicted location to the model, we are able to halve the error and greatly reduce the variability in performance among predicted locations.

Read more

6/27/2024

💬

Large Language Models for Mobility in Transportation Systems: A Survey on Forecasting Tasks

Zijian Zhang, Yujie Sun, Zepu Wang, Yuqi Nie, Xiaobo Ma, Peng Sun, Ruolin Li

YC

0

Reddit

0

Mobility analysis is a crucial element in the research area of transportation systems. Forecasting traffic information offers a viable solution to address the conflict between increasing transportation demands and the limitations of transportation infrastructure. Predicting human travel is significant in aiding various transportation and urban management tasks, such as taxi dispatch and urban planning. Machine learning and deep learning methods are favored for their flexibility and accuracy. Nowadays, with the advent of large language models (LLMs), many researchers have combined these models with previous techniques or applied LLMs to directly predict future traffic information and human travel behaviors. However, there is a lack of comprehensive studies on how LLMs can contribute to this field. This survey explores existing approaches using LLMs for mobility forecasting problems. We provide a literature review concerning the forecasting applications within transportation systems, elucidating how researchers utilize LLMs, showcasing recent state-of-the-art advancements, and identifying the challenges that must be overcome to fully leverage LLMs in this domain.

Read more

5/7/2024

Deep Activity Model: A Generative Approach for Human Mobility Pattern Synthesis

Deep Activity Model: A Generative Approach for Human Mobility Pattern Synthesis

Xishun Liao, Brian Yueshuai He, Qinhua Jiang, Chenchen Kuai, Jiaqi Ma

YC

0

Reddit

0

Human mobility significantly impacts various aspects of society, including transportation, urban planning, and public health. The increasing availability of diverse mobility data and advancements in deep learning have revolutionized mobility modeling. Existing deep learning models, however, mainly study spatio-temporal patterns using trajectories and often fall short in capturing the underlying semantic interdependency among activities. Moreover, they are also constrained by the data source. These two factors thereby limit their realism and adaptability, respectively. Meanwhile, traditional activity-based models (ABMs) in transportation modeling rely on rigid assumptions and are costly and time-consuming to calibrate, making them difficult to adapt and scale to new regions, especially those regions with limited amount of required conventional travel data. To address these limitations, we develop a novel generative deep learning approach for human mobility modeling and synthesis, using ubiquitous and open-source data. Additionally, the model can be fine-tuned with local data, enabling adaptable and accurate representations of mobility patterns across different regions. The model is evaluated on a nationwide dataset of the United States, where it demonstrates superior performance in generating activity chains that closely follow ground truth distributions. Further tests using state- or city-specific datasets from California, Washington, and Mexico City confirm its transferability. This innovative approach offers substantial potential to advance mobility modeling research, especially in generating human activity chains as input for downstream activity-based mobility simulation models and providing enhanced tools for urban planners and policymakers.

Read more

5/29/2024