1. Introduction
In the urban transportation system, when a traveler is ready to start from the starting point to the destination, there are usually multiple routes to choose from. For the urban public transport system, it is very important to predict the arrival time of the bus reasonably and accurately. The travel time of city bus has obvious characteristics of time interval distribution, especially in the morning and evening rush hour and the peak period of flat peak period, the travel time of city bus is different from that of ordinary peak period. Many problems, such as uneven bus arrival time, reduced reliability, long waiting time, large difference in passenger carrying rate, and so on, affect passengers' willingness to take public transportation. So far, there are several typical methods for predicting travel time. For example, time series prediction method, prediction method based on queuing theory, Kalman filter prediction method, non-parametric regression prediction method, neural network prediction method and comprehensive prediction method and so on.
Peng Xinjian, Weng Xiaoxiong uses firefly optimization algorithm combined with BP neural network to predict travel time of bus vehicles. Yang Zhaosheng proposed a real-time road travel time estimation model based on fuzzy neural network (FNN). Mei Chen used AVL data to establish a Kalman filter prediction model for bus arrival time. Peng proposed a correlation vector machine prediction algorithm based on Bayesian probability, which can obtain arrival time prediction value and error variance. The model can be used to predict the travel time of public transport vehicles in future road sections in real time by using the measured traffic data. In the current research, support vector machine model is mostly used in driving decision and driving behavior research, and the application in travel time prediction is very few, so this paper is based on the theory of support vector machine model to study the prediction of bus stop travel time.
2. Travle Time Prediction Methods
Bus operation will be disturbed by many random factors (such as weather, traffic congestion and passenger flow changes.), it is very difficult to accurately predict the arrival time of vehicles, for this reason, domestic and foreign scholars have done a lot of research. To sum up, there are the following methods: time series model, artificial neural network and Kalman filtering, and so on. (1) The time series model mainly depends on the similarity between the future information and the historical information. When the average situation of the historical data changes, it will lead to obvious deviation of the prediction result. Moreover, the time series model still has obvious lag in the real time prediction. (2) Kalman filtering technology is formed by introducing state space into modern control theory. It has been applied to predict the short-term traffic demand and travel time of expressways. (3) Neural network is a model that simulates the function of the human brain nervous system by modeling and connecting neurons, which are the basic units of the human brain. An artificial system with intelligent information processing functions, such as learning, association, memory and pattern recognition, is developed. In recent years, a new machine learning method-support vector machine (SVM) has emerged, which mainly studies how to mine the law from limited observation data (samples) that cannot be obtained by principle analysis. The main work is to support Vector Machine (SVM) is a new machine learning method. These laws are used to analyze the objective objects and to predict and judge the unknown data or new phenomena which cannot be observed. It has a strong learning ability, generalization ability is obviously superior to neural networks, it is easy to balance the degree of fit and generalization level.
This paper mainly discusses the application of support vector machine (SVM) to predict the travel time of road sections between stations 11-12 of No. 6 Road in Qingdao Economic and technological Development Zone. In the prediction, the travel time at 11-12 stations is censored and calculated according to the given data of 7 days, and the travel time is calculated according to the time variable. Considering the travel time of the section in each period of the first four days of the working day and the influence on the travel time of the section on Friday on the basis of the four days before the working day, a support vector machine model is established to predict the travel time of the section on the fifth day of the working day. The actual measured values and the predicted values on the fifth day of the working day are compared and verified.
3. Sopport Vector Machine Theory
Support vector machine (SVM) which is based on statistical learning, was first proposed by Vap-nick in the 1990s. In recent years, it has made a breakthrough in its theoretical research and algorithm implementation. It began to become a powerful means to overcome the traditional difficulties such as nonlinear and dimensional disaster problems, over-learning problems and local minima problems. Its goal is to minimize the structural risk. Compared with the condition that the empirical risk requires the sample to be infinitely large, the structural risk chooses a compromise between the empirical risk and the confidence range, which is
more suitable for the case of limited samples. After solving the problem of linear separability, for nonlinear separable problems, it first transforms the input space into a high-dimensional space by using the nonlinear transformation defined by the inner product function, and the input space is transformed into a high-dimensional space by means of the nonlinear transformation defined by the inner product function. The nonlinear separable problem is transformed into the linear separable problem in this high dimensional space to find the optimal classification surface or the generalized optimal classification surface. The generalized maximum interval method or the generalized bisection nearest point method can be transformed into separable problems.
Generally speaking, the relationship between vehicle running time and traffic conditions, weather and so on is very complex, it is difficult to use a specific model to describe, therefore, this paper uses SVM to map the relationship between output (running time) and input (time period, weather). Because SVM can map input data to a high dimensional feature space by kernel function, it is very effective for some complex or nonlinear problems. In addition, SVM algorithm is a convex quadratic optimization problem, which can ensure that the extremum found is the global optimal solution, unlike some other nonlinear optimization methods, it is easy to fall into local minimum.
"to be continued in the next part"