Skip to content

Applying the Kalman filter and Map-Matching algorithms to time-series GPS data, for preprocessing the Walkwise model training data.

Notifications You must be signed in to change notification settings

hyuncat/pre-walkwise

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pre-Walkwise 🚶‍♂️

Scripts and web app to implement and visualize Kalman filtering and Map-Matching algorithms for GPS time-series data. For use in preprocessing the training data for the Walkwise model, for predicting pedestrian intent at intersections.

Installation notes

All non-map-matching features can be run just by installing package dependencies:

pip install -r requirements.txt

To setup the Docker image for Valhalla Map Matching, please download the beijing-latest.osm.pbf file and run the server on the port 8002.

For instance, I created a 'valhalla' directory in the project's root directory and ran:

docker run -dt --name valhalla_gis-ops -p 8002:8002 \
    -v $PWD/valhalla:/custom_files \
    -e serve_tiles=True \
    ghcr.io/gis-ops/docker-valhalla/valhalla:latest

The online documentation for setting up Valhalla routing service is comprehensive, but contact me if you need help configuring it for running this project. The current scripts assume the request URL is going to http://localhost:8002/trace_route, so if you want to run it on a different port just change the port number in the MapMatch.py scripts.

Notebook scripts

Current jupyter notebook scripts are found in /notebooks. All code can be tested with data from /notebooks/data, though if you want to run it with all the training data you should download a file all_plt_data from the project Google Drive.

The notebooks include code for the three following modes:

1. Kalman filtering GPS data

Our main, initial goal was to apply the Kalman filter with the pykalman library to smooth out some of the gaussian noise generated by GPS data.

We found it works pretty well in reducing the random noise of the GPS data.

About the Kalman Filter

The Kalman filter uses the following algorithm:

$$ \begin{align*} x_{t+1} &= A_{t} x_{t} + b_{t} + \text{Normal}(0, Q_{t}) \\ z_{t} &= C_{t} x_{t} + d_{t} + \text{Normal}(0, R_{t}) \end{align*} $$

Within pykalman itself, it's implemented with these two functions:

Kalman Smoother (kf.smooth) is an algorithm designed to estimate the probability of a given $x_t=(\text{lat}, \text{long})$ given all the observations from $0$ to $T-1$.

  • $P(x_t | z_{0:T-1})$

The Expectation-Maximization (EM) algorithm aims to find the KF parameters within the observations which have the 'max expectation' or the greatest probability of occurring given all the observations for all timesteps.

  • Given: $\theta = (A, b, C, d, Q, R, \mu_0, \Sigma_0)$
  • Want to find: $\max_{\theta} P(z_{0:T-1}; \theta)$

2. Time segmentation

In the presence of large jumps in the GPS data, Kalman filtering tends to generate hallucinations which try to fill in the gaps when it may not be applicable. See the small red dots between groups below:

To avoid this we filter GPS data separately based on person, date, and by only filtering datapoints together if they are within 60 seconds (tunable) of each other. This seems to get rid of virtually all of these obvious hallucinations.

3. Map matching

Currently hosting Valhalla's Meili Match Service to 'snap' our GPS traces to the street grid, with pretty good results! Parameters of interest include:

  • Search radius - Maximum radius to search for road around the supplied coordinate
  • GPS accuracy - What standard of GPS accuracy do we assume?
  • Breakage distance - Between twopoints what's the maximum distance before we match separately(?)
  • Interpolation distance - Max distance, beyond which trace points are merged together
map_matching

4. (A fourth type)

There is also a fourth notebook type, map_visualization, which tests the various workflows I attempted while trying to create an intuitive way to create a map visualization class. The current best map visualization class for now is in notebooks/scripts/PlotMap.py, and its sample usage can be found in the last few cells of map_visualization.ipynb and road_snap2.ipynb.

Explore the data in the flask app

You can visualize the data (either full or demo) in a simple flask app to explore any given person's movement on all available dates they had walked. You can test this locally by running:

cd flask-app
python app.py

The app looks something like this:

prewalk_flask

Please note that if you want to view the results of map matching, you will need to host the Meili match service on your own machine.

About

Applying the Kalman filter and Map-Matching algorithms to time-series GPS data, for preprocessing the Walkwise model training data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages