Skip to content

Create a new branch for the GIS based mode detection #712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

shankari
Copy link
Contributor

@shankari shankari commented Aug 5, 2019

I had actually created this earlier but deleted it by mistake (!!??!!)
Fortunately, I had a copy that I can restore it from

shankari added 30 commits March 19, 2018 09:21
https://stackoverflow.com/questions/46364143/pass-arguments-to-python-from-bash-script

> @ardit, needs to be "$@", not bare $@, or you're splitting on spaces,
> expanding literal globs in names, and otherwise not passing the input exactly
> as it was received. – Charles Duffy Sep 22 '17 at 13:28
This fixes
https://github.com/e-mission/e-mission-server/issues/577#issuecomment-376351382

After the fix,

```
2018-03-26 18:23:41,071:DEBUG:140735691387712:filter distance ends at index = 495 when len = 746, using index 1522113815.586377 ...
2018-03-26 18:23:41,071:DEBUG:140735691387712:for filter distance, startTs = 1519666023 and endTs = 1520449374
...
2018-03-26 18:23:47,680:DEBUG:140735691387712:transition_df =                            fmt_time  transition
0  2018-02-26T09:30:54.269049-08:00          14
1  2018-02-26T12:04:21.242699-08:00          13
2  2018-02-26T12:04:21.315481-08:00           8
3  2018-02-26T12:04:22.338651-08:00          10
4  2018-02-26T12:04:22.434913-08:00           2
....
2018-03-26 18:34:28,451:DEBUG:140735691387712:filter time ends at index = 745 when len =
 746, using initEndTs 1522114461.070792 ...
2018-03-26 18:34:28,451:DEBUG:140735691387712:for filter time, startTs = 1520449374 and
endTs = 1522114461
...
2018-03-26 18:34:28,585:DEBUG:140735691387712:transition_df =
  fmt_time  transition
0   2018-03-07T11:02:57.306000-08:00           1
1   2018-03-07T11:13:14.241000-08:00           2
2   2018-03-07T12:03:49.872000-08:00           1
3   2018-03-07T12:13:41.976000-08:00           2
4   2018-03-07T13:05:20.865000-08:00           1
5   2018-03-07T13:11:17.332000-08:00           2
6   2018-03-07T13:54:25.418000-08:00           1
7   2018-03-07T14:10:20.638000-08:00           2
8   2018-03-07T15:05:40.949000-08:00           1
9   2018-03-07T15:15:25.195000-08:00           2
10  2018-03-07T15:38:00.078000-08:00           1
11  2018-03-07T15:50:09.988000-08:00           2
12  2018-03-07T18:21:03.416000-08:00           1
13  2018-03-07T18:38:38.569000-08:00           2
14  2018-03-07T18:54:19.956000-08:00           1
15  2018-03-07T19:00:27.614000-08:00           2
...
```
We have always deleted out of order entries in the trip segmentation stage.
ea80978

We also recently started deleting spurious entries on iOS
(ef8e025)

This allows us to handle these bad points in one early stage instead of having
them cause problems for each step in the inference pipeline.

But now that we move into the murky area of inference, instead of the clear one
of out of order, we may not want to delete points any more. Instead, we may
want to simply mark them as invalid, so that we can go back to marking them as
non-invalid if we want to go back to the original state.

This change marks them as invalid, filters out invalid points during searches
and changes the deletion code to mark invalid instead
So that we can distinguish them from real phone points
Now we only need to deal with two main issues:
- spurious one-flip walk sections (e.g.
2018-02-26T16:34:13.848574-08:00 2018-02-26T16:36:33.451954-08:00 MotionTypes.WALKING)
- getting the section end point right when there are no matching filtered location points
    (the resampled data can be off by a lot because of interpolation errors in
    the absence of nearby points)
Remaining fix is to correctly extrapolate locations for section start/end
so that we can:
- reuse it for the other segmentation
- make it more complex, as necessary
Stop using unfiltered locations rn
Filter out sections with zero location points even if they are not technically
a flip because they are two long.
Experiment with location reconstruction, but turn it off
https://github.com/e-mission/e-mission-server/issues/577#issuecomment-377742975
non-motorized -> motorized, segment at the beginning
motorized -> non-motorized, segment at the end
https://github.com/e-mission/e-mission-server/issues/577#issuecomment-377766351
- Move out the sanity checking assumptions about domain
- Add some more sanity checks, the ones around minimum duration seem
  particularly powerful
- Add speed based checks as well, using median speeds to avoid zig-zagging errors

https://github.com/e-mission/e-mission-server/issues/577#issuecomment-377866347
- support WALKING and ON_FOOT
- remove some hardcoded WALKING checks
- convert bunch of `location_points` -> `self.location_points`
short drives: https://github.com/e-mission/e-mission-server/issues/577#issuecomment-378125143
fast bikes: https://github.com/e-mission/e-mission-server/issues/577#issuecomment-378129015

This is more complex than one would think because it requires us to retain the
mode of a merged fragment, so we had to change the merging code a bit
so that it can be reused by multiple mode inference modules
Instead of the motion activity. This makes it easier to reuse, for example, in
the mode inference code
And parse/interpret the results
The behavior of the integration can be controlled through the related
configuration files
Unlike the decision tree algorithm, this does not use machine learning.
Instead it uses a simple rule engine that uses the OSM integration from
733fc9a to distinguish between motorized modes

Minor changes:
- change clean_and_resample to squish the stop if it is greater than the stop
  radius that we plan to query for
GIS based mode inference was checked in to
733fc9a

This works much better than the decision tree, so switch to it for now.
Later, we could compare the results of the two algorithms to decide when to ask
the userfor clarification, for example.
Use the same logic for merging from and to walking when we are `BICYCLING` as
for `IN_VEHICLE`.

Without this fix, the entire trip except the last couple of minutes was WALKING
Details at:
https://github.com/e-mission/e-mission-server/issues/577#issuecomment-379496118

After this fix, the trips are

```
**********1 : 2018-04-06T12:00:19.510373-07:00 -> 2018-04-06T12:18:52.967424-07:00**********
2018-04-06T12:00:19.510373-07:00 2018-04-06T12:06:48.002566-07:00 MotionTypes.WALKING 0.999999886670494
2018-04-06T12:00:19.510373-07:00 2018-04-06T12:06:48.002566-07:00 PredictedModeTypes.WALKING 0.999999886670494

2018-04-06T12:07:21.999432-07:00 2018-04-06T12:07:43.998962-07:00 MotionTypes.RUNNING 1.1558113323084085
2018-04-06T12:07:21.999432-07:00 2018-04-06T12:07:43.998962-07:00 PredictedModeTypes.WALKING 1.1558113323084085

2018-04-06T12:08:46.998215-07:00 2018-04-06T12:18:52.967424-07:00 MotionTypes.WALKING 0.08121076478030438
2018-04-06T12:08:46.998215-07:00 2018-04-06T12:18:52.967424-07:00 PredictedModeTypes.WALKING 0.08121076478030438
```
See
https://github.com/e-mission/e-mission-server/issues/577#issuecomment-379496118

After fixing this,

the sections are

```
**********1 : 2018-04-06T12:01:07.222872-07:00 -> 2018-04-06T12:18:52.967424-07:00**********
2018-04-06T12:01:07.222872-07:00 2018-04-06T12:18:52.967424-07:00 MotionTypes.WALKING 1.0122036638335292
2018-04-06T12:01:07.222872-07:00 2018-04-06T12:18:52.967424-07:00 PredictedModeTypes.WALKING 1.0122036638335292
```

And because the section was longer, we technically didn't even need
(0da819a) since the speed was already OK.

```
2018-04-07 14:03:19,703:DEBUG:140735495942976:Adding distance 365.9192626980049 to original 545.1524297095458 to extend section start from [-122.26297302190915, 37.87055511505241] to [-122.25908662369412, 37.87174562558791]
2018-04-07 14:03:19,705:DEBUG:140735495942976:After subtracting time 318.20676419911274 from original 747.537787914 to cover additional distance 365.9192626980049 at speed 1.1499418110076278, new_start_ts = 1523041267.22
```
shankari added 23 commits April 7, 2018 17:58
See 0da819a.
This covers a couple of use cases not covered in testing (e.g.
    mode is bicycling and even 0.9 * computed_median is not
    within the speed range for the mode
For e.g. OAK airport people mover is tagged MONORAIL
http://www.openstreetmap.org/relation/4740795#map=14/37.7328/-122.2042

that is not a pre-defined tag, but user-defined tags are supported
so life is complex
It doesn't hurt anything and it provides extra protection in case of platform changes.
Otherwise, I run into

```
Traceback (most recent call last):
  File "/code/e-mission-server/emission/analysis/intake/segmentation/section_segmentatio
n.py", line 52, in segment_current_sections
    segment_trip_into_sections(user_id, trip_entry, trip_entry.data.source)
  File "/code/e-mission-server/emission/analysis/intake/segmentation/section_segmentatio
n.py", line 83, in segment_trip_into_sections
    segmentation_points = shcmsm.segment_into_sections(ts, distance_from_place, time_que
ry)
  File "/code/e-mission-server/emission/analysis/intake/segmentation/section_segmentatio
n_methods/smoothed_high_confidence_motion.py", line 132, in segment_into_sections
    motion_changes = self.segment_into_motion_changes(timeseries, time_query)
  File "/code/e-mission-server/emission/analysis/intake/segmentation/section_segmentatio
n_methods/smoothed_high_confidence_motion.py", line 118, in segment_into_motion_changes
    smoothed_motion_list = ffd.FlipFlopDetection(motion_change_list, self).merge_flip_fl
op_sections()
  File "/code/e-mission-server/emission/analysis/intake/segmentation/section_segmentatio
n_methods/flip_flop_detection.py", line 511, in merge_flip_flop_sections
    sm = self.should_merge(ss, se)
  File "/code/e-mission-server/emission/analysis/intake/segmentation/section_segmentatio
n_methods/flip_flop_detection.py", line 108, in should_merge
    cvft = self.check_valid_for_type(streak_start, streak_end)
  File "/code/e-mission-server/emission/analysis/intake/segmentation/section_segmentatio
n_methods/flip_flop_detection.py", line 201, in check_valid_for_type
    valid_for_type = valid_for_type and self.is_valid_for_type(mc)
  File "/code/e-mission-server/emission/analysis/intake/segmentation/section_segmentatio
n_methods/flip_flop_detection.py", line 317, in is_valid_for_type
    ret_val = validity_check_map[mcs.type](mcs, mce)
KeyError: <MotionTypes.NONE: 9>
```
Since we now merge bike trips backward at the beginning.
With this fix, the trip to Berkeley on iOS is again the correct

```
**********0 : 2018-02-26T09:27:03-08:00 -> 2018-02-26T12:02:46.000041-08:00**********
2018-02-26T09:27:03-08:00 2018-02-26T09:32:32.000052-08:00 MotionTypes.WALKING 2.493409531761379
2018-02-26T09:27:03-08:00 2018-02-26T09:32:32.000052-08:00 PredictedModeTypes.BICYCLING 2.493409531761379

2018-02-26T09:36:59.697741-08:00 2018-02-26T10:22:04.000002-08:00 MotionTypes.IN_VEHICLE 6.302443355537658
2018-02-26T09:36:59.697741-08:00 2018-02-26T10:22:04.000002-08:00 PredictedModeTypes.TRAIN 6.302443355537658

2018-02-26T10:25:03.052444-08:00 2018-02-26T11:30:42.848401-08:00 MotionTypes.IN_VEHICLE 13.092253711686567
2018-02-26T10:25:03.052444-08:00 2018-02-26T11:30:42.848401-08:00 PredictedModeTypes.TRAIN 13.092253711686567

2018-02-26T11:30:42.848401-08:00 2018-02-26T12:02:46.000041-08:00 MotionTypes.WALKING 0.31497006975417674
2018-02-26T11:30:42.848401-08:00 2018-02-26T12:02:46.000041-08:00 PredictedModeTypes.WALKING 0.31497006975417674
```

Also handle the case in which a section has no points and is also at the start
or end of a trip. Unsure how we get into that situation, but we ran into this
while running the pipeline for the `test_kyle` user and are trying to fix it now.
This is a continuation of c227c24,
but across all modes once you join the project.
if the segmentation point is much closer to one section end than the other
verified all dataset tests, everything works except trip to consulate as CAR
https://github.com/e-mission/e-mission-server/issues/577#issuecomment-379582029
That don't have actual routes but instead, `route_refs` in the tags.
https://github.com/e-mission/e-mission-server/issues/577#issuecomment-435186134

We create routes from the route_refs, if present, and use our prior route
matching algorithm
Not all UNKNOWN sections have no points.
If they do have points, use the speeds as a first check before the overall speed chck
https://github.com/e-mission/e-mission-server/issues/577#issuecomment-435243205
To make it easier to test mode inference without redoing segmentation
Instead of 100 kmph

This fixes the AIR_OR_HSR issue from
e-mission/e-mission-docs#322
We had some checks in place to handle false positives, but they were
insufficient for the French data. And there is a better check we can put in
place, so let's do it.

This fixes the BUS issue from e-mission/e-mission-docs#322
BUT with the fix, the trip is classified as UNKNOWN instead of CAR due to
sensing issues detailed in
e-mission/e-mission-docs#322 (comment)
Merge the one-line scalability fix
Merging some minor fixes that the tripaware project encountered
- added `aerialway` to the list of train modes
- fixed assert display message
- Backoff and retry if we get throttled at the overpass side
- Improved logging for matching routes
- add new import statements to the reset code

In addition,
- changes to: `bin/analysis/remove_inferred_modes.py`
- the check for `eaid.is_bicycling_speed`
had already been made

Also ignored the changes to the pipeline to comment out the habitica integration
…into ground_truth_matching

Resolved conflict by retaining both types of objects
@shankari shankari changed the base branch from master to gis-based-mode-detection August 5, 2019 23:43
@shankari shankari merged commit 0441401 into e-mission:gis-based-mode-detection Aug 5, 2019
@shankari shankari deleted the gis-based-mode-detection branch October 26, 2020 18:03
@shankari shankari restored the gis-based-mode-detection branch October 26, 2020 18:03
jf87 pushed a commit to jf87/e-mission-server that referenced this pull request Jun 21, 2021
Create a new branch for the GIS based mode detection
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant