PostgreSQL 實時位置跟蹤+軌跡分析系統實踐 - 單機頂千億軌跡/天

背景隨著移動設備的普及，越來越多的業務具備了時空屬性，例如快遞，試試跟蹤包裹、快遞員位置。例如實體，具備了空間屬性。例如餐飲配送，送貨員位置屬性。例如車輛，實時位置。等等。其中兩大需求包括： 1、對象位置實時跟蹤，例如實時查詢某個位點附近、或某個多邊形區域內的送貨員。 2、對象位置軌跡記錄和分 ...

背景

隨著移動設備的普及，越來越多的業務具備了時空屬性，例如快遞，試試跟蹤包裹、快遞員位置。例如實體，具備了空間屬性。

例如餐飲配送，送貨員位置屬性。例如車輛，實時位置。等等。

其中兩大需求包括：

1、對象位置實時跟蹤，例如實時查詢某個位點附近、或某個多邊形區域內的送貨員。

2、對象位置軌跡記錄和分析。結合地圖，分析軌跡，結合路由演算法，預測、生成最佳路徑等。

DEMO

以快遞配送為例，GPS設備實時上報快遞員軌跡，寫入位置跟蹤系統，同時將軌跡記錄永久保存到軌跡分析系統。

由於快遞員可能在配送過程中停留時間較長（比如在某個小區配送時），上報的多條位置可能變化並不大，同時考慮到資料庫更新消耗，以及位置的時效性，可以避免一些點的更新（打個比方，上一次位置和當前位置變化量在50米時，不更新）。

動態更新可以減少資料庫的更新量，提高整體吞吐能力。

設計

實時位置更新

1、建表

create table t_pos (  
  uid int primary key,   -- 感測器、快遞員、車輛、。。。對象ID  
  pos point,             -- 位置  
  mod_time timestamp     -- 最後修改時間  
);  
  
create index idx_t_pos_1 on t_pos using gist (pos);

真實環境中，我們可以使用PostGIS空間資料庫插件，使用geometry數據類型來存儲經緯度點。

create extension postgis;  
  
create table t_pos (  
  uid int primary key,   -- 感測器、快遞員、車輛、。。。對象ID  
  pos geometry,          -- 位置  
  mod_time timestamp     -- 最後修改時間  
);  
  
create index idx_t_pos_1 on t_pos using gist (pos);

2、上報位置，自動根據移動範圍，更新位置。

例如，移動距離50米以內，不更新。

insert into t_pos values (?, st_setsrid(st_makepoint($lat, $lon), 4326), now())  
on conflict (uid)  
do update set pos=excluded.pos, mod_time=excluded.mod_time  
where st_distancespheroid(t_pos.pos, excluded.pos, 'SPHEROID["WGS84",6378137,298.257223563]') > ?;  -- 超過多少米不更新

歷史軌跡保存

通常終端會批量上報數據，例如每隔10秒上報10秒內採集的點，一次上報的數據可能包含多個點，在PostgreSQL中可以以數組存儲。

create table t_pos_hist (  
  uid int,                  -- 感測器、快遞員、車輛、。。。對象ID  
  pos point[],              -- 批量上報的位置  
  crt_time timestamp[]      -- 批量上報的時間點  
);   
  
create index idx_t_pos_hist_uid on t_pos_hist (uid);                 -- 對象ID  
create index idx_t_pos_hist_crt_time on t_pos_hist ((crt_time[1]));    -- 對每批數據的起始時間創建索引

有必要的話，可以多存一個時間欄位，用於分區。

歷史軌跡分析

動態位置變更壓測

寫入併合並，同時判斷當距離大於50時，才更新，否則不更新。

（測試）如果使用point類型，則使用如下SQL

insert into t_pos values (1, point(1,1), now())  
on conflict (uid)  
do update set pos=excluded.pos, mod_time=excluded.mod_time  
where t_pos.pos <-> excluded.pos > 50;

（實際生產）如果使用PostGIS的geometry類型，則使用如下SQL

insert into t_pos values (1, st_setsrid(st_makepoint(120, 71), 4326), now())  
on conflict (uid)  
do update set pos=excluded.pos, mod_time=excluded.mod_time  
where st_distancespheroid(t_pos.pos, excluded.pos, 'SPHEROID["WGS84",6378137,298.257223563]') > 50;

壓測

首先生成1億隨機空間對象數據。

postgres=# insert into t_pos select generate_series(1,100000000), point(random()*10000, random()*10000), now();  
INSERT 0 100000000  
Time: 250039.193 ms (04:10.039)

壓測腳本如下，1億空間對象，測試動態更新性能（距離50以內，不更新）。

vi test.sql    
  
\set uid random(1,100000000)    
insert into t_pos    
select uid, point(pos[0]+random()*100-50, pos[1]+random()*100-50), now() from t_pos where uid=:uid   
on conflict (uid)   
do update set pos=excluded.pos, mod_time=excluded.mod_time   
where t_pos.pos <-> excluded.pos > 50;

壓測結果，動態更新 21.6萬點/s，187億點/天。

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 120   
  
number of transactions actually processed: 26014936
latency average = 0.295 ms
latency stddev = 0.163 ms
tps = 216767.645838 (including connections establishing)
tps = 216786.403543 (excluding connections establishing)

軌跡寫入壓測

每個UID，每批寫入50條：寫入速度約 467.5萬點/s，4039億點/天。

壓測時，寫多表，壓測使用動態SQL。

do language plpgsql $$  
declare  
begin  
  for i in 0..127 loop  
    execute 'create table t_pos_hist'||i||' (like t_pos_hist including all)';  
  end loop;  
end;  
$$;

create or replace function import_test(int) returns void as $$  
declare  
begin  
  execute format('insert into t_pos_hist%s values (%s, %L, %L)', mod($1, 128), $1,   
  array[point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1)] ,  
  array['2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10', '2018-01-01 10:10:10']);  
end;  
$$ language plpgsql strict;

vi test1.sql  
  
\set uid random(1,100000000)  
select import_test(:uid);

pgbench -M prepared -n -r -P 1 -f ./test1.sql -c 56 -j 56 -T 120   
  
  
number of transactions actually processed: 11220725  
latency average = 0.599 ms  
latency stddev = 5.452 ms  
tps = 93504.532256 (including connections establishing)  
tps = 93512.274135 (excluding connections establishing)