pbjs 無法編碼 bytes 類型數據問題的解決方案

来源:https://www.cnblogs.com/goodcitizen/archive/2023/09/25/solution_about_pbjs_encode_bytes_data_failed_problem.html
-Advertisement-
Play Games

一段包含 bytes 類型的 protobuf 二進位數據,經過 pbjs 解碼生成的 json 文件,再傳遞給 pbjs 編碼後生成的二進位數據和原始數據差異巨大,經過一番探究,發現居然是 pbjs 的一個 bug,快來看看你是否踩過這個坑吧~ ...


問題背景

之前寫過一篇《使用腳本收發 protobuf 協議數據 》,通過 pbjs 命令可以將 protobuf 二進位數據轉換為 json:

> pbjs msg.proto --decode ProbeIpv6Response < response.bin
{
  "selfAddr": {
    "addrV6": "2409:8900:7900:8f0d:ecd9:4aee:aa3:7ad",
    "portV6": 46066
  },
  "brosAddr": [
    {
      "addrV6": "2409:8a34:4405:6624:5250:9d04:cf77:d",
      "portV6": 18720
    },
    {
      "addrV6": "2409:8a34:401a:4151:59e6:69b4:37ad:dea2",
      "portV6": 18679
    },
    {
      "addrV6": "2409:8a20:2a02:20c0:7d11:9a6b:6b51:a9bb",
      "portV6": 18824
    },
    {
      "addrV6": "2409:8a20:e0d:7773:50d4:93b0:680a:b555",
      "portV6": 18968
    },
    {
      "addrV6": "2409:8a44:5b20:edf2:7c09:a5e1:cdbf:69c6",
      "portV6": 18008
    }
  ]
}

反過來將 json 編碼為二進位數據也沒問題:

> pbjs msg.proto --encode ProbeIpv6Response < response.json > response2.bin
> xxd response2.bin
00000000: 122b 0a25 3234 3039 3a38 3930 303a 3739  .+.%2409:8900:79
00000010: 3030 3a38 6630 643a 6563 6439 3a34 6165  00:8f0d:ecd9:4ae
00000020: 653a 6161 333a 3761 6410 f2e7 021a 2a0a  e:aa3:7ad.....*.
00000030: 2432 3430 393a 3861 3334 3a34 3430 353a  $2409:8a34:4405:
00000040: 3636 3234 3a35 3235 303a 3964 3034 3a63  6624:5250:9d04:c
00000050: 6637 373a 6410 a092 011a 2d0a 2732 3430  f77:d.....-.'240
00000060: 393a 3861 3334 3a34 3031 613a 3431 3531  9:8a34:401a:4151
00000070: 3a35 3965 363a 3639 6234 3a33 3761 643a  :59e6:69b4:37ad:
00000080: 6465 6132 10f7 9101 1a2d 0a27 3234 3039  dea2.....-.'2409
00000090: 3a38 6132 303a 3261 3032 3a32 3063 303a  :8a20:2a02:20c0:
000000a0: 3764 3131 3a39 6136 623a 3662 3531 3a61  7d11:9a6b:6b51:a
000000b0: 3962 6210 8893 011a 2c0a 2632 3430 393a  9bb.....,.&2409:
000000c0: 3861 3230 3a65 3064 3a37 3737 333a 3530  8a20:e0d:7773:50
000000d0: 6434 3a39 3362 303a 3638 3061 3a62 3535  d4:93b0:680a:b55
000000e0: 3510 9894 011a 2d0a 2732 3430 393a 3861  5.....-.'2409:8a
000000f0: 3434 3a35 6232 303a 6564 6632 3a37 6330  44:5b20:edf2:7c0
00000100: 393a 6135 6531 3a63 6462 663a 3639 6336  9:a5e1:cdbf:69c6
00000110: 10d8 8c01

編碼生成的 response2.bin 與原始的 response.bin 完全一致。

然而後來在編碼另一種消息格式的時候,重新生成的 bin 文件和原始文件有很大差異,導致不能通過 pbjs 將 json 轉化為 binary 數據。

問題現象

為了說明白這個問題,先來看消息定義:

message common
{
    required uint32 mem1 = 1;
    required uint32 mem2 = 2;
    required bytes  mem3 = 3;
    required uint32 mem4 = 4;
    required uint64 mem5 = 5;
    optional uint32 mem6 = 6;
    optional bytes  mem7 = 7;
    optional uint32 mem8 = 8;
    optional uint64 mem9 = 9;
}

message query_md5
{
    required common mema = 1;
    required uint32 memb = 2;
    required bytes  memc = 3;
    required uint32 memd = 4; 
    required uint64 meme = 5; 
    repeated bytes  memf = 6; 
}

出於協議安全考慮,這裡欄位全部使用 memxx 代替。下麵是 proto 消息對應的原始數據:

> xxd tmp/resp.bin
0000000: 0a37 0802 10c3 8040 1a10 ba38 ba93 af7a  [email protected]
0000010: dae8 1967 2b89 ddd2 6b5c 200b 28b4 baba  ...g+...k\ .(...
0000020: a8b6 0130 003a 0a32 2e32 2e31 3031 2e32  ...0.:.2.2.101.2
0000030: 3740 0348 f0db 8883 0910 001a 1067 c607  [email protected]..
0000040: 215e 47ae 8925 272d 6da0 f602 2d20 0028  !^G..%'-m...- .(
0000050: a0cd c90a 3210 d15b f326 4708 bfc7 01e0  ....2..[.&G.....
0000060: 4b3d c624 38a3 3210 3195 44f3 2f32 1b96  K=.$8.2.1.D./2..
0000070: 7865 6b82 fdb8 9560 3210 9a75 1735 fcca  xek....`2..u.5..
0000080: e66f 7486 e9fa dc6a 9fab 3210 284c ebbf  .ot....j..2.(L..
0000090: 36e0 1d57 5ca6 93de 391b 7a7d 3210 3e0b  6..W\...9.z}2.>.
00000a0: 439c 62a5 a401 c3ff cf00 3299 bc7e 3210  C.b.......2..~2.
00000b0: f6b9 9746 9ce6 9555 52d3 f50b 6ca3 8eb1  ...F...UR...l...
00000c0: 3210 9852 e7f1 2530 cb6b 7aa0 5569 fbcd  2..R..%0.kz.Ui..
00000d0: 0a5c 3210 d333 33b1 d516 d868 3938 f307  .\2..33....h98..
00000e0: bffe d4c0 3210 a646 0cdf 2874 486a 0bc0  ....2..F..(tHj..
00000f0: edf1 6f51 b59e 3210 1eee e679 5bf1 0832  ..oQ..2....y[..2
0000100: d5a7 fc4f 60cf 48ab 3210 c446 9663 f6a4  ...O`.H.2..F.c..
0000110: 87cd fc3f d560 285c 0ea4                 ...?.`(\..

經過 pbjs 解碼後得到如下 json:

> pbjs query_md5.proto --decode query_md5 < tmp/resp.bin > resp.json
> jq -c '.' resp.json
{"mema":{"mem1":2,"mem2":1048643,"mem3":{"type":"Buffer","data":[186,56,186,147,175,122,218,232,25,103,43,137,221,210,107,92]},"mem4":11,"mem5":{"low":1695456564,"high":11,"unsigned":true},"mem6":0,"mem7":{"type":"Buffer","data":[50,46,50,46,49,48,49,46,50,55]},"mem8":3,"mem9":{"low":-1872613904,"high":0,"unsigned":true}},"memb":0,"memc":{"type":"Buffer","data":[103,198,7,33,94,71,174,137,37,39,45,109,160,246,2,45]},"memd":0,"meme":{"low":22177440,"high":0,"unsigned":true},"memf":[{"type":"Buffer","data":[209,91,243,38,71,8,191,199,1,224,75,61,198,36,56,163]},{"type":"Buffer","data":[49,149,68,243,47,50,27,150,120,101,107,130,253,184,149,96]},{"type":"Buffer","data":[154,117,23,53,252,202,230,111,116,134,233,250,220,106,159,171]},{"type":"Buffer","data":[40,76,235,191,54,224,29,87,92,166,147,222,57,27,122,125]},{"type":"Buffer","data":[62,11,67,156,98,165,164,1,195,255,207,0,50,153,188,126]},{"type":"Buffer","data":[246,185,151,70,156,230,149,85,82,211,245,11,108,163,142,177]},{"type":"Buffer","data":[152,82,231,241,37,48,203,107,122,160,85,105,251,205,10,92]},{"type":"Buffer","data":[211,51,51,177,213,22,216,104,57,56,243,7,191,254,212,192]},{"type":"Buffer","data":[166,70,12,223,40,116,72,106,11,192,237,241,111,81,181,158]},{"type":"Buffer","data":[30,238,230,121,91,241,8,50,213,167,252,79,96,207,72,171]},{"type":"Buffer","data":[196,70,150,99,246,164,135,205,252,63,213,96,40,92,14,164]}]}

內容比較多使用 jq -c 列為一行了。將 json 再次編碼後,得到的 bin 文件內容如下:

> pbjs query_md5.proto --encode query_md5 < resp.json > resp.bin
> xxd resp.bin
0000000: 0a08 0802 10c3 8040 1a00 1000 1a00       .......@......

從數據長度就能看出來,明顯與第一次不一樣。

初步分析

既然之前 pbjs 能成功的恢復 binary 數據,說明它本身的問題不大,複習下第一個消息的格式:

> cat msg.proto
message ProbeIpv6Request {
    string xxxxx     = 1;
    string xxxx      = 2;
    string xxxxxxxx  = 3;
    string xxxxxxx   = 4;
}

message V6AddrType {
    string addrV6 = 1;
    uint32 portV6 = 2;
}

message ProbeIpv6Response {
    string              xxxxx    = 1;
    V6AddrType          selfAddr = 2;
    repeated V6AddrType brosAddr = 3;
}

與出問題的消息區別主要在於:前者使用 string,後者使用 bytes。

bytes vs string

難道問題出在 bytes 類型上?嘗試將第二個消息中的 bytes 替換為 string:

message common
{
    required uint32 mem1 = 1;
    required uint32 mem2 = 2;
    required string mem3 = 3;
    required uint32 mem4 = 4;
    required uint64 mem5 = 5;
    optional uint32 mem6 = 6;
    optional string mem7 = 7;
    optional uint32 mem8 = 8;
    optional uint64 mem9 = 9;
}

message query_md5
{
    required common mema = 1;
    required uint32 memb = 2;
    required string memc = 3;
    required uint32 memd = 4; 
    required uint64 meme = 5; 
    repeated string memf = 6; 
}

但願 pbjs 對它這兩種類型做了相容,按 string 類型直接解析 binary 數據:

> pbjs query_md5.proto --decode query_md5 < tmp/resp.bin > resp.json
> cat resp.json
{
  "mema": {
    "mem1": 2,
    "mem2": 1048643,
    "mem3": "�8���z��\u0019g+���k\\",
    "mem4": 11,
    "mem5": {
      "low": 1695456564,
      "high": 11,
      "unsigned": true
    },
    "mem6": 0,
    "mem7": "2.2.101.27",
    "mem8": 3,
    "mem9": {
      "low": -1872613904,
      "high": 0,
      "unsigned": true
    }
  },
  "memb": 0,
  "memc": "g�\u0007!^G��%'-m��\u0002-",
  "memd": 0,
  "meme": {
    "low": 22177440,
    "high": 0,
    "unsigned": true
  },
  "memf": [
    "�[�&G\b��\u0001�K=�$8�",
    "1�D�/2\u001b�xek����`",
    "�u\u00175���ot����j��",
    "(L��6�\u001dW\\���9\u001bz}",
    ">\u000bC�b��\u0001���\u00002��~",
    "���F���UR��\u000bl���",
    "�R��%0�kz�Ui��\n\\",
    "�33��\u0016�h98�\u0007����",
    "�F\f�(tHj\u000b���oQ��",
    "\u001e��y[�\b2է�O`�H�",
    "�F�c�����?�`(\\\u000e�"
  ]
}

哈哈,居然解出來了,雖然 bytes 欄位出現了亂碼。如果原封不動的再 encode 回去,應該沒問題吧?

> pbjs query_md5.proto --encode query_md5 < resp.json > resp.bin
> xxd resp.bin
0000000: 0a49 0802 10c3 8040 1a22 efbf bd38 efbf  .I.....@."...8..
0000010: bdef bfbd efbf bd7a efbf bdef bfbd 1967  .......z.......g
0000020: 2bef bfbd efbf bdef bfbd 6b5c 200b 28b4  +.........k\ .(.
0000030: baba a8b6 0130 003a 0a32 2e32 2e31 3031  .....0.:.2.2.101
0000040: 2e32 3740 0348 f0db 8883 0910 001a 1a67  [email protected]
0000050: efbf bd07 215e 47ef bfbd efbf bd25 272d  ....!^G......%'-
0000060: 6def bfbd efbf bd02 2d20 0028 a0cd c90a  m.......- .(....
0000070: 321e efbf bd5b efbf bd26 4708 efbf bdef  2....[...&G.....
0000080: bfbd 01ef bfbd 4b3d efbf bd24 38ef bfbd  ......K=...$8...
0000090: 321e 31ef bfbd 44ef bfbd 2f32 1bef bfbd  2.1...D.../2....
00000a0: 7865 6bef bfbd efbf bdef bfbd efbf bd60  xek............`
00000b0: 3224 efbf bd75 1735 efbf bdef bfbd efbf  2$...u.5........
00000c0: bd6f 74ef bfbd efbf bdef bfbd efbf bd6a  .ot............j
00000d0: efbf bdef bfbd 321c 284c efbf bdef bfbd  ......2.(L......
00000e0: 36ef bfbd 1d57 5cef bfbd efbf bdef bfbd  6....W\.........
00000f0: 391b 7a7d 3220 3e0b 43ef bfbd 62ef bfbd  9.z}2 >.C...b...
0000100: efbf bd01 efbf bdef bfbd efbf bd00 32ef  ..............2.
0000110: bfbd efbf bd7e 3226 efbf bdef bfbd efbf  .....~2&........
0000120: bd46 efbf bdef bfbd efbf bd55 52ef bfbd  .F.........UR...
0000130: efbf bd0b 6cef bfbd efbf bdef bfbd 321e  ....l.........2.
0000140: efbf bd52 efbf bdef bfbd 2530 efbf bd6b  ...R......%0...k
0000150: 7aef bfbd 5569 efbf bdef bfbd 0a5c 3222  z...Ui.......\2"
0000160: efbf bd33 33ef bfbd efbf bd16 efbf bd68  ...33..........h
0000170: 3938 efbf bd07 efbf bdef bfbd efbf bdef  98..............
0000180: bfbd 321e efbf bd46 0cef bfbd 2874 486a  ..2....F....(tHj
0000190: 0bef bfbd efbf bdef bfbd 6f51 efbf bdef  ..........oQ....
00001a0: bfbd 321c 1eef bfbd efbf bd79 5bef bfbd  ..2........y[...
00001b0: 0832 d5a7 efbf bd4f 60ef bfbd 48ef bfbd  .2.....O`...H...
00001c0: 3222 efbf bd46 efbf bd63 efbf bdef bfbd  2"...F...c......
00001d0: efbf bdef bfbd efbf bd3f efbf bd60 285c  .........?...`(\
00001e0: 0eef bfbd                                ....

可以是可以,但還是和原始數據有很大差異:

這次是多了很多內容,給我的熱情澆了一大盆冷水。抱著試試看的態度,將這個 binary 數據發給伺服器,果然報錯了:

{"error_code":196608,"error_msg":"fgid not find","request_id":3933672364}

看起來是解析 bytes 欄位時失敗了。

在我的場景中,使用 pbjs 主要就是根據 json 生成請求的 protobuf 數據併發送給伺服器,從而得到 protobuf 響應,之後通過 pbjs 解析響應數據得到 json 數據,最後喂給 jq 來獲取想要的各種信息。

如果這一步走不通,後面的就全阻塞了,即使在本地可以使用 string 類型來迴轉換數據。

json unicode

一開始懷疑 string 類型中一些字元沒能成功轉換為對應的二進位數據,以上例中的 memc 欄位為例:

"memc":{"type":"Buffer","data":[103,198,7,33,94,71,174,137,37,39,45,109,160,246,2,45]}

轉換後變為:

"memc": "g�\u0007!^G��%'-m��\u0002-",

一些亂碼字元看起來很可疑,如何在 json 中表示一個字元的二進位形式?搜到了 json 中的 unicode 表達式 \u,它要求後面必需跟四位 hex 值,因此這裡做了一些轉換:

"memc": "\u0067\u00c6\u0007\u0021\u005e\u0047\u00ae\u0089\u0025\u0027\u002d\u006d\u00a0\u00f6\u0002\u002d",

將其它的幾個 string 類型欄位也如法炮製:

{
  "mema": {
    "mem1": 2,
    "mem2": 1048643,
    "mem3": "\u00ba\u0038\u00ba\u0093\u00af\u007a\u00da\u00e8\u0019\u0067\u002b\u0089\u00dd\u00d2\u006b\u005c",
    "mem4": 11,
    "mem5": {
      "low": 1695456564,
      "high": 11,
      "unsigned": true
    },
    "mem6": 0,
    "mem7": "2.2.101.27",
    "mem8": 3,
    "mem9": {
      "low": -1872613904,
      "high": 0,
      "unsigned": true
    }
  },
  "memb": 0,
  "memc": "\u0067\u00c6\u0007\u0021\u005e\u0047\u00ae\u0089\u0025\u0027\u002d\u006d\u00a0\u00f6\u0002\u002d",
  "memd": 0,
  "meme": {
    "low": 22177440,
    "high": 0,
    "unsigned": true
  },
  "memf": [
    "\u00d1\u005b\u00f3\u0026\u0047\u0008\u00bf\u00c7\u0001\u00e0\u004b\u003d\u00c6\u0024\u0038\u00a3",
    "\u0031\u0095\u0044\u00f3\u002f\u0032\u001b\u0096\u0078\u0065\u006b\u0082\u00fd\u00b8\u0095\u0060",
    "\u009a\u0075\u0017\u0035\u00fc\u00ca\u00e6\u006f\u0074\u0086\u00e9\u00fa\u00dc\u006a\u009f\u00ab",
    "\u0028\u004c\u00eb\u00bf\u0036\u00e0\u001d\u0057\u005c\u00a6\u0093\u00de\u0039\u001b\u007a\u007d",
    "\u003e\u000b\u0043\u009c\u0062\u00a5\u00a4\u0001\u00c3\u00ff\u00cf\u0000\u0032\u0099\u00bc\u007e",
    "\u00f6\u00b9\u0097\u0046\u009c\u00e6\u0095\u0055\u0052\u00d3\u00f5\u000b\u006c\u00a3\u008e\u00b1",
    "\u0098\u0052\u00e7\u00f1\u0025\u0030\u00cb\u006b\u007a\u00a0\u0055\u0069\u00fb\u00cd\u000a\u005c",
    "\u00d3\u0033\u0033\u00b1\u00d5\u0016\u00d8\u0068\u0039\u0038\u00f3\u0007\u00bf\u00fe\u00d4\u00c0",
    "\u00a6\u0046\u000c\u00df\u0028\u0074\u0048\u006a\u000b\u00c0\u00ed\u00f1\u006f\u0051\u00b5\u009e",
    "\u001e\u00ee\u00e6\u0079\u005b\u00f1\u0008\u0032\u00d5\u00a7\u00fc\u004f\u0060\u00cf\u0048\u00ab",
    "\u00c4\u0046\u0096\u0063\u00f6\u00a4\u0087\u00cd\u00fc\u003f\u00d5\u0060\u0028\u005c\u000e\u00a4"
  ]
}

使用 pbjs 編碼新的 json 文件嘗試:

> pbjs query_md5.proto --encode query_md5 < resp.uni.json > resp.uni.bin
> xxd resp.uni.bin
0000000: 0a40 0802 10c3 8040 1a19 c2ba 38c2 bac2  .@[email protected]...
0000010: 93c2 af7a c39a c3a8 1967 2bc2 89c3 9dc3  ...z.....g+.....
0000020: 926b 5c20 0b28 b4ba baa8 b601 3000 3a0a  .k\ .(......0.:.
0000030: 322e 322e 3130 312e 3237 4003 48f0 db88  [email protected]...
0000040: 8309 1000 1a15 67c3 8607 215e 47c2 aec2  ......g...!^G...
0000050: 8925 272d 6dc2 a0c3 b602 2d20 0028 a0cd  .%'-m.....- .(..
0000060: c90a 3217 c391 5bc3 b326 4708 c2bf c387  ..2...[..&G.....
0000070: 01c3 a04b 3dc3 8624 38c2 a332 1731 c295  ...K=..$8..2.1..
0000080: 44c3 b32f 321b c296 7865 6bc2 82c3 bdc2  D../2...xek.....
0000090: b8c2 9560 321a c29a 7517 35c3 bcc3 8ac3  ...`2...u.5.....
00000a0: a66f 74c2 86c3 a9c3 bac3 9c6a c29f c2ab  .ot........j....
00000b0: 3216 284c c3ab c2bf 36c3 a01d 575c c2a6  2.(L....6...W\..
00000c0: c293 c39e 391b 7a7d 3218 3e0b 43c2 9c62  ....9.z}2.>.C..b
00000d0: c2a5 c2a4 01c3 83c3 bfc3 8f00 32c2 99c2  ............2...
00000e0: bc7e 321b c3b6 c2b9 c297 46c2 9cc3 a6c2  .~2.......F.....
00000f0: 9555 52c3 93c3 b50b 6cc2 a3c2 8ec2 b132  .UR.....l......2
0000100: 17c2 9852 c3a7 c3b1 2530 c38b 6b7a c2a0  ...R....%0..kz..
0000110: 5569 c3bb c38d 0a5c 3219 c393 3333 c2b1  Ui.....\2...33..
0000120: c395 16c3 9868 3938 c3b3 07c2 bfc3 bec3  .....h98........
0000130: 94c3 8032 17c2 a646 0cc3 9f28 7448 6a0b  ...2...F...(tHj.
0000140: c380 c3ad c3b1 6f51 c2b5 c29e 3218 1ec3  ......oQ....2...
0000150: aec3 a679 5bc3 b108 32c3 95c2 a7c3 bc4f  ...y[...2......O
0000160: 60c3 8f48 c2ab 3219 c384 46c2 9663 c3b6  `..H..2...F..c..
0000170: c2a4 c287 c38d c3bc 3fc3 9560 285c 0ec2  ........?..`(\..
0000180: a4                                       .

新版本看起來比之前有一些變化:

縮短了一些,然而伺服器仍然報相同的錯誤。

事實證明這個方案不可行,使用 string 類型替換 bytes 類型這個方向走到頭兒了。

解決方案

既然必需使用 bytes 類型,而 pbjs 又有問題,那有沒有其它轉換工具呢?

protobufjs

一般的 pbjs help 輸出如下:

> pbjs
Usage: pbjs [options] <schema_path>

Options:
  -V, --version        output the version number
  --es5 <js_path>      Generate ES5 JavaScript code
  --es6 <js_path>      Generate ES6 JavaScript code
  --ts <ts_path>       Generate TypeScript code
  --decode <msg_type>  Decode standard input to JSON
  --encode <msg_type>  Encode standard input to JSON
  -h, --help           output usage information

無意間我的 pbjs 輸出了下麵的信息:

> pbjs
protobuf.js v1.1.2 CLI for JavaScript

Translates between file formats and generates static code.

  -t, --target     Specifies the target format. Also accepts a path to require a custom target.

                   json          JSON representation
                   json-module   JSON representation as a module
                   proto2        Protocol Buffers, Version 2
                   proto3        Protocol Buffers, Version 3
                   static        Static code without reflection (non-functional on its own)
                   static-module Static code without reflection as a module

  -p, --path       Adds a directory to the include path.

  --filter         Set up a filter to configure only those messages you need and their dependencies to compile, this will effectively reduce the final file size
                   Set A json file path, Example of file content: {"messageNames":["mypackage.messageName1", "messageName2"] }

  -o, --out        Saves to a file instead of writing to stdout.

  --sparse         Exports only those types referenced from a main file (experimental).

  Module targets only:

  -w, --wrap       Specifies the wrapper to use. Also accepts a path to require a custom wrapper.

                   default   Default wrapper supporting both CommonJS and AMD
                   commonjs  CommonJS wrapper
                   amd       AMD wrapper
                   es6       ES6 wrapper (implies --es6)
                   closure   A closure adding to protobuf.roots where protobuf is a global

  --dependency     Specifies which version of protobuf to require. Accepts any valid module id

  -r, --root       Specifies an alternative protobuf.roots name.

  -l, --lint       Linter configuration. Defaults to protobuf.js-compatible rules:

                   eslint-disable block-scoped-var, id-length, no-control-regex, no-magic-numbers, no-prototype-builtins, no-redeclare, no-shadow, no-var, sort-vars

  --es6            Enables ES6 syntax (const/let instead of var)

  Proto sources only:

  --keep-case      Keeps field casing instead of converting to camel case.
  --alt-comment    Turns on an alternate comment parsing mode that preserves more comments.

  Static targets only:

  --no-create      Does not generate create functions used for reflection compatibility.
  --no-encode      Does not generate encode functions.
  --no-decode      Does not generate decode functions.
  --no-verify      Does not generate verify functions.
  --no-convert     Does not generate convert functions like from/toObject
  --no-delimited   Does not generate delimited encode/decode functions.
  --no-typeurl     Does not generate getTypeUrl function.
  --no-beautify    Does not beautify generated code.
  --no-comments    Does not output any JSDoc comments.
  --no-service     Does not output service classes.

  --force-long     Enforces the use of 'Long' for s-/u-/int64 and s-/fixed64 fields.
  --force-number   Enforces the use of 'number' for s-/u-/int64 and s-/fixed64 fields.
  --force-message  Enforces the use of message instances instead of plain objects.

  --null-defaults  Default value for optional fields is null instead of zero value.

usage: pbjs [options] file1.proto file2.json ...  (or pipe)  other | pbjs [options] -

原來有兩個 pbjs,一個是 npm install pbjs 所得,一個是 npm install protobufjs[-cli] 所得,後者是用來生成處理 protobuf 數據的 javascript 代碼的。

如果先安裝了一個,另外一個就會報錯:

$ sudo npm install protobufjs -g
npm ERR! code EEXIST
npm ERR! path /usr/local/bin/pbjs
npm ERR! EEXIST: file already exists
npm ERR! File exists: /usr/local/bin/pbjs
npm ERR! Remove the existing file and try again, or run npm
npm ERR! with --force to overwrite files recklessly.

npm ERR! A complete log of this run can be found in:
npm ERR!     /root/.npm/_logs/2023-09-24T03_19_13_647Z-debug-0.log

需要卸載之前安裝的才行。網上搜索 pbjs 關鍵字,有的講的是第一種,有的講的是第二種,原因就是安裝的包不同,千萬不要將這二者混為一談。

有一種方法可以同時保有兩者,就是將另外一個安裝在本地:

> npm install protobufjs-cli

added 84 packages in 2m
> ls node_modules/
acorn           brace-expansion  entities              esutils           inherits      lodash              minimatch         protobufjs      strip-json-comments  underscore
acorn-jsx       catharsis        escape-string-regexp  fast-levenshtein  js2xmlparser  long                minimist          @protobufjs     supports-color       word-wrap
ansi-styles     chalk            escodegen             fs.realpath       jsdoc         lru-cache           mkdirp            protobufjs-cli  tmp                  wrappy
argparse        color-convert    eslint-visitor-keys   glob              @jsdoc        markdown-it         once              requizzle       type-check           xmlcreate
@babel          color-name       espree                graceful-fs       klaw          markdown-it-anchor  optionator        rimraf          @types               yallist
balanced-match  concat-map       esprima               has-flag          levn          marked              path-is-absolute  semver          uc.micro
bluebird        deep-is          estraverse            inflight          linkify-it    mdurl               prelude-ls        source-map      uglify-js
> find . -type f -name "pbjs"
./node_modules/protobufjs-cli/bin/pbjs
> ./node_modules/protobufjs-cli/bin/pbjs
protobuf.js v1.1.2 CLI for JavaScript

Translates between file formats and generates static code.
......
usage: pbjs [options] file1.proto file2.json ...  (or pipe)  other | pbjs [options] -

缺點是只能用下麵的方式引用了:

> ./node_modules/protobufjs-cli/bin/pbjs

關於 protobufjs,主要關註它將 proto 消息轉換為 json 描述的格式以便 js 代碼直接使用:

> ./node_modules/protobufjs-cli/bin/pbjs -t json query_md5.proto > query_md5.json
> cat query_md5.json
{{
  "nested": {
    "common": {
      "fields": {
        "mem1": {
          "rule": "required",
          "type": "uint32",
          "id": 1
        },
        "mem2": {
          "rule": "required",
          "type": "uint32",
          "id": 2
        },
        "mem3": {
          "rule": "required",
          "type": "bytes",
          "id": 3
        },
        "mem4": {
          "rule": "required",
          "type": "uint32",
          "id": 4
        },
        "mem5": {
          "rule": "required",
          "type": "uint64",
          "id": 5
        },
        "mem6": {
          "type": "uint32",
          "id": 6
        },
        "mem7": {
          "type": "bytes",
          "id": 7
        },
        "mem8": {
          "type": "uint32",
          "id": 8
        },
        "mem9": {
          "type": "uint64",
          "id": 9
        }
      }
    },
    "query_md5": {
      "fields": {
        "mema": {
          "rule": "required",
          "type": "common",
          "id": 1
        },
        "memb": {
          "rule": "required",
          "type": "uint32",
          "id": 2
        },
        "memc": {
          "rule": "required",
          "type": "bytes",
          "id": 3
        },
        "memd": {
          "rule": "required",
          "type": "uint32",
          "id": 4
        },
        "meme": {
          "rule": "required",
          "type": "uint64",
          "id": 5
        },
        "memf": {
          "rule": "repeated",
          "type": "bytes",
          "id": 6
        }
      }
    }
  }

稍後會用到。

javascript

無論是 protobufjs 還是 pbjs,都可以根據 proto 文件生成 javascript 代碼,回顧 pbjs 的幫助信息:

> pbjs
Usage: pbjs [options] <schema_path>

Options:
  -V, --version        output the version number
  --es5 <js_path>      Generate ES5 JavaScript code
  --es6 <js_path>      Generate ES6 JavaScript code
  --ts <ts_path>       Generate TypeScript code
  --decode <msg_type>  Decode standard input to JSON
  --encode <msg_type>  Encode standard input to JSON
  -h, --help           output usage information

主要是通過 --es5/6 選項來實現,protobufjs 也有類似選項,這裡出於描述方便,統一使用 pbjs 說明。

通過運行 js 代碼來將 binary 數據轉換為 json,也不失為一種解決方案。參考網上的帖子,得到下麵的 js 代碼:

let pbroot = require("protobufjs").Root;
let json = require("./query_md5.json");
let root = pbroot.fromJSON(json);
// console.log (root);

var fs = require('fs');
fs.readFile('./tmp/resp.bin', function (err, data) {
    if (err) {
        console.log(err);
    } else {
        console.log(data);
        console.log(data.length + ' bytes');

        let Message = root.lookupType("query_md5");
        try{
            let message = Message.decode(data);
            console.log(message);
        }catch(e){
            console.log(e);
        }
    }
});

註意第 2 行中的 query_md5.json 文件就是上一節中通過 protobufjs 生成的。對上面的代碼做個簡單說明:

  • 載入 query_md5.json 中定義的 proto 類型 (query_md5)
  • 讀取 binary 數據 (tmp/resp.bin) 併進行解析
  • 輸出解析結果

運行 js 代碼得到下麵的輸出:

> node index.js
<Buffer 0a 37 08 02 10 c3 80 40 1a 10 ba 38 ba 93 af 7a da e8 19 67 2b 89 dd d2 6b 5c 20 0b 28 b4 ba ba a8 b6 01 30 00 3a 0a 32 2e 32 2e 31 30 31 2e 32 37 40 ... 232 more bytes>
282 bytes
query_md5 {
  memf: [
    <Buffer d1 5b f3 26 47 08 bf c7 01 e0 4b 3d c6 24 38 a3>,
    <Buffer 31 95 44 f3 2f 32 1b 96 78 65 6b 82 fd b8 95 60>,
    <Buffer 9a 75 17 35 fc ca e6 6f 74 86 e9 fa dc 6a 9f ab>,
    <Buffer 28 4c eb bf 36 e0 1d 57 5c a6 93 de 39 1b 7a 7d>,
    <Buffer 3e 0b 43 9c 62 a5 a4 01 c3 ff cf 00 32 99 bc 7e>,
    <Buffer f6 b9 97 46 9c e6 95 55 52 d3 f5 0b 6c a3 8e b1>,
    <Buffer 98 52 e7 f1 25 30 cb 6b 7a a0 55 69 fb cd 0a 5c>,
    <Buffer d3 33 33 b1 d5 16 d8 68 39 38 f3 07 bf fe d4 c0>,
    <Buffer a6 46 0c df 28 74 48 6a 0b c0 ed f1 6f 51 b5 9e>,
    <Buffer 1e ee e6 79 5b f1 08 32 d5 a7 fc 4f 60 cf 48 ab>,
    <Buffer c4 46 96 63 f6 a4 87 cd fc 3f d5 60 28 5c 0e a4>
  ],
  mema: common {
    mem1: 2,
    mem2: 1048643,
    mem3: <Buffer ba 38 ba 93 af 7a da e8 19 67 2b 89 dd d2 6b 5c>,
    mem4: 11,
    mem5: Long { low: 1695456564, high: 11, unsigned: true },
    mem6: 0,
    mem7: <Buffer 32 2e 32 2e 31 30 31 2e 32 37>,
    mem8: 3,
    mem9: Long { low: -1872613904, high: 0, unsigned: true }
  },
  memb: 0,
  memc: <Buffer 67 c6 07 21 5e 47 ae 89 25 27 2d 6d a0 f6 02 2d>,
  memd: 0,
  meme: Long { low: 22177440, high: 0, unsigned: true }
}
<Buffer 0a 37 08 02 10 c3 80 40 1a 10 ba 38 ba 93 af 7a da e8 19 67 2b 89 dd d2 6b 5c 20 0b 28 b4 ba ba a8 b6 01 30 00 3a 0a 32 2e 32 2e 31 30 31 2e 32 37 40 ... 232 more bytes>

能正確的解析 binary 數據。對代碼稍加改動:

...
            let buffer= Message.encode(Message.create(message)).finish();
            console.log (buffer);
            fs.writeFile('./resp.bin', buffer, function (err) {
                if (err) {
                    console.log(err);
                } else {
                    console.log('success');
                }
            });
...

將解析後的數據 (message) 再編碼為二進位 (buffer) 並輸出到文件 (resp.bin):

...
<Buffer 0a 37 08 02 10 c3 80 40 1a 10 ba 38 ba 93 af 7a da e8 19 67 2b 89 dd d2 6b 5c 20 0b 28 b4 ba ba a8 b6 01 30 00 3a 0a 32 2e 32 2e 31 30 31 2e 32 37 40 ... 52 more bytes>
success
> xxd resp.bin
0000000: 0a37 0802 10c3 8040 1a10 ba38 ba93 af7a  [email protected]
0000010: dae8 1967 2b89 ddd2 6b5c 200b 28b4 baba  ...g+...k\ .(...
0000020: a8b6 0130 003a 0a32 2e32 2e31 3031 2e32  ...0.:.2.2.101.2
0000030: 3740 0348 f0db 8883 0910 001a 1067 c607  [email protected]..
0000040: 215e 47ae 8925 272d 6da0 f602 2d20 0028  !^G..%'-m...- .(
0000050: a0cd c90a 3210 d15b f326 4708 bfc7 01e0  ....2..[.&G.....
0000060: 4b3d c624 38a3 3210 3195 44f3 2f32 1b96  K=.$8.2.1.D./2..
0000070: 7865 6b82 fdb8 9560 3210 9a75 1735 fcca  xek....`2..u.5..
0000080: e66f 7486 e9fa dc6a 9fab 3210 284c ebbf  .ot....j..2.(L..
0000090: 36e0 1d57 5ca6 93de 391b 7a7d 3210 3e0b  6..W\...9.z}2.>.
00000a0: 439c 62a5 a401 c3ff cf00 3299 bc7e 3210  C.b.......2..~2.
00000b0: f6b9 9746 9ce6 9555 52d3 f50b 6ca3 8eb1  ...F...UR...l...
00000c0: 3210 9852 e7f1 2530 cb6b 7aa0 5569 fbcd  2..R..%0.kz.Ui..
00000d0: 0a5c 3210 d333 33b1 d516 d868 3938 f307  .\2..33....h98..
00000e0: bffe d4c0 3210 a646 0cdf 2874 486a 0bc0  ....2..F..(tHj..
00000f0: edf1 6f51 b59e 3210 1eee e679 5bf1 0832  ..oQ..2....y[..2
0000100: d5a7 fc4f 60cf 48ab 3210 c446 9663 f6a4  ...O`.H.2..F.c..
0000110: 87cd fc3f d560 285c 0ea4                 ...?.`(\..

與原始數據做個對比:

完全一致!看起來這種方法可行,只是有些麻煩。

protoc

說到通過 proto 文件編解碼二進位數據,最拿手的就不應該是 protobuf 自帶的 protoc 工具嗎?

$ ./protoc --help
Usage: ./protoc [OPTION] PROTO_FILES
Parse PROTO_FILES and generate output based on the options given:
  -IPATH, --proto_path=PATH   Specify the directory in which to search for
                              imports.  May be specified multiple times;
                              directories will be searched in order.  If not
                              given, the current working directory is used.
  --version                   Show version info and exit.
  -h, --help                  Show this text and exit.
  --encode=MESSAGE_TYPE       Read a text-format message of the given type
                              from standard input and write it in binary
                              to standard output.  The message type must
                              be defined in PROTO_FILES or their imports.
  --decode=MESSAGE_TYPE       Read a binary message of the given type from
                              standard input and write it in text format
                              to standard output.  The message type must
                              be defined in PROTO_FILES or their imports.
  --decode_raw                Read an arbitrary protocol message from
                              standard input and write the raw tag/value
                              pairs in text format to standard output.  No
                              PROTO_FILES should be given when using this
                              flag.
  -oFILE,                     Writes a FileDescriptorSet (a protocol buffer,
    --descriptor_set_out=FILE defined in descriptor.proto) containing all of
                              the input files to FILE.
  --include_imports           When using --descriptor_set_out, also include
                              all dependencies of the input files in the
                              set, so that the set is self-contained.
  --include_source_info       When using --descriptor_set_out, do not strip
                              SourceCodeInfo from the FileDescriptorProto.
                              This results in vastly larger descriptors that
                              include information about the original
                              location of each decl in the source file as
                              well as surrounding comments.
  --dependency_out=FILE       Write a dependency output file in the format
                              expected by make. This writes the transitive
                              set of input file paths to FILE
  --error_format=FORMAT       Set the format in which to print errors.
                              FORMAT may be 'gcc' (the default) or 'msvs'
                              (Microsoft Visual Studio format).
  --print_free_field_numbers  Print the free field numbers of the messages
                              defined in the given proto files. Groups share
                              the same field number space with the parent
                              message. Extension ranges are counted as
                              occupied fields numbers.

  --plugin=EXECUTABLE         Specifies a plugin executable to use.
                              Normally, protoc searches the PATH for
                              plugins, but you may specify additional
                              executables not in the path using this flag.
                              Additionally, EXECUTABLE may be of the form
                              NAME=PATH, in which case the given plugin name
                              is mapped to the given executable even if
                              the executable's own name differs.
  --cpp_out=OUT_DIR           Generate C++ header and source.
  --csharp_out=OUT_DIR        Generate C# source file.
  --java_out=OUT_DIR          Generate Java source file.
  --javanano_out=OUT_DIR      Generate Java Nano source file.
  --js_out=OUT_DIR            Generate JavaScript source.
  --objc_out=OUT_DIR          Generate Objective C header and source.
  --php_out=OUT_DIR           Generate PHP source file.
  --python_out=OUT_DIR        Generate Python source file.
  --ruby_out=OUT_DIR          Generate Ruby source file.

說乾就乾:

> ./protoc --decode=query_md5 query_md5.proto < tmp/resp.bin > resp.pb
[libprotobuf WARNING ../../src/google/protobuf/compiler/parser.cc:546] No syntax specified for the proto file: query_md5.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.)
> cat resp.pb
mema {
  mem1: 2
  mem2: 1048643
  mem3: "\2728\272\223\257z\332\350\031g+\211\335\322k\\"
  mem4: 11
  mem5: 48940096820
  mem6: 0
  mem7: "2.2.101.27"
  mem8: 3
  mem9: 2422353392
}
memb: 0
memc: "g\306\007!^G\256\211%\'-m\240\366\002-"
memd: 0
meme: 22177440
memf: "\321[\363&G\010\277\307\001\340K=\306$8\243"
memf: "1\225D\363/2\033\226xek\202\375\270\225`"
memf: "\232u\0275\374\312\346ot\206\351\372\334j\237\253"
memf: "(L\353\2776\340\035W\\\246\223\3369\033z}"
memf: ">\013C\234b\245\244\001\303\377\317\0002\231\274~"
memf: "\366\271\227F\234\346\225UR\323\365\013l\243\216\261"
memf: "\230R\347\361%0\313kz\240Ui\373\315\n\\"
memf: "\32333\261\325\026\330h98\363\007\277\376\324\300"
memf: "\246F\014\337(tHj\013\300\355\361oQ\265\236"
memf: "\036\356\346y[\361\0102\325\247\374O`\317H\253"
memf: "\304F\226c\366\244\207\315\374?\325`(\\\016\244"

生成的文件並非 json 格式,屬於 protobuf 定義的一種通用文本格式。將它原封不動的 encode 回去:

> ./protoc --encode=query_md5 query_md5.proto < resp.pb > resp.bin
[libprotobuf WARNING ../../src/google/protobuf/compiler/parser.cc:546] No syntax specified for the proto file: query_md5.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.)
> xxd resp.bin
0000000: 0a37 0802 10c3 8040 1a10 ba38 ba93 af7a  [email protected]
0000010: dae8 1967 2b89 ddd2 6b5c 200b 28b4 baba  ...g+...k\ .(...
0000020: a8b6 0130 003a 0a32 2e32 2e31 3031 2e32  ...0.:.2.2.101.2
0000030: 3740 0348 f0db 8883 0910 001a 1067 c607  [email protected]..
0000040: 215e 47ae 8925 272d 6da0 f602 2d20 0028  !^G..%'-m...- .(
0000050: a0cd c90a 3210 d15b f326 4708 bfc7 01e0  ....2..[.&G.....
0000060: 4b3d c624 38a3 3210 3195 44f3 2f32 1b96  K=.$8.2.1.D./2..
0000070: 7865 6b82 fdb8 9560 3210 9a75 1735 fcca  xek....`2..u.5..
0000080: e66f 7486 e9fa dc6a 9fab 3210 284c ebbf  .ot....j..2.(L..
0000090: 36e0 1d57 5ca6 93de 391b 7a7d 3210 3e0b  6..W\...9.z}2.>.
00000a0: 439c 62a5 a401 c3ff cf00 3299 bc7e 3210  C.b.......2..~2.
00000b0: f6b9 9746 9ce6 9555 52d3 f50b 6ca3 8eb1  ...F...UR...l...
00000c0: 3210 9852 e7f1 2530 cb6b 7aa0 5569 fbcd  2..R..%0.kz.Ui..
00000d0: 0a5c 3210 d333 33b1 d516 d868 3938 f307  .\2..33....h98..
00000e0: bffe d4c0 3210 a646 0cdf 2874 486a 0bc0  ....2..F..(tHj..
00000f0: edf1 6f51 b59e 3210 1eee e679 5bf1 0832  ..oQ..2....y[..2
0000100: d5a7 fc4f 60cf 48ab 3210 c446 9663 f6a4  ...O`.H.2..F.c..
0000110: 87cd fc3f d560 285c 0ea4                 ...?.`(\..

與原始數據做個對比:

也能對得上!不過這種方案的缺點是 pb 文件不能交給 jq 命令處理,後期集成時工作量會大不少。

問題根因

標準的 pbjs 命令其實是一個鏈接:

> which pbjs
/usr/local/bin/pbjs
> ls -lh /usr/local/bin/pbjs
lrwxrwxrwx 1 root root 31 Sep 24 11:33 /usr/local/bin/pbjs -> ../lib/node_modules/pbjs/cli.js
> ls /usr/local/lib/node_modules/pbjs/
cli.js                        index.d.ts                    node_modules/                 test.js                       test.proto.js
cli.ts                        index.js                      package.json                  test.proto                    test.proto.ts
generate.js                   index.ts                      protocol-buffers-schema.d.ts  test.proto.es5.js             test.ts
generate.ts                   LICENSE.md                    README.md                     test.proto.es6.js             tsconfig.json

對應的是 cli.js 文件,出於好奇,查看了一下它是如何處理 bytes 類型的 encode 的,這主要位於 generate.js 文件:

function encodeValue(name, buffer, value, nested = 'nested') {
    let type;
    let write;
    switch (name) {
        case 'bool':
            type = TYPE_VAR_INT;
            write = [`writeByte(${buffer}, ${value} ? 1 : 0)`];
            break;
        case 'bytes':
            type = TYPE_SIZE_N;
            write = [`writeVarint32(${buffer}, ${value}.length), writeBytes(${buffer}, ${value})`];
            break;
        case 'int32':
            type = TYPE_VAR_INT;
            write = [`writeVarint64(${buffer}, intToLong(${value}))`];
            break;
        case 'int64':
            type = TYPE_VAR_INT;
            write = [`writeVarint64(${buffer}, ${value})`];
            break;
        case 'string':
            type = TYPE_SIZE_N;
            write = [`writeString(${buffer}, ${value})`];
            break;
        ....
    }
    return { type, write };
}

為了突出重點代碼有刪減。對比 bytes 類型與其它類型,發現它會首先 encode 一個數組的長度,然後才是數組內容。

數組內容的寫入是由一個 writeBytes 的常式負責的:

lines.push(`function writeBytes(bb${ts('ByteBuffer')}, buffer${ts('Uint8Array')})${ts('void')} {`);
lines.push(`  ${varOrLet} offset = grow(bb, buffer.length);`);
lines.push(`  bb.bytes.set(buffer, offset);`);
lines.push(`}`);

看它的實現,首先增長底層緩存區以確保可以容納數組,然後一整個寫入進去。

還記得 pbjs decode 二進位數據後的形式嗎?這裡回顧一下:

"mem3":{"type":"Buffer","data":[186,56,186,147,175,122,218,232,25,103,43,137,221,210,107,92]},

數據是包在一個 object 里的,而這裡要求的是直接的數組類型,會不會是這一步出現了匹配問題?

將 pbjs 反解二進位數據得到的 json 稍加修改,去掉包在 bytes 數據外面的對象:

> jq -c '.' resp.json
{"mema":{"mem1":2,"mem2":1048643,"mem3":[186,56,186,147,175,122,218,232,25,103,43,137,221,210,107,92],"mem4":11,"mem5":{"low":1695456564,"high":11,"unsigned":true},"mem6":0,"mem7":[50,46,50,46,49,48,49,46,50,55],"mem8":3,"mem9":{"low":-1872613904,"high":0,"unsigned":true}},"memb":0,"memc":[103,198,7,33,94,71,174,137,37,39,45,109,160,246,2,45],"memd":0,"meme":{"low":22177440,"high":0,"unsigned":true},"memf":[[209,91,243,38,71,8,191,199,1,224,75,61,198,36,56,163],[49,149,68,243,47,50,27,150,120,101,107,130,253,184,149,96],[154,117,23,53,252,202,230,111,116,134,233,250,220,106,159,171],[40,76,235,191,54,224,29,87,92,166,147,222,57,27,122,125],[62,11,67,156,98,165,164,1,195,255,207,0,50,153,188,126],[246,185,151,70,156,230,149,85,82,211,245,11,108,163,142,177],[152,82,231,241,37,48,203,107,122,160,85,105,251,205,10,92],[211,51,51,177,213,22,216,104,57,56,243,7,191,254,212,192],[166,70,12,223,40,116,72,106,11,192,237,241,111,81,181,158],[30,238,230,121,91,241,8,50,213,167,252,79,96,207,72,171],[196,70,150,99,246,164,135,205,252,63,213,96,40,92,14,164]]}

再對這個 json 進行編碼:

> pbjs query_md5.proto --encode query_md5 < resp.json > resp.bin
> xxd resp.bin
0000000: 0a37 0802 10c3 8040 1a10 ba38 ba93 af7a  [email protected]
0000010: dae8 1967 2b89 ddd2 6b5c 200b 28b4 baba  ...g+...k\ .(...
0000020: a8b6 0130 003a 0a32 2e32 2e31 3031 2e32  ...0.:.2.2.101.2
0000030: 3740 0348 f0db 8883 0910 001a 1067 c607  [email protected]..
0000040: 215e 47ae 8925 272d 6da0 f602 2d20 0028  !^G..%'-m...- .(
0000050: a0cd c90a 3210 d15b f326 4708 bfc7 01e0  ....2..[.&G.....
0000060: 4b3d c624 38a3 3210 3195 44f3 2f32 1b96  K=.$8.2.1.D./2..
0000070: 7865 6b82 fdb8 9560 3210 9a75 1735 fcca  xek....`2..u.5..
0000080: e66f 7486 e9fa dc6a 9fab 3210 284c ebbf  .ot....j..2.(L..
0000090: 36e0 1d57 5ca6 93de 391b 7a7d 3210 3e0b  6..W\...9.z}2.>.
00000a0: 439c 62a5 a401 c3ff cf00 3299 bc7e 3210  C.b.......2..~2.
00000b0: f6b9 9746 9ce6 9555 52d3 f50b 6ca3 8eb1  ...F...UR...l...
00000c0: 3210 9852 e7f1 2530 cb6b 7aa0 5569 fbcd  2..R..%0.kz.Ui..
00000d0: 0a5c 3210 d333 33b1 d516 d868 3938 f307  .\2..33....h98..
00000e0: bffe d4c0 3210 a646 0cdf 2874 486a 0bc0  ....2..F..(tHj..
00000f0: edf1 6f51 b59e 3210 1eee e679 5bf1 0832  ..oQ..2....y[..2
0000100: d5a7 fc4f 60cf 48ab 3210 c446 9663 f6a4  ...O`.H.2..F.c..
0000110: 87cd fc3f d560 285c 0ea4                 ...?.`(\..

看起來有戲!與原始數據做個對比:

完全一致!

結語

本文記敘了 protobuf 的 js 工具 pbjs 在遇到 bytes 類型時編解碼方面的一些問題,通過幾次嘗試最終找到了三種解決方案:

  • 使用 pbjs & protobufjs 生成 js 代碼將 json 編碼為二進位數據
  • 使用 protoc 編碼 pb 文本為二進位數據
  • 修改解碼後的 json,去掉 bytes 數組外包的 object 層,使用 pbjs 編碼修改後的 json 為二進位數據

方案 I 稍微複雜一點;方案 II 的 pb 文本不通用,特別是不能傳遞給下游 jq 做事先處理;方案 III 兼顧了便利性與相容性,是最優解。

特別是修改 json 去掉 objet 包裹層這一工作,對於 jq 來說就是手到擒來:

local req=$(jq -c ".mema.mem3=${mem3}|.mema.mem4=${mem4}|.mema.mem5.low=${mem5_lo}|.mema.mem5.high=${mem5_hi}|.mema.mem7=${mem7}|.mema.mem8=${mem8}|.mema.mem9.low=${mem9_lo}|.mema.mem9.high=${mem9_hi}|.memc=${memc}" query_md5.json)

jq 首先讀取原始 json (resp.json),然後通過層級管道對各個欄位進行賦值 (json 只是一個模板,沒有請求需要的數據),在賦值過程中,對於 bytes 類型,通過直接設置以下形式的值:

[186,56,186,147,175,122,218,232,25,103,43,137,221,210,107,92]

來將預設的 object 替換為位元組數組。jq 變數的方式也能替換值,但是在更改欄位類型時遇到了一些困難,像下麵這樣:

local req=$(jq --arg mm3 "[${mem3}]" --arg mm4 "${mem4}" --arg mm5h "${mem5_hi}" --arg mm5o "${mem5_lo}" --arg mm7 "[${mem7}]" --arg mm8 "${mem8}" --arg mm9 "${mem9}" --arg mmc "[${memc}]"  -c '{ mema: { mem1 : .mema.mem1, mem2: .mema.mem2, mem3: $mm3, mem4: $mm4, mem5: { low: $mm5_lo, high: $mm5_hi, unsigned: true }, mem6: .mema.mem6, mem7: $mm7, mem8: $mm8, mem9: { low: $mm9, high: 0, unsigned: true } }, memb: .memb, memc: $mmc, memd: .memd, meme: .meme, memf: .memf }' query_md5.json)

更新後的 json 會變成這樣:

req={"mema":{"mem1":2,"mem2":1048642,"mem3":"[186,56,186,147,175,122,218,232,25,103,43,137,221,210,107,92]","mem4":"11","mem5":{"low":"1695625406","high":"11","unsigned":true},"mem6":0,"mem7":"2.2.101.27","mem8":"3","mem9":{"low":"2422353392","high":0,"unsigned":true}},"memb":0,"memc":"[103,198,7,33,94,71,174,137,37,39,45,109,160,246,2,45]","memd":"0",...}

發現所有位元組數組外面都套了雙引號變字元串了!再加上這種方式比較繁瑣,就不推薦了。

後記

根因定位的過程有一些潦草了,記得當時確實是看到了相關可疑的點,不過後來複盤的時候,卻怎麼也回憶不起來是哪裡引發了懷疑,所以就將就看吧,哈哈。

現在回過頭來看,這應該是 pbjs 的一個 bug,在將 Uint8Array 解碼時,使用了 wrapper 類直接寫入,導致有 object 層包裹,而在編碼時又只接收純 bytes 數組,最終導致數據匹配不上沒有編入二進位結果中。

如果僅使用 pbjs 生成的 js/ts 代碼,應該不受影響,甚至直接使用 protoc 生成 pb 文件也是正常的,只在使用 pbjs 將二進位數據和 json 之間轉換時才會出現上面問題,希望 pbjs 的作者能早日修複這個問題。

參考

[1]. JSON 序列化中的轉義和 Unicode 編碼

[2]. protobufjs

[3]. node.js讀本地文件

[4]. 當creator遇上protobufjs—起步

本文來自博客園,作者:goodcitizen,轉載請註明原文鏈接:https://www.cnblogs.com/goodcitizen/p/solution_about_pbjs_encode_bytes_data_failed_problem.html


您的分享是我們最大的動力!

-Advertisement-
Play Games
更多相關文章
  • MySQL 索引、事務與存儲引擎 MySQL 索引 1.索引的概念 ●索引是一個排序的列表,在這個列表中存儲著索引的值和包含這個值的數據所在行的物理地址(類似於C語言的鏈表通過指針指向數據記錄的記憶體地址)。 ●使用索引後可以不用掃描全表來定位某行的數據,而是先通過索引表找到該行數據對應的物理地址然後 ...
  • 冷熱分離功能支持將冷熱數據存儲在不同的介質上,可以大大降低存儲成本,HBase支持對同一張表的數據進行冷熱分離存儲。 ...
  • NineData SQL開發工具現已支持深色模式,為用戶提供更舒適的使用體驗。長時間暴露在明亮屏幕下容易引發眼睛疲勞和不適,而深色模式通過降低屏幕亮度減輕了眼睛的負擔。此外,深色模式還能節省能源、改善低光環境,並適用於開發人員、夜間工作者和移動設備用戶等不同群體。您可以在NineData SQL視窗... ...
  • 一、錯誤日誌 錯誤日誌是MySQL中最重要的日誌之一,它記錄了當MySQL啟動和停止時,以及伺服器在運行過程中發生的任何嚴重錯誤時的相關信息,當資料庫出現任何故障導致無法正常使用時,建議首先查看此日誌 錯誤日誌預設是開啟的,預設存在目錄/var/log/,預設的日誌文件名為mysqld.log, 但 ...
  • 本篇作為 OPPO主題組件調試與預覽 文檔的補充,因為它真的很簡單而且太老,一些命令已發生變化😪 1. 調試前準備 1. PC 端下載 adb命令工具 下載 https://adbdownload.com/,或從其他地方下載也可 解壓,放在你想放的文件夾下 配置環境變數 右鍵 我的電腦/此電腦 選 ...
  • 作為一名全棧工程師,在日常的工作中,可能更側重於後端開發,如:C#,Java,SQL ,Python等,對前端的知識則不太精通。在一些比較完善的公司或者項目中,一般會搭配前端工程師,UI工程師等,來彌補後端開發的一些前端經驗技能上的不足。但並非所有的項目都會有專職前端工程師,在一些小型項目或者初創公... ...
  • npm 依賴安裝 眾所周知,npm 全局安裝依賴位置預設是c盤,c盤一般是系統安裝盤,裝太多東西容易造成系統運行緩慢,因此想將依賴包安在指定位置 node.js的安裝 nodejs的安裝可以直接到菜鳥教程查看: http://www.runoob.com/nodejs/nodejs-install- ...
  • [簡介]: 關鍵代碼; <el-row class='midPart' style=''>{{ menu.name }}</el-row>.midPart { float: left; width: 4%; padding: 7px 7px; background: #444;} [內容]: <te ...
一周排行
    -Advertisement-
    Play Games
  • 移動開發(一):使用.NET MAUI開發第一個安卓APP 對於工作多年的C#程式員來說,近來想嘗試開發一款安卓APP,考慮了很久最終選擇使用.NET MAUI這個微軟官方的框架來嘗試體驗開發安卓APP,畢竟是使用Visual Studio開發工具,使用起來也比較的順手,結合微軟官方的教程進行了安卓 ...
  • 前言 QuestPDF 是一個開源 .NET 庫,用於生成 PDF 文檔。使用了C# Fluent API方式可簡化開發、減少錯誤並提高工作效率。利用它可以輕鬆生成 PDF 報告、發票、導出文件等。 項目介紹 QuestPDF 是一個革命性的開源 .NET 庫,它徹底改變了我們生成 PDF 文檔的方 ...
  • 項目地址 項目後端地址: https://github.com/ZyPLJ/ZYTteeHole 項目前端頁面地址: ZyPLJ/TreeHoleVue (github.com) https://github.com/ZyPLJ/TreeHoleVue 目前項目測試訪問地址: http://tree ...
  • 話不多說,直接開乾 一.下載 1.官方鏈接下載: https://www.microsoft.com/zh-cn/sql-server/sql-server-downloads 2.在下載目錄中找到下麵這個小的安裝包 SQL2022-SSEI-Dev.exe,運行開始下載SQL server; 二. ...
  • 前言 隨著物聯網(IoT)技術的迅猛發展,MQTT(消息隊列遙測傳輸)協議憑藉其輕量級和高效性,已成為眾多物聯網應用的首選通信標準。 MQTTnet 作為一個高性能的 .NET 開源庫,為 .NET 平臺上的 MQTT 客戶端與伺服器開發提供了強大的支持。 本文將全面介紹 MQTTnet 的核心功能 ...
  • Serilog支持多種接收器用於日誌存儲,增強器用於添加屬性,LogContext管理動態屬性,支持多種輸出格式包括純文本、JSON及ExpressionTemplate。還提供了自定義格式化選項,適用於不同需求。 ...
  • 目錄簡介獲取 HTML 文檔解析 HTML 文檔測試參考文章 簡介 動態內容網站使用 JavaScript 腳本動態檢索和渲染數據,爬取信息時需要模擬瀏覽器行為,否則獲取到的源碼基本是空的。 本文使用的爬取步驟如下: 使用 Selenium 獲取渲染後的 HTML 文檔 使用 HtmlAgility ...
  • 1.前言 什麼是熱更新 游戲或者軟體更新時,無需重新下載客戶端進行安裝,而是在應用程式啟動的情況下,在內部進行資源或者代碼更新 Unity目前常用熱更新解決方案 HybridCLR,Xlua,ILRuntime等 Unity目前常用資源管理解決方案 AssetBundles,Addressable, ...
  • 本文章主要是在C# ASP.NET Core Web API框架實現向手機發送驗證碼簡訊功能。這裡我選擇是一個互億無線簡訊驗證碼平臺,其實像阿裡雲,騰訊雲上面也可以。 首先我們先去 互億無線 https://www.ihuyi.com/api/sms.html 去註冊一個賬號 註冊完成賬號後,它會送 ...
  • 通過以下方式可以高效,並保證數據同步的可靠性 1.API設計 使用RESTful設計,確保API端點明確,並使用適當的HTTP方法(如POST用於創建,PUT用於更新)。 設計清晰的請求和響應模型,以確保客戶端能夠理解預期格式。 2.數據驗證 在伺服器端進行嚴格的數據驗證,確保接收到的數據符合預期格 ...