前文回顧 實現一個簡單的Database1(譯文) 實現一個簡單的Database2(譯文) 實現一個簡單的Database3(譯文) 譯註:cstsck在github維護了一個簡單的、類似SQLite的資料庫實現,通過這個簡單的項目,可以很好的理解資料庫是如何運行的。本文是第四篇,主要是使用rsp ...
前文回顧
譯註:cstsck在github維護了一個簡單的、類似SQLite的資料庫實現,通過這個簡單的項目,可以很好的理解資料庫是如何運行的。本文是第四篇,主要是使用rspec對目前實現的功能進行測試並解決測試出現BUG
譯註:cstsck在github維護了一個簡單的、類似sqlite的資料庫實現,通過這個簡單的項目,可以很好的理解資料庫是如何運行的。本文是第四篇,主要是使用rspec對目前實現的功能進行測試並解決測試出現BUG
Part 4 我們的第一個測試(和BUG)
我們已經獲得插入數據到資料庫並列印所有數據的能力。現在來測試一下目前microseconds已有的功能。
我要使用rspec來寫我的測試,因為我對rspec很熟悉,它的語法也相當易讀。
譯註:rsepec 是一個基於Ruby的測試框架,語法非常簡單,可以很方便的測試各種可執行程式,判斷輸出
我定義一個短小的help來發送一個幫助命令列表到資料庫,然後對輸出進行斷言。
describe 'database' do
def run_script(commands)
raw_output = nil
IO.popen("./db", "r+") do |pipe|
commands.each do |command|
pipe.puts command
end
pipe.close_write
# Read entire output
raw_output = pipe.gets(nil)
end
raw_output.split("\n")
end
it 'inserts and retrieves a row' do
result = run_script([
"insert 1 user1 [email protected]",
"select",
".exit",
])
expect(result).to match_array([
"db > Executed.",
"db > (1, user1, [email protected])",
"Executed.",
"db > ",
])
end
end
這個簡單的測試是確認我們的輸入能夠獲取返回結果。並確保能通過測試:
bundle exec rspec
.
Finished in 0.00871 seconds (files took 0.09506 seconds to load)
1 example, 0 failures
現在測試插入更多行數據到資料庫是可行的:
it 'prints error message when table is full' do
script = (1..1401).map do |i|
"insert #{i} user#{i} person#{i}@example.com"
end
script << ".exit"
result = run_script(script)
expect(result[-2]).to eq('db > Error: Table full.')
end
再次運行測試:
bundle exec rspec
..
Finished in 0.01553 seconds (files took 0.08156 seconds to load)
2 examples, 0 failures
妙啊,測試通過了!我們的資料庫現在能夠hold住1400行數據,這是因為我們設置pages最大數量是100頁,每頁可以存放14行數據。
查看我們目前寫的的代碼,我意識到我們可能沒有正確處理存儲文本欄位。很容易用下麵的例子測試出來(插入邊界長度的字元串):
it 'allows inserting strings that are the maximum length' do
long_username = "a"*32
long_email = "a"*255
script = [
"insert 1 #{long_username} #{long_email}",
"select",
".exit",
]
result = run_script(script)
expect(result).to match_array([
"db > Executed.",
"db > (1, #{long_username}, #{long_email})",
"Executed.",
"db > ",
])
end
測試失敗了:
Failures:
1) database allows inserting strings that are the maximum length
Failure/Error: raw_output.split("\n")
ArgumentError:
invalid byte sequence in UTF-8
# ./spec/main_spec.rb:14:in `split'
# ./spec/main_spec.rb:14:in `run_script'
# ./spec/main_spec.rb:48:in `block (2 levels) in <top (required)>'
如果是我們自己人工來測試,當我們列印行數據時,會看到有一些奇怪的字元(例子中,我把很長的字元串進行了縮寫):
db > insert 1 aaaaa... aaaaa...
Executed.
db > select
(1, aaaaa...aaa\�, aaaaa...aaa\�)
Executed.
db >
發生了什麼?如果看一下代碼中我們定義的Row結構,我們確實為username欄位分配了32個位元組長度,為email欄位分配255個位元組長度。但是C語言的strings是以一個null字元來作為結尾的,這個字元我們沒有為它分配空間。解決方法就是多分配一個額外的位元組(來存放這個null字元):
const uint32_t COLUMN_EMAIL_SIZE = 255;
typedef struct {
uint32_t id;
- char username[COLUMN_USERNAME_SIZE];
- char email[COLUMN_EMAIL_SIZE];
+ char username[COLUMN_USERNAME_SIZE + 1];
+ char email[COLUMN_EMAIL_SIZE + 1];
} Row;
這樣確實解決了上面的問題(重新運行上面插入邊界長度字元串的測試):
bundle exec rspec
...
Finished in 0.0188 seconds (files took 0.08516 seconds to load)
3 examples, 0 failures
我們不允許插入的username或者email的長度超過固定的列的長度。這樣的超出長度要求的spec測試看起來就像下麵這樣:
it 'prints error message if strings are too long' do
long_username = "a"*33
long_email = "a"*256
script = [
"insert 1 #{long_username} #{long_email}",
"select",
".exit",
]
result = run_script(script)
expect(result).to match_array([
"db > String is too long.",
"db > Executed.",
"db > ",
])
end
我了能夠支持上面這種效果,我們需要升級我們的解析器(parser)。提醒一下,我們現在使用的是scanf():
if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
statement->type = STATEMENT_INSERT;
int args_assigned = sscanf(
input_buffer->buffer, "insert %d %s %s", &(statement->row_to_insert.id),
statement->row_to_insert.username, statement->row_to_insert.email);
if (args_assigned < 3) {
return PREPARE_SYNTAX_ERROR;
}
return PREPARE_SUCCESS;
}
但是scanf()有一些缺點。如果讀取的string大於正在讀取它的緩存(buffer),就會引起緩存溢出(buffer overflow)並寫入到意想不到的地方。所以我們需要在拷貝string到Row結構前檢查每一個string的長度。為了檢查string長度,我們需要用空格分割輸入。
我使用 strtok() 來做這些。如果你看到過程就會覺得它很容易理解:
譯註: strtok: 字元串處理函數,char * strtok ( char * str, const char * delimiters ); 分解字元串為一組字元串。str為要分解的字元,delimiters為分隔符字元(如果傳入字元串,則傳入的字元串中每個字元均為分割符)
+PrepareResult prepare_insert(InputBuffer* input_buffer, Statement* statement) {
+ statement->type = STATEMENT_INSERT;
+
+ char* keyword = strtok(input_buffer->buffer, " ");
+ char* id_string = strtok(NULL, " ");
+ char* username = strtok(NULL, " ");
+ char* email = strtok(NULL, " ");
+
+ if (id_string == NULL || username == NULL || email == NULL) {
+ return PREPARE_SYNTAX_ERROR;
+ }
+
+ int id = atoi(id_string);
+ if (strlen(username) > COLUMN_USERNAME_SIZE) {
+ return PREPARE_STRING_TOO_LONG;
+ }
+ if (strlen(email) > COLUMN_EMAIL_SIZE) {
+ return PREPARE_STRING_TOO_LONG;
+ }
+
+ statement->row_to_insert.id = id;
+ strcpy(statement->row_to_insert.username, username);
+ strcpy(statement->row_to_insert.email, email);
+
+ return PREPARE_SUCCESS;
+}
+
PrepareResult prepare_statement(InputBuffer* input_buffer,
Statement* statement) {
if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
+ return prepare_insert(input_buffer, statement);
- statement->type = STATEMENT_INSERT;
- int args_assigned = sscanf(
- input_buffer->buffer, "insert %d %s %s", &(statement->row_to_insert.id),
- statement->row_to_insert.username, statement->row_to_insert.email);
- if (args_assigned < 3) {
- return PREPARE_SYNTAX_ERROR;
- }
- return PREPARE_SUCCESS;
}
每當輸入到一個分隔符時(在我們的例子中就是空格),就在輸入緩衝(input buffer)上連續調用strtok(),把它分解成子字元串。它返回一個指向子字元串開始位置的指針。
我們可以在每個文本值上調用strlen(),看看它是否太長(strlen()函數,獲取字元串的長度)。
我們可以像處理其他錯誤碼一樣處理錯誤:
enum PrepareResult_t {
PREPARE_SUCCESS,
+ PREPARE_STRING_TOO_LONG,
PREPARE_SYNTAX_ERROR,
PREPARE_UNRECOGNIZED_STATEMENT
};
switch (prepare_statement(input_buffer, &statement)) {
case (PREPARE_SUCCESS):
break;
+ case (PREPARE_STRING_TOO_LONG):
+ printf("String is too long.\n");
+ continue;
case (PREPARE_SYNTAX_ERROR):
printf("Syntax error. Could not parse statement.\n");
continue;
這樣就能通過測試了。
bundle exec rspec
....
Finished in 0.02284 seconds (files took 0.116 seconds to load)
4 examples, 0 failures
到了這裡,我們不妨再多處理一個錯誤情況(id值插入一個負值):
it 'prints an error message if id is negative' do
script = [
"insert -1 cstack [email protected]",
"select",
".exit",
]
result = run_script(script)
expect(result).to match_array([
"db > ID must be positive.",
"db > Executed.",
"db > ",
])
end
enum PrepareResult_t {
PREPARE_SUCCESS,
+ PREPARE_NEGATIVE_ID,
PREPARE_STRING_TOO_LONG,
PREPARE_SYNTAX_ERROR,
PREPARE_UNRECOGNIZED_STATEMENT
@@ -148,9 +147,6 @@ PrepareResult prepare_insert(InputBuffer* input_buffer, Statement* statement) {
}
int id = atoi(id_string);
+ if (id < 0) {
+ return PREPARE_NEGATIVE_ID;
+ }
if (strlen(username) > COLUMN_USERNAME_SIZE) {
return PREPARE_STRING_TOO_LONG;
}
@@ -230,9 +226,6 @@ int main(int argc, char* argv[]) {
switch (prepare_statement(input_buffer, &statement)) {
case (PREPARE_SUCCESS):
break;
+ case (PREPARE_NEGATIVE_ID):
+ printf("ID must be positive.\n");
+ continue;
case (PREPARE_STRING_TOO_LONG):
printf("String is too long.\n");
continue;
好了,測試做的差不多了。接下來是非常重要的功能:持久化!我們要實現保存我們的資料庫到一個文件,再把它從文件中讀取出來。(目前它還在記憶體當中)
現在它越來越牛了。
下麵是和上一部分代碼,修改位置的對比:
@@ -22,6 +22,8 @@
enum PrepareResult_t {
PREPARE_SUCCESS,
+ PREPARE_NEGATIVE_ID,
+ PREPARE_STRING_TOO_LONG,
PREPARE_SYNTAX_ERROR,
PREPARE_UNRECOGNIZED_STATEMENT
};
@@ -34,8 +36,8 @@
#define COLUMN_EMAIL_SIZE 255
typedef struct {
uint32_t id;
- char username[COLUMN_USERNAME_SIZE];
- char email[COLUMN_EMAIL_SIZE];
+ char username[COLUMN_USERNAME_SIZE + 1];
+ char email[COLUMN_EMAIL_SIZE + 1];
} Row;
@@ -150,18 +152,40 @@ MetaCommandResult do_meta_command(InputBuffer* input_buffer, Table *table) {
}
}
-PrepareResult prepare_statement(InputBuffer* input_buffer,
- Statement* statement) {
- if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
+PrepareResult prepare_insert(InputBuffer* input_buffer, Statement* statement) {
statement->type = STATEMENT_INSERT;
- int args_assigned = sscanf(
- input_buffer->buffer, "insert %d %s %s", &(statement->row_to_insert.id),
- statement->row_to_insert.username, statement->row_to_insert.email
- );
- if (args_assigned < 3) {
+
+ char* keyword = strtok(input_buffer->buffer, " ");
+ char* id_string = strtok(NULL, " ");
+ char* username = strtok(NULL, " ");
+ char* email = strtok(NULL, " ");
+
+ if (id_string == NULL || username == NULL || email == NULL) {
return PREPARE_SYNTAX_ERROR;
}
+
+ int id = atoi(id_string);
+ if (id < 0) {
+ return PREPARE_NEGATIVE_ID;
+ }
+ if (strlen(username) > COLUMN_USERNAME_SIZE) {
+ return PREPARE_STRING_TOO_LONG;
+ }
+ if (strlen(email) > COLUMN_EMAIL_SIZE) {
+ return PREPARE_STRING_TOO_LONG;
+ }
+
+ statement->row_to_insert.id = id;
+ strcpy(statement->row_to_insert.username, username);
+ strcpy(statement->row_to_insert.email, email);
+
return PREPARE_SUCCESS;
+
+}
+PrepareResult prepare_statement(InputBuffer* input_buffer,
+ Statement* statement) {
+ if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
+ return prepare_insert(input_buffer, statement);
}
if (strcmp(input_buffer->buffer, "select") == 0) {
statement->type = STATEMENT_SELECT;
@@ -223,6 +247,12 @@ int main(int argc, char* argv[]) {
switch (prepare_statement(input_buffer, &statement)) {
case (PREPARE_SUCCESS):
break;
+ case (PREPARE_NEGATIVE_ID):
+ printf("ID must be positive.\n");
+ continue;
+ case (PREPARE_STRING_TOO_LONG):
+ printf("String is too long.\n");
+ continue;
case (PREPARE_SYNTAX_ERROR):
printf("Syntax error. Could not parse statement.\n");
continue;
並且我們添加了一些測試:
+describe 'database' do
+ def run_script(commands)
+ raw_output = nil
+ IO.popen("./db", "r+") do |pipe|
+ commands.each do |command|
+ pipe.puts command
+ end
+
+ pipe.close_write
+
+ # Read entire output
+ raw_output = pipe.gets(nil)
+ end
+ raw_output.split("\n")
+ end
+
+ it 'inserts and retrieves a row' do
+ result = run_script([
+ "insert 1 user1 [email protected]",
+ "select",
+ ".exit",
+ ])
+ expect(result).to match_array([
+ "db > Executed.",
+ "db > (1, user1, [email protected])",
+ "Executed.",
+ "db > ",
+ ])
+ end
+
+ it 'prints error message when table is full' do
+ script = (1..1401).map do |i|
+ "insert #{i} user#{i} person#{i}@example.com"
+ end
+ script << ".exit"
+ result = run_script(script)
+ expect(result[-2]).to eq('db > Error: Table full.')
+ end
+
+ it 'allows inserting strings that are the maximum length' do
+ long_username = "a"*32
+ long_email = "a"*255
+ script = [
+ "insert 1 #{long_username} #{long_email}",
+ "select",
+ ".exit",
+ ]
+ result = run_script(script)
+ expect(result).to match_array([
+ "db > Executed.",
+ "db > (1, #{long_username}, #{long_email})",
+ "Executed.",
+ "db > ",
+ ])
+ end
+
+ it 'prints error message if strings are too long' do
+ long_username = "a"*33
+ long_email = "a"*256
+ script = [
+ "insert 1 #{long_username} #{long_email}",
+ "select",
+ ".exit",
+ ]
+ result = run_script(script)
+ expect(result).to match_array([
+ "db > String is too long.",
+ "db > Executed.",
+ "db > ",
+ ])
+ end
+
+ it 'prints an error message if id is negative' do
+ script = [
+ "insert -1 cstack [email protected]",
+ "select",
+ ".exit",
+ ]
+ result = run_script(script)
+ expect(result).to match_array([
+ "db > ID must be positive.",
+ "db > Executed.",
+ "db > ",
+ ])
+ end
+end
Enjoy GreatSQL