[PATCH 1/1] csv: detect end of csv file correctly

Tue Aug 31 22:43:10 BST 2021

Am 31. August 2021 22:18:34 MESZ schrieb Peter Jones <pjones at redhat.com>:
>On Sun, Aug 29, 2021 at 09:12:47PM +0200, Heinrich Schuchardt wrote:
>> Shim produces random failures like
>> 
>>     sbat.c:46:parse_sbat_section() row->n_columns:0 SBAT_SECTION_COLUMNS:6
>>     Could not parse .sbat section data: Invalid Parameter
>> 
>> Adding debug statements shows output like
>> 
>>     csv.c:60:parse_csv_data() sbat[0x0]:
>>     sbat,1,SBAT Version,sbat,1,https://github.com/rhboot/shim/blob/main/SBAT.md
>>     shim,1,UEFI shim,shim,1,https://github.com/rhboot/shim
>> 
>>     csv.c:16:parse_csv_line() line:
>>     sbat,1,SBAT Version,sbat,1,https://github.com/rhboot/shim/blob/main/SBAT.md
>>     csv.c:35:parse_csv_line() n_columns 6
>>     csv.c:16:parse_csv_line() line:
>>     shim,1,UEFI shim,shim,1,https://github.com/rhboot/shim
>>     csv.c:35:parse_csv_line() n_columns 6
>>     csv.c:16:parse_csv_line() line:
>> 
>>     csv.c:35:parse_csv_line() n_columns 0
>>     csv.c:16:parse_csv_line() line:
>> 
>>     csv.c:35:parse_csv_line() n_columns 0
>>     csv.c:16:parse_csv_line() line:
>
>I agree with your patch and with the summary below, but the above could
>use some improvement.  Under what conditions does it fail?  What makes
>you categorize the failures as "random"? 

When rebooting my board it sometimes failed and sometimes not while using the same images.

 What do we need to have in the
>.sbat section to make it fail this way? 

It does not depend on the content of .sbat which was always the same but on the undefined content of the memory following the loaded image.

As only 0x0d and 0x0a are recognized as line end and 0x00 is not, a "line" starting with 0x00  may end in nirvana after .sbat wherever 0x0d or 0x0a is hit.

Adding 0x00 to the list of line endings could improve the patch.

 What could we add to test-csv.c
>or test-sbat.c to make it reproduce the and thus verify that it is in
>fact fixed and stays that way?

We need to define an appropriate array of bytes after the CSV file.

Thanks for reviewing.

I will start working on the test cases.

Best regards

Heinrich

>
>> In parse_csv_data() variable 'line' is a pointer with a value between
>> the values of 'data' and 'data_end'.
>> 
>> There is no reason to check that that it is non-zero after assigning it
>> from 'data' as we already check 'data'.
>> 
>> Instead after each line we must check that we have not reached the end of
>> the file marked by a '\0' character.
>> 
>> Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt at canonical.com>
>> ---
>>  csv.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/csv.c b/csv.c
>> index f6b37f1..b716632 100644
>> --- a/csv.c
>> +++ b/csv.c
>> @@ -63,10 +63,10 @@ parse_csv_data(char *data, char *data_end, size_t n_columns, list_t *list)
>>  
>>  	max = (uintptr_t)end - (uintptr_t)line + (end > line ? 1 : 0);
>>  
>> -	if (line && is_utf8_bom(line, max))
>> +	if (is_utf8_bom(line, max))
>>  		line += UTF8_BOM_SIZE;
>>  
>> -	while (line && line <= data_end) {
>> +	while (line <= data_end && *line) {
>>  		size_t entrysz = sizeof(char *) * n_columns + sizeof(struct csv_row);
>>  		struct csv_row *entry;
>>  		size_t m_columns = n_columns;
>> -- 
>> 2.30.2
>> 
>