[PATCH 1/1] csv: detect end of csv file correctly

Wed Sep 1 17:24:26 BST 2021

On 9/1/21 2:29 PM, Heinrich Schuchardt wrote:
> On 8/31/21 11:43 PM, Heinrich Schuchardt wrote:
>> Am 31. August 2021 22:18:34 MESZ schrieb Peter Jones <pjones at redhat.com>:
>>> On Sun, Aug 29, 2021 at 09:12:47PM +0200, Heinrich Schuchardt wrote:
>>>> Shim produces random failures like
>>>>
>>>>      sbat.c:46:parse_sbat_section() row->n_columns:0 
>>>> SBAT_SECTION_COLUMNS:6
>>>>      Could not parse .sbat section data: Invalid Parameter
>>>>
>>>> Adding debug statements shows output like
>>>>
>>>>      csv.c:60:parse_csv_data() sbat[0x0]:
>>>>      sbat,1,SBAT 
>>>> Version,sbat,1,https://github.com/rhboot/shim/blob/main/SBAT.md
>>>>      shim,1,UEFI shim,shim,1,https://github.com/rhboot/shim
>>>>
>>>>      csv.c:16:parse_csv_line() line:
>>>>      sbat,1,SBAT 
>>>> Version,sbat,1,https://github.com/rhboot/shim/blob/main/SBAT.md
>>>>      csv.c:35:parse_csv_line() n_columns 6
>>>>      csv.c:16:parse_csv_line() line:
>>>>      shim,1,UEFI shim,shim,1,https://github.com/rhboot/shim
>>>>      csv.c:35:parse_csv_line() n_columns 6
>>>>      csv.c:16:parse_csv_line() line:
>>>>
>>>>      csv.c:35:parse_csv_line() n_columns 0
>>>>      csv.c:16:parse_csv_line() line:
>>>>
>>>>      csv.c:35:parse_csv_line() n_columns 0
>>>>      csv.c:16:parse_csv_line() line:
>>>
>>> I agree with your patch and with the summary below, but the above could
>>> use some improvement.  Under what conditions does it fail?  What makes
>>> you categorize the failures as "random"?
>>
>> When rebooting my board it sometimes failed and sometimes not while 
>> using the same images.
>>
>>   What do we need to have in the
>>> .sbat section to make it fail this way?
> 
> objcopy sets the field VirtualSize of the .sbat section header to 0x83 
> for our sbat.csv.
> 
> When U-Boot or EDK II load an EFI binary they copy section by section to 
> memory according to the following rules:
> 
> Copy the minimum of VirtualSize and RawSize bytes to memory.
> Zero out remaining size up to VirtualsSize.
> 
> In handle_sbat() we call parse_sbat_section() with SizeOfRawData as section 
> size. So we have SizeOfRawData - VirtualSize random bytes passed to 
> parse_csv_data(). If one of these is 0x0a or 0x0d, we interpret this as 
> a valid CSV line ending.

Using the information from the LOADED_IMAGE_PROTOCOL we have to read the 
shim binary and parse the section header of the .sbat section to get the 
VirtualSize. This is the relevant size that should be used to call 
parse_csv_data().

We already have similar coding in handle_image().

Best regards

Heinrich

> 
>>
>> It does not depend on the content of .sbat which was always the same 
>> but on the undefined content of the memory following the loaded image.
>>
>> As only 0x0d and 0x0a are recognized as line end and 0x00 is not, a 
>> "line" starting with 0x00  may end in nirvana after .sbat wherever 
>> 0x0d or 0x0a is hit.
>>
>> Adding 0x00 to the list of line endings could improve the patch.
>>
>>   What could we add to test-csv.c
>>> or test-sbat.c to make it reproduce the and thus verify that it is in
>>> fact fixed and stays that way?
>>
>> We need to define an appropriate array of bytes after the CSV file.
>>
>> Thanks for reviewing.
>>
>> I will start working on the test cases.
>>
>> Best regards
>>
>> Heinrich
>>
>>>
>>>> In parse_csv_data() variable 'line' is a pointer with a value between
>>>> the values of 'data' and 'data_end'.
>>>>
>>>> There is no reason to check that that it is non-zero after assigning it
>>>> from 'data' as we already check 'data'.
>>>>
>>>> Instead after each line we must check that we have not reached the 
>>>> end of
>>>> the file marked by a '\0' character.
>>>>
>>>> Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt at canonical.com>
>>>> ---
>>>>   csv.c | 4 ++--
>>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/csv.c b/csv.c
>>>> index f6b37f1..b716632 100644
>>>> --- a/csv.c
>>>> +++ b/csv.c
>>>> @@ -63,10 +63,10 @@ parse_csv_data(char *data, char *data_end, 
>>>> size_t n_columns, list_t *list)
>>>>       max = (uintptr_t)end - (uintptr_t)line + (end > line ? 1 : 0);
>>>> -    if (line && is_utf8_bom(line, max))
>>>> +    if (is_utf8_bom(line, max))
>>>>           line += UTF8_BOM_SIZE;
>>>> -    while (line && line <= data_end) {
>>>> +    while (line <= data_end && *line) {
>>>>           size_t entrysz = sizeof(char *) * n_columns + 
>>>> sizeof(struct csv_row);
>>>>           struct csv_row *entry;
>>>>           size_t m_columns = n_columns;
>>>> -- 
>>>> 2.30.2
>>>>
>>>
>>
>