Affects Version/s: 11.4, 12.5, 19.0, 20.0
Component/s: Site Info
When using some OS/text editor combinations, a CSV file can be created with or without a BOM (byte order mark), which is essentially data in the file which the user cannot see and precedes the user entered text. If a file containing a BOM is used as a bulk upload for creating groups, the results may be unexpected and erroneous.
Take for example the two attached CSV files on this ticket (TestGroup.csv and TestGroupWithoutBom.csv). If you open these files in a text editor, you will notice that they're virtually identical (they're both creating one group – G1 --, with several users). However, if you inspect the files on the command line you'll notice subtle differences:
You can clearly see that the TestGroup.csv file has some strange data preceeding the first occurrence of "G1", however looking at the file using a text editor or word processor, you would be oblivious to this difference.
The result of uploading these two seemingly identical files, is that the TestGroup.csv would produce two groups with the identical title "G1", where the first group would contain the first user in the file, and the second would contain the remainder. The user in this situation would be very confused at the results:
Uploading TestGroupWithoutBom.csv works as expected, because it does not contain the BOM at the beginning of the file.
To resolve the issue in the file with the BOM while still maintaining compatibility for files without the BOM, we simply need to make use of Apache's BOMInputStream when reading in the CSV file.