This tutorial will show you how to load and save CSV files with multi-line fields. It is divided in three sections:
- Reading and parsing a CSV file with multi-line fields (this post)
- Control fields order with the function ObjCSV_CSV2Collection
- Converting to a single-line CSV file
In most comma-separated-values (CSV) files, each record stands on its own line, with each field separated by a comma. For example:
Title,Album,Track She Loves You [Mono],"Past Masters, Vol. 1",4 Eight Days A Week,Beatles For Sale,8 Ticket To Ride,Help!,7 Strawberry Fields Forever,Magical Mystery Tour,8 Get Back,Let It Be,12
But if we add a fourth field with the lyrics of these songs, then we have a multi-line field:
Title,Album,Track,Lyrics She Loves You [Mono],"Past Masters, Vol. 1",4,"She loves you, yeah, yeah, yeah She loves you, yeah, yeah, yeah She loves you, yeah, yeah, yeah, yeah" Eight Days A Week,Beatles For Sale,8,"Eight days a week I love you. Eight days a week Is not enough to show I care." Ticket To Ride,Help!,7,"She's got a ticket to ri-hide, She's got a ticket to ri-hi-hide, She's got a ticket to ride, But she don't care." Strawberry Fields Forever,Magical Mystery Tour,8,"Let me take you down, 'cos I'm going to Strawberry Fields. Nothing is real and nothing to get hungabout. Strawberry Fields forever." Get Back,Let It Be,12,"Get back, get back. Get back to where you once belonged Get back, get back. Get back to where you once belonged."
This is a valid CSV file: fields including line breaks are enclosed in double-quotes. And this is not unusual. For example, you will find multi-line fields in exports from Outlook or Google contacts where the "Notes" field includes line breaks.
But if we try to import these files in MS Excel, we have a problem because each end-of-line is interpreted at the end of a record. In that case, the line "She loves you, yeah, yeah, yeah" in the middle of the lyrics is interpreted as a record with four fields separated by the commas.
How can AutoHotKey help us? Let’s try with the "Loop, Parse, CSV" command. We would have something like this:
#NoEnv #SingleInstance force Loop, Read, %A_ScriptDir%\TheBeatles-Lyrics.txt if (A_Index = 1) MsgBox, Header: %A_LoopReadLine% else Loop, Parse, A_LoopReadLine, CSV MsgBox, Field #%A_Index%:`n`n%A_LoopField% return
Unfortunately, this will fall into the same trap as Excel. Line breaks between double-quotes are not considered as part of the fields as they should be. They are again considered as end of records.
There is a solution: replace all line breaks found between double-quotes with a temporary character and replace back this character with line breaks after a record has been parsed. This is easily said but this not a trivial task.
This is one of the things the command "ObjCSV_CSV2Collection" from the ObjCSV library will do for you! Let’s try this code:
NOTE: Before running this script...
- Make sure the ObjCSV library is copied in one of these directories:
%A_ScriptDir%\Lib\ (folder Lib under the current script’s folder)
%A_MyDocuments%\AutoHotkey\Lib\ (global AHK’s Lib folder)
[path to the currently running AutoHotkey_L.exe]\Lib\ - and copy the file TheBeatles-Lyrics.txt in the same directory as the script.
#NoEnv #SingleInstance force objCollection := ObjCSV_CSV2Collection(A_ScriptDir . "\TheBeatles-Lyrics.txt", strFileHeader, 1, 1) loop, % objCollection.MaxIndex() for strFieldName, strFieldValue in objCollection[A_Index] MsgBox, %strFieldName%:`n`n%strFieldValue% return