CSVtoDict v2

Post your working scripts, libraries and tools.
arcylix
Posts: 54
Joined: 27 Sep 2022, 14:43

CSVtoDict v2

Post by arcylix » 10 Oct 2023, 17:44

Updated code thanks to @flyingDman

I went searching for a way to convert CSV to an Associative array, and I found old code in a couple places, but they were for AHK v1. So, because I needed this for work, I went about trying to recreate it for v2. The original code can be found here:

read CSV to Associative Array

What this does is allow you to turn a CSV file into an associative array in AHKv2 Enjoy!

Code: Select all

CSVprs(str)								;creates an array of the elements of a CSV string
	{
	arr := []
	Loop Parse, str, "CSV"
	        arr.Push(A_LoopField)
	return arr
	}

CSVtoDict(file)
{
    array   := Map()
    data 	:= StrSplit(FileRead(file), "`n", "`r")
    hdr 	:= CSVprs(data.RemoveAt(1))				;reads the 1st line into an array and deletes it from the data array. Remove this line if your data does not have Headers.
    
    for x,y in data
		{
		    array[x] := Map()
		    for k,v in CSVprs(y)
			    array[x][hdr[k]] := v				;change [hdr[k]] to just [k] if no headers
		}
    Return array
}
For an example:

Code: Select all

/*
EXAMPLE DATA(SAVE THIS AS A CSV):
   accountId, dealerName, dealerAddress, POCName, POCTN, POCEmail, DPMName, DPMEmail, DPMTN, accountType, DualPartner, engagement, advertising, 3pt, SEO, CustomHUD, MPRPrefType, Statisfaction, Group, Lat, Long, TimeZone, OEM, Contacts, Notes
"coolestdealerever", "Coolest Car Dealer Ever", "2300 marsh ln,dallas NM 75085", "1234567890", "Nathan Northcut", "1234567890", "noreply@noreply.com", "Courtney Douglas", "noreply@noreply.com", "1234567890", "Lincoln", , "Engaged", "Mix", "No", "No", "none", "Either", "Satisfied", "Retail", "29", "33", "America/Chicago", "Lincoln", "Lance Willis|noreply@noreply.com|,secondContact|noreply@noreply.com|1234567891", "this is just a set of notes here!"
*/
   csvFilePath := FileSelect(1, , "Select the CSV file to import your accounts from", "*.csv")
   
if (csvFilePath != "") {
   accountArray := CSVtoDict(csvFilePath)
   MsgBox accountArray[1]["accountId"] . " | " . accountArray[1]["dealerName"]		;If no headers, then return the indices instead (e.g. accountArray[1][1], accountArray[1][2])
}
Last edited by arcylix on 11 Oct 2023, 22:04, edited 2 times in total.

CJeffrey
Posts: 1
Joined: 11 Oct 2023, 14:53

Re: CSVtoDict v2

Post by CJeffrey » 11 Oct 2023, 15:16

Exactly what I needed to find today! I've been trying to wrap my head around some of the v2 ways of doing things and this helps. Thanks for sharing!

The post that I had referenced to do this kinda thing in v1 is in Create an array from a CSV-file, which had the added functionality of using a key (i.e. the value of the first column) as a reference instead of just an index. I had trouble converting it to v2, though, and what you've shared works perfectly for my purposes.

User avatar
10basetom
Posts: 9
Joined: 10 Aug 2022, 08:06

Re: CSVtoDict v2

Post by 10basetom » 19 May 2024, 02:17

Hi, how would you handle cells that contain multi-line values? Here's a sample CSV I saved from Excel:

```
Test 1,"1
1
1"
Test 2,"2
2
2"
Test 3,"3
3
3"
```

User avatar
boiler
Posts: 17404
Joined: 21 Dec 2014, 02:44

Re: CSVtoDict v2

Post by boiler » 19 May 2024, 02:38

That is more complicated because you can't simply split strings on line breaks. You would have to write a function that doesn't split on line breaks if they are between a pair of quotation marks.

WKen
Posts: 204
Joined: 21 Feb 2023, 00:01

Re: CSVtoDict v2

Post by WKen » 19 May 2024, 03:13

I tried this script a few days ago and I ended up using Python's csv library. It would be great if someone could write a csv library similar to Python.

User avatar
boiler
Posts: 17404
Joined: 21 Dec 2014, 02:44

Re: CSVtoDict v2

Post by boiler » 19 May 2024, 03:50

This is a first cut at something that splits the lines appropriately and produces the desired output when quoted fields contain line breaks. Caveats are that it doesn't handle cases where the CSV file contains escaped quote characters (although that would be an easy modification), and it wouldn't be very fast on larger files.

Code: Select all

CSVprs(str)								;creates an array of the elements of a CSV string
	{
	arr := []
	Loop Parse, str, "CSV"
	        arr.Push(A_LoopField)
	return arr
	}

CSVtoDict(file)
{
    array   := Map()
    data 	:= CsvSplit(FileRead(file))					;splits into an array using a function that doesn't split when a line break is inside a quoted field
    hdr 	:= CSVprs(data.RemoveAt(1))				;reads the 1st line into an array and deletes it from the data array. Remove this line if your data does not have Headers.
    
    for x,y in data
		{
		    array[x] := Map()
		    for k,v in CSVprs(y)
			    array[x][hdr[k]] := v				;change [hdr[k]] to just [k] if no headers
		}
    Return array
}

CsvSplit(str) {
	inQuote := 0
	out := ''
	str := StrReplace(str, '`r')
	loop parse, str {
		if inQuote {
			if (A_LoopField = '"')
				inQuote := 0
			out .= A_LoopField 
		} else {
			if (A_LoopField = '"')
				inQuote := 1
			out .= (A_LoopField = '`n' ? '^^^br^^^' : A_LoopField)
		}
	}
	return StrSplit(out, '^^^br^^^')
}

just me
Posts: 9576
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: CSVtoDict v2

Post by just me » 19 May 2024, 06:35

@10basetom,

try to change the original CSVtoDict() function to:

Code: Select all

CSVtoDict(file)
{
    array   := Map()
    data 	:= StrSplit(FileRead(file), "`r`n") ; <<<<<<<<<<
    ...

User avatar
10basetom
Posts: 9
Joined: 10 Aug 2022, 08:06

Re: CSVtoDict v2

Post by 10basetom » 19 May 2024, 19:43

just me wrote:
19 May 2024, 06:35
@10basetom,

try to change the original CSVtoDict() function to:

Code: Select all

CSVtoDict(file)
{
    array   := Map()
    data 	:= StrSplit(FileRead(file), "`r`n") ; <<<<<<<<<<
    ...
Thanks for this. However, none of the solutions worked in all cases for me because the tool that I'm working on has some special requirements for CSV parsing:

  1. Users could save data from Excel with multi-line quoted values, so the file can have \n and \r\n line endings.
  2. Users could edit the CSV file from a text editor, so the file can have all \n (Linux, Mac) or \r\n (Windows) line endings.
I ended up writing my own function based on regex that I'm happy with. It converts the CSV data into an array instead of dictionary, but that was enough for me to do what I wanted (populate ListView components), so it's all good. It's surprisingly fast, too (4-5x faster than parsing to dictionary with 3k rows). If anyone is interested I'll post the test script here.

User avatar
boiler
Posts: 17404
Joined: 21 Dec 2014, 02:44

Re: CSVtoDict v2

Post by boiler » 19 May 2024, 23:00

10basetom wrote: …none of the solutions worked in all cases for me because the tool that I'm working on has some special requirements for CSV parsing:

  1. Users could save data from Excel with multi-line quoted values, so the file can have \n and \r\n line endings.
  2. Users could edit the CSV file from a text editor, so the file can have all \n (Linux, Mac) or \r\n (Windows) line endings.
Not that it’s important, but the approach I posted should have handled it no matter whether there was \n or \r\n or a mix of them in the file. The first thing it does is remove all the \r characters, so all that’s left are \n characters, and the rows are only split based on the ones not between a pair of quotes (i.e., within a cell).

just me
Posts: 9576
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: CSVtoDict v2

Post by just me » 20 May 2024, 02:43

10basetom wrote:
  1. Users could save data from Excel with multi-line quoted values, so the file can have \n and \r\n line endings.
  2. Users could edit the CSV file from a text editor, so the file can have all \n (Linux, Mac) or \r\n (Windows) line endings.
I ended up writing my own function based on regex that I'm happy with. It converts the CSV data into an array instead of dictionary, but that was enough for me to do what I wanted (populate ListView components), so it's all good. It's surprisingly fast, too (4-5x faster than parsing to dictionary with 3k rows). If anyone is interested I'll post the test script here.
I think that valid CSV files never use `r`n as line ending as well as line delimiter within multi-line fields. The same applies to `n. If so, it should be enough to check the file contents for `r*n and use it as line ending if it exists, otherwise `n.

Code: Select all

CSVtoDict(file)
{
    array   := Map()
    data 	:= FileRead(file)
    data 	:= StrSplit(data, InStr(data, "`r`n") ? "`r`n" : "`n") 
    ...
}

User avatar
10basetom
Posts: 9
Joined: 10 Aug 2022, 08:06

Re: CSVtoDict v2

Post by 10basetom » 20 May 2024, 11:06

just me wrote:
20 May 2024, 02:43
I think that valid CSV files never use `r`n as line ending as well as line delimiter within multi-line fields. The same applies to `n. If so, it should be enough to check the file contents for `r*n and use it as line ending if it exists, otherwise `n.

Code: Select all

CSVtoDict(file)
{
    array   := Map()
    data 	:= FileRead(file)
    data 	:= StrSplit(data, InStr(data, "`r`n") ? "`r`n" : "`n") 
    ...
}
I don't know about what's considered valid vs. invalid, I just simulated what the end user would do and it generated three different files, including one that you mentioned is invalid. When I get back to my PC I'll attach the test case where the parsing to dictionary fails one test (maybe it's a simple code tweak to fix it).

User avatar
10basetom
Posts: 9
Joined: 10 Aug 2022, 08:06

Re: CSVtoDict v2

Post by 10basetom » 20 May 2024, 15:52

Here are the three files that I saved without doing anything special:

- test_CRLF+LF.csv (saved in Excel)
- test_CRLF.csv (saved in Windows Notepad)
- test_LF.csv (saved in Windows VS Code set to LF line endings)

Since I can't control how my users are going to edit the CSV, the tool needs to work seamlessly with all three files or I'll get a flood of service tickets. 😅

The function to save to dictionary was able to properly parse all the files except for test_CRLF.csv.
Attachments
tests.zip
(683 Bytes) Downloaded 29 times

just me
Posts: 9576
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: CSVtoDict v2

Post by just me » 21 May 2024, 05:06

Ok, if you want/have to allow any user to edit CSV files using any editor on any system without defining strict rules, good luck! ;)

Post Reply

Return to “Scripts and Functions (v2)”