<!-- m -->ftp://ftp.heise.de/pub/ct/listings/0717-182.zip<!-- m -->
<!-- w -->www.heise.de/ct<!-- w --> soft-link 0717182
this zip contains a cmd-line tool, the c-source and the txt-file with the match-data for about 40000 Names.
Overview of the program "gender" by Jörg MICHAEL
The program "gender.c" is a program for determining the gender of a given
fist name.
List of files:
a) gen_ext.h (contains macros and prototypes; may be changed)
b) umlaut.h (contains lists of umlauts)
c) gender.c (this is the "workhorse" of the program)
d) nam_dict.txt (dictionary file containing first names)
The file "nam_dict.txt", which contains a list of first names, uses the
char set "iso8859-1".
If you want to use "gender.c" as a library, delete the line
"#define GENDER_EXECUTABLE" from the file "gen_ext.h".
========================================================================
The dictionary file "nam_dict.txt"
The program "gender.c" uses the dictionary file "nam_dict.txt" as a data
source. This file contains a list of more than 40,000 first names and
gender, plus some 600 pairs of "equivalent" names.
This list should be able to cover the vast majority of first names
in all European contries and in some overseas countries (e.g. China,
India, Japan, U.S.A.) as well.
Also included in this file is information on the approximate frequency
of each name. The scale goes from 1 (=rare) to 13 (=extremely common).
The value 10 has been formatted to represent at least 2 percent of
the population. (The values 11 to 13 have been added last.)
The scale is logarithmic. For countries with very good statistics,
each step (down to frequency 2) represents a factor of 2.
For example, a frequency value of 7 means that the correspondig first
name has an absolute in the range of 0.25 % to 0.5 %.
The sorting order of the file "nam_dict.txt" is governed by the search
algorithm of the program "gender.c". Hence, names with "expandable"
umlauts can be found twice in this dictionary, first with sorting
according to "expanded" umlauts, and second with sorting according to
"compressed" umlauts (e.g. 'Ö' is sorted like "Oe" and 'O').
You don't have to reformat this file for use in a unix environment,
because the DOS linefeeds (trailing '\r') are ignored when the file
is read.
========================================================================
A few words on quality of data
The dictionary of first names has been prepared with utmost care.
For example, the Turkish, Indian and Korean names in this dictionary
have all been independently lassified by several native speakers.
I also took special care to list only those names which can currently
be found.
The lesson from this?
Any modifications should be done very cautiously (and they must also
adhere to the sorting required by the search algorithm).
For example, knowing that "Sascha" is a boy's name in Germany, the author
never assumed the English "Sasha" to be a girl's name.
Knowing that "Jan" is a boy's name in Germany, I never assumed it to be
also a English short form of "Janet". Another case in point is the name
"Esra". This is a boy's name in Germany, but a girl's name in Turkey.
Or consider the following first names:
Ildikó female Hungarian name
Mitja male Russian name
Elizaveta rare name; looks like misspelled "Elizabeta"
Roelf rare name; looks like German "Rolf" with an erroneous 'e'
Borchert, Oltmann, Sievert, Hartmann look like common German surnames
the tool is released under the LGPL.
I have created a little Gui for the main-function of gender.exe.
The script should be stored in the same directory as gender.exe.

#NoTrayIcon SetWorkingDir %A_ScriptDir% ;------------auto-execute---------------------------------------------------- IfNotExist, gender.exe { MsgBox, gender.exe not found! ExitApp } IfNotExist, nam_dict.txt { MsgBox, nam_dict.txt not found! ExitApp } Gui, +Resize Gui, Margin, 3, 3 Gui, Add, Tab, w284 h260 vMyTab, Get Gender|Check Nickname|List Names|Statistics Gui, Tab, Get Gender Gui, Font, S11 Gui, Add, Edit, x8 y30 R1 W270 vMyNameString Gui, Font, S8 Gui, Add, Button, Default gCheckGender x8 y+5, &Check Gender Gui, Add, Button, gCheckGenderTrace x+38, Check Gender (Display&Trace) Gui, Font, S11 Gui, Add, Edit, R9 W270 x8 y+5 vMyResultField ReadOnly Gui, Font, S8 Gui, Add, Checkbox, x8 y+5 vUseHotkey gActivateHotkey, Use Alt+G to check selected text for gender Gui, Tab, Check Nickname Gui, Font, S11 Gui, Add, Text, x8 y33, Name 1: Gui, Add, Edit, x58 y30 R1 W220 vMyNickAString Gui, Add, Text, x8 y63, Name 2: Gui, Add, Edit, x58 y60 R1 W220 vMyNickBString Gui, Font, S8 Gui, Add, Button, gCheckNick x59 y90, Check, if two first &Names are "equivalent" Gui, Font, S11 Gui, Add, Edit, R8 W270 x8 y+5 vMyNickResultField ReadOnly Gui, Tab, List Names Gui, Add, Text, x8 y33, Country : Gui, Add, Edit, x60 y30 R1 W218 vMyCountryString Gui, Font, S8 Gui, Add, Button, gListNames x61 y60, &List all names of the given country. Gui, Font, S11 Gui, Add, Edit, R10 W270 x8 y+5 vMyCountryResultField ReadOnly Gui, Tab, Statistics Gui, Font, S8 Gui, Add, Button, gShowStats x8 y33, &Show statistics Gui, Font, S11 Gui, Add, Edit, R11 W270 x8 y+7 vMyStatResultField ReadOnly Gui, Show, , Gender Verification return return ;--------------End-auto-execute---------------------------------------------- ;--------------gender.exe related-------------------------------------------- CheckGender: Gui, Submit, Nohide Gui +Disabled Gui, Flash StringLeft, MyNameString, MyNameString, 100 RunWait, %comspec% /c ""%A_WorkingDir%\gender.exe" "-get_gender" "%MyNameString% " >"RESULT.TXT"", , Hide UseErrorlevel if ErrorLevel = ERROR GuiControl, , MyResultField, Calling gender.exe produced an error! else { FileRead, MyResult, Result.txt GuiControl, , MyResultField, %MyResult% } Gui -Disabled Gui, Flash Gui, Flash, Off FileDelete, Result.txt return CheckGenderTrace: Gui, Submit, Nohide Gui +Disabled Gui, Flash StringLeft, MyNameString, MyNameString, 100 RunWait, %comspec% /c ""%A_WorkingDir%\gender.exe" "-get_gender" "%MyNameString% " "-trace" >"RESULT.TXT"", , Hide UseErrorlevel if ErrorLevel = ERROR GuiControl, , MyResultField, Calling gender.exe produced an error! else { FileRead, MyResult, Result.txt GuiControl, , MyResultField, %MyResult% } Gui -Disabled Gui, Flash Gui, Flash, Off FileDelete, Result.txt return CheckSelectedforGender: ClipSaved := ClipboardAll Send ^c ClipWait, 4 if ErrorLevel { GuiControl, , MyResultField, The attempt to copy text onto the clipboard failed. return } Loop, parse, Clipboard, `n, `r ; Specifying `n prior to `r allows both Windows and Unix files to be parsed. { MyNameString := A_LoopField break } StringLeft, MyNameString, MyNameString, 100 Gui +Disabled Gui, Flash RunWait, %comspec% /c ""%A_WorkingDir%\gender.exe" "-get_gender" "%MyNameString% " >"RESULT.TXT"", , Hide UseErrorlevel if ErrorLevel = ERROR GuiControl, , MyResultField, Calling gender.exe produced an error! else { FileRead, MyResult, Result.txt GuiControl, , MyNameString, %MyNameString% GuiControl, , MyResultField, %MyResult% GuiControl, , MyNickAString, %MyNameString% ToolTip, %MyResult% SetTimer, RemoveToolTip, 5000 } Gui -Disabled Gui, Flash Gui, Flash, Off FileDelete, Result.txt Clipboard := ClipSaved ClipSaved = return RemoveToolTip: SetTimer, RemoveToolTip, Off ToolTip return CheckNick: Gui +Disabled Gui, Flash Gui, Submit, Nohide StringLeft, MyNameString, MyNickAString, 100 StringLeft, MyNameString, MyNickBString, 100 RunWait, %comspec% /c ""%A_WorkingDir%\gender.exe" "-check_nickname" "%MyNickAString% " "%MyNickBString% " >"RESULT.TXT"", , Hide UseErrorlevel if ErrorLevel = ERROR GuiControl, , MyNickResultField, Calling gender.exe produced an error! else { FileRead, MyResult, Result.txt GuiControl, , MyNickResultField, %MyResult% } Gui -Disabled Gui, Flash Gui, Flash, Off FileDelete, Result.txt return ListNames: Gui, Submit, Nohide Gui +Disabled Gui, Flash StringLeft, MyNameString, MyCountryString, 100 RunWait, %comspec% /c ""%A_WorkingDir%\gender.exe" "-print_names_of_country" "%MyCountryString%" "RESULT.TXT"", , Hide UseErrorlevel if ErrorLevel = ERROR GuiControl, , MyCountryResultField, Calling gender.exe produced an error! else { FileRead, MyResult, Result.txt GuiControl, , MyCountryResultField, %MyResult% } Gui -Disabled Gui, Flash Gui, Flash, Off FileDelete, Result.txt return ShowStats: Gui +Disabled Gui, Flash RunWait, %comspec% /c ""%A_WorkingDir%\gender.exe" "-statistics" >"RESULT.TXT"", , Hide UseErrorlevel if ErrorLevel = ERROR GuiControl, , MyStatResultField, Calling gender.exe produced an error! else { FileRead, MyResult, Result.txt GuiControl, , MyStatResultField, %MyResult% } Gui -Disabled Gui, Flash Gui, Flash, Off FileDelete, Result.txt return ;---------------End-gender.exe related---------------------------------------- ;---------------Hotkey related------------------------------------------------ ActivateHotkey: Gui, Submit, Nohide if (UseHotkey = 1) { Hotkey, !g, CheckSelectedforGender, ON } if (UseHotkey = 0) { Hotkey, !g, CheckSelectedforGender, OFF } return ;---------------End-Hotkey related-------------------------------------------- ;---------------Gui related--------------------------------------------------- GuiSize: if (A_EventInfo != 1) { if (A_GuiWidth < 290) Gui, Show, w290 if (A_GuiHeight < 260) Gui, Show, h260 } GuiControl, Move, MyTab, % "w" A_GuiWidth-6 "h" A_GuiHeight-6 GuiControl, Move, MyResultField, % "w" A_GuiWidth-15 "h" A_GuiHeight-115 GuiControl, Move, UseHotkey, % "y" A_GuiHeight-25 GuiControl, Move, MyNickResultField, % "w" A_GuiWidth-15 "h" A_GuiHeight-130 GuiControl, Move, MyCountryResultField, % "w" A_GuiWidth-15 "h" A_GuiHeight-100 GuiControl, Move, MyStatResultField, % "w" A_GuiWidth-15 "h" A_GuiHeight-75 return GuiClose: ExitApp ;---------------End-Gui related-----------------------------------------------
The exe- and the ahk-file can be downloaded here: <!-- m -->https://ahknet.autoh... ... ender2.zip<!-- m -->