The 32-bit binary of Basis Technology's UNICONV utility, which converts between most East Asian code-sets and Unicode.
uniconv.inf (2633 bytes) Some brief explanation from the "help" printouts.
[uniconv.txt] (7037 bytes) The words from the "help" screen.
[uniconv.zip] (726772 bytes) The Windows NT/95/98 binary and DLL.
uniconv_old.exe (835072 bytes) The previous Windows version (no DLL)
The Sun (Solaris 2.5), HPUX and Macintosh binaries are available too.
Uniconv is a command line utility that uses the Basis Technology C++
Library for Unicode for converting text between encodings and optionally
applying transforms to it.
Uniconv will convert a text file written in a given encoding (click here
for accepted encodings) to another of its accepted encodings. It uses a
command line interface, the usage being as follows:
[property | transform]*
Name of the program to run.
List the encoding of the input file. Encoding name must be
written in the way listed below.
List the name of the file (if in the current directory) or the
path and file name of the file (if not in the current directory)
to be converted.
List the desired encoding of the ouput file. Encoding name must
be written in the way listed below.
List the name of the file to be created in the new encoding (if
in the current directory) or the path and file name of the new
file (if not in the current directory).
Returns true or false value for characters. A property is
associated with the transform that follows it. Properties not
followed by a transform are ignored. Multiple property-transform
pairs are OK. Multiple properties per transform are also OK. See
Character Properties for more information about how to use
properties, and see below for a quick reference of the
Changes a property value for designated characters in a file.
Multiple transforms are OK. See Transforms for more information
about how to use transforms, and see below for a quick reference
of the transforms available.
Use these flags at the beginning of the command line, before you
specify the input and output encodings and filenames.
This option will print messages generated by Auto-detect. For
example, if you are converting a Japanese file and the input
encoding is japaneseautodetect, uniconv will list the encodings
it is attempting (sjis, euc-j, etc.) and the results.
Displays the copyright information.
Allows you to change the default substitution character. The
substitution character is the character that is used if there is
no direct mapping between characters in a conversion. The
default substitution character is CTRL-Z.
- All command line arguments are case insensitive.
- Separate properties and transforms with a space.
- If there are multiple properties or transforms, they will be
performed in the order listed.
- The options -debug, -help, -subst, if used, must directly
- * means more than one property or transform is OK.
Quick Reference: Accepted Encodings
Arabic, ASCII, Big5, BMP, ChineseAutoDetect, cp1251, cp1252, cp437, cp850,
EUC-J, EUC-KR, GB2312, Greek, Hebrew, HZ, ISO-2022-JP, ISO-2022-KR,
ISOLatinCyrillic, JapaneseAutoDetect, JIS_X0201, JIS_X_0208,
KoreanAutoDetect, Latin1, Latin2, Latin3, Latin4, Latin5, Latin6,
Shift-JIS, Thai, UCS2, Unicode11UCS2, Unicode11UTF7, Unicode11UTF8, UTF7,
UppercaseLetter, LowercaseLetter, TitlecaseLetter, ModifierLetter,
OtherLetter, AnyLetter, NonSpacingMark, CombiningMark, DecimalNumber,
OtherNumber, DashPunctuation, OpenPunctuation, ClosePunctuation,
OtherPunctuation, MathSymbol, CurrencySymbol, OtherSymbol, SpaceSeparator,
LineSeparator, ParagraphSeparator, ControlCharacter, OtherCharacter,
UndefinedScript, GeneralScript, Latin, Greek, Cyrillic, Armenian, Hebrew,
Arabic, Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu,
Kannada, Malayalam, Thai, Lao, Tibetan, Georgian, HangulJamo, Hiragana,
Katakana, Kana, Bopomofo, CJKUnifiedIdeographs, Hangul, UndefinedWidth,
Accepted Transforms :
ToLowercase, ToUppercase, ToFullwidth, ToHalfwidth, ToHiragana,
ToKatakana, Decompose, Compose, ToCombiningMark, ToSpacingMark, Select,
Filter, ToCRLF, ToCR, ToLF, ToParagraphSeparator, ToLineSeparator,
ToCanonical, ToTraditionalChinese, ToSimplifiedChinese, RomajiToHiragana,
RomajiToKatakana, KanaToRomaji, ToLatinNumber, SGMLEntity
UniConv - Convert unicode [CMD]
9 replies to this topic
#1 - Posted 29 June 2006 - 10:17 PM