BoBo² Guest
|
Posted: Wed Jul 02, 2008 12:39 pm Post subject: CMsort - Sort DOS/WIN/UNIX/MAC text files [CMD] |
|
|
HugoV mentioned something about it within another thread. Check it out. | Quote: | **************************************************************
CMsort version 1.6
Sort a DOS, WINDOWS, UNIX, or MAC text file
Copyright (c) 2002 by Christian Maas - All Rights Reserved
chmaas@handshake.de
www.chmaas.handshake.de
**************************************************************
CONTENTS
========
I. Overview
II. Example 1
III. Example 2
IV. Run Time Comparison
V. Requirements
VI. Installation
VII. License and Disclaimer agreement
I. OVERVIEW
===========
CMSORT.EXE is a command line tool to sort text files with
DOS, WINDOWS, UNIX, MAC (or even mixed!) end-of-line marks.
New: processing files with fixed-length records.
How is CMsort working? CMsort is reading records of an input
file until the adjusted memory is reached. Then the records
are sorted and written to a temporary file. This will be
repeated until all records are processed. Finally, all
temporary files are merged into the output file.
The syntax is:
CMsort [sort field] ... [option] ... <input file> <output file>
Sort fields:
/S=F,L string from position F with length L (case sensitive)
/C=F,L string from position F with length L (case insensitive)
/N=F,L numeric from position F with length L
Options:
/F=n read/write files with fixed-length records (n byte without CR/LF)
/B ignore blank or empty records
/D ignore records with duplicate keys (according to sort field specs)
/D=<file> ignore records with duplicate keys, write them to <file>
/Q quiet mode (no progress output)
/H=n don't sort n header lines (default: n=0)
/W=n n-way-merge of temporary files (2<=n<=5, default: n=5)
/T=<path> for temporary files (/T=TMP for Windows temporary file path)
/M=n[p] use n KB [or n% of physical available] memory;
default: use 10%, at least 100 KB, but max. 1024 KB
Notes:
======
1. The complete line is used as sort field when running CMsort
without any sort field specification.
2. Use L=0 for the last non-numeric sort field specification
to define a (part-) key until end of line.
3. Numeric fields must be floating-point numbers. They may contain
decimal and thousand separators as well as minus or plus sign.
4. By default, temporary files are created in the current directory,
i.e. where CMSort is called. In the following example, temporary
files will be created in C:\temp:
C:\temp>cmsort /M=100 data.txt data.sor
Otherwise, use the /T option.
5. The input and/or output file may be located in a different directory:
C:\temp>cmsort C:\input\data.txt C:\output\data.sor
II. Example 1
=============
Let's suppose you have a file CUSTOMER.TXT with customer orders
as follows (including three header lines, which are not sorted by
CMSort by command /H=3)
1234567890123456789012345678901234567890123
Cust. Name Order Return
No. Date
1004711 Miller & Co. 1999-12-06 1,207.23
1004713 Topsoft 2000-01-04 2,521.95
1004747 MCP & Co. 2000-01-04 7,356.88
1004799 Eftpos 1999-12-06 23,122.56
To sort this file by order date (ascending) and return (descending), use
cmsort /H=3 /S=22,10 /N=33,11- CUSTOMER.TXT CUSTOMER.SOR
/H=3 don't sort three header lines
/S=22,10 first part of key is a string,
beginning at position 22, length 10, sort ascending (default)
/N=33,11- second part of key is numeric,
beginning at position 33, length 11, sort descending (-)
The result is:
1234567890123456789012345678901234567890123
Cust. Name Order Return
No. Date
1004799 Eftpos 1999-12-06 23,122.56
1004711 Miller & Co. 1999-12-06 1,207.23
1004747 MCP & Co. 2000-01-04 7,356.88
1004713 Topsoft 2000-01-04 2,521.95
III. Example 2
==============
This example shows how to ignore duplicate records. Duplicate records
are recognized by the defined key, not by the whole line. If you want
to exclude duplicate lines, you must perfom an additional sort
beforehead by using the whole line as key.
The following log file is containing user ID, user name, and last access time:
055 Maas 2001-02-05 07:31:55
087 Mechenbier 2001-02-05 08:01:23
024 Hesselbein 2001-02-05 08:15:16
055 Maas 2001-02-05 08:44:24
089 Kruft 2001-02-05 09:05:07
087 Mechenbier 2001-02-05 09:31:13
cmsort /S=1,3 /D LOG.TXT LOG.SOR
The result is:
024 Hesselbein 2001-02-05 08:15:16
055 Maas 2001-02-05 08:44:24
087 Mechenbier 2001-02-05 09:31:13
089 Kruft 2001-02-05 09:05:07
IV. Run-time Comparison
=======================
Using a file with 300,000 records of 9-digit integers as input
(total file size 3,300,000 bytes), on a Pentium 266 MHz
running under Windows 98 with 32 MB RAM the following times were
measured compared to CUSORT (a relatively fast DOS sorting tool):
Program Memory usage Elapsed Time Notes
--------+--------------+--------------+---------------------------------------
CUSORT 24 KB 547 seconds 75 MB additional memory on HDD needed
CMSort 24 KB 32 seconds max. 6.5 MB on HDD needed
CMSort 100 KB 31 seconds
CMSort 1024 KB 30 seconds
V. REQUIREMENTS
===============
CMSORT.EXE is a 32 bit application and requires Windows 9x or NT.
VI. INSTALLATION
================
1. Extract CMSORT.ZIP into a directory on your hard disk.
Make sure that this directory is in your PATH
environment variable, so you can call CMSort.exe
from within any directory.
2. The archive contains the following files:
README.TXT (this file)
CMSORT.EXE (the executable)
3. Start CMSORT.EXE to display the command line options. |  |
|