Site hosted by Angelfire.com: Build your free website today!

 


LeoCentaur Stand-Alone File Sort Utility (Qsortw v1.5a) 2005

SECTIONS:

1) INSTALLATION

2) HOW TO USE

3) DETAILED DESCRIPTION OF SORT PARAMETERS

4) SETTING PREFERENCES

5) VARIABLE LENGTH RECORDS (Delimited Text Files)

6) TECHNICAL NOTES

7) HISTORY

8) VERSION UPDATES (BUG FIXES)

9) KNOWN PROBLEMS

10) LICENSE AGREEMENT

Comments and feedback should be directed to Chris at

lcem2.jpg (8646 bytes)


1) INSTALLATION

    Copy (extract) the files to your hard drive, preferably in it's own directory or "folder".  The zip archive should contain two files (Qsortw.exe and QsDoc.htm).


2) HOW TO USE

In the Main Window, Click on "Options",

    Select "1) Input Sort Parameters", this will take you to the Input Sort Parameters dialog screen,

    Enter the Required* Fields: Input & Output File names, Record Size & Key 1 and

    Click on Sort Button.


    Basic sort parameter information is in SECTION, 3) DETAILED DESCRIPTION OF SORT PARAMETERS.  For information on the "bells & whistles" options, read SECTION, 4) SETTING PREFERENCES.


3) DETAILED DESCRIPTION OF SORT PARAMETERS

Input File Name:*

    Any Valid File Name. Usually of the form ([d:][\path[\path]...]YourFileName.ext). Use "Browse" to find the file that you want to sort. The Input File Name and Output File Names must be different. The Input File is never altered, it is treated as "read only" file.  (File names including drive designation and paths should not exceed 250 characters).

Output File Name:*

    A default file name is generated. The OFNI "Srt" is inserted between "YourFileName" and ".ext"; (i.e., [d:][\path[\path]...]YourFileNameSrt.ext). If you override the default file name, the paths MUST exist. If the file you specify exits, it will be OVERWRITTEN! The Input File Name and Output File Name may NOT be the same. Refer to SECTION 6, SETTING PREFERENCES, Output File Name Identifier (O F N I):, to change the default O.F.N.I. "Srt".   (File names including drive designation and paths should not exceed 250 characters).

Record Size:*

    MUST BE A NUMBER from 1 to 4096: If you are sorting delimited text files (with fixed length lines), the carriage return and/or line feed must be included in your Record Size calculations. See 'LENGTH:' in this section for more information of Record Size Calculations. (Qsortw v1.5a limits maximum Record Size to 4096 bytes; this is an arbitrary limitation as Record and Key buffers are dynamically allocated). ENTER 'var' in the Record Size field if you want to sort a delimited text file with varying line lengths. Refer to SECTION 5, VARIABLE LENGTH RECORDS, for more information.

Key 1:* - Key 6:

    Key fields must be sequential. At least one Key field is required. You cannot enter "KEY 1:", Skip "Key 2:", then enter "Key 3:" Qsortw v1.5a allows a maximum of six Key Field declarations.

POSITION:

    Beginning Position of the Field to Sort. (Values 1 to the number in Record Size Field).

LENGTH:

Length, in bytes, of the sort field .

    Actual Bytes occupied by the Field i.e., Characters are 1 for 1, Short Integers occupy 2 Bytes, Long Integers occupy 4 Bytes. (#)Please note that if you have Integers > 4 bytes. They may be sorted, if they are in endian format (huge-endian).

    Floating Point types are either 4 (single precision) or 8 (double precision) bytes. Long Double Precision floats are NOT supported. *Keys, Position and Length: Keys cannot overlap each other. Error messages are issued for the offending Key declarations. The total lengths of all the keys cannot exceed the Record Size. Also, the Position + Length cannot be > the Record Size.

SORT TYPES:

c = Character, (Lower Case Characters will Sort out as Upper Case, usually desirable).

a = ASCII, (Lower Case Characters will Sort out in ASCII Sequence, usually UNDESIRABLE).

i = Integer Signed. (see LENGTH in this section, for more information).

u = Integer Unsigned.

f = Floating Point. (See Floating Point under LENGTH in this section, for more information).

Default is "c" (character), if the field is left blank.

SEQUENCE:

"a" = Ascending.

"d" = Descending.

Default is Ascending, if the field is left blank.

Enable Sort Statistics:

    When checked, a File called SortStat.txt is created with Statistics on Memory Usage, Sort/Merge Compares, Stack Recursion, Run Time, etc. If you use Browse to find the Input File, the file is created in that directory.

Return To Main Button:

    Temporarily saves any parameters that have been entered and returns to the Main Window. If you exit the program, you may lose the data that was entered. See 'Validate Button' in this section for details.

Preferences Button:

    Takes you to the 'Set Preferences' screen. (New in version 1.4).

Sort Button:

    After the required fields have been entered, click on this button to begin sorting. The field entries (sort parameters) are checked for validity. If there are errors, a message box informs you of the problem(s); a single invalid field entry can cause multiple error messages.

    After the sort parameters are validated, before the sort begins, a file called "spt.spt" is created. If "spt.spt" cannot be created, a warning is issued; this does not stop the sort. (See 'Validate Button' in this section for more information). Next, the sort routine is called; the Input File is opened, the Output file is created and memory is allocated; if any of these fail the sort is aborted, and error messages are displayed. After the sort is complete, a message "Sort Is Done!" is displayed.

Validate Button:

    Validates sort parameters without sorting the file and saves the information in the spt.spt file. Program start up looks for this file, if spt.spt is found, the information is displayed when you enter the Input Sort Parameters dialog. User .spt files for fixed length record sorts are also created at this time; refer to SECTION 4, SETTING PREFERENCES, Create User .spt Files, for more information. User .spt files for variable length records are only created after a successful sort; refer to SECTION 5, VARIABLE LENGTH RECORDS, for more information.

Help Button:

Brings up a Quick Help message box.

Saved Sort Parameters (List Box):

    This list box appears if the 'Create User .spt Files:' box is checked in 'Preferences'. Entries are added to the box whenever a successful 'Validate' or 'SORT' is completed. Identifiers are program generated (i.e., '_a', '_b' and so on) and are resequenced if you remove an entry. The number of Save Sort Parameters allowable for each file is 26. The 'Remove Selected Entry' button removes the highlighted entry from the list.

    TO CHANGE AN ENTRY without creating a new sort parameter item: First, Remove the Selected Entry (the information will remain on the screen), make the wanted changes and press the 'Validate' or 'Sort'; if 'var' is entered as Record Length, you must press the 'Sort' button to create an entry in the Saved Sort Parameters box. See, 'Validate Button', this section and refer to SECTION 5, VARIABLE LENGTH RECORDS, USER .spt FILES and Delimited Text Files:, for more information.

Remove Selected Entry:

Removes the highlighted entry from the list in the Saved Sort Parameters box.


4) SETTING PREFERENCES

Startup Main Window Position:

    To position the Window, move the Window to the desired position, click on Preferences in the Options Menu, click on "Done"; the Main Window will be displayed at that position on startup. Qsortw v1.5a doesn't allow Windows to be resized.

Default Browse Directory:

    Set the starting path\directory for Browse. (i.e., entering "c:\", brings up Browse in the root directory on drive c:; entering "c:\windows", brings up Browse in the windows directory on drive c:). (NOTE: This is a gigo field; the program only requires at least three characters to be entered; if you put Garbage In you will get Garbage Out).

Use Last Path for Browse:

    Check this box, if you want the program to keep track of the drive/path of the last file selected in Browse and use that drive/path for future Browsing.

Output File Name Identifier (O F N I):

    Append String to create Output File Name. (Refer to SECTION 3, DETAILED DESCRIPTION OF SORT PARAMETERS, Output File Name & Input File Name).

Go to Sort Dialog on Startup:

    Check this box, if you want the program to go directly to Sort Parameter Dialog screen on startup.

Enable Key Titling:

    Check this box, if you want the program to allow you to change Key Titles from 'Key x:' to something more meaningful (i.e., Name, Artist, Song Title, City, etc.). To Use Key Titling, click on the Key Title you wish to change (i.e., Key 1:), a small window will appear above the Key Titles (it will display the current Key Title), delete the old title (using the delete key), enter a new Key Title and click on the Key Title a second time to display the new Key Title. (NOTE; that you can wander away and do other stuff, the rename key title window remains active until a second click on the Key Title). If you Enable Key Titling, you should also check to box next to Create User .spt Files, or else Key Titles won't be saved (except for the last file that was sorted). Refer to SECTION 3, DETAILED DESCRIPTION OF SORT PARAMETERS, Validate Button:, for more information.

Create User .spt Files:  (spt = Sort Parameter Table)

    Check this box, if you want the program to save your Sort Parameters.  When this box is checked, a list box will appear on the Sort Parameter Dialog Screen.  This allows you to save multiple Sort Parameters for each sort file, sets up recall options for future use and allows you delete Sort Parameter entries that are no longer needed. (NOTE: 'spt' file names and identifiers are program generated based on the name of the file to be sorted; Refer to SECTION 3, DETAILED HELP ON SORT PARAMETERS, Validate Button:, for more information).

Stay in Sort Dialog after Sort:

    Check this box, if you want the program to Stay in the Sort Parameter screen  after sorting a file. The default is to return to the Main Window after a successful sort.

Variable Record Delimiter:

    "bo" = 'both cr/lf', "cr" = 'cr only' and "lf" = 'lf only'. Refer to SECTION 5, VARIABLE LENGTH RECORDS, for more information.


5) VARIABLE LENGTH RECORDS (SORTING DELIMITED TEXT FILES)

SORTING DELIMITED TEXT FILES:

    "var" must be entered in Record Length, other acceptable entries are "vari", "varia", "variab", "variabl" and "variable".

    KEY TYPES for delimited text files are limited to 'c' and 'a', see SECTION 3, SORT TYPES:, for more information.

    USER .spt FILES and Delimited Text Files: '.spt' files for delimited text files are created only after a successful sort because information to check their validity is not available when the .spt files are normally created.

    Delimited Text Files and "TABS": Qsortw v1.5a does NOT expand tabs. Tabs are not differentiated; they are treated as a single byte or character. This may cause some confusion as to whether the file was sorted correctly.

    Delimited Text Files are usually delimited by a carriage return & line feed. There is an option in PREFERENCES SETTING, to change the default; however, this changes it for all delimited text files that are sorted. Some delimited text files contain a carriage return/line feed (a hard return) and just line feeds (a soft return). Sorting a delimited text file with hard and soft returns, using the default setting, will remove all soft returns from the file (assuming a line length > 512is not generated by the removal of the soft returns).

    Delimited Text Files are read twice. The first read is to determine the record count and the longest line length; so, that Key Position and Key Length edits can be done and sufficient memory can be allocated for the sort phase.

HANDS ON EXAMPLE:

    For an example, you could sort the "QsDoc.txt" file. Although, the resulting file won't make much sense, you will note that the file is in the specified sequence. (Note, Return to Main, will allow you minimize the Main Window between sorts, so you can look at files without exiting the program).

a) Try using 1 for Position & 10 for Length. You may, also, want to Enable Sort Statistics and look at the SortStat.txt file to see what types of information are logged.

b) Now, change the Type field to 'a' (ASCII), sort again and look at the results.

c) Try using 31 for Position & 12 for Length. You may try enabling Key Titling, refer to SECTION 4, SETTING PREFERENCES, Enable Key Titling:, for more information.

d) Try sorting in descending sequence.

e) Have Fun!!! (NOTE: Tabs are not expanded. When viewing the file, this may cause some confusion as to whether the file was sorted correctly).


6) TECHNICAL NOTES

    Qsortw is a MEMORY SORT. Record, Key and support data buffers are dynamically allocated. If you are sorting an extremely large file, you may not want to have other programs running. If Qsortw can't allocate the required memory, the program is terminated with the appropriate error message.

    SORT CAPACITIES: The theoretical record capacity of Qsortw is 32,000,000 records. However, you will probably run out of memory before this limit is reached.

    SORT DURATION: Sort run time varies according to the number of records, the total length of keys sorted and the frequency of duplicate values in major key fields.

    Test file #1 with 7,500 records and a 70 byte total key length sorts in approximately 0.8 seconds on a 450Mhz machine.

    Test file #2 with 25,700 records and an 18 byte total key length sorts in approximately 1.5 seconds on a 450Mhz machine.

    The run time for these files on a 233Mhz machine is a bit longer. On faster machines, the run time decreases depending on the speed of the machine.

    INTEGERS greater than 4 bytes: Qsortw will, theoretically, sort integers greater than 4 bytes if they are in endian format. As I have no software to generate integers greater than 4 bytes, sorting them remains unproven.    

If you are interested in Stack Recursion statistics,

1) Sort a file and rename the "SortStat.txt"; so, that it won't be over-written.

2) Sort the "sorted" file but, in the opposite sequence (i.e., if the first sort was ascending, the second would be descending and visa versa).

3) Look at the Stack Recursion statistics from the first sort and compare them to the second sort. Also, notice the differences in the number of sort and merge compares and run times.


7) HISTORY

    Before I begin, many thanks to my brother Jim, who assisted me with his expertise in 'C' and Windows API and to Loren Kirby who collaborated on the original incarnation of this work.

    The concept for this program originated over 25 years ago and was written in Z80 assembler to run under CP/M. The concept was reworked to drive this anachronistic software from a Windows Dialog Box. The sort program is written in the 'C' programming language. After I completed the sort program, the problem was how to interface user input to the sort program. A-ha!, write a Windows program for the user input.

    Step back a few months. I had my cable TV disconnected, because I couldn't afford it anymore. I needed something to do; so, I decided to write a program using Quick Basic to generate prime numbers. Nearing the end of this project, I discovered that Quick Basic was storing numbers, that I had defined as integers, in floating point format; I was NOT a happy camper. So, I relearned how to program in 'C'; a programming language that I had dabbled with years earlier. After the 'C' rewrite was completed, I was looking for something to do with the 30 million prime number I had generated. I dumped them into a .wav file; other than making an interesting wave pattern, it seemed pointless. Next, I wrote a program to generate graphics in Windows with the prime number data; that, also, lead nowhere. However, I became proficient in the 'C' programming language. And thusly, the sort program concept was reborn.

    In order to test the sort program, I had to write a program to generate test data. I created a file of random data consisting of two character fields, a short signed integer, a short unsigned integer, a long signed integer, a long unsigned integer, a single precision float and a double precision float. I had planned on supporting long double floats, but I could not generate reliable data (the software I'm using has some kind of bug and doesn't work with long double floats).

    The original program supported IBM's Binary Coded Decimal (BCD) and Zoned Signed Display numeric formats. As these have gone the way of the dinosaur, along with IBM's EBCDIC collating sequence, I saw no reason to support these formats.

    Qsortw allows integers greater than 4 bytes to be sorted. I follow the little-endian/big-endian format for storing integers, perhaps one might refer to it as huge-endian format. As I have no software to generate integers greater than 4 bytes, sorting them remains unproven.    


8) VERSION UPDATES (BUG FIXES)

  

    Version 1.5a - Changed limit on line size to 512 bytes for delimited text files.

    Version 1.5 - Version 1.4 was compiled using "Dead Code Elimination Optimization"; this was causing fields following double floating fields to sort incorrectly.

    Version 1.4 - Added 'Preferences' button to Sort Parameter screen.

    Version 1.2a - Fixed bugs in v1.2 - "Use Last Path for Browse" was not working properly, thanks to Ray from San Francisco for catching that one.  Also, temporary save of sort parameters when 'returning to main' was not working.

    Version 1.2 - Run-time 'dll' incorporated into program.  The run-time dll "bcRT32.dll" is not needed to run this version. The file, "bcRT32.dll", may be deleted.

    Version 1.1 - Fixed bug in v1.0 - Double floating point field zero values did not sort correctly when sorted in descending sequence.


9) KNOWN PROBLEMS

Files in root directories:

    Qsort can have problems accessing some files, if they are located in a root directory (i.e., c:\filename.ext, d:\filename.ext, a:\filename.ext,).   The program inserts a extra backslash when building the drive, path and file name (i.e., c:\\filename.ext). 

Windows 98:

    If you use "Browse" to select a file, and you Press the Cancel Button, hit the escape key, or close the dialog box, without selecting a file upon exiting the program, you get an error - "This program has performed an illegal operation" (Invalid Page Fault) error. This is an intermittent error; occurring about 90% of the times tested. (Qsortw v1.2: incorporating the run time dll into the program seems to have solved this problem).


      * * * * * * * * * * * * * * * * * * * * * * LICENSE AGREEMENT * * * * * * * * * * * * * * * * * * * * * * *

This software;

a) is licensed for personal use only, subject to provisions herein,

b) may be distributed to anyone for their personal use, subject to provisions herein,

c) may NOT be sold or used for financial gain without express permission from the author.

Limited Warranty:

    This software will perform in accordance with the accompanying written materials. No warrant or guarantee, regarding the use of this software or written materials in terms of correctness, accuracy, reliability, currency or otherwise, is given. The entire risk, as to the results and performance of this software, is assumed by you. You assume the responsibility for the selection of the program to achieve your intended results, and for the installation, use and results obtained.

Ownership of Software:

    Title to this software will at all times remain with the author. You may not modify, adapt, translate, reverse engineer, decompile or disassemble this software.

Use Restrictions:

    The Licensee is obtaining limited rights to use this software. You may not adopt, translate or create derivative works based on the written materials without prior written consent of the author.

Copy Restrictions:

    This software and the accompanying written materials are copyrights. You will be held legally responsible for any copyright infringement which is caused or encouraged by your failure to abide by the terms of this agreement.