| SiteMap |
This is a menu of the topics on this page (click on any): Overview Helps for the DB Person Hoped for Response from the DB Person Principles of Design for Our Archives .
Overview
A CD has been created to put the intellectual property of USMS that exists
on the SwimGold website into a form whereby it can't be lost and where we
can begin to create queries and procedures for analysis, reporting and for
USMS promotional purposes. The list below identifies the critical databases.
What is important here is that this information be put into a format which
many people can use. It is possible that in doing that we will create the
ability to maintain the USMS Digital Archives in the form they have been
maintained in on the SwimGold website.
This material has been exported from SwimGold into a format that will be
very easy for Carl to update from the SwimGold databases. The export
procedure is very easy, so if we'd like to revise the procedure it will
be easy to do so. My best hope near term is that we could easily move
information back and forth between the SwimGold format and the dBase and/or
Access databases. Sometime in the future, maintenance of the Archives
webpages might always use the dBase and Access databases and the current
form of storage might not be needed.
Helps for the DB Person
Each database is on the CD in a directory called qksource or else a subdirectory under that name.
All database files are in .txt format and data is in fixed field flat ascii format.
Each database has a file called "qkfields.txt" which identifies its fields and field widths.
d:\swimgold\tt\aah\qksource\qkfields.txt
d:\swimgold\tt\age\qksource\qkfields.txt
d:\swimgold\tt\age\qksource\id\qkfields.txt
d:\swimgold\tt\age\qksource\nats\qkfields.txt
d:\swimgold\tt\age\qksource\tt\qkfields.txt
d:\swimgold\tt\ash\qksource\qkfields.txt
There may also be a file called "qkfiles.txt".
This identifies the files related to the database and can identify
files that are not in the fixed field format. These will be indicated by "(not DB)".
d:\swimgold\tt\aah\qksource\qkfiles.txt
d:\swimgold\tt\ash\qksource\qkfiles.txt
Three fields (Alphacode, Birthyear, Birthday) in combination are the SwimmerID. The complete swimmer id has 9 characters. A variation of the swimmer id is used to name stories and photographs. This variation involves converting the birthmonth and birthday from 4 characters into 2 characters, so the entire SwimmerID in this shortened format becomes 7 characters. Then the primary photo for anyone is named with the 7 characters and subsequent photos are named by adding a character, usually a number, in the 8th position. This is documented at ../committee/swmmridu.htm.
Hoped for Response from the DB Person
There are 392 .txt files (both SDIF & text vector files) which are the
source data for all of the USMS Digital Archives. Many of these will be
combined into one dBase or Access file. It is very easy on Carl's end
to export or update these files into this format. It is hoped that the
DB person collaborating will create the ability to import these into a
well-designed dBase or Access database and, whenever desired, to export them
back into this file structure. If that is done, then either Carl or the DB
person can work on the files, modify them, and exchange them with others in the
collaboration. Here's an example. The DB person might import all national
championship files from 1993 to 2000 into one master database. Seven of
those files came from Carl and 9 came from other sources. Carl will continue
to assign SwimmerID's as new matches are made with Registration data.
By exchanging data with all data from a single meet in a single file
(16 SDIF files for all championship meets from 1993 thru 2000),
it will be easier to pass information back and forth without walking on
each others work.
I (Carl) will be among my colleagues Nov. 12-15, many of whom are very skilled in using Access & dBase databases from my environment. If I have a database to work with, I expect I can get some help in learning how to work with it during those 4 days. I already have some ability to use dBase and Excel databases, but at this point I have no experience or tools to work with Access databases.
Principles of Design for Our Archives
Our approach to archives is people oriented. There are 10929 people on the
list of people about whom we have information (identified by the SwimmerID).
The SwimmierID is the key which provides the means for extracting or matching
information about any individual in any of the databases. If these databases
were a well-developed relational database, they would be organized very differently.
All information about each swimmer (hometown, LMSC, club) would be contained in
a "swimmer" file and those fields would be provided for any report of any table
from the "swimmer" file and not from the subject table. However, in our case
there were so many errors and inconsistencies in our data that it was not possible
to work that way. Lots of information, especially about club, LMSC, and the correct
spelling of names, and the correct last name, was tentative until it stood the
test of time, and, in fact, much of the information is tentative even today.
Many more corrections as to name, LMSC, club, and birthday will be made when
they are discovered and researched. So, it may still be premature to move to a
relational database format.
We need to focus both on text material and on competition material while getting started. One of our desires is to get all of the USMS intellectual property in our digital archives into a format whereby many people can access it and learn from it. Some people might feel we should do everything the same way, Foxpro was proposed instead of Access, I contacted my experts for opinions and their main point was that if you get it into a modern format, it will be easy to move information from one format to another.
The most important issue, I believe, is to store information in "unadorned" forms. That means character vectors for text material, not Word or Word Perfect. I don't mind receiving information in Word (not preferred) or Word Perfect (preferred), but that that is not a good format for an archive unless we want to set ourselves up for a very labor intensive process. My desire is always to have a process as automated as possible. (By the way, HTML qualifies as an "unadorned" character vector and is a good format, but not as a web page with navigational code added. Archives should only have content. Separately we develop ways to retrieve it, view it, do analysis with it, enhance it, and publish it.)