| |
Building the Union Catalog for PharosTM Directions for Extraction of GAP Bibliographic Records from CSU Library Systems
Initial Build
The first stage of the two stage process of building the Pharos Union
Catalog has been completed. Using the 11+ million records extracted
from the 22 CSU libraries in late 1998, the initial build of the union
catalog was completed in July of 1999 . The Task Force for Database Standards
and Management and Ying Liu, the Pharos Database Manager, have been reviewing
the 3.5+ million records loaded into the Horizon system that is being
used to host the union catalog.
The Horizon database used for the union catalog was preloaded with the
Library of Congress Name and Subject Authority records. These records have
been updated to reflect the changes that have been made through July 2000
by LC and will continue to be updated on an ongoing basis with loads of
new and updated authority records from LC.
The English language monographs and serial records with a publication
date greater than 1991 in the union catalog database have been extracted
and sent to Blackwells' where they were enriched with table of contents
and summary information. The 1996 through 1998 enriched records
have now been loaded back into the union catalog database. As new records
are added to the union catalog, they will be sent to Blackwells' for ToC
and summary enrichment processing.
Maintenance
Numerous test databases were created using a maintenance program that
implements the master
record algorithm. This is the same algorithm used by a program
that was used to merge the initial 11.5+ million bibliographic records
extracted in late 1998 into the single file of 3.5+ million records. The
maintenance program was also tested in late 1999 and early 2000 by loading
gap records extracted from San Francisco State's catalog into the live
union catalog database.
The maintenance program initially matches similar bibliographic records.
Once a match has been made, the maintenance program then determines if
the incoming record should become the master record, again using the rules
described in the algorithm. Another important step is the creation of a
new field in this record that is used by the Pharos software to create
a hypertext link to the local catalog. A user can click on this link to
obtain the local call number and circulation status of the holdings in
that library.
The maintenance program is designed to accommodate new
records, updated records and deleted
records. Depending on the status of the existing record in the union
catalog, a new record can be created, an existing record modified, or an
existing record can be deleted from the database. The maintenance program
has been created and extensively reviewed to insure that the union catalog
will continue to keep pace with each CSU online catalog.
We are now ready to receive a file of "gap" records extracted from
your library's local system.
Gap File: Defined
A gap file represents those records that update, add to, or delete records
from each library's local online catalog database. For the purposes of
the Pharos union catalog these gap file procedures pertain to bibliographic
records only. Updating of authority records will be done by the Pharos
project office. Updating of item records is not relevant to the updating
of the master records in the union catalog, e.g. the adding of a second
copy of the same edition of a title by a library should not be submitted,
although there is no harm done to the union catalog database if such a
record is submitted.
Real Time Updating
The goal of providing a real-time, seamless interface that could be
used to update the union catalog, using records from your library's catalog
is not yet available. A plan to implement this type of an interface is
under development. The currency of the union catalog, until the real-time
interface is available, will depend on the submission of gap files of bibliographic
records. Our goal is to keep the union catalog current with your
local catalog, but first, we have over twelve months of cataloging activity
to digest before we can narrow the gap. We are unsure of the number of
records that we will obtain in this first of what will probably be many
gap files, but knowing that library budgets were uncommonly flush in the
last year, we expect the number to be significant.
A possible set of gap files would include:
-
Bibliographic records added to your system since
the date of the initial extraction of records for the Pharos union catalog.
Please check with Ying Liu if you
are unsure of the date
of the initial extraction of records for the union catalog. We
expect that your library has added many new bibliographic records to your
catalog due to the additional funding that was available to CSU libraries
for purchasing new materials for library collections in FY 1999 & 2000.
The users of the union catalog will certainly want to find records for
these important new acquisitions.
-
Bibliographic records that have been modified
since the date of the initial extraction of records for the Pharos union
catalog. A modified record constitutes any maintenance to the bibliographic
record since the extraction date. Updates and new records may
be submitted as an integrated file or as separate files based on local
preference or capability. Again the extraction date for the update
and new record file(s) should be noted by the local library as well as
submitted to the Database Manager.
-
Bibliographic records that have been deleted
from your local catalog since the date of the initial extraction of records
for the Pharos union catalog. Since the initial extraction of bibliographic
records for the creation of the union catalog, campuses have either been
maintaining a paper copy of records deleted from the local catalog database
or marking/flagging the deleted bibliographic record in the local database.
Paper deletes
For campuses maintaining a paper file of deleted records: Please send
these print-outs to:
Ying
Liu
Pharos Database Manager
Office of the Chancellor
California State University
401 Golden Shore, 2nd Floor
Long Beach, California 90802-4210
To avoid the record keeping associated with printing out deleted records,
the Pharos project office and the Task Force for Database Standards and
Management will be working with each library to devise a method of submitting
deleted records electronically.
Electronic deletes
Electronic deletes fall into two categories: 1). Deletes from
systems in which flagging/marking of deletes is not possible requiring
such records to be submitted as a separate electronic file identified as
deleted records and 2). Deletes from systems in which flagging/marking
of deletes is possible thus making it possible to submit deletes in an
integrated file together with updates and new records. With the second
category of records, libraries must identify the flag (code) used and the
location of the code in the record. (The flag to mark deleted records cannot
have multiple uses.)
When the records have been extracted they can be sent to the Pharos
Project Office via
SFTP [sftp: sftp.calstate.edu Username:
MyCampusPassword:
unique]
Contact
Ying Liu at 562 951-4261 to get your actual username and password.
Ying Liu will begin processing your records as soon as they are received.
Please
contact Ying Liu to discuss the gap record extraction timetable that would
best suit your library's situation. Because we will be processing these
files one at a time, we are not setting a deadline for completion. Rather,
we would like some libraries to volunteer to extract as soon as is possible,
while other libraries can choose a later date. As libraries commit to a
time frame this information will be posted on a UIAS web site.
The gap file extraction process will vary according to your library's
automation system software and equipment configuration. Common to all,
however, are the following issues that need to be considered when planning
and carrying out the bibliographic extraction:
-
No changes should be made to bibliographic records already in your database
while you are in the process of extracting the gap records from your database.
This is necessary in order to insure the integrity of each library's records.
New records, however, can still be added to the bibliographic database
during the extraction process. We expect that your gap file record extraction
process will take far less time than the initial extraction and that the
period of time during which database edits should be restricted to also
be brief. We want that this gap extraction process to cause little or no
disruption to the schedules of any of your library personnel. Please contact
Ying
Liu at 562 951-4261 to review the requirements for this extraction
and to schedule a date to start your gap file extraction.
-
The library system's internal bibliographic control
number must be included in all extracted records. We would like
this number to be located in the same field in which the original extraction
placed the internal control number. If you suspect that a change has been
made to your system that would have changed this, it is strongly recommended
that you extract one test record and that you can send this record
to Ying Liu for analysis. A web
page has been established where you can check to see what was recorded
for the initial extraction. This number is crucial to the operation of
the union catalog because it is the "hook" back the local system which
will enable the union catalog to display call number(s) and circulation
status for each title. Please indicate which MARC field will be used to
store the internal control number and supply this information to the Pharos
Database Manager. Note: The program implementing the master record
algorithm to merge records will use the 035 field. Placing the internal
control number in the 035 could cause this program to merge two bibliographic
records that should not merge, etc. For this reason it is recommended
that the 035 field NOT be used to store the internal control number.
(This "Hook-to-Holdings" capability can be viewed at http://Pharos.calstate.edu:5080/webpac/pharosstart.html
using the "Union Catalog" search option.)
-
Classes of records which are present in the local system but which are
not suitable for inclusion in the union catalog should be identified and,
preferably, removed or excluded from the file of extracted records from
your local library system. Such records could include; faculty owned
copies of materials on Reserve, items such as room keys, titles withdrawn
from the library, payment records, records used for check-in purposes,
and other items that would not normally be considered part of a library's
collection. Records like the ones described above may already be "suppressed"
from public view in your OPAC. If this is the case, it is also likely that
your library automation vendor could assist you in mapping this characteristic
information to an otherwise unused MARC field. Having this information
in the records that you extract will enable the programs used to build
the union catalog to exclude these records during the load or suppress
them from public view.
-
The date that the gap file extraction is completed should be noted and
this information sent to the Pharos Database Manager. Keeping track
of the date and last record number extracted will be important when the
next file of gap records is requested later in the year. Please send a
copy of this information to yliu@calstate.edu
Do not extract any additional records until notified
by the Pharos Database Manager.
-
Send paper copies, which your library has saved, of any bib records
deleted from your local system since the initial extraction of records
to Ying Liu, Pharos Database Manager, Office of the Chancellor, California
State University, 401 Golden Shore, Long Beach, CA 90802-4210. These
printouts will be used by the Union Catalog Database Manager to edit the
union catalog to reflect changes at the campus level between the time the
initial extraction occurred and the present.
-
Send computer files containing only deleted records to: the Pharos
project office via SFTP [sftp.calstate.edu Username:
MyCampus
Password:
unique]
Contact Ying Liu at 562 951-4261 to obtain your username and password.
-
For the purposes of the Pharos Union Catalog the definition of a deleted
record is a bibliographic record that was in your system's database at
the time of the initial extraction and which has now been removed because
the library no longer owns a copy of the material represented by the bibliographic
record being removed. Thus, a library, which deletes a record from
its local system because it is discovered to be a duplicate entry, would
not need to make a printout of that deletion since the title would still
remain in the database.
-
Please contact the Pharos Database Manager at 562 951-4261 to schedule
a date to extract your first Gap File. As we determine the number of
bibliographic records contained in all of the 22 CSU libraries' gap files
and the rate at which these records can be processed the new maintenance
program we will be able to anticipate when we will begin a regular schedule
of extracting records.
The gap record extraction process described here is similar to, but smaller
in size than, the extraction process that you followed in late 1998. We
expect that this extraction process will be less burdensome for your staff
because of the smaller size of the extraction. Nevertheless, We expect
that questions and problems will arise. Please call Ying Liu at
562 951-4261 to discuss ways to make this process as easy as possible
or to work out problems that you encounter. Alternatively, send your comments
to: Ying Liu at yliu@calstate.edu
Ongoing Submittal of gap files
The first gap file will be larger than any subsequent gap file because
of the time that was required to create the union catalog. Subsequent gap
files should be submitted on a regular basis. When the first round of gap
files have been processed we will have a better idea of when we will be
able to request that libraries begin to routinely submit gap files. At
that time libraries will be able to choose to either submit gap files on
a daily or weekly basis. All update gap files must be submitted and
processed in the order in which they are created by the library. We will
be working with each library to insure that gap file names include information
that identifies the source and date of extraction. An example of this would
be: CCH04232000NEW. Translated, this name would indicate that CSU
Chico extracted these new records on April 23, 2000.
You can search the CSU union catalog at http://Pharos.calstate.edu:5080/webpac/pharosstart.html
using the "Union Catalog" link. Even though the indexing of these 5 million
records is now complete fine tuning of the the searching capabilities offered
through the Pharos Web interface continues.
Directions for the FTP of records to
Pharos Union Catalog:
SFTP: sftp.calstate.edu
Username:
MyCampus
Password: unique
You can ftp records directly to your campus'
root directory when using your campus's special username - there is no
need to change directories
bakersfield
chico
channelislands
dominguezhills
fresno
fullerton
hayward
humboldt
longbeach
losangeles
maritimeacademy
montereybay
northridge
pomona
sacramento
sanbernardino
sandiego
sanfrancisco
sanjose
sanluisobispo
sanmarcos
sonoma
stanislaus
Directions for
Innovative
systems
Geac
Systems
Horizon
Systems
DRA Systems
Endeavor Systems
UIAS
Task Force for Database Standards & Management Documents
UIAS
Task Force for Systems Documents
|