|
CFM Checklist For WebPAC Redirection March 30th, 1998 There are several issues of concern to the CFM developer when fine tuning CFM redirection. 1. What elements to redirect on
Unfortunately, this can be server specific, even though there are standards in the library industry. A library analyst should be consulted to insure that the default field in the fld_usmarc... cfm file follows accepted practice. For instance, suppose the standard indexing of an author name uses the subfields [a-name] and [d-date of birth]. The fld_usmarc_eng_t.cfm should contain the following entry (display portion omitted): [+ [$ [$ Display portion ....... ] [$ [* 1, %NL,] "{R}R_AUTHOR " [^ S100a, ] [^ ì ì S100d,] "{/R}" ] ] ] ]
only on the ‘a’ subfield. Then there must be a field in fld_usmarc_eng_t.cfm which handles redirection for vendor XYZ. [T_AUTHOR_XYZ,
placed in the fld_usmarc_eng_t_custom.cfm.
fields; required, optional, illegal. Unfortunately, what is sent in the MARC record often bears little resemblance to how the field is indexed. When designing the standard redirectable field, library standards should be adhered to. In general, all illegal and optional punctuation should be removed from the instance. This is done with a regular expression in the "RegularExpressions" section of the fld... cfm file. and a ‘rex semantic’ which applies the regular expression to the instance of data. For instance, many MARC records contain an author name in the form "Twain, Mark.". However, these are usually indexed in the form "Twain Mark". While some servers apply a process called "normalization" which eliminates commas and periods from the search string, others will find no hits when searching on "Twain, Mark.". To solve this we define a regular expression in the fld...cfm file. [REMOVE_COMMA_AND_PERIOD, ì[,.]î ìî 0,] }
more details see the documentation on regular expressions). To invoke this regular expression on each instance of a given semantic, the semantic name is enclosed in a ‘rex semantic’ grouping. [$ [* 1, %NL,] "{R}R_AUTHOR " [^ [@ REMOVE_COMMA_AND_PERIOD, S100a, ]] [^ ì ì [@ REMOVE_DASHES, S100d,]] "{/R}" ] ...
USMARC record, it is best to remove it from the instance and reinsert it when formatting the field. For instance, most vendors do not send the dashes in the x,y and z subfields of subject, however, these are required for proper searching of the index. Just in case they are sent, they should be removed to prevent them from being included twice. (Note: A library analyst should be consulted for the definitive word on whether dashes are ever sent in the MARC record.) [$ [* 1, %NL,] "{R}R_SUBJECT " [^ S600a, ] [^ ì -- ì [@ REMOVE_DASHES, S600x,]] [^ ì -- ì [@ REMOVE_DASHES, S600y,]] [^ ì -- ì [@ REMOVE_DASHES, S600z,]] "{/R}" ] ...
special field should be added to the fld...cfm for major vendors or added to the custom CFM file for minor vendors.
qualifiers in the connection CFM file. The qualifier refers to a SearchAttribute which contains a list of attributes to be sent to the server along with the redirected term. The ‘use’ attribute usually refers to the index that will be searched. Redirection terms are often sent to the server as phrases "[4, 1,]", and not truncated "[5, 100,]" , however, this can be adjusted in the connection CFM file, if need be. There is no reason why the redirection searchAttributes need to be from the same set of attributes displayed to the user. By setting the category code in the searchAttribute to 0, the search attribute will not be displayed but can still be used for redirection. 4. Character Set Translation
problems are resolved in version 8.1. Currently in version 8.0, all instances are translated to the display character set which is usually HTML Latin1. The server however, usually indexes on the ALA character set which is a form of ISO 2022. This is no problem for the 7 bit ASCII characters (the first 128 characters), however, no hits will be found for characters in the upper 128 because different codes and sequences are used. Currently a translator is being worked on which can translate the display character sets back to the ALA character set. This translator will be called on redirection strings sent from 8.0 WebPAC. (Note: The translator will be used in 8.1 WebPAC for keyboarded search strings, so the effort is not wasted.) In version 8.1 CFM files, translation is specified explicitly in the
CFM file. Since
In some rare cases, the server sends out the ALA character set, yet
indexes on 7
[$ [* 1, %NL,] "{R}R_AUTHOR ì %TRANSDEL, ì0î %TRANSDEL, [^ [@ REMOVE_COMMA_AND_PERIOD, S100a, ]] [^ ì ì [@ REMOVE_DASHES, S100d,]] %TRANSDEL, ì{/R}" ] ...
which indicates that the TO_SERVER_TRANSLATOR is to be used. The TO_SERVER_TRANSLATOR is defined in the ‘Maps’ section of the connection CFM. [TO_SERVER_TRANSLATOR, PATH, ASCII7, ] }
[TO_SERVER_TRANSLATOR, PATH, ALACHARSET, ] }
for keyboarded input. Since the internationalized databases will have their own fld...cfm files, this will not be an issue. In general, no action can be taken in 8.0 CFM files or should be taken
in 8.1
a redirection is nothing more than a search with a search term. If you enter the phrase you think you are redirecting on in an ordinary search (not a scan), you should get back the same number of hits. Thus you can use ordinary searches to experiment with punctuation, fields to redirect on, attributes to use etc. Secondly, the actual term is sent to the server in a Z39.50 Search Request
PDU.
|
|
|
Last Updated: March 31, 1998
|