
SOAR VOLUNTEER INSTRUCTION #1 (Reference only) SUPERCEDED BY INSTRUCTION #6
Horizontal image problem
All of the images you will receive in your packet will have the header typed across the top. That is the name of the deceased, the name of the publication the obit was clipped from, and the date of that publication. Below that will be the actual obit. If that obit was pasted horizontally, only the header will OCR and appear in the text block on the left side of your screen. Enter the keywords from that text and then type in the keywords from the obit. The obit in the image server can be rotated and zoomed in or out for readability. When hand typing, enter a comma between keywords to keep them separated. This is a bit more work than the vertically posted obits but it does give you a chance to read the entire obit. Some of them are interesting. Here is a sample:

If the obit was glued vertically to the header proceed as usual.
Ken Leffler
AHSGR SOAR coordinator
209-533-4056
SOAR VOLUNTEER INSTRUCTION #2 (Reference only) SUPERCEDED BY INSTRUCTION #11
Illegible or foreign language obits
Some of the obits you receive will be illegible or will be in German, such as those from the Sendbote or Kirchenbote. When you encounter these, go to the file navigation block in the CIC Keyword for Distribution Window and click on the right arrow. That will bypass that image and you can then process the next good one. That image will come back to the server and we will either rescan it if is illegible or send it to one of our volunteers who are fluent in German.
Note: Once you bypass an obit, the Image Server will not automatically bring up the next one after the update. Use the file navigation forward/next arrow to advance. This must be done through the rest of that packet .
SOAR VOLUNTEER INSTRUCTION
#3 (Reference only) SUPERCEDED BY INSTRUCTION #11
Keyword indexing packets are now on the way to volunteers who have requested
them. You all should have your CD's or they should be delivered to you by the
Postal Service. Ken, Marge and I have received a number of questions and we
thought a simple guide for "keyword indexing" was in order. Below is a
guide to keyword indexing with examples. Keywords followed by an "*"
are examples of correct indexing, examples followed by a "#" are
examples of incorrect indexing. We hope the instructions prove useful for you as
ready guide and reference.
* = Correct keywords
# = Incorrect keywords or interpretation of data presented
Names of Documents
Examples: "The Oregonian" "The Fresno Bee"
Index as:
Oregonian *
The Oregonian *
Do not index as:
The OR #
OR #
Index as:
Fresno *
Bee *
The Fresno Bee *
Do not index as:
Fresno Bee #
Names of Persons
Names are spelled as they appear in the document even if they appear misspelled.
If the name is spelled two or more ways in the document, keyword all spellings
found. Your reference to verify spelling should always be to the document and
not the OCRed interpretation of the document.
Names of Persons, examples: "Rev. Tim E. Abel" "Mrs. Johann
Schmidt"
Indexed as:
Tim *
Abel *
Tim E. *
Rev. Abel *
Do not index as:
E. #
Rev. Tim #
Timothy #
Index as:
Johann *
Schmidt *
Mrs. Johann Schmidt *
Do not index as:
Mrs. #
Mrs. Johann #
John #
Names of Places
Names of Places, Cities and States, examples: "Cheyenne, Wyo."
"Sterling, CO" "Spavluka, Russia"
Indexed as:
Cheyenne *
Wyo. *
Do not index as:
WY #
Cheyenne, Wyo. #
Cheyenne, WY #
Cheyenne, Wyoming #
Wyoming #
Index as:
Sterling *
CO *
Do not index as:
Colo. #
Sterling Colo. #
Sterling Colorado #
Index as:
Spavluka *
Russia *
Do not index as:
Spavluka, Russia #
Spavluka, RU #
RU #
Huck #
Names of Churches, Funeral Homes, and Cemeteries, examples: "St.
John Lutheran" Saint Patrick Catholic Church"
Index as:
St. *
John *
Lutheran *
Do not index as:
Saint #
St. John Lutheran #
Index as:
Saint +
Patrick +
Catholic +
Church +
Do not index as:
St. #
Saint Patrick Catholic Church
St. Pat's
Other Dates
Dates other than the document date captured in the date box above the OCRed
text. Birth and death dates are provided for in the boxes below the keyword box.
The date must be given in the article as a calendar date. Enter these dates in
the date format required in the boxes. Other dates may appear such as marriage,
death of spouse, etc. Other than marriage dates these other dates are not likely
to be useful for an inquiry by a researcher although they may very useful to
further that persons research. We urge you to refrain from calculating birth and
death dates from information given in the obituary such as "She passed away
last Tuesday..." or "...aged 67 years, 12 months, 23 days."
Marriage date may be keyword indexed using the date convention as it appears in
the document.
Finally, remember, the only stupid question is the question that is not asked.
We expect many questions and will answer all of them. Remember though, we are
not waiting at our machines for questions to arrive and we may need to research
the question more thoroughly, possibly delaying our response. Always remember,
our job is to index the document, not to annotate it.
Bob Benson
SOAR VOLUNTEER INSTRUCTION #4 (Reference only) SUPERCEDED BY INSTRUCTION #11
The intent of the obituary phase of this project is to accurately preserve the content of these obituary collections, as they were printed. The search engine will look for the original words entered, bring up the obit image, and researchers can then look at the originals and make their own changes if they so desire.
It is imperative that we enter the data exactly as it is on the original obit. Do NOT change Neb. To NE., Calif, to CA., Colo. To CO., etc. Please make sure that the data saved in the "Keywords to index by" block matches the data on the obit image in the Image Server window, prior to updating. Corrections can be made in that block at that time. See instruction #7.
I have found it best to enter the publication date in the upper date block and the dates of birth and death in the "Additional Dates" blocks prior to indexing keywords. That way, you will not hit the update button when you are through indexing and forget the dates. They are very important!
SOAR VOLUNTEER INSTRUCTION #5 (Reference only) SUPERCEDED BY INSTRUCTION #11
Two page obituaries
Some of you will receive two page obituaries. When we scanned them some had data on both sides of the card and it comes out as separate images. If you see an obit that appears to be incomplete, please just bypass that obit by going to the "file navigation" block and clicking on the forward button. That image will then be returned to us for further processing, i.e. make a single image when we find both parts. It would help us if you would note this on the e-mail message for the return packet.
SOAR VOLUNTEER INSTRUCTION #6
Indexing multiple column and horizontal obits
Many of the obituaries have multiple columns and when OCRd the columns sort
of merge together as it tries to read straight across the page. Volunteer Marcia
Paxton came up with following method to easily OCR one column at a
time. This will save you many hours of hand typing. We suggest you use this
method:
Some of the obits are pasted on the cards horizontally. Do the following:
Ken Leffler
AHSGR SOAR project coordinator
SOAR VOLUNTEER INSTRUCTION #7
Editing completed obits prior to submission
This question comes up again and again so I am posting it for the new volunteers and those of us with short memories.
Ken Leffler
AHSGR SOAR project coordinator
SOAR VOLUNTEER INSTRUCTION #8 (Reference only) SUPERCEDED BY INSTRUCTION #11
Indexing four digit numeric dates
We have a change of direction. In order to preserve the four number numeric
dates such as 1913, we are now asking you to enter them and add an letter after
it. The letters to use are b for birth, d for death, i for
immigration, l for location, and m for marriage. For example a
death date of 1989 would be 1989d, marriage date of 1937 will be 1937m, a
location date of 1967 will be 1967l, etc. You should continue to complete dates
such as April 13 1980 by highlighting that phrase in the Words Found block. but
be sure to delete any periods (dots) and commas from that phrase while it is in
the Keywords to Index by block. Everything else remains the same. All cities.
churches, etc., are to be individual words such as San Diego is San and Diego.
Ken Leffler
AHSGR SOAR coordinator
SOAR VOLUNTEER INSTRUCTION #9
Enlarging the "Words Found" data block
Some people are having a problem on their computer where the "Words found" block is only one line tall. To correct this, due the following:
Ken Leffler
AHSGR SOAR project coordinator
SOAR VOLUNTEER INSTRUCTION #10 (Reference only) SUPERCEDED BY INSTRUCTION #11
You may have encountered delays in indexing when a document is skipped because of the following conditions; Document is printed in German, Portuguese, Spanish, or other foreign language. Document is illegible. Document is a second or subsequent page of a multi-page document. Document is a cross reference card (X-ref). Document blank. Image has two overlapping document images. This condition requires us to retrieve and rescan the overlapping documents, if possible. The Image Server is programmed to OCR text and sends the OCR output for display on the Keyword for Distribution window. This may delay your ability to navigate to the next image you need to keyword. When you attempt to skip to the next available image before the OCR is completed you may time-out, requiring closing and restarting the Keyword for Distribution Program. This may be especially frustrating to you volunteers who are "POWER USERS." We have developed a work around as follows: If the document is German, Portuguese, Spanish or other foreign language after the document has OCRed type "foreign" in the "Keywords to Index by" box, note the image number* , and press process. Note this condition in your E-mail comments when you return the keyword packet as follows, *Image # and comment, e.g. 20-0007777 foreign. If the document is illegible, after the document has OCRed type "illegible" in the "Keywords to Index by" box, note the image number* , and press process. Note this condition in your E-mail comments when you return the keyword packet as follows, Image # and comment, e.g. 20-0007777 illegible. If the document is a second or subsequent page of a multi-page document, after the document has OCRed type "multi-page" in the "Keywords to Index by" box, , note the image number and press process. Note this condition in your E-mail comments when you return the keyword packet as follows, Image # and comment, e.g. 20-0007777 multi-page. If the document is a cross reference card, after the document has OCRed type "xref" in the "Keywords to Index by" box, note the image number , and press process. Note this condition in your E-mail comments when you return the keyword packet as follows, Image # and comment, e.g. 20-0007777 xref. If the document is blank, after the document has OCRed type "blank" in the "Keywords to Index by" box, note the image number , and press process. Note this condition in your E-mail comments when you return the keyword packet as follows, Image # and comment, e.g. 20-0007777 blank. If the image has two or more documents, usually one partially covering the other(s) after the document has OCRed type "rescan" , note the image number and press process. Note this condition in your E-mail comments when you return the keyword packet as follows, Image # and comment, e.g. 20-0007777 rescan. The image number is displayed at the top of the Keyword for Distribution window in a box titled "File to Index." Usually you cannot see the full number as it is hidden from view. Here is the quickest way to check for the image number. Place your cursor in the upper right of the Keyword for Distribution window at the middle box with the square icon. Press your left mouse button and the window will maximize, disclosing the full text in the "File to Index" box. That allows you to view and note the image number. After you have recorded the number you need to restore the window to view both the Keyword for Distribution and Image Server windows. Place your cursor on the middle box that now displays an icon with two smaller boxes. Press the left mouse button. The Keyword for Distribution window should return to its previous size and location. The above instructions should cover most instances where you are unable to keyword the documents presented to you. We will take action in the QA/QC Edit phase to resolve the problem. Remember, your good notes sent along with your E-mail are helpful in securing a quality index for our ancestral records.
Bob Benson
AHSGR SOAR project coordinator
SOAR VOLUNTEER INSTRUCTION
#11
This Instruction supercedes and replaces SOAR VOLUNTEER Instructions #2, #3,
#4, #5, #8, & #10. If you have any questions regarding these changes please
address your questions to our SOAR VOLUNTEER FORUM at AHSGR-SOAR-L@rootsweb.com
. or Bob Benson at rmbmlb@attbi.com or Ken Leffler at klef@sonnet.com
We have come a long way and learned much as we have developed our skills and
techniques in the “ART” of keyword indexing. Our Instructions have attempted
to inform you of how and what to index. Those instructions were based upon our
understanding of how the software processes the keywords. The last element of
our keyword indexing system is now in place, the Quality Assurance (QA), Quality
Control (QC) and Edit capability. Our experience these past three months and
our ability to now observe the keyword selection as it passes through a series
of filters affords us the opportunity to define and restate our keyword instruction
incorporating the need to make some adjustments in our methodology.
Our focus continues to consider the indexers time and the researchers needs.
The obituary index could have been as simple as keywording the deceased’s surname.
However, from the beginning we attempted to provide future researchers every
reasonable resource to retrieve the records they need. We hope this will lead
to new discoveries about our ancestors and their associations and affiliations.
Special Characters:
The keyword index filter truncates any entry following a space or special character.
Examples of special characters are an ampersand i.e. “&” virgule i.e. “/”
and special characters include the space i.e. “” Generally speaking special
characters are the symbols at the top of your keyboard above the numbers and
to the right of the letters p, l, and m on a Qwerty keyboard. The special character
that concerns us most is the space In the past we have asked you to Keyword
certain publication names, initials of individuals, individuals with titles,
etc as follows:
Example 1. “The Oregonian”
oregonian
the oregonian; discontinue the index example on this line.
Example 2.George W. Bush
bush
george
george w; discontinue the index example on this line.
Example 3. Rev. Billy Graham
billy
graham
rev
rev graham, discontinue the index example on this line.
When the keyword index filter encounters character strings such as “the oregonian”
it reads the space and disregards all text that follows the space. It then compares
the remainder, the word “the” and discards the word “the.” When “george w” is
encountered, it reads the space, disregards the “w” and any other character
that follows. It then finds “george” is already entered and discards the duplicate
entry. In the case of Rev Graham the space is encountered and all characters
that follow are discarded leaving “rev” In the example “rev graham” truncates
to rev and rev is duplicated and discarded.
Date Entry:
Date entry is the completion of the three date fields. This can be done by key
entry of the full date or key entry of the year and using the calendar pull
down menu navigating to the month and day and pressing your left mouse button
on the correct day of the selected month.
SOAR Instruction #3 informed you to key the date only when the full calendar
date is published. This instruction is reversed and withdrawn. If a day of the
week is mentioned in context with a full publication date, use the calendar
function to ascertain the death date, i.e. publication date is 07-29-1978 and
person died on Thursday. Ascertain the calendar date for Thursday as 07-27-1978.
Use this data entry carefully and determine that these dates are in context.
Example: Article reports the person died on Thursday and decedent was buried
on Saturday. The publication date is a Friday. This information from the publication
suggests the death date is another earlier date, 07-20-1978. Enter that date
accordingly. If you are uncertain, note the death date as an exception, “death
date unclear” in the notes you return in your e-mail with the completed packet.
There is another trap lying in wait regarding death dates. That is the publication
is dated January 5, 1999 and the decedent died on Tuesday or publication is
dated January 5, 1999 and the decedent died on December 30. Be sure you catch
the fact that the preceding month is 1998.
Consistent with our frugal traditions, some images will include more than one
article published on different dates. When this occurs, enter the earlier date.
Keyword Entry:
Persons:
Keyword all persons, living or dead, mentioned in the publication that were
contemporaries of the record subject. This includes but is not limited to spouses,
parents, aunts and uncles, brothers and sisters, children, grand children, nephews
and nieces, in-laws, coworkers, and participants in the funeral service, e.g.
clergy, casket bearers, soloists, organists, ushers and attendees.
Initials:
If a single initial is given, do not keyword it. The index filter disregards
single alphabetical characters.
Example: George W. Bush. If you keyword George W the index filter will drop
the W.
If two or more initials are given, key them without a space or period between
or after the letters.
Example: G. W. Bush, key initials as GW or gw. Note: alphabetic characters are
not case sensitive in keyword and all keywords are displayed in lower case upon
pressing update.
Places:
Keyword place names of townships, communities, villages, towns, cities, counties,
states, provinces and nations. If the name is identified as name + township,
name + city or name + county, include the word township, city, or county as
well; e.g. keyword;
Berlin Township as
berlin
township
Risely Township as
risley
township
Midway City as
midway
city
Nebraska City as
nebraska
city
Weld County as
weld
county and
Marion County as
marion
county.
State or Province Abbreviations:
Keyword abbreviations without a space or period between or after the letters.
The index filter disregards the following parts of speech, Articles, Conjunctions
Prepositions and Pronouns.
The following state abbreviations must be amended.
Example 1: IN is the standard USPS abbreviation for the state of Indiana. This
combination of letters also spells the word “in” a preposition. Amend the abbreviation
IN for Indiana by adding the letter “d,” as IND or ind.
Example 2: OR is the standard USPS abbreviation for the state of Oregon. This
combination of letters also spells the word “or,” a conjunction. Amend the abbreviation
OR for Oregon by adding the letter “e” as in ORE or ore.
Institutional Names:
Names of institutions where the record subject 1. maintained church affiliation,
2. hospital or nursing facility where death occurred 3. mortuary, funeral home
and/or funeral service providers 4. cemetery or mausoleum where laid to rest.
When keywording institutional names, particularly churches, Articles, Conjunctions,
Prepositions and Pronouns are often encountered. You do not need to keyword
these parts of speech as the software index filter will delete these words.
However if the words are entered individually no harm is done. We recommend
that when in doubt, keyword and let the software index filter figure out the
excluded parts of speech. Note: In most instances Articles, Conjunctions and
Prepositions are lower case while Prepositions used in a name are usually capitalized.
Example 1. Our Savior Lutheran Church; keyword as
church
lutheran
savior
Example 2. Church of the Covenant; keyword as
church
covenant
Publication Names:
When keywording publication names, past instruction was to keyword each name(s)
separately and if preceded with the word “The” to enter a character string,
i.e. characters separated by a space. That instruction is revoked. See “Special
Characters:” on page one for more information. Keyword the names separately
excluding; Articles, Conjunctions Prepositions and Pronouns.
Example: The Denver Post; keyword as
denver
post
the Denver Post discontinue the index example on this line.
This same rule applies to many other newspaper names such as The Oregonian and
The Fresno Bee. Character strings separated by special characters are not accepted
by the software index filter.
Miscellaneous Dates:
Year of birth, death, immigration, relocation and marriage of the record subject,
(keyword birth and/or death year of the record subject only when you are unable
to recorded that date in the DOB or DOD fields) or persons associated with the
record subject.
Misc. Date Entry:
This instruction applies only to dates entered in the “Keyword to Index by”
box. It does not apply to the Publication Date, Birth Date and Death Date fields.
Keyword the year only, followed by a letter, b = birth year, d = death year,
i = immigration year, l = relocation year and m = marriage year. Do not keyword
the month or day. When the full date “month day and year” (MDY) or “day month
and year” (DMY) are entered all information following the first space encountered
is truncated. If the convention is MDY the month only is left. Unfortunately
these months can become clutter frustrating a researcher. That is because the
following month names or abbreviations are also the given names of persons,
e.g. Jan, March, April, May, June, July and August. When entered as DMY all
information after the first space encountered is truncated leaving a number.
When the keyword index filter encounters a number with no sucedeing letter(s,
it is discarded. Rather than formulate a set of complex rules for the entry
of miscellaneous dates our instruction is to enter the year followed by the
keycode letter as follows:
Example 1: Spouse born July 4 1905, keyword as:
1905b
Example 2: Spouse died January 7 1978, keyword as:
1978d
Example 3: Record subject immigrated to America in August 1911, keyword as:
1911i
Example 4: Record subject and spouse moved to Washington in winter of 1936,
keyword as:
1936l
Example 5: Record subject married on 19 June 1946, keyword as:
1946m
Remember, entry of full dates accomplishes no productive purpose and muddies
the data base with clutter for future researchers.
Keyword Processing:
Precedence: The order of precedence when facts conflict are 1. the article,
2. facts available to you derived from the article, 3. the mounting card on
which the article is affixed and 3. hand written or typed extraneous
comments.
OCR: Process keywords using the OCRed text compared and corrected to the
text appearing in the article or key-entered directly from the article.
Cropping: Take advantage of the crop image tool when multiple columns are
encountered.
Rotate: Rotate the image when it is presented sideways.
De-Speckle: If an image does not keyword well use the de-speckle tool
Returning the Completed Packet:
Attach and Send the completed packet named Rtn###.dst (where # = number) via
e-mail to soarproject@ahsgr.org
Keyword Processing:
Precedence: The order of precedence when facts conflict are 1. the article,
2. facts available to you derived from the article, 3. the mounting card on
which the article is affixed and 3. hand written or typed extraneous
comments.
OCR: Process keywords using the OCRed text compared and corrected to the
text appearing in the article or key-entered directly from the article.
Cropping: Take advantage of the crop image tool when multiple columns are
encountered.
Rotate: Rotate the image when it is presented sideways.
De-Speckle: If an image does not keyword well use the de-speckle tool
Returning the Completed Packet:
Attach and Send the completed packet named Rtn###.dst (where # = number) via
e-mail to
soarproject@ahsgr.org
General
Complete the subject line of the message with the packet name included i.e.
Rtn###.dst.
Include all exceptions encountered in the text of the returned message, e.g.
image number 20-0012345 –no publication date, or no death date or no birth date
and special cases such as:
Image File Format:
The images are “tagged image file format (tiff). These files are identified
using a file name such as 20-0012345.tif. The files images are examined in thumbnail
identify and remove blank and Xref cards before the packets are created. This
creates a “browse” file. The file extension for these browse files is .jbf.
If this file format is encountered it will not opendo not try to index it. Include
this exception in your comments with your return packet e-mail. Simply state
“jbf file encountered.”
Foreign:
If the document is German, Portuguese, Spanish or other foreign language after
the document has OCRed type "foreign" in the "Keywords to Index by" box.
Make a note of the image number.*
Press Update
Note this condition in your E-mail comments when you return the keyword packet
as follows:
Image # and comment, e.g. 20-0012345 foreign.
Illegible
If the document is to faint or unfocussed after the document has OCRed type
"illegible" in the "Keywords to Index by" box. Make an effort to keyword
the data as best you can, especially the header data, as this is probably the
best scan of the obituary that we can get.
Make a note of the image number*
Press Update.
Note this condition in your E-mail comments when you return the keyword packet
as follows:
Image # and comment, e.g. 20-0012345 illegible.
Multipage
If the document is a second or subsequent page of a multi-page document, including
a photo only, after the document has OCRed type "multipage" in the "Keywords
to Index by" box.
Make a note of the image number.
Press Update.
Note this condition in your E-mail comments when you return the keyword packet
as follows;
Image # and comment, e.g. 20-0012345 multipage.
Cross (Xref) Reference Card
If the document is a cross reference card, after the document has OCRed type
"xref" in the "Keywords to Index by" box.
Make a note of the image number.
Press Update.
Note this condition in your E-mail comments when you return the keyword packet
as follows, Image # and comment, e.g. 20-0012345 xref.
Blank
If the document is blank, after the document has OCRed type "blank" in the "Keywords
to Index by" box.
Make a note of the image number.
Press Update.
Note this condition in your E-mail comments when you return the keyword packet
as follows, Image # and comment, e.g. 20-0012345 blank.
Rescan
If the image has two or more documents, usually one partially covering the other(s)
after the document has OCRed type "rescan"
Make a note of the image number.
Press Update.
Note this condition in your E-mail comments when you return the keyword packet
as follows, Image # and comment, e.g. 20-0012345 rescan.
Multiple Death
If the image has two or more contemporaneous deaths (two or more deaths occurring
at or about the same time) after the document has OCRed type "multiple death"
Make a note of the image number.
Press Update.
Note this condition in your E-mail comments when you return the keyword packet
as follows, Image # and comment, e.g. 20-0012345 Multiple Death.
Glossary/Definition of Terms:
Blank is a card with no meaningful text or photograph.
Cross Reference is a card that directs the researcher to another location in
the paper file.
DOB is date of birth.
DOD is date of death.
Double is an image(s) partially covered by another image.
Duplicate is an article published in the same publication, published on the
same day.*
Foreign is used to describe any language other than English.
Institutional name is a name of a business or organization.
Multiple death is where two or more contemporaneous deaths are recorded in an
article or image.
Multipage is where an article extends to a second or subsequent image.
Photo Only is when an image displays a photograph with no text.
Place name is the name of a township, community, village, town, city, county,
state, province or nation.
Record Subject is the person about whom the article is written.
Repeat is a second or subsequent article published in the same or another publication.
Xref is an abbreviation of Cross Reference.
*Index exact duplicates when encountered. Duplicates will be examined in a separate
process and the superior image will be selected for retention.
Table of Contents:
Page 1. Superceded Instructions
Introduction
Special Characters
Page 2. Date Entry
2. Keyword Entry
Persons
Initials
Places
Page 3. State or Province Abbreviations
Institutional Names
Publication Names
Page 4. Miscellaneous Dates
Misc. Date Entry
Page 4. Keyword Processing
Page 5 Returning the Completed Packet
Notes General
Foreign
Illegible
Multipage
Cross (Xref) Reference Card
Blank
Page 6. Rescan
Multiple Death
Page 6 Glossary/Definition of Terms
| History |
Present
| Future | Fresno |
Lincoln
| Colorado
| California |
| Volunteer Instructions |
Telephone: (402) 474-3363
Fax: (402) 474-7229
E-Mail: ahsgr@ahsgr.org
Report any web site problems to Webmistress