|
Genealogical Computing
4/1/2002 - Archive
Genealogy P2P
Remember when Napster was hot-before the music industry and the courts doused
its sizzle with charges of copyright infringement? Napster had flourished by attracting
large numbers of users who were interested in sharing their copies of music files
with one another. At the time, I wondered whether genealogists would pick up on
the idea and implement Napster's peer-to-peer file sharing format for family research.
With the natural propensity of genealogists to exchange data, it could be an alternative
vehicle to the proprietary databases to which many of us now deliver our family
data.
I've found two groups that are trying to put this peer-to-peer process to practical
use for genealogists on a global scale. Before discussing them, however, it's
useful to understand the technology behind it.
Networking 101
First some definitions. A network is simply two or more computers that are connected
to allow communication to occur between them. The Internet is, of course, a network
on a global scale, but there are other versions of networks that, for example,
connect computers within a multinational corporation, a college campus, or a small
business. A family with more than one computer can even set up a small home network
to share the resources on each computer.
Sharing resources is one of the main reasons to set up a network, and files are
one of the resources that may be shared. If a file is a shared resource, the file
only has to exist on one computer in the network. Someone working at a different
computer on the network can be allowed to access or use the file without having
to copy it to his or her own computer. In other words, there's no need to hike
down the hall or across town with a file copied on a disk for the other person
to use. If other people want a copy of the file, they can copy or download it
over the network, assuming they have permission from the file owner.
Networks are classified as either peer-to-peer or server-based. Most small to
large networks today are server-based, which means that one or more powerful computers
(or servers) are at the center of, and in control of, the network. Programs and
files that are shared resources are stored on these fast, powerful servers, and
individual computers communicate with the servers to use shared resources.
Peer-to-peer, or P2P, networks go back to the early days of networking. In its
pure form, every computer on the network would be equal, and the user/owner of
each computer determines which files are selected for sharing with users at other
computers on the network. Each individual also determines when to share by controlling
whether or not his or her computer is turned on. If the computer is off, users
at other computers cannot access anything on that offline computer. Also, when
pure P2P networks get too large, they become slow and inefficient.
Family2Family Live
Napster resurrected P2P file sharing and made it a popular buzzword during the
past couple of years. They did this by using a modified peer-to-peer network that
had servers performing an indexing function, while the actual music files remained
on the users' computers. (There's a bit more to it than that, but for the purpose
of this column, it's not necessary to understand the technical details.)
Two groups have developed their own versions of peer-to-peer file sharing to facilitate
communication and exchange of data between family researchers. GedLink, founded
by Infoduc S.A. of Paris, France, was the first to go online, while GNTP, created
by professors at Brigham Young University, is still in beta testing.
Both of these genealogical systems use a modified peer-to-peer network model with
similarities to that used by Napster, but instead of MP3 music files, users seek
twigs and branches of family trees. Central servers store the index of names that
users search to determine whether other members have family files they would be
interested in. Unlike the music industry that closed down Napster, the genealogical
community is eager to share its product.
The basis of both GedLink and GNTP networks are GEDCOM files, which can, of course,
be created by any genealogy software program. To participate fully in either network,
you would download the software that can link you to the network. The software
indexes your GEDCOM file and, while your GEDCOM file remains on your computer,
it sends the index info to a central server. With GedLink the indexed data include
surname, first name, and year and place of birth and death. The indexed data remain
in the GedLink directory as long as you choose to participate in sharing your
family tree. Even when you go offline, your indexed data remain in the directory.
(GNTP works somewhat differently than this.)
Note, however, that with both systems you can search the global index even if
you do not share a GEDCOM file. In GedLink, if a search result looks promising,
that's when the direct peer-to-peer element kicks in, because the request for
more information is made directly to the person who owns the GEDCOM file you are
interested in. Obviously, there are several advantages here. You have control
over who you share your family data with, and, because there is contact between
genealogists in order to exchange data, it encourages communication. Ideally this
will facilitate corrections being made if someone's data contain errors. And the
GedLink system is set up so it is easy to update your indexed data on the GedLink
directory if your GEDCOM file changes.
GedLink formally announced it was operational in early May 2001. The software
works with all versions of Windows from 95 to XP. Four languages are supported:
English, French, German, and Spanish. As of December 2001, the GedLink Web site
reported it had 20,000 users, the majority of which were European, while a quarter
of the users were from North America. Although you can search GedLink for free,
it is produced by a private company, and they encourage users to become members.
Apparently there are some advanced features reserved for members who pay an annual
fee of $19 U.S.
Not Quite Live
The BYU effort, GNTP, or Genealogical Network Transfer Protocol, is at an earlier
stage of development, and has some significant differences from GedLink, but it
is supposed to remain free. At this point the interface is a Java application
that took an hour to download, but it works with Windows, Mac, and Linux operating
systems. Currently, it only supports English, but a European company is supposed
to be developing a multilingual Windows interface.
GNTP has two major differences from GedLink. First, the GNTP system requires that
a GEDCOM remains online to be included in the system, and if a user goes offline,
the data are no longer indexed in the directory. Because most genealogists connect
to the Internet through dial-up services, GNTP encourages those users to upload
their GEDCOM files to proxy servers. The advantage is that all data is immediately
available all the time. The disadvantage is you don't have control over who gets
your family data, and there is no obligation for a requester of data to contact
the person who owns the data.
The other difference is the plans of GNTP for organizations, such as the LDS Church,
to become server nodes. This will certainly provide a large influx of data. However,
I've been critical before in this column about the quality of data in Ancestral
File, since the vast majority of it is undocumented, and the contact information
for the individuals who contributed the data is often out-of-date.
Laudable Goals
Despite these differences, the objective of both efforts is to provide a new way
for genealogists to locate and connect with other researchers in a potentially
more timely and direct manner then the methods we have been using. Being in control
of your own GEDCOM file, if you add, delete, or make other changes or corrections
to your family database, you can easily update the GEDCOM files that are indexed
by either network. In addition, you retain ownership of your data, unlike some
proprietary databases, where once your file is incorporated, it becomes part of
the larger whole that the private company may own the copyright for. And participating
actively in a network by contacting other researchers about contradictory data
may improve the accuracy of online research and help reduce the problem of bad
data on the Web.
It remains to be seen whether either of these genealogical peer-2-peer systems
become as successful as Napster once was. I would encourage readers to check out
both networks. Obviously, the more family researchers that participate in sharing
data on a network, the more useful that network will be to genealogists seeking
to make connections.
Candace L. Doriott has served on the board of directors of the Detroit Society
for Genealogical Research. The International Society of Family History Writers
and Editors has recognized her for her excellence in writing. She can be contacted
at cdoriott@earthlink.net.
Back to Table
of Contents
|