Network Working Group                                        D. B. McKay
Request for Comments: 316                                 A. P. Mulleray
NIC: 9346                                                            IBM
                                                  February 23 & 24, 1972


               ARPA Network Data Management Working Group


   The meeting had two different phases.  The first included
   presentations of applications of networks and development work in the
   design to allow data sharing in a computer network, the second was a
   working meeting in which was discussed what the data management
   working group should do.

Phase I

   JOHN SENIOR, Univ. of Penn. and National Board of Medical Examiners,
   Phila., PA., described the use of a network to provide access to
   models that simulate medical behavior of patients.  These models are
   used primarily for teaching and testing physicians.  The network
   provides an interface by which varieties of terminals can connect to
   and access these models.  Other data bases exist to which access
   through a network may be desirable; however, these data bases have a
   "polyglot" of organizations making it presently impossible to use
   foreign data bases.

   HECTOR MAYNEZ, National Library of Medicine, described the MEDLINE
   system.  This has 1000 journals on-line to which access can be made
   via a network.  This network, as the one above, provides the
   interface for access by various terminals.  In this network are four
   or five computers with other applications such as CAI, clinical
   diagnosis, etc.

   RAY BEVERIDGE, MITRE, presented the requirements for the WWMCCS
   (World Wide Military Command and Control System) Network.  This
   network will contain 25 nodes and have a data exchange rate of the
   order of 10,000,000 characters per day.  Three type of data were
   formulated - query data with response on the order of seconds, daily
   exchange for updates and reports, and other data for weekly, monthly
   or as required reports.

   ERICA PEREZ, MITRE, discussed data management for the WWMCCS Network.
   The two problems are determining the location of desired data, and
   providing the proper security and reliability for vital data.  The
   location of data bases will be indicated in directories which may
   automatically determine which segment is applicable to a query.  The
   directory will contain lists of data bases, files users and programs.



McKay & Mulleray                                                [Page 1]


RFC 316              Data Management Working Group         February 1972


   The directory can be centralized (all at one location), distributed
   (split into pieces but where each piece resides at one location)
   partially replicated (split into pieces but in which certain parts
   may be replicated at different locations) and completely replicated
   (the complete directory at all locations).

   The data management system will have to deal with possibly different
   hardware systems and even different local data managements systems.
   One solution is to have a standard data management and data
   description language for transmission of requests and data in the
   network.

   The system will have to provide capabilities for file transfer,
   queries, remote batch, and for user communication via a mail box.
   The security of the data is maintained by checking user id, terminal
   authorization, process authorization and data authorization.

   BOB BROWN, General Motors Research Lab., described the network of
   computers at the General Motors Research Center.  This network at
   present consists of an IBM 360/67, a 360/65, a 370/165, three 1800's
   and a Sigma 5.  All of these are primarily for graphics use except
   the 67 and the 165.  An example of how data passes through the
   network was given.  The styling department develops a design on an
   1800.  Data on this design is sent to the 67 for stress and shape
   analysis and the results returned to the 1800.  After a design is
   developed, it is sent to the 65-1800 combination for detailed
   analysis for production.  Many of the computers are running GM's own
   operating systems, and the network control consists of macros added
   to these operating systems.  Interfacing is done by providing
   specific conversion modules to the called when the specific
   conversion is required.  The 67 will eventually be replaced by a
   hierarchical multiprocessor based on the CDC Star-100.

   PHIL MESSING, MITRE,  is setting up an experiment to test the
   practicability of interfacing a network standard data management
   language with local data management systems.  In this experiment, a
   user will make a request in the network language, this request will
   be transmitted to a node, and translated to the language of this
   local node.  At present, three local systems have been selected to be
   used - MADAM at MIT, LISTAR and Lincoln Labs., and NASIS at
   NASA/Ames.

   It is not expected that the common data language will be able to
   handle all possible requests that may be made.  The language should
   be able to handle the most common requests, otherwise, some means of
   interaction may be set up in order to allow the transmission of more
   information to the target system than the common language may allow,
   or finally, a user can utilize the local target language.



McKay & Mulleray                                                [Page 2]


RFC 316              Data Management Working Group         February 1972


   At a later stage in the experiment, a user will input a query, the
   local host will determine where the query is to be sent, the
   transmission takes place, it is accepted by the target node,
   translated to the target node's local language and processed.

   ERNIE FORMAN, MITRE, is developing a special, simple data management
   system specifically for the purpose of measuring and testing
   organizational techniques for control, directories, and files.  The
   question to be answered is whether each of these three functions
   should be centralized, or distributed, how, and where.  The initial
   experimental arrangement is to have the control and directory
   centralized at the Rand node, and the files to be distributed at
   UCSB, Rand, and BBN.  The files are each split vertically and
   distributed, this organization chosen to present the more difficult
   case.

   DICK WATSON, SRI, described some extensions of NIC (Network
   Information Center) that he would like to see, and that would involve
   network data management facilities.  The first would be the ability
   to process text from one text processor by another.  Second, it would
   eventually be desirable to distribute the NIC journals.  A first
   stage of this would be to have several NLS (Network Library System)
   systems around the network, each with its own journal.  The problems
   with this first stage would be in coordination of numbering and in
   organization of the directory.  A second stage would be one in which
   the journal might reside, in part, on other than NLS systems.

   A third extension is to enable the NLS System to use the results of
   some other cataloging or citation and bibliographic referencing
   systems as input to the NLS catalogs.  The fourth extension would be
   to enable other data management systems to generate data of more
   general type and be usable by the NLS.

PHASE II

   The second phase of the meeting was a working meeting to try and
   organize the committee and try and set up an active working interest
   group.

   The following names presently form the committee.  These are the
   people who have shown active interest, and are engaged in related
   activities:









McKay & Mulleray                                                [Page 3]


RFC 316              Data Management Working Group         February 1972


      Douglas B. McKay        IBM Research (Chairman)
      Abhay Bhushan           MIT
      Ernie Forman            MITRE
      Dorothy Hopkin          University of Illinois
      Phil Messing            MITRE
      A.P. Mullery            IBM Research
      Erika Perez             MITRE
      A. Shoshani             SDC
      S. Taylor               MITRE
      Bob Thomas              BBN
      Frank Ulmer             NBS
      Dick Watson             SRI
      Dick Winter             CCA

   It would be very useful in follow-on meetings to have representative
   from the Form Machine group.  Discussions on various uses of the Form
   Machine by a Network Data Management facility are bound to come up in
   later meetings.

   A member of the form machine group would be an asset to the Data
   Management Committee.

   Discussion on network data management covered many aspects of the
   problem with a general discussion on just what people want to be able
   to do with a network data facility.

   The following list, gleamed from the discussion, represents the
   possible stages of development:

   1.  Transmission Facility - the Network Data Control Facility (DCF)
       is able to route requests for files to the proper node.  The
       location and name must be specified.

   2.  Location Catalog- The DCF now has available to it a catalog which
       contains the locations of the data sets to be used in the
       network.  Requests for files may be made by name only, the
       location being determined by the DCF.

   3.  Description Catalog - Descriptions, as well as data sets can be
       transmitted in the network.  It is assumed these descriptions
       exist as files at local nodes.  A target node can make use of the
       description to properly convert the data set to its own format.

   4.  Data Conversion Modules - Data descriptions are received by this
       module of the DCF.  Based on the descriptions, conversion
       programs are called or generated which will transform a file to
       the form required by the target node.




McKay & Mulleray                                                [Page 4]


RFC 316              Data Management Working Group         February 1972


   5.  File Access Command Interface - this module is able to convert a
       request for a file from a network data language to the local
       language at which the file is located.

   6.  Data Access - This module, an extension of the network data
       language and the interface modules, allows access to pieces of
       data as specified in the data language, and generates the proper
       local access commands.

   7.  Data Management Interface - This is the final stage, at which
       general types of commands can be interfaced to local data
       managements systems, providing general interaction among
       different data amanagement systems at different nodes.

   It was generally agreed that the ability to access all data and
   different data bases is a goal which is worth achieving.  There was
   discussion in what is the best way to achieve this goal, and the
   actual implementation techniques that could be used to achieve this.
   It was agreed that the data base interfacing problem should be
   studied in more detail and several people more willing to write
   reports on a representative problem when they have more results from
   their work.

   There was also a discussion concerning the data language and whether
   it is suitable or not.  One fact should be made clear, the results of
   this committee should not fail or succeed on the outcome of the data
   language question.  The initial proposal recommends the Datalanguage
   as de facto standard that will be adopted in the network because of
   its support and availability.  The group should be able to recommend
   changes when changes are shown to be necessary.

   The Datalanguage discussion did point out the need for having data
   set descriptions cataloged and referable by name - D. Winter, said
   that he would look into this problem.

   The proposal (RFC 304) for a network data facility should be read
   again and discussed in more detail at our next meeting.  The proposal
   says we can implement and achieve a stage 3 capability with what we
   know today.  It would be a useful stepping stone to a stage 5 and
   stage 6 capability.

   Related to the stages of development described above the following
   studies are now in progress and will help us answer pertinent
   questions.

   A. Bhushan is studying a stage 1 type of network operation with
   extension in local catalogs to contain entries of network data sets
   of interest locally, to enable automatic calls to foreign data sets.



McKay & Mulleray                                                [Page 5]


RFC 316              Data Management Working Group         February 1972


   E. Perez will be studying the network catalog structure in more
   detail and will publish an RFC on her work.

   Many questions were raised about the use of the data language as a
   network standard.  There are two people that have volunteered writing
   up their investigations of this important study.

   Frank Ulmer will be looking at various data management systems to see
   if their data structures are describable in terms of the
   Datalanguage.  In addition, the NIC represents one important network
   data base that could be distributed through the network.  Dick Watson
   will try to describe the NLS Journal structure in terms of the
   Datalanguage.

   If there are any other people in the ARPA network or outside within
   hearing distance of this memo who may know about any real or
   potential applications of data sharing in a network, please submit an
   RFC in a letter to someone associated with the Data Management
   committee describing it.

Appendix -- Meeting Attendees

   William Benedict     USAFETAC Bldg. 159 Navy Yard Annex Wash. D.C.

   Roy Beveridge        MITRE

   Abhay Bhushan        MIT, Project Mac, Cambridge, Mass.

   Bob Brown            General Motors Research Lab.

   Elizabeth Fong       National Bureau of Standards, Wash. D.C.

   Ernie Forman         MITRE

   Glen Grazier         USAFETAC Bldg. 159 Navy Yard Annex Wash. D.C.

   Dorothy Hopkin       U. of Ill., Adv. Comp. Bldg., Urbana, Ill.

   Hector S. Maynez     National Library of Medicine

   Doug B. McKay        IBM Research Center

   Phil Messing         MITRE

   Al Mullery           IBM Research Center

   Erika Perez          MITRE




McKay & Mulleray                                                [Page 6]


RFC 316              Data Management Working Group         February 1972


   John Senior          Univ. of Penn. and National Board of Medical
                        Examiners, Phila. PA.

   Arie Shoshani        SDC, 2500 Colorado Ave., Santa Monica, Cal.

   Martin Snyderman     Smithsonian Science Info. Exch., Wash. D.C.

   Eric Swarthe         National Bureau of Standards, Wash. D.C.

   Suzanne Taylor       MITRE

   Bob Thomas           BBN

   Frank Ulmer          National Bureau of Standards, Wash. D.C.

   Dick Watson          SRI

   Richard Winter       Computer Corporation of America







        [This RFC was put into machine readable form for entry]
     [into the online RFC archives by Hélène Morin, Viagénie 10/99]
























McKay & Mulleray                                                [Page 7]