CAUSE/EFFECT

Copyright 1997 CAUSE. From CAUSE/EFFECT Volume 20, Number 2, Summer 1997, pp. 41-47. Permission to copy or disseminate all or part of this material is granted provided that the copies are not made or distributed for commercial advantage, the CAUSE copyright and its date appear, and notice is given that copying is by permission of CAUSE, the association for managing and using information resources in higher education. To disseminate otherwise, or to republish, requires written permission. For further information, contact Julia Rudy at CAUSE, 4840 Pearl East Circle, Suite 302E, Boulder, CO 80301 USA; 303-939-0308; e-mail: jrudy@cause.org

Taming the Internet for Electronic Data Interchange via a Secure Server

by David H. Stones

In the fall of 1995, the University of Texas at Austin placed in service a dedicated UNIX machine to provide educational institutions a simple, convenient, secure mechanism for exchanging formatted educational documents -- primarily transcripts -- via multiple Internet protocols at no cost to the users. This article is intended to showcase an interesting technological project, increase EDI awareness and use of the server described, dissuade others from building their own such servers, and reinforce the concept of Internet security for transcript exchange.

Student information systems staff at the University of Texas at Austin have invested heavily since the early 1980s in local and national efforts on standards for electronic data interchange (EDI).1 In September of 1994, discussions turned to some obstacles that had arisen with regard to using the Internet for electronic data interchange of student transcripts. A plan of attack was carefully developed, calling for providing a trusted third-party service to those wishing to use the Internet for EDI. The plan was endorsed by the Technology Committee of the Texas Association of Collegiate Registrars and Admissions Officers (TACRAO).

To carry out the plan, the UT Austin registrar's office purchased a computer, and a team of systems analysts began software development in an unfamiliar arena, utilizing tools that were also new to us. Considerable programming was mixed with integration of existing software and products, which we found to be largely free of charge for nonprofit public institutions. Substantial assistance came from the Internet community and other schools. The national educational EDI group was very interested and supportive of our project.

In less than a year, the UT Austin Internet EDI Server was tested and placed in service. Since then, 140+ institutions have registered with the server, more value has been added, and usage has increased steadily, with no major setbacks. Through cooperative efforts, we were able to integrate many separate pieces to deliver the planned service. Usage of the server is now in the plans or actions of many states, colleges, universities, and school districts. Higher education has pioneered safe and effective EDI of student transcripts on the Internet.

SPEEDE/ExPRESS origins

During the '80s, several successful projects for electronic transcript exchange were implemented. These were confined to states or communities, and they used different proprietary formats. Beginning in 1988, the American Association of Collegiate Registrars and Admissions Officers (AACRAO) and the National Center for Educational Statistics (NCES) sponsored an effort to derive a single national format for electronic transcript exchange. The college format (SPEEDE, for Standardization of Postsecondary Education Electronic Data Exchange) and the pre-K-12 format (ExPRESS, for EXchanging Permanent Records Electronically for Students and Schools) share a common format which has been approved through the American National Standards Institute as an ANSI ASC X12 standard for EDI (namely Transaction Set 130, or TS130).

Developers were primarily from the U.S., but Canadian participation in the development group made the format more robust. TS130 provides codes and structures allowing schools to carry whatever personal and academic items they choose to include on their transcripts, or to send similar information as part of an electronic permanent record upon movement or transfer of a student to a different school or school district. Transaction sets have also been approved for other formats associated with the delivery of transcripts, as well as representing the information in the application for admission, course inventory, and verification of enrollment. Work continues on other educational data standards.

SPEEDE/ExPRESS advantages

Use of EDI for transcripts has important advantages. Recipient institutions have tremendous opportunities to automate or reengineer their processes and save considerable resources. Electronic transcripts may be instantly logged into the prospective student systems, allowing admissions staff to inform applicants that transcripts have been received, an event that can take weeks for paper documents arriving in the mail at the application deadline. GPA calculations and evaluation of transfer courses into the receiving institution's course numbering system can be accomplished programmatically with no required data entry, potentially eliminating hundreds of hours of repetitive manual work.

The sending institution can realize financial savings by eliminating the paper, stuffing, and postage costs associated with delivering paper mail. End-of-semester peak processing is also moved from a manual to an automated process, reducing the load during peak periods. The greatest benefit to the sender is in providing better service to the student, enabling information to be quickly delivered to and processed by the recipient.

With so much to be gained through the use of EDI, one would expect rapid acceptance and deployment of this new approach for exchanging transcripts. However, while the number of institutions exchanging EDI transcripts or working toward that result has been very gratifying, the movement has been neither pervasive nor fast.

Why is this the case? For schools with complete student records stored in computer databases, there are three steps for sending transcripts electronically: extraction of data from their files, translation into the ANSI X12 format, and delivery of the file. The wide range of options and difficulties associated with the actual delivery process has been a major impediment to the widespread use of EDI in higher education.

Delivery options: VANs vs. Internet

Early state-oriented electronic transcript exchanges employed value-added networks (VANs) -- although sometimes indirectly, as the backbones for state networks -- for delivery of transcripts. These worked reasonably well but with significant cost per transcript.

At the first annual SPEEDE conference in 1990, representatives of several institutions urged the harnessing of BITNET for delivery of transcripts, despite doubts by the standard developers and promoters. Several institutions later conducted limited Internet pilots and were quite encouraged by the results.

Wide-open Internet usage continued to be distrusted on the basis of operational, dependability, and security concerns. The existence of many different flavors and protocols on the Internet was also confusing. Still, the number of institutions already connected and using the Internet for other purposes was increasing steadily, and familiarity and financial objectives kept the desire for EDI on the Internet high. It was assumed that industry would soon present a satisfactory Internet solution.

In 1993, in an effort to better understand the issues, the SPEEDE Committee and ExPRESS Technical Advisory Group met with representatives of the Internet and the Internet Engineering Task Force. The IETF is the structure that approves proposals for Internet protocols and through which requests for comment are followed by formation of discussion groups, consensus agreement on standards, and testing of the standards. This encounter was very informative, and the IETF accepted the charge of addressing EDI over the Internet.

Basically, both of the primary Internet protocols, file transfer protocol (FTP) and electronic mail with MIME, are workable for EDI from an operations standpoint. Both, however, have scalability problems, and both have dedicated followers who are not about to discard them to switch to the other protocol.

The issues of Internet security and privacy are also very important. Encryption offered promise, but management of encryption keys for thousands of schools at each of thousands of locations remained a dilemma.2 These issues are discussed in greater detail below.

Disadvantages of VANs

While VANs have been effective for EDI, at least four problems made us look for a viable Internet solution.

First, delivery via VAN was costing around one dollar per transcript, based on Texas Electronic Transcript Network (TXETN) experience. While recipients considered it "worth it," there was interest in avoiding this cost. For high schools and school districts, solutions with incremental costs were unattractive.

Second, there are many VANs, and both exchange of documents and access to management information about them are far easier if all trading partners are on the same VAN. Which one should we select?

Third, scalability is a real problem. Establishing trading partner relationships with the VAN provider for each pair of the tens of thousands of schools was a frightening thought, with close to an n-squared number of relationships for n institutions. The problem increased in the case of different VANs, where interconnect agreements had to be set up on both VANs.

Finally, states and schools had already spent millions on Internet functionality and connectivity. Using it to save money would be highly desirable.

Disadvantages of Internet

Even with the Internet, we had a problem similar to that of multiple VANs; namely, sender and recipient could not make independent selections. A school electing to deliver via FTP might be unable to exchange with a school that had built systems based on MIME attachments to e-mail. Those supporting encryption could not deal effectively with those wishing no part of it. How could we support all options, yet allow schools to make selections based only on the best fit with their own computer environments?

Scalability remained an issue with Internet delivery, as well. The notion of hundreds or thousands of schools logging in at least daily to each of hundreds of other systems to drop off or obtain files was unacceptable, so a single delivery by each school would be an objective. Distribution of passwords in an n-squared pattern compromises the secrecy of the passwords -- how is scalability solved?

Finally, trading partner relationships needed improvement. The need for advance set-up for every possible combination was too expensive. We needed a system in which each party could register centrally, then rely on the central registry for information on potential partners.

Needed: A simple Internet solution

With all the available options, nobody felt qualified to make the selections with any degree of confidence. Many registrars willing to participate were asking for ABC instructions on what they needed to do, a difficult task in the absence of a single acceptable delivery vehicle. The system had to be easy to use, without requiring training in new protocols. The delivery process needed to be at least as simple as with VANs. Simplicity engenders trust, while complexity makes defense of the security of our solution much more difficult for the lay person. A simple alternative to the delivery dilemma would allow institutions to concentrate their resources on the task of producing ANSI ASC X12-compliant files, rather than worrying about how they are going to deliver them.

With news of Internet break-ins commonplace, security had to be ironclad. Authenticity of documents had to be assured. Trust had to exist that documents had not been modified in transit. Students must be assured that their documents have had their confidentiality maintained, as required by both the Family Educational Rights and Privacy Act (FERPA) and general good information practice.

Description and function of the UT server

The server itself is a DEC UNIX machine, dedicated to use as the UT Internet EDI Server. Externally obtained software is used to handle Internet e-mail, FTP for file transmission, and PGP (Pretty Good Privacy) encryption. Internal operations are handled via software written in Perl.

The server has few components. A file contains identifiers and delivery parameters for those institutions registered with (and thus capable of doing business with) the server. Another part is the software itself, which for all practical purposes runs continuously. Four components can be thought of as I/O ports: for incoming e-mail, files coming in via FTP, outgoing e-mail, and files going out via FTP. Add the communication and notification function, and the picture is complete.

The only input accepted by the server is files, and these must come via either MIME attachments to Internet e-mail or as file drop-offs via Internet FTP into an area dedicated for use by that institution. These files must be ANSI ASC X12 compliant, which means they must conform to the data standards with a standardized envelope for delivery instructions.

The file delivered to the UT server may contain one or more envelopes, intended for multiple recipients and including different types of documents. The file is always delivered by the sender to the address of the server, which takes over from there. Regardless of whether e-mail attachment or FTP is used, the file may be encrypted using the PGP encryption algorithm. This is done using the public key of the server, and may also include the digital signature of the sender.

The first processing step is recognition of receipt of a file. Second, the file is unencrypted (if encrypted) using the private key of the server and the public key of the sending institution. Third, the file is parsed into envelopes intended for different destinations. Fourth, the registrant table is checked for presence of an entry for the codes identifying the sender and the recipient. Failure here generates messages to the server administrators and the sender, but kills subsequent processing. Fifth, the envelopes are encrypted and delivered according to the delivery parameters selected by the recipient. Finally, notification e-mail messages are sent to administrative officials of both the sending and the receiving institutions, notifying them of the nature of the file received, the success in delivering it, and/or the need to pick it up or process it. The delivery protocol and parameters may be changed by a school with no effect on trading partners.

The server runs with no conscience, and it will not generally attempt to impose morals on those who use it. Aside from asking for compliance with ANSI ASC X12 standards, it will not get involved in the question: What should actually be sent within the (fairly permissive and flexible) framework of the standard? We anticipate only limited syntax checking. We have a few constraints, such as MIME or FTP, but only to render the system secure and our task as simple as possible; simplifying assumptions are the only kind to have. Rules of usage are included in a Frequently Asked Questions (FAQ) document.

Security

From the beginning, the developers of the data standards for educational EDI knew they needed to plan for delivery standards more "open" than those used in the business community and on VANs. With tens of thousands of potential trading partners, we would have to use networks and protocols provided by their states or provinces. Because of this, the initial development of the TS130 format for the educational transcript was accompanied by the TS131 for acknowledgment of an individual transcript. The purpose was to provide both authentication of the sender (a guard against transcript mills, which are a problem with paper documents) and evidence that the document was not modified en route.

Here's how it works. For every transcript received, the recipient produces a TS131 acknowledgment and sends it back to the certified acknowledgment address of the institution identified as the sender within the transcript. The acknowledgment includes identification of the student, plus a few computed values (total classes and degrees) and an academic summary with academic grade points. The recipient of the acknowledgment reconciles it with his file of transcripts sent. He notifies the TS131 sender in the event of his not having sent the initial TS130 or if his totals differ from those in the TS131. It has the added benefit of providing quality assurance unknown with paper transcripts, as one has a record of receipt, and can follow up on transcripts which are not promptly acknowledged. This is part of the SPEEDE/ExPRESS protocol, and has nothing to do with the UT server, except that the TS131 is easy to return to the sender via the server, and the authentication and modification threats are already covered.

The Internet break-ins were not much cause for alarm, unless a villain were to break into a site, assume an identity, learn all the right passwords and protocols, manufacture an evil student record, produce a flat file, translate it into TS130 format, and send it to a destination where it would benefit the student of record -- an unlikely scenario. The area not covered for the open Internet was that of privacy. What could protect the file against unauthorized viewing -- casual or malicious -- as it passed through the airwaves or waited at a gateway?

While public carriers assume responsibility for privacy within their own domains, there appears to be no such liability at points where regional Internets connect to one another. FERPA requires that registrars take reasonable measures to ensure the confidentiality of student academic records, so some action was needed.

Our solution is PGP encryption. It guarantees privacy from all except those with the necessary private keys, meeting both the spirit and the letter of the law. It is also available at no charge for public and nonprofit institutions. Versions are available for many platforms, including DOS, Windows, Macintosh, and UNIX, although not yet for MVS mainframes. An added benefit comes via the signature function of PGP, because it adds a separate authentication assurance, as only the institution with the proper private key could have signed the document. The signature also includes a derived number which signals modification.

Thus, we are protected by multiple layers of security in the area of authentication. Even if one layer fails (or is eventually compromised), we are still protected. By use of PGP, we also extend the privacy and modification protection beyond the actual Internet transmission phase, and cover from the time the file is encrypted at one end until it is processed at the other. This is important because exposure of such files is far greater at either end than it ever is during transmission.

The server has other security features worthy of note. Log-ons to the UNIX box are strictly limited, as it is not used for other activities. FTP log-ons are password protected, limited to specific directories, and denied all except "write" capability. Each institution has a pre-established relationship with the server and is a known entity, adding yet another layer to the authenticity assurance. Functional acknowledgment of the entire file, or notification similar to TS997, is possible. E-mail notices about files received or sent make it difficult for a single suspicious party to avoid detection at a participating institution. The system logs activities, and it notifies server- support analysts of invalid log-on attempts. Exposure is limited by the fact that the server is used for nothing else, so no other development activity is allowed.

While many were interested in using the server, as hoped, some were skeptical of committing to that course without contractual assurances that UT Austin would continue to operate the server indefinitely with no administrative charge. The capacity of the UNIX box has not been taxed to date, but we could understand the concern. Substantial relief has been provided by the University System of Georgia, which has signed a contract with UT Austin to establish and maintain a hot backup site. This will protect users in the event of a natural disaster in Texas and reassure against the unlikely event that UT Austin might decide to get out of the Internet EDI Server business.

Institutional usage

To be technically able to use the free services of the UT server, an educational institution needs just the following: (1) ability to produce EDI documents consistent with ANSI ASC X12 standards; (2) ability to send and receive files over the Internet via FTP or e-mail with MIME; (3) ability to execute PGP; (4) e-mail for inquiries and notification; and (5) agreement to rules of usage.3

An institution wishing to send or receive via the server must register with the server, providing receipt medium, delivery address and parameters, and notification address. It should persuade intended trading partners to register as well. If using encryption, a copy of the PGP public key is sent to the server via e-mail, and the server's key is received the same way.

Once registered, the delivery process is fairly simple. The sender prepares ANSI ASC X12-compliant files in the appropriate delivery envelopes and sends them to the waiting ports on the server, after encrypting and signing the files. The sender and recipient are notified by e-mail that the delivery has been made. The recipient then unencrypts and passes the file to acknowledgment, translation, and processing routines.

Approximately 140 entities from twenty-two states had registered with the server by March of 1997, including major EDI software providers wishing to test their products. Maryland, Florida, Texas, and the American Medical College Admissions Service (AMCAS) have been linked. Considerable volume has been seen, especially in Texas and Iowa, and in the feeds to AMCAS. Statewide movements in five states increased server registration and usage. Thousands of deliveries are made each month at no cost.

Especially noteworthy is the use by Austin Independent School District for delivery of high school transcripts to Alamo Community College District, Southern Methodist University, Southwest Texas State University, Texas A&M, UT Dallas, and UT Austin. Richardson and Plano Independent School Districts and the San Antonio Regional Service Center will be using it soon for high school transcript deliveries.

Early projects in Texas (and Florida) using proprietary formats helped define the national format. The Florida Department of Education will convert and deliver as needed, and Texas schools are now well on their way to switching to the SPEEDE format. The 1996 Texas Performance Review, initiated by the State Comptroller, recommends use of EDI and the UT server.

Consider the University of Miami, a private school excluded from Florida Information Resource Network (FIRN) usage in the earlier Florida project. We (and they) feel that UM can send a transcript to the server, which will send it back to the Miami-Dade Community College address at the Florida Department of Education, which will then send it on to Miami-Dade via FIRN. By taking a free ride of 2,000 miles, the transcript can now travel the two miles to Miami-Dade Community College.

Value-added services

In the initial design, the server was simply to provide a plug-compatible alternative to existing delivery mechanisms. Since it was an ad hoc solution, though, it made sense to tailor it more to the specific business function and usage by colleges. Improvements include:

Changes under consideration (which have not yet been scheduled on a firm timeline) include purchase of a separate development and testing machine, with only compiled production code running on the production server, and consideration in time of a possible rewrite in C++ or some such language if more efficiency is needed. Full routine transfer of operational files necessary to support the backup offered by the University System of Georgia will be on a more regular basis soon. Better management reports on usage of the server are evolving over time. We could use better automated processing of e-mail deliveries returned by mail systems, although most schools elect to use FTP. The system structure supports possible eventual transfer of server registration and testing to AACRAO or some other entity.

Final thoughts

The UT Austin Internet EDI Server experiment to provide a safe and simple alternative for doing EDI on the Internet has become a satisfactory production service. We are glad to have embarked on this project, and we see it persisting. We were most impressed by and are grateful for the cooperation of individuals associated with EDI, education, and the Internet.

We did not anticipate the need for value-added services, but they make good sense now. They underscore the fact that the business case is more important than the technical details of EDI in developing a successful exchange between institutions. If the right people become involved in planning, implementing, and testing, the technical aspects are manageable.

We learned that while outward deliveries via Internet e-mail are quickly executed, they lack the confirmation of success offered by FTP. FTP delivery attempts may be repeated at intervals, with certain status information returned to the sender, whereas e-mail failures become known over a longer period of time, and the notification varies with the mail service and postmaster at the institution.

We were interested that several state organizations liked the server model so much that they wanted to build their own just like it. We hope that the backup site in Georgia will diminish this movement, as each additional site increases the complexity of the nationwide (and beyond) network. The hassle of maintaining codes and registrants in additional sites increases exponentially with the number of sites. In some cases, however, such as Florida and Maryland, state servers of a somewhat different nature can add different value for their users, serve as agents for institutions, and deal directly with the UT Austin server.

The distrust of a free service surprised us. Despite some generous offers to share the cost, we decided that the simplest solution would be for The University of Texas at Austin to bear all costs of the server. Savings on VAN charges for the Texas ETN (around $8,000 per year for UT Austin) will quickly offset out-of-pocket expenses. If maintenance of the registrant table were to become a burden, that would be an indication of an increased level of participation in SPEEDE/ExPRESS, which is good. We know that the SPEEDE/ExPRESS project and/or AACRAO will offer assistance if expansion to other transaction sets creates administrative burdens. We have received offers of support from several sources.

There is no charge for the service, partially because that eliminates the burden of financial contracts and paperwork, and partially because charging might restrict our freedom to add enhancements and make other changes deemed necessary. UT Austin wishes to simplify procedures and eliminate transmission costs in hopes of helping more institutions to join the SPEEDE/ExPRESS movement, thus increasing the number of documents it is able to receive electronically.

For further reading

Morley, Robert. "Killing the Electronic Messenger." CAUSE/EFFECT, Spring 1996.

Palmer, B.H., and P. B. Wei. "SPEEDE Made Easy." College & University, Fall 1993.

Stones, D.H. "On the Strategic Nature of SPEEDE/ExPRESS, Scalability, and Applicability of EDI in the Workplace." SACRAO Journal, Volume 8.

Author acknowledgments

This project would not have been successful without the willing participation and assistance of many individuals, and I must attempt to recognize some of them. In alphabetical order: AACRAO SPEEDE Committee; Bruce Alexander, University of Washington; Betsy Bainbridge, AACRAO; Bill Bard, UT Austin; Dave Crocker, IETF; LaNell Day, Alamo Community College District; Cindy Dayton, University of Iowa; Rich Everman, University of California Irvine; Barbara Hewitt, Southwest Texas State University; Rick Jennings, Systems & Computer Technology (SCT) Corporation; Jerry McGauhey, UT Health Science Center in Houston; Don Nash, UT Austin; Les Pennington, University of Washington; Ted Pfeifer, UT Austin; Mike Read, AISD and ExPRESS; Bill Ruiz, University of Maryland System; and Tom Scott, University of Wisconsin­Madison.

The lion's share of the credit should go to my own staff for their invaluable contributions to the project: Cecily Allmon, Lisa Barden, Kay Coonrod, Jean McArthur, Wally Reeves, Shelby Stanfield, and Tom Yu. The executive officers at UT Austin blessed our efforts and allowed us to place the server in production. And, of course, all of this would have been meaningless without the registrations and use by colleges, universities, and school districts.


Sidebars:

Capabilities of the UT Austin Server

Information Sources

A Frequently Asked Questions (FAQ) document explains much about the server, including registration instructions. It is available at the UT Austin Server Web site (http://www.utexas.edu/student/giac/speede/index.html). To receive the FAQ by e-mail, send a request to nrdhs@utxdp.dp.utexas.edu.

The UT Austin Internet EDI Server also maintains and publicizes a Registrant Table to show (1) registered schools capable of sending or receiving, (2) codes to identify recipients to the server, (3) an indicator of whether the school uses encryption, and (4) production or test status. This information is prominently displayed on the server Web pages and is updated weekly. A separate report showing just changes to registrant information within the last month is also posted weekly.

The best source for information about the SPEEDE format and related postsecondary educational EDI projects is the SPEEDE Office at AACRAO. It maintains a Web page (http://www.aacrao.com/technology/edi.html) with pointers to a number of other vital Web sites. The phone number of the AACRAO SPEEDE Office is 202-293-7383.


Endnotes:

1 Electronic data interchange enables computers of different types to send and receive information directly between organizations that have established a trading partner relationship.

Back to the text

2 Encryption is the process of changing a digital message so that it can be read only by intended parties. Encryption schemes include using a private key or public key for encrypting and/or decrypting messages.

Back to the text

3 Both MIME and PGP may generally be obtained at no cost by educational institutions. For details, see the UT Austin Server FAQ (http://www.utexas.edu/student/giac/speede/index.html).

Back to the text


David Stones (nrdhs@utxdp.dp.utexas.edu) is Database Coordinator, Division of Student Affairs, at the University of Texas at Austin, where he has managed the student information systems since 1979. He is a developer of the SPEEDE national standard formats for electronic educational documents and has received the distinguished service award from AACRAO, the American Association of Collegiate Registrars and Admissions Officers.


...to the table of contents


[Comments] [Search] [Home]