英文摘要 |
The purpose of this project is to build an “Environmental Information Web Archive System” that copies and archives old website data. In addition to archiving valuable reference material and information on major environmental events, as well as achieving the goal of having an expanded quality source of collected and integrated environmental information, the system will also circumvent data security issues resulting from websites that are not updatable and maintenance negligence.
The scope of this project consists of analyzing, designing, developing, and building an “Environmental Information Web Archive System.” The main approach is to use a Website Copier’s web crawling mode to collect websites and then back up the corresponding old data. At the same time, metadata is created and full-text indices are built which enable regular users to browse and search previous versions of websites. The project implementation process involves incorporating ideas from international and domestic web archive systems as well as requirement evaluations to analyze and design the system while using the cores of the archive systems of the National Central Library and National Taiwan University for value-added development. To meet the needs of this project, the system utilizes the Environmental Protection Administration’s IDCs located in both northern and central Taiwan, thereby adopting a decentralized method to collect web information.
In order to establish a basis for future web archiving systems and processes, we referred to the National Central Library’s archive system to design a set of policies: the “Environmental Protection Administration Executive Yuan Environmental Information Web Archive Policies.” These policies specify the types of websites the Environmental Information Web Archive System will archive, collection methods and principles, and relevant licensing terms. The policies also formulate the “Environmental Information Web Archive Process” and the “Web Archive Application Form” to facilitate the implementation of future archival work.
The contract requires archiving at least 80 websites by the end of the current year. The Environmental Information Web Archive System has already archived 126 websites and collected approximately 1.08 million files, and used this content to build a website devoted to energy conservation and carbon reduction. However, due to the inherent limitations of web archiving technology, not all content from web pages can be completely saved and free of errors. This project also analyzed and categorized these limitations and acted as a reference for future website selection and development purposes.
With the rapid development of the Internet, web archiving has become an important part of digital archiving. This project offers a mechanism for archiving old website data and developed a platform that saves important records and provides searchable versions of old website versions. It can also construct and integrate knowledge databases filled with environmental related information to help people observe changes in trends. The project’s achievements can be further applied to other organizations to help solve the need for removing website information due to internal organizational restructuring, expiration, information security, and so on. Furthermore, strategic alliances can be formed with other web archive agencies to combine resources to increase the effectiveness of the project.
|