维基百科:使用WebCite

本页提供了有关使用WebCite(一个网络存档服务)的资讯,其网址为http://www.webcitation.org/。使用WebCite的话,如果原始页面被移动、更改或删除,维基百科编者就可以透过保留一个在线的来源副本减少连结失效英语Link rot。但是,并非所有网页都可以存档。[nb 1]

WebCite可以归档一系列内容,包括HTML网页、PDF档案、CSS样式表JavaScript数字图像。另一个网络存档服务是网络时光机。这两种服务之操作方法不同,某些页面可以由其中一个存档,但另一个就无法存档。网络时光机使用机器人在特定时间自动存档某些网页,并接受由用户请求发起的存档过程;WebCite则需要有人主动归档链接。

如何存档

有很多方法可以将网页提交到WebCite进行归档。如果你是新手,建议使用网页表单。其他方法较适合那些使用WebCite的老手。

网页表单

此方法易于使用,但比其他方法慢,因为它需要在每次要存档时访问WebCite网站。

  1. 进入http://www.webcitation.org/archive
  2. 输入您要存档的网址到“URL to Archive [url]”栏位。
  3. 将您的电子信箱地址输入到“Your (citing author) E-mail Address [email]”栏位。
  4. 在输入上述内容后,点击“Submit”按钮。您将被导向到一个页面,其中包含指向您要存档之网页的存档网址连结。
  5. 告知存档过程是否成功的电子邮件将发送到您的电子信箱地址。如果成功,存档的网址也会包括在电子邮件中。
  6. 建议您查看存档页面以检查存档过程是否成功。

书签

Put simply, a bookmarklet is a web browser bookmark which instead of going to a web page, performs a certain function. With the WebCite bookmarklet, you click the bookmark, it takes the URL of the page you are currently looking at and submits it to WebCite for archiving. This method is easy to set up, easy to use and is fast. To get the most out of this method, it is recommended that you have your Bookmarks/Favorites bar visible or at least have your bookmarks accessible within a click or two. This method only allows you to archive the page you are currently looking at, to archive a different web page you will have to use another method.

  1. http://www.webcitation.org/bookmarklet设定书签。
  2. 输入电子邮件地址。告知存档过程是否成功的电子邮件将发送到此地址。如果成功,存档的网址也会包括在电子邮件中。
  3. 单击“Build my Bookmarklet”按钮。将会显示一些文本。
  4. 在第1点的结尾,有一个“WebCite® this page”连结。这是你的个人书签。将此连结拖动到您的书签列。
  5. 当你想要使用书签归档你正在浏览的网页时,单击它就会被导向到一个页面,其中包含存档连结。
  6. 建议您查看存档页面以检查存档过程是否成功。

Firefox smart keyword

Firefox smart keywords are commonly used to perform searches through the Firefox address bar or to open a bookmark by typing a keyword into the Firefox address bar. Here we are going to use a smart keyword to submit a URL to WebCite for archiving. This method is moderately simple to set up, easy to use and is fast.

  1. To set up the smart keyword, hit Ctrl+Shift+B to open up your Bookmarks Library (or by clicking the orange Firefox button on the top left of the window, then going to "Bookmarks", then "Show All Bookmarks")
  2. Browse to a location you would like to save the smart keyword bookmark in.
  3. In the menu at the top of the window, click "Organize", then "New Bookmark".
  4. Enter a name for the bookmark (e.g. WebCite).
  5. Enter http://www.webcitation.org/archive?url=%s&[email protected] into the Location field, replacing [email protected] with your email address. An email stating whether the archive process succeeded or failed will be sent to this address. If it was successful, the archive URL will also be included in the email.
  6. Enter a keyword for the bookmark. You should choose something short and this keyword must not already be used for another bookmark. (e.g. wc)
  7. Click the "Add" button. Close the Bookmarks Library.
  8. To use the smart keyword, add the keyword you chose ("wc" in the above example) followed by a space (" ") in front of the URL of the web page you would like to archive in the Firefox address bar. (e.g. If you are using "wc" as your keyword, the text in the address bar would be wc http://www.example.com/pageyouwantoarchive.html).
  9. Hit Enter. You will be sent to a page containing a link to the archive URL of the web page you wished to archive.
  10. It is recommended that you view the archived page to check if the archive process has been successful.

Chrome search engine

Although this is created through Chrome's search engine feature, this functions just like a smart keyword in Firefox. This method is moderately simple to set up, easy to use and is fast.

  1. To set up the "search engine", right click the address bar and select "Edit search engines...". At the bottom of the list that comes up, you can add a "search engine".
  2. Enter a name for the "search engine" in the first field (e.g. WebCite).
  3. Enter a keyword for the "search engine" in the second field. You should choose something short and this keyword must not already be used. (e.g. wc)
  4. Enter http://www.webcitation.org/archive?url=%s&[email protected] into the third field, replacing [email protected] with your email address. An email stating whether the archive process succeeded or failed will be sent to this address. If it was successful, the archive URL will also be included in the email.
  5. Hit Enter to save the "search engine".
  6. To use the "search engine", add the keyword you chose ("wc" in the above example) followed by a space (" ") in front of the URL of the web page you would like to archive in the Chrome address bar (e.g. If you are using "wc" as your keyword, the text in the address bar would be wc http://www.example.com/pageyouwantoarchive.html).
  7. Hit Enter. You will be sent to a page containing a link to the archive URL of the web page you wished to archive.
  8. It is recommended that you view the archived page to check if the archive process has been successful.

限制

WebCite尊重机器人排除标准,以及no-cache和no-archive标记,不会归档不允许归档的网站。

例如,《纽约时报》有个位址为 http://www.nytimes.com/robots.txt 的 robots.txt 页面包含:

User-agent: *
Disallow: /aponline/
Disallow: /archives/
Disallow: /reuters/

因此,《纽约时报》网站中包含这些资料夹及其他任何相似的资料夹的URL的存档请求就被排除在外。

在维基百科中使用

英文维基百科社群认为使用WebCite存档的链接应以长网址显示(请参阅RfC,但中文维基社群因为较少使用此服务,而没有讨论过这个问题)。

长网址示例:

http://www.webcitation.org/5eWaHRbn4?url=http://www.example.com/

缩网址服务类似的9位码“快照ID”,contains a base 62 coded timestamp that can be extracted by bots and other programs. 它也用作唯一的页面ID。后面是原始网址,这有助于防止恶意连结的隐藏,例如垃圾邮件。

第二种可选长网址:

http://www.webcitation.org/query?url=http://www.example.com&date=20091104(日期采用YYYYMMDD或YYYY-MM-DD格式)

这种方式放弃了“快照ID”,并改用日期参数。两者都适合在维基百科内使用。

此存档网址可以放入任何引用模板英语Wikipedia:Citation templates中的archiveurl=,并应填写archivedate=deadurl=。如果原始网址无法访问deadurl的值应填写为yes。反之,若原始网址仍可访问,deadurl应填写no

<ref>{{cite web |last= |first= |title= |work= |publisher= |date= |url= |archiveurl= |archivedate= |deadurl= }}</ref>

搜寻先前存档的网页

先前透过WebCite存档的网页可经由可搜索的数据库访问。用户可以按网址、日期或“快照ID”进行搜索。

相关条目

注解

  1. ^ WebCite FAQ: A page may not be archived for a number of reasons. The page owner may specifically prohibit archiving of their content through no-cache / no-archive tags, or via a robot exclusion policy on their site. The content may be inaccessible from the WebCite® network (this is particularly likely if you are attempting to access subscription based content which your institution subscribes to on its users' behalf). Also, the content may be unreadable by the WebCite® archiver (complex JavaScript based pages, or ones involving browser checks sometimes cause our archive engine to fail).