A Large-Scale Study of the Evolution of Web Pages - Microsoft

Extrait du fichier (au format texte) :

A Large-Scale Study of the Evolution of Web Pages
Dennis Fetterly
Hewlett Packard Labs
1501 Page Mill Road
Palo Alto, CA 94304
dennis.fetterly@hp.com

Mark Manasse

Marc Najork

Microsoft Research
Microsoft Research
1065 La Avenida
1065 La Avenida
Mountain View, CA 94043 Mountain View, CA 94043
manasse@microsoft.com najork@microsoft.com

Janet Wiener
Hewlett Packard Labs
1501 Page Mill Road
Palo Alto, CA 94304
janet.wiener@hp.com

ABSTRACT

1. INTRODUCTION

How fast does the web change? Does most of the content remain unchanged once it has been authored, or are the documents continuously updated? Do pages change a little or a lot? Is the extent of change correlated to any other property of the page? All of these questions are of interest to those who mine the web, including all the popular search engines, but few studies have been performed to date to answer them.
One notable exception is a study by Cho and Garcia-Molina,
who crawled a set of 720,000 pages on a daily basis over four months, and counted pages as having changed if their MD5 checksum changed. They found that 40% of all web pages in their set changed within a week, and 23% of those pages that fell into the
.com domain changed daily.
This paper expands on Cho and Garcia-Molina s study, both in terms of coverage and in terms of sensitivity to change. We crawled a set of 150,836,209 HTML pages once every week, over a span of
11 weeks. For each page, we recorded a checksum of the page, and a feature vector of the words on the page, plus various other data such as the page length, the HTTP status code, etc. Moreover, we pseudo-randomly selected 0.1% of all of our URLs, and saved the full text of each download of the corresponding pages.
After completion of the crawl, we analyzed the degree of change of each page, and investigated which factors are correlated with

Les promotions

Promo
26.89 € 22.78 €


D6. 4: Final evaluation of CLASSiC TownInfo and ... - Microsoft
D6. 4: Final evaluation of CLASSiC TownInfo and ... - Microsoft
23/11/2017 - www.microsoft.com
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/228835240 D6. 4: Final evaluation of CLASSiC TownInfo and Appointment Scheduling systems Article · May 2011 CITATIONS READS 15 56 11 authors, including: Helen Hastie Filip Jurcicek Heriot-Watt University Charles University in Prague 105 PUBLICATIONS 858 CITATIONS 55 PUBLICATIONS 439 CITATIONS SEE PROFILE SEE PROFILE Oliver Joseph Lemon Steve Young Heriot-Watt University University of Cambridge 323 PUBLICATIONS 3,678 CITATIONS 310 PUBLICATIONS 14,308 CITATIONS SEE PROFILE SEE PROFILE Some of the authors of this publication are also working on these related projects: MaDrIgAL: Multi-Dimensional Interaction management and Adaptive Learning View project ...

Microsoft Modern Work Plan Comparison Education 11 2021
Microsoft Modern Work Plan Comparison Education 11 2021
14/09/2024 - www.microsoft.com
Add-on licenses Endpoint and app management Microsoft Product Terms Desktop client apps1 %? %? %? %? %? Office Mobile apps2 %? %? %? %? %? %? Install apps on up to 5 PCs/Mac + 5 tablets + 5 smartphones %?3 %? %? %?3 %? %? Office for the web %? %?

C dric FOURNET LE JOIN-CALCUL : UN CALCUL POUR ... - Microsoft
C dric FOURNET LE JOIN-CALCUL : UN CALCUL POUR ... - Microsoft
11/04/2018 - www.microsoft.com
TH SE pr sent e L' COLE POLYTECHNIQUE pour obtenir le titre de DOCTEUR DE L' COLE POLYTECHNIQUE sp cialit : INFORMATIQUE par C dric FOURNET Sujet de la th se : LE JOIN-CALCUL : UN CALCUL POUR LA PROGRAMMATION R PARTIE ET MOBILE The Join-Calculus: a Calculus for Distributed Mobile Programming Soutenue le 23 Novembre 1998 devant le jury compos de : MM. Robin Milner Roberto Amadio G rard Boudol Jean-Jacques L vy G rard Berry Luca Cardelli Georges Gonthier Pr sident Rapporteurs Directeur de th...

1 Introduction - Microsoft
1 Introduction - Microsoft
11/04/2018 - www.microsoft.com
One-Way Accumulators: A Decentralized Alternative to Digital Signatures (Extended Abstract) Josh Benaloh Clarkson University Michael de Mare Giordano Automation Abstract This paper describes a simple candidate one-way hash function which satis es a quasi-commutative property that allows it to be used as an accumulator. This property allows protocols to be developed in which the need for a trusted central authority can be eliminated. Space-e cient distributed protocols are given for document time...

Vers une approche simplifiée pour introduire le caractère ... - Microsoft
Vers une approche simplifiée pour introduire le caractère ... - Microsoft
23/11/2017 - www.microsoft.com
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/262881208 Vers une approche simplifiée pour introduire le caractère incrémental dans les systèmes de dialogue Conference Paper · July 2014 CITATION READS 1 26 3 authors, including: Hatim Khouzaimi Romain Laroche Orange Labs / Laboratoire Informatique d'Avi & Microsoft Maluuba 12 PUBLICATIONS 42 CITATIONS 58 PUBLICATIONS 185 CITATIONS SEE PROFILE All content following this page was uploaded by Hatim Khouzaimi on 28 April 2015. The user has requested enhancement of the downloaded file. SEE PROFILE 21ème...

User-Driven Access Control: Rethinking Permission ... - CiteSeerX
User-Driven Access Control: Rethinking Permission ... - CiteSeerX
23/08/2018 - www.microsoft.com
User-Driven Access Control: Rethinking Permission Granting in Modern Operating Systems Franziska Roesner, Tadayoshi Kohno {franzi, yoshi}@cs.washington.edu University of Washington Alexander Moshchuk, Bryan Parno, Helen J. Wang {alexmos, parno, helenw}@microsoft.com Microsoft Research, Redmond Crispin Cowan crispin@microsoft.com Microsoft Abstract tionality and security for access to the user s data and resources. From a functionality standpoint, isolation inhibits the client-side manipulation...

DSCOVR: Randomized Primal-Dual Block Coordinate ... - Microsoft
DSCOVR: Randomized Primal-Dual Block Coordinate ... - Microsoft
23/08/2018 - www.microsoft.com
DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization lin.xiao@microsoft.com Lin Xiao Microsoft Research AI Redmond, WA 98052, USA weiyu@cs.cmu.edu Adams Wei Yu Machine Learning Department, Carnegie Mellon University Pittsburgh, PA 15213, USA qihang-lin@uiowa.edu Qihang Lin Tippie College of Business, The University of Iowa Iowa City, IA 52245, USA wzchen@microsoft.com Weizhu Chen Microsoft AI and Research Redmond, WA 98052, USA October 13,...

Entanglement and Rigidity in Percolation Models ... - Alexander Holroyd
Entanglement and Rigidity in Percolation Models ... - Alexander Holroyd
22/05/2017 - www.microsoft.com
 ''&'''''' '&'!' &'' &''&''''''' ' ' ''''''''''''"' ''#' '$'%&''&&'''*')'+'!',''-''''.')'+' '/ ')'0''1&''!''2 ''3 '4'6'5'8'7''9';':'=''§'H''£'Œ'X'© '’''“'”'','¾'K''‘''£'Œ'‹'“'”!’'8'’''Š''Œ''Š''›'ž'’'''£'Œ'ž'Š'­'Š',!’'8'’'''£!’'H'¥&`''œ'Š',!”''Š',!’'8'’'''£!’'H'™&'Œ'ž'“'”'¥&`'“'œ'™'H'“'œ'’'¸'¨'£'²'‹'¬''Ž'@'Ž&`'›'ž'Š',''œ'¨$i'›'ž'§'V'Š',''£'®%Ï'“'”!’'H'¥'H'»&`'’'' 'H'Š'­!”''Š'z''£!’'K'“'”!’'H'¥ 'Ž'£'$c'’'' 'H'Š','›'ž'Š$e'’''Š''Œ'!”''›'­'“'”'›'´''£'›'´''¢'Ž&`''œ''”'Ž'h'¤'‡'›','²'>'±''¥&`'Œ''t'§'H' '0'“'”!’'¯'’'' 'H'Œ''Š''Š'#'©'P'™'H'“'”!”''Š',!’'H'›'ž'“'”'Ž&`!’'H''£''¹'›''§'|''£'''Š'¼'“'”'›'Q'Š',!’'8'’'.''£!’'K'¥&`''”'Š''™ '“'«''´'“'«'’'w''z''£!’'H!’'K'Ž'£'’$i'Ÿ'V'Š'0'R'n'§'H'¾'H''”''œ'Š','™'p''£'§'|'t'Œ'ž'’'zÏ&'¤'‡' 'K'Š',!’Ð'’'' 'H'Š''Š','™'H'¥&`'Š''›''t'Œ''Š'+'Œ''Š''¥'8''£'Œ''™'K'Š','™Ñ't'›$i'§'K' %Ï'¨'@'›'ž'“'”''z't''µ'''Ž&`!’%²'© !’'H'Š''''’''“'”'Ž&`!’'H'›$i!”&''£'™'H'Š''Ž'£''*'Š',''”''£'›'ž'’''“'”'t'²'0'±Ò'¥&`'Œ'''£'§'H' Ð'“'”'›$i'Œ''“'”'¥'£'“'”'™'p'“'œ''­'“'«'’$i''z''£!’'H!’'K'Ž'£'’'º'Ÿ'"'Š'1'R'n'™'H'Š'#''¢'Ž&`'Œ'!”''Š','™&Ï$c'¤'‡' 'H'Š'!’ '’'' 'H'Š'º'Š','™'K'¥&`'Š','›'w''£'Œ'ž'Š'º'Œ''Š''¥'8''£'Œ'ž'™'H'Š','™Ó''£'›'-'›''Ž&`''œ'“'”'™''Œ'ž'Ž%²'™'K'›'-'¤'‡' 'H'“'”''.' '…'',''£!’'…'§'H'“'«'¶&`'Ž'£'’'w''F'’'-'’'' 'K'Š''¶&`'Š''Œ'ž'’''“'œ'','Š''›','²$i'·'*' 'H'Š','›'ž'Š '“'œ!’%Ï'’''¾'H'“'œ'’''“'œ'¶'£'Š$e!’'H'Ž'£'’''“'”'Ž&`!’'H'›'­'¤'‡'“'œ''”'&'Ÿ'"'Š$e''¢'Ž&`'Œ'ž!”'¯''£''œ'“'”'›'ž'Š','™'0''‘''F'’''Š','Œ''² Ô'=!’'8'’'.''£!’'H'¥'£''”'Š',!”''Š',!’'8'’'''£!’'H'™'p'Œ''“'œ'¥&`'“'”'™'H'“'«'’'¸'¨&c'“'œ!’'p'§'"'Š','Œ'ž'','Ž&`''”'t'’''“'œ'Ž&`!’'p''£'Œ''Š''Ž'£''´'“'”!’'8'’''Š''Œ''Š','›'X'’'''¢'Ž'£'Œ$i'›'ž'Š''¶'£'Š','Œ''t''Q'Œ'ž'Š'z''F'© '›'ž'Ž&`!’'H'›','²ÖÕ×'“'”'Œ''›'X'’'''«'¨&`'»'*'’'' 'H'Š'#'¨Ø' '|''z'¶&`'Š'p'“'”!”''§'"'Ž&`'Œ'ž'’'''£!’'8'’'...
 
 

Fiche relative aux lave-linge ménagers - Miele
Fiche relative aux lave-linge ménagers - Miele
13/05/2020 - www.miele.fr
Fiche relative aux lave-linge menagers selon reglement delegue (UE) N°1061/2010 Miele WCE660 TDOS WIFI Identification du modele Capacite nominale 1 kg Classe d'efficacite energetique A+++ (la plus grande efficacite) a D (la plus faible efficacite) 2 Consommation energetique annuelle (AEC) kWh par an consommation d'energie du programme «coton» standard a 60 °C (pleine charge) kWh consommation d'energie du programme «coton» standard a 60 °C (demi-charge) kWh consommation d'energie du programme «coton»...

HDCAM SR Workflow Guide: Avid (205.2 KB) - Sony
HDCAM SR Workflow Guide: Avid (205.2 KB) - Sony
16/02/2012 - www.sony.fr
HDCAM-SR Workflow Guide Avid offers a complete line of post production solutions that are designed to work hand in hand with Sony's HDCAM-SR VTR from offline to finishing to stereoscopy 3D editorial. Capture, Create and Deliver. Avid Media Composer, Avid Symphony Nitris DX and Avid DS have specific features enabling versatile efficient and reliable workflows with Sony's HDCAM-SR VTR, Stereoscopy editorial, high speed conform and mastering. Avid Post Production Solutions: Avid Media Composer 4.x,...

PT-P300BT leaflet_French.indd - Brother
PT-P300BT leaflet_French.indd - Brother
18/01/2017 - www.brother.fr
CUBE " " " " " " Créez des étiquettes avec un grand nombre de couleurs et de largeurs différentes pour une utilisation à la maison Identifiez les vêtements de vos enfants en utilisant des étiquettes faciles à appliquer avec un fer à repasser Imprimez des messages personnalisés sur du ruban tissu pour vos cadeaux ou autres occasions Connectivité Bluetooth facile à configurer avec votre smartphone ou votre tablette Application intuitive P-touch Design & Print gratuite pour la conception...

McAfeeSMB 2014 TCs final_FR
McAfeeSMB 2014 TCs final_FR
10/09/2018 - www.toshiba.fr
Toshiba  Promotion McAfee pour PME 1. Les conditions générales suivantes constituent le cadre juridique de la participation à la Promotion Toshiba  McAfee pour PME (ci-après « Promotion ») organisée par Toshiba Europe GmbH, Hammfelddamm 8, D-41460 Neuss, Allemagne (ci-après « TOSHIBA »). 2. L'objet de la Promotion est le suivant : Chaque client faisant l'achat d'un PC portable d'entreprise Toshiba (Portégé, Tecra, Satellite Pro) dans l'un de ces pays, France, Allemagne, Autriche, Belgique,...

TABLE DES MATIÈRES - KitchenAid
TABLE DES MATIÈRES - KitchenAid
23/02/2017 - www.kitchenaid.fr
MANUEL D UTILISATION DU PRÉPARATEUR CULINAIRE CUISEUR TOUT-EN-UN TABLE DES MATIÈRES PRÉCAUTIONS D EMPLOI DU PRÉPARATEUR CULINAIRE CUISEUR TOUT-EN-UN Consignes de sécurité importantes................................................................... 58 Alimentation...................................................................................................... 60 Traitement des déchets d équipements électriques.......................................... 60 PIÈCES ET FONCTIONS Pièces...

Fiche produit - Tôle perforée sur mesure
Fiche produit - Tôle perforée sur mesure
02/02/2017 - www.tole-perforee-sur-mesure.com
FICHE TECHNIQUE TÔLES PERFORÉES www.tole-perforee-sur-mesure.com TÔLE PERFORÉE HOSANNA CATÉGORIE : PATRIMOINE RÉFÉRENCE : SPM 10705 PERFORATION: Trous en forme de trèfle en quatre feuille stylisé VIDE 48 % COURANT: 2000 X 1000 mm BASE: 3000 X 1500 mm MAXI: 4000 X 1500 mm MATIÈRE Acier noir | Acier galva Acier inox | Aluminium Sur demande: - bronze, laiton, cuivre - peinture thermolaquage ou anodisation bronze ou champagne ÉPAISSEUR : 2 mm et 3 mm Délai de production: 10 jours ouvrables...