A Large-Scale Study of the Evolution of Web Pages - Microsoft

Extrait du fichier (au format texte) :

A Large-Scale Study of the Evolution of Web Pages
Dennis Fetterly
Hewlett Packard Labs
1501 Page Mill Road
Palo Alto, CA 94304
dennis.fetterly@hp.com

Mark Manasse

Marc Najork

Microsoft Research
Microsoft Research
1065 La Avenida
1065 La Avenida
Mountain View, CA 94043 Mountain View, CA 94043
manasse@microsoft.com najork@microsoft.com

Janet Wiener
Hewlett Packard Labs
1501 Page Mill Road
Palo Alto, CA 94304
janet.wiener@hp.com

ABSTRACT

1. INTRODUCTION

How fast does the web change? Does most of the content remain unchanged once it has been authored, or are the documents continuously updated? Do pages change a little or a lot? Is the extent of change correlated to any other property of the page? All of these questions are of interest to those who mine the web, including all the popular search engines, but few studies have been performed to date to answer them.
One notable exception is a study by Cho and Garcia-Molina,
who crawled a set of 720,000 pages on a daily basis over four months, and counted pages as having changed if their MD5 checksum changed. They found that 40% of all web pages in their set changed within a week, and 23% of those pages that fell into the
.com domain changed daily.
This paper expands on Cho and Garcia-Molina s study, both in terms of coverage and in terms of sensitivity to change. We crawled a set of 150,836,209 HTML pages once every week, over a span of
11 weeks. For each page, we recorded a checksum of the page, and a feature vector of the words on the page, plus various other data such as the page length, the HTTP status code, etc. Moreover, we pseudo-randomly selected 0.1% of all of our URLs, and saved the full text of each download of the corresponding pages.
After completion of the crawl, we analyzed the degree of change of each page, and investigated which factors are correlated with

Les promotions



C dric FOURNET LE JOIN-CALCUL : UN CALCUL POUR ... - Microsoft
C dric FOURNET LE JOIN-CALCUL : UN CALCUL POUR ... - Microsoft
11/04/2018 - www.microsoft.com
TH SE pr sent e L' COLE POLYTECHNIQUE pour obtenir le titre de DOCTEUR DE L' COLE POLYTECHNIQUE sp cialit : INFORMATIQUE par C dric FOURNET Sujet de la th se : LE JOIN-CALCUL : UN CALCUL POUR LA PROGRAMMATION R PARTIE ET MOBILE The Join-Calculus: a Calculus for Distributed Mobile Programming Soutenue le 23 Novembre 1998 devant le jury compos de : MM. Robin Milner Roberto Amadio G rard Boudol Jean-Jacques L vy G rard Berry Luca Cardelli Georges Gonthier Pr sident Rapporteurs Directeur de th...

1 Introduction - Microsoft
1 Introduction - Microsoft
11/04/2018 - www.microsoft.com
One-Way Accumulators: A Decentralized Alternative to Digital Signatures (Extended Abstract) Josh Benaloh Clarkson University Michael de Mare Giordano Automation Abstract This paper describes a simple candidate one-way hash function which satis es a quasi-commutative property that allows it to be used as an accumulator. This property allows protocols to be developed in which the need for a trusted central authority can be eliminated. Space-e cient distributed protocols are given for document time...

MSFT Echo SurfaceLaptopIntel Fact Sheet
MSFT Echo SurfaceLaptopIntel Fact Sheet
13/12/2025 - www.microsoft.com
Windows Hello for Business with facial recognition and Enhanced Sign-In Security Surface Laptop for Business Near-edgeless display and Surface's signature 3:2 ratio for more screen in a compact footprint Premium experiences drive AI advantage NPUs delivering 40 or 48 TOPS of on-device AI performance to support today's capabilities and tomorrow's innovations5 Anti-reflective technology reduces reflections up to 50% Optional smart card reader16 Exceptional AI-enabled collaboration and Copilot+...

User-Driven Access Control: Rethinking Permission ... - CiteSeerX
User-Driven Access Control: Rethinking Permission ... - CiteSeerX
23/08/2018 - www.microsoft.com
User-Driven Access Control: Rethinking Permission Granting in Modern Operating Systems Franziska Roesner, Tadayoshi Kohno {franzi, yoshi}@cs.washington.edu University of Washington Alexander Moshchuk, Bryan Parno, Helen J. Wang {alexmos, parno, helenw}@microsoft.com Microsoft Research, Redmond Crispin Cowan crispin@microsoft.com Microsoft Abstract tionality and security for access to the user s data and resources. From a functionality standpoint, isolation inhibits the client-side manipulation...

Msft Echo Microsoft Surface Pro 10 Fact Sheet Row
Msft Echo Microsoft Surface Pro 10 Fact Sheet Row
13/12/2025 - www.microsoft.com
Surface Pro 10 An AI PC built for business, designed for versatility Surface Pro 10 blurs the boundary between hardware and software for peak performance in a secured, lightweight device that adapts to any work style. Employees get the benefits of an AI PC that accelerates Microsoft Copilot* experiences and offers integrated AI engines that enable the next wave of business features. Choose from Wi-Fi+5G or Wi-Fi only. A new era of workplace collaboration Never-ending, on-the-go impact Take advantage...

Microsoft Modern Work Plan Comparison Education 11 2021
Microsoft Modern Work Plan Comparison Education 11 2021
14/09/2024 - www.microsoft.com
Add-on licenses Endpoint and app management Microsoft Product Terms Desktop client apps1 %? %? %? %? %? Office Mobile apps2 %? %? %? %? %? %? Install apps on up to 5 PCs/Mac + 5 tablets + 5 smartphones %?3 %? %? %?3 %? %? Office for the web %? %?

Entanglement and Rigidity in Percolation Models ... - Alexander Holroyd
Entanglement and Rigidity in Percolation Models ... - Alexander Holroyd
22/05/2017 - www.microsoft.com
 ''&'''''' '&'!' &'' &''&''''''' ' ' ''''''''''''"' ''#' '$'%&''&&'''*')'+'!',''-''''.')'+' '/ ')'0''1&''!''2 ''3 '4'6'5'8'7''9';':'=''§'H''£'Œ'X'© '’''“'”'','¾'K''‘''£'Œ'‹'“'”!’'8'’''Š''Œ''Š''›'ž'’'''£'Œ'ž'Š'­'Š',!’'8'’'''£!’'H'¥&`''œ'Š',!”''Š',!’'8'’'''£!’'H'™&'Œ'ž'“'”'¥&`'“'œ'™'H'“'œ'’'¸'¨'£'²'‹'¬''Ž'@'Ž&`'›'ž'Š',''œ'¨$i'›'ž'§'V'Š',''£'®%Ï'“'”!’'H'¥'H'»&`'’'' 'H'Š'­!”''Š'z''£!’'K'“'”!’'H'¥ 'Ž'£'$c'’'' 'H'Š','›'ž'Š$e'’''Š''Œ'!”''›'­'“'”'›'´''£'›'´''¢'Ž&`''œ''”'Ž'h'¤'‡'›','²'>'±''¥&`'Œ''t'§'H' '0'“'”!’'¯'’'' 'H'Œ''Š''Š'#'©'P'™'H'“'”!”''Š',!’'H'›'ž'“'”'Ž&`!’'H''£''¹'›''§'|''£'''Š'¼'“'”'›'Q'Š',!’'8'’'.''£!’'K'¥&`''”'Š''™ '“'«''´'“'«'’'w''z''£!’'H!’'K'Ž'£'’$i'Ÿ'V'Š'0'R'n'§'H'¾'H''”''œ'Š','™'p''£'§'|'t'Œ'ž'’'zÏ&'¤'‡' 'K'Š',!’Ð'’'' 'H'Š''Š','™'H'¥&`'Š''›''t'Œ''Š'+'Œ''Š''¥'8''£'Œ''™'K'Š','™Ñ't'›$i'§'K' %Ï'¨'@'›'ž'“'”''z't''µ'''Ž&`!’%²'© !’'H'Š''''’''“'”'Ž&`!’'H'›$i!”&''£'™'H'Š''Ž'£''*'Š',''”''£'›'ž'’''“'”'t'²'0'±Ò'¥&`'Œ'''£'§'H' Ð'“'”'›$i'Œ''“'”'¥'£'“'”'™'p'“'œ''­'“'«'’$i''z''£!’'H!’'K'Ž'£'’'º'Ÿ'"'Š'1'R'n'™'H'Š'#''¢'Ž&`'Œ'!”''Š','™&Ï$c'¤'‡' 'H'Š'!’ '’'' 'H'Š'º'Š','™'K'¥&`'Š','›'w''£'Œ'ž'Š'º'Œ''Š''¥'8''£'Œ'ž'™'H'Š','™Ó''£'›'-'›''Ž&`''œ'“'”'™''Œ'ž'Ž%²'™'K'›'-'¤'‡' 'H'“'”''.' '…'',''£!’'…'§'H'“'«'¶&`'Ž'£'’'w''F'’'-'’'' 'K'Š''¶&`'Š''Œ'ž'’''“'œ'','Š''›','²$i'·'*' 'H'Š','›'ž'Š '“'œ!’%Ï'’''¾'H'“'œ'’''“'œ'¶'£'Š$e!’'H'Ž'£'’''“'”'Ž&`!’'H'›'­'¤'‡'“'œ''”'&'Ÿ'"'Š$e''¢'Ž&`'Œ'ž!”'¯''£''œ'“'”'›'ž'Š','™'0''‘''F'’''Š','Œ''² Ô'=!’'8'’'.''£!’'H'¥'£''”'Š',!”''Š',!’'8'’'''£!’'H'™'p'Œ''“'œ'¥&`'“'”'™'H'“'«'’'¸'¨&c'“'œ!’'p'§'"'Š','Œ'ž'','Ž&`''”'t'’''“'œ'Ž&`!’'p''£'Œ''Š''Ž'£''´'“'”!’'8'’''Š''Œ''Š','›'X'’'''¢'Ž'£'Œ$i'›'ž'Š''¶'£'Š','Œ''t''Q'Œ'ž'Š'z''F'© '›'ž'Ž&`!’'H'›','²ÖÕ×'“'”'Œ''›'X'’'''«'¨&`'»'*'’'' 'H'Š'#'¨Ø' '|''z'¶&`'Š'p'“'”!”''§'"'Ž&`'Œ'ž'’'''£!’'8'’'...

MSR Quantum applications - Microsoft
MSR Quantum applications - Microsoft
23/08/2018 - www.microsoft.com
( What Can We Do with a Quantum Computer? ( Matthias Troyer  Station Q, ETH Zurich | 1 Classical computers have come a long way Antikythera mechanism ENIAC astronomical positions (1946) (100 BC) Kelvin s harmonic analyzer prediction of tides (1878) Difference Engine (1822) Is there anything that we cannot solve on future supercomputers? Titan, ORNL (2013) Matthias Troyer | | 2 How long will Moore s law continue? Do we see signs of the end of Moore s law? Can we go below 7nm...
 
 

SE93SGH3
SE93SGH3
05/07/2012 - www.smeg.fr
SE93SGH3 Table de cuisson gaz, 90 cm, inox, thermocouples EAN13: 8017709105723 5 brûleurs dont : Arrière gauche : 1,65 kW Avant gauche ultra-rapide : 4,05 kW Central poissonnière : 2,90 kW Arrière droit : 2,55 kW Avant droit : 1,05 kW Puissance nominale gaz : 12,20 kW Grilles en fonte Chapeaux de brûleurs émaillés fonte Allumage électronique intégré aux manettes Sécurité thermocouples rapides Réglée au gaz naturel Injecteurs gaz butane/propane fournis Bandeau démontable pour accès...

Fiche produit Sony : 96/1237480983596.pdf
Fiche produit Sony : 96/1237480983596.pdf
16/02/2012 - www.sony.fr
7000 WUXGA Vidéoprojecteur d'installation VPL-FH500L product design award 2011 www.pro.sony.eu/projectors VPL-FH500L Qualité d'image exceptionnelle en WUXGA Ce modèle procure une luminosité remarquable de 7 000 Lumens Le VPL-FH500L regroupe dans un boîtier discret les technologies de projection les plus avancées qui sont parfaitement adaptées aux utilisations dans le secteur universitaire, en entreprise, dans les musées et dans un environnement médical (conforme DICOM). Offrant une...

Imprimante Photo Numérique - Sony
Imprimante Photo Numérique - Sony
22/02/2018 - www.sony.fr
3-097-069-21 (1) Avant de commencer Imprimante Photo Numérique Réaliser diverses impressions Impression à partir d un appareil photo ou d un périphérique externe compatible PictBridge Impression à l aide d un ordinateur DPP-FP70/FP90 Dépannage Informations complémentaires Mode d emploi Avant d utiliser cet appareil, veuillez lire le manuel « A lire avant toute utilisation » fourni ainsi que le présent « Mode d emploi » attentivement et conservez-les pour toute référence...

Four Electrique - Fiche Produit Selon le règlement délégué (UE) N° 65 ...
Four Electrique - Fiche Produit Selon le règlement délégué (UE) N° 65 ...
06/09/2017 - www.sauter-electromenager.com
Four Electrique - Fiche Produit Selon le règlement délégué (UE) N° 65/2014 Marque SAUTER Référence commerciale SOP4440B Indice d'efficacité énergétique (EEI cavité) Classe d'efficacité énergétique Consommation d'énergie par cycle en mode conventionnel Consommation d énergie par cycle en mode chaleur tournante 80,3 A+ 0,65 kWh 0,95 kWh Nombre de cavité(s) 1 Source(s) de chaleur Electricité 65 l Volume de la cavité Label écologique de l'Union Européenne

2009 - GQ - Editing .indd - Kyocera
2009 - GQ - Editing .indd - Kyocera
12/01/2018 - www.kyocera.fr
GQ CHIPBREAKER Molded & Periphery Ground Chipbreaker %Ï Good chip control over broad cutting range (ap=0.8~3mm) 3 ap(mm) NEW 2 GQ 1 CK CF 0.1 0.2 f(mm/rev) " High precision with accurate periphery ground design " Sharp edge Gesinterter und Umfangsgeschliffener Spanbrecher " Gute Spankontrolle über weiten Anwendungsbereich. " Sehr hohe Schneidkantenpräzision. " Scharfe Schneidkante. TZZ00008 Brise-copeaux brut de frittage rectiûé en périphérie " Bon contrôle copeau pour...

Nike + iPod Sensor - ?? ??? 2011. 10. 12 - 2 ... - Support
Nike + iPod Sensor - ?? ??? 2011. 10. 12 - 2 ... - Support
27/11/2014 - manuals.info.apple.com
Nike + iPod Sensor Nike + iPod a^”Vh Nike + iPod aaÉVh Nike + iPod Á