Recovering device drivers
MichaelM.Swift,MuthukaruppanAnnamalai,BrianN.Bershad,andHenryM.Levy
DepartmentofComputerScienceandEngineering
UniversityofWashingtonSeattle,WA98195USA
{mikesw,muthu,bershad,levy}@cs.washington.edu
Abstract
Thispaperpresentsanewmechanismthatenablesapplicationstoruncorrectlywhendevicedriversfail.Becausedevicedriversaretheprincipalfailingcomponentinmostsystems,reducingdriver-inducedfailuresgreatlyimprovesoverallreliability.Ear-lierworkhasshownthatanoperatingsystemcansurvivedriverfailures[33],buttheapplicationsthatdependonthemcannot.Thus,whileoperatingsystemreliabilitywasgreatlyimproved,applicationreliabilitygenerallywasnot.
Toremedythissituation,weintroduceanewoperatingsys-temmechanismcalledashadowdriver.Ashadowdrivermon-itorsdevicedriversandtransparentlyrecoversfromdriverfail-ures.Moreover,itassumestheroleofthefaileddriverduringrecovery.Inthisway,applicationsusingthefaileddriver,aswellasthekernelitself,continuetofunctionasexpected.
WeimplementedshadowdriversfortheLinuxoperatingsystemandtestedthemonoveradozendevicedrivers.Ourre-sultsshowthatapplicationsandtheOScanindeedsurvivethefailureofavarietyofdevicedrivers.Moreover,shadowdriversimposeminimalperformanceoverhead.Lastly,theycanbein-troducedwithonlymodestchangestotheOSkernelandwithnochangesatalltoexistingdevicedrivers.
1Introduction
Improvingreliabilityisoneofthegreatestchallengesforcommodityoperatingsystems.Systemfailuresarecom-monplaceandcostlyacrossalldomains:inthehome,intheserverroom,andinembeddedsystems,wheretheexistenceoftheOSitselfisinvisible.Atthelowend,failuresleadtouserfrustrationandlostsales.Atthehighend,anhourofdowntimefromasystemfailurecanresultinlossesinthemillions[16].
Mostofthesesystemfailuresarecausedbytheoper-atingsystem’sdevicedrivers.Faileddriverscause85%ofWindowsXPcrashes[30],whileLinuxdrivershaveseventimesthebugrateofotherkernelcode[14].Afaileddrivertypicallycausestheapplication,theOSker-nel,orbothtocrashorstopfunctioningasexpected.Hence,preventingdriver-inducedfailuresimprovesover-allsystemreliability.
Earlierfailure-isolationsystemswithinthekernelweredesignedtopreventdriverfailuresfromcorruptingthekernelitself[33].Inthesesystems,thekernelunloadsafaileddriverandthenrestartsitfromasafeinitialstate.Whileisolationtechniquescanreducethefrequencyofsystemcrashes,applicationsusingthefaileddrivercanstillcrash.Thesefailuresoccurbecausethedriverlosesapplicationstatewhenitrestarts,causingapplicationstoreceiveerroneousresults.Mostapplicationsareunpre-paredtocopewiththis.Rather,theyreflecttheconven-tionalfailuremodel:driversandtheoperatingsystemei-therfailtogetherornotatall.
Thispaperpresentsanewmechanism,calledashadowdriver,thatimprovesoverallsystemreliabilitybyconcealingadriver’sfailurefromitsclientswhilere-coveringfromthefailure.Duringnormaloperation,theshadowtracksthestateoftherealdriverbymonitoringallcommunicationbetweenthekernelandthedriver.Whenafailureoccurs,theshadowinsertsitselftemporarilyinplaceofthefaileddriver,servicingrequestsonitsbehalf.Whileshieldingthekernelandapplicationsfromthefail-ure,theshadowdriverrestoresthefaileddrivertoastatewhereitcanresumeprocessingrequests.
Ourdesignforshadowdriversreflectsfourprinciples:1.Devicedriverfailuresshouldbeconcealedfromthedriver’sclients.Iftheoperatingsystemandapplica-tionsusingadrivercannotdetectthatithasfailed,theyareunlikelytofailthemselves.2.Recoverylogicshouldbecentralizedinasinglesub-system.Wewanttoconsolidaterecoveryknowledgeinasmallnumberofcomponentstosimplifytheim-plementation.3.Driverrecoverylogicshouldbegeneric.Thein-creasedreliabilityofferedbydriverrecoveryshouldnotbeoffsetbypotentiallydestabilizingchangestothetensofthousandsofexistingdrivers.There-fore,thearchitecturemustenableasingleshadowdrivertohandlerecoveryforalargenumberofde-vicedrivers.
4.Recoveryservicesshouldhavelowoverheadwhennotneeded.Therecoverysystemshouldimposerel-ativelylittleoverheadforthecommoncase(thatis,whendriversareoperatingnormally).Overall,thesedesignprinciplesareintendedtominimizethecostrequiredtomakeanduseshadowdriverswhilemaximizingtheirvalueinexistingcommodityoperatingsystems.
Weimplementedtheshadowdriverarchitectureforsound,network,andIDEstoragedriversonaversionoftheLinuxoperatingsystem.Ourresultsshowthatshadowdrivers:(1)maskdevicedriverfailuresfromap-plications,allowingapplicationstorunnormallyduringandafteradriverfailure,(2)imposeminimalperfor-manceoverhead,(3)requirenochangestoexistingap-plicationsanddevicedrivers,and(4)integrateeasilyintoanexistingoperatingsystem.
Thispaperdescribesthedesign,implementationandperformanceofshadowdrivers.Thefollowingsectionre-viewsgeneralapproachestoprotectingapplicationsfromsystemfaults.Section3describesdevicedriversandtheshadowdriverdesignandcomponents.Section4presentsthestructureofshadowdriversandthemechanismsre-quiredtoimplementtheminLinux.Section5presentsexperimentsthatevaluatetheperformance,effectiveness,andcomplexityofshadowdrivers.Thefinalsectionsum-marizesourwork.
2RelatedWork
Thissectiondescribespreviousresearchonrecoverystrategiesandmechanisms.Theimportanceofrecoveryhaslongbeenknowninthedatabasecommunity,wheretransactions[19]preventdatacorruptionandallowap-plicationstomanagefailure.Morerecently,theneedforfailurerecoveryhasmovedfromspecializedapplicationsandsystemstothemoregeneralarenaofcommoditysys-tems[28].
Ageneralapproachtorecoveryistorunapplicationreplicasontwomachines,aprimaryandabackup.Allinputstotheprimaryaremirroredtothebackup.Afterafailureoftheprimary,thebackupmachinetakesovertoprovideservice.Thereplicationcanbeperformedbythehardware[21],atthehardware-softwareinter-face[8],atthesystemcallinterface[2,5,7],oratames-sagepassingorapplicationinterface[4].Shadowdriverssimilarlyreplicateallcommunicationbetweentheker-nelanddevicedriver(theprimary),sendingcopiestotheshadowdriver(thebackup).Ifthedriverfails,theshadowtakesovertemporarilyuntilthedriverrecovers.How-ever,shadowsdifferfromtypicalreplicationschemesinseveralways.First,becauseourgoalistotolerateonlydriverfailures,nothardwarefailures,boththeshadowandthe“real”driverrunonthesamemachine.Second,
andmoreimportantly,theshadowisnotareplicaofthedevicedriver:itimplementsonlytheservicesneededtomanagerecoveryofthefaileddriverandtoshieldappli-cationsfromtherecovery.Forthisreason,theshadowistypicallymuchsimplerthanthedriveritshadows.
Anothercommonrecoveryapproachistorestartap-plicationsafterafailure.Manysystemsperiodicallycheckpointapplicationstate[26,27,29],whileotherscombinecheckpointswithlogs[2,5,31].Thesesys-temstransparentlyrestartfailedapplicationsfromtheirlastcheckpoint(possiblyonanothermachine)andre-playthelogifoneispresent.Shadowdriverstakeasimilarapproachbyreplayingalogofrequestsmadetodrivers.Recentworkhasshownthatthisapproachislimitedwhenrecoveringfromapplicationfaults:appli-cationsoftenbecomecorruptedbeforetheyfail;hence,theirlogsorcheckpointsmayalsobecorrupted[10,25].Shadowdriversreducethispotentialbyloggingonlyasmallsubsetofrequests.Furthermore,applicationbugstendtobedeterministicandrecuraftertheapplicationisrestarted[11].Driverfaults,incontrast,oftencausetran-sientfailuresbecauseofthecomplexitiesofthekernelexecutionenvironment[34].
Anotherapproachissimplytorebootthefailedcom-ponent,forexample,unloadingandreloadingfailedker-nelextensions,suchasdevicedrivers[33].Rebootinghasbeenproposedasageneralstrategyforbuildinghigh-availabilitysoftware[9].However,rebootingforcesap-plicationstohandlethefailure,forexample,reinitializingstatethathasbeenlostbytherebootedcomponent.Fewexistingapplicationsdothis[9],andthosethatdonotsharethefateofthefaileddriver.Shadowdriverstrans-parentlyrestoredriverstatelostinthereboot,invisiblytoapplications.
ShadowdriversrelyondevicedriverisolationtopreventfaileddriversfromcorruptingtheOSorap-plications.Isolationcanbeprovidedinvariousways.Vino[32]encapsulatesextensionsusingsoftwarefaultisolation[35]andusestransactionstorepairkernelstateafterafailure.Nooks[33]andPalladium[13]isolateex-tensionsinprotectiondomainsenforcedbyvirtualmem-oryhardware.Microkernels[23,38,39]andtheirderiva-tives[15,17,20]forceisolationbyexecutingextensionsinusermode.
Ratherthanconcealingdriverfailures,thesesystemsallreflectarevealingstrategy,oneinwhichtheapplica-tionoruserismadeawareofthefailure.TheOStypi-callyreturnsanerrorcode,tellingtheapplicationthatasystemcallfailed,butlittleelse(e.g.,itdoesnotindicatewhichcomponentfailedorhowthefailureoccurred).Theburdenofrecoverythenrestsontheapplication,whichmustdecidewhatstepstotaketocontinueexecuting.Aspreviouslymentioned,mostapplicationscannothandlethefailureofdevicedrivers[37],sincedriverfaultstyp-
icallycrashthesystem.Whenadriverfailureoccurs,thesesystemsexposethefailuretotheapplication,whichmaythenfail.Byimpersonatingdevicedriversduringrecovery,shadowdriversconcealerrorscausedbydriverfailuresandtherebyprotectapplications.Severalsystemshavenarrowedthescopeofrecoverytofocusonaspecificsubsystemorcomponent.Forex-ample,theRiofilecache[12]provideshighperformancebyisolatingasinglesystemcomponent,thefilecache,fromkernelfailures.Phoenix[3]providestransparentrecoveryafterthefailureofasingleproblematiccompo-nenttype,databaseconnectionsinmulti-tierapplications.Similarly,ourshadowdriverresearchfocusesonrecov-eryforasingleOScomponenttype,thedevicedriver,whichistheleadingcauseofOSfailure.Byabandoninggeneral-purposerecovery,wetransparentlyresolveama-jorcauseofapplicationandOSfailurewhilemaintainingalowruntimeoverhead.
3
DeviceDriversandShadowDriverDesign
Adevicedriverisakernel-modesoftwarecomponentthatprovidesaninterfacebetweentheOSandahardwarede-vice1.Thedriverconvertsrequestsfromthekernelintorequeststothehardware.Driversrelyontwointerfaces:theinterfacethatdriversexporttothekernelthatprovidesaccesstothedevice,andthekernelinterfacethatdriversimportfromtheoperatingsystem.Forexample,Figure1showsthekernelcallingintoasounddrivertoplayatone;inresponse,thesounddriverconvertstherequestintoasequenceofI/Oinstructionsthatdirectthesoundcardtoemitsound.
Inpractice,mostdevicedriversaremembersofaclass,whichisdefinedbyitsinterface.Forexample,allnetworkdriversobeythesamekernel-driverinterface,andallsound-carddriversobeythesamekernel-driverinterface.Thisclassorientationsimplifiestheintroduc-tionofnewdriversintotheoperatingsystem,sincenoOSchangesarerequiredtoaccommodatethem.
InadditiontoprocessingI/Orequests,driversalsohandleconfigurationrequests.Applicationsmayconfig-urethedevice,forexample,bysettingthebandwidthofanetworkcardorthevolumeforasoundcard.Configura-tionrequestsmaychangebothdriveranddevicebehaviorforfutureI/Orequests.
3.1DriverFaults
Mostdriversfailduetobugsthatresultfromunexpectedinputsorevents[34].Forexample,adrivermaycorruptadatastructureifaninterruptarrivesduringasensitive
1Thispaperusestheterms“devicedriver”and“driver”interchange-
ably;similarly,weusetheterms“shadowdriver”and“shadow”inter-changeably.
OS KernelKernel InterfaceSound DriverClass InterfaceSound Card Device DriverSound CardFigure1:Asampledevicedriver.Thedevicedriverexportstheservicesdefinedbythedevice’sclassinterfaceandim-portsservicesfromthekernel’sinterface.portionofrequestprocessing.Devicedriversmaycrashinresponseto(1)thestreamofrequestsfromthekernel,bothconfigurationandI/O,(2)messagestoandfromthedevice,and(3)thekernelenvironment,whichmayraiseorlowerpowerstates,swappagesofmemory,andinter-ruptthedriveratarbitrarytimes.AdriverbugtriggeredsolelybyasequenceofconfigurationorI/Orequestsiscalledadeterministicfailure.Nogenericrecoverytech-niquecantransparentlyrecoverfromthistypeofbug,be-causeanyattempttocompleteanoffendingrequestmaytriggerthebug[11].Incontrast,transientfailuresaretriggeredbyadditionalinputsfromthedeviceortheop-eratingsystemandoccurinfrequently.AdriverfailurethatisdetectedandstoppedbythesystembeforeanyOS,device,orapplicationstateisaf-fectedistermedfail-stop.Moreinsidiousfailuresmaycorruptthesystemorapplicationandneverbedetected.Thesystem’sresponsetofailuredetermineswhetherafailureisfail-stop.Forexample,asystemthatdetectsandpreventsaccidentalwritestokerneldatastructuresexhibitsfail-stopbehaviorforsuchabug,whereasonethatallowscorruptiondoesnot.
AppropriateOStechniquescanensurethatdriversex-ecuteinafail-stopfashion[32,33,36].Forexample,inearlierworkwedescribedNooks[33],akernelreliabilitysubsystemthatexecuteseachdriverwithinitsownin-kernelprotectiondomain.Nooksdetectsfaultsthroughmemoryprotectionviolations,excessiveCPUusage,andcertainbadparameterspassedtothekernel.WhenNooksdetectsafailure,itstopsexecutionwithinthedriver’spro-tectiondomainandtriggersarecoveryprocess.Were-portedthatNookswasabletodetectapproximately75%offailuresinsyntheticfault-injectiontests[33].
Shadowdriverscanrecoveronlyfromfailuresthatarebothtransientandfail-stop.Deterministicfailuresmayrecurwhenthedriverrecovers,againcausingafailure.
Incontrast,transientfailuresaretriggeredbyenviron-mentalfactorsthatareunlikelytopersistduringrecov-ery.Inpractice,manydriversexperiencetransientfail-ures,causedbythecomplexitiesofthekernelexecutionenvironment(e.g.asynchrony,interrupts,lockingproto-cols,andvirtualmemory)[1],whicharedifficulttofindandfix.Deterministicdriverfailures,incontrast,aremoreeasilyfoundandfixedinthetestingphaseofde-velopmentbecausethefailuresarerepeatable[18].Re-coverablefailuresmustalsobefail-stop,becauseshadowdriversconcealfailuresfromthesystemandapplications.Hence,shadowdriversrequireareliabilitysubsystemtodetectandstopfailuresbeforetheyarevisibletoapplica-tionsortheoperatingsystem.Althoughshadowdriversmayuseanymechanismthatprovidestheseservices,ourimplementationusesNooks.
3.2ShadowDrivers
Ashadowdriverisakernelagentthatimprovesrelia-bilityforasingledevicedriver.Itcompensatesforandrecoversfromadriverthathasfailed.Whenadriverfails,itsshadowrestoresthedrivertoafunctioningstateinwhichitcanprocessI/Orequestsmadebeforethefail-ure.Whilethedriverrecovers,theshadowdriverservicesitsrequests.
Shadowdriversexecuteinoneoftwomodes:pas-siveoractive.Inpassivemode,usedduringnormal(non-faulting)operation,theshadowdrivermonitorsallcommunicationbetweenthekernelandthedevicedriveritshadows.Thismonitoringisachievedviareplicatedprocedurecalls:akernelcalltoadevicedriverfunc-tioncausesanautomatic,identicalcalltoacorrespond-ingshadowdriverfunction.Similarly,adrivercalltoakernelfunctioncausesanautomatic,identicalcalltoacorrespondingshadowdriverfunction.Thesepassive-modecallsaretransparenttothedevicedriverandthekernel.Theyarenotintendedtoprovideanyservicetoeitherpartyandexistonlytotrackthestateofthedriverasnecessaryforrecovery.
Inactivemode,whichoccursduringrecoveryfromafailure,theshadowdriverperformstwofunctions.First,it“impersonates”thefaileddriver,interceptingandre-spondingtocallsfromthekernel.Therefore,theker-nelandhigher-levelapplicationscontinueoperatinginasnormalafashionaspossible.Second,theshadowdriverimpersonatesthekerneltorestartthefaileddriver,inter-ceptingandrespondingtocallsfromtherestarteddrivertothekernel.Inotherwords,inactivemodetheshadowdriverlookslikethekerneltothedriverandlikethedrivertothekernel.Onlytheshadowdriverisawareofthede-ception.Thisapproachhidesrecoverydetailsfromthedriver,whichisunawarethatitisbeingrestartedbyashadowdriverafterafailure.
Oncethedriverhasrestarted,theactive-modeshadowreintegratesthedriverintothesystem.Itre-establishesanyapplicationconfigurationstatedownloadedintothedriverandthenresumespendingrequests.
Ashadowdriverisa“classdriver,”awareofthein-terfacetothedriversitshadowsbutnotoftheirimple-mentations.Asingleshadowdriverimplementationcanrecoverfromafailureofanydriverintheclass.Theclassorientationhasthreekeyimplications.First,anoperat-ingsystemcanleverageafewimplementationsofshadowdriverstorecoverfromfailuresinalargenumberofde-vicedrivers.Second,implementingashadowdriverdoesnotrequireadetailedunderstandingoftheinternalsofthedriversitshadows.Rather,itrequiresonlyanunderstand-ingofthosedrivers’interactionswiththekernel.Finally,ifanewdriverisloadedintothekernel,nonewshadowdriverisrequiredaslongasashadowforthatclassal-readyexists.Forexample,ifanewnetworkinterfacecardanddriverareinsertedintoaPC,theexistingnetworkshadowdrivercanshadowthenewdriverwithoutchange.Similarly,driverscanbepatchedorupdatedwithoutre-quiringchangestotheirshadows.Shadowupdatingisrequiredonlytorespondtoachangeinthekernel-driverprogramminginterface.
3.3Taps
Aswehaveseen,ashadowdrivermonitorscommuni-cationbetweenafunctioningdriverandthekernelandimpersonatesonecomponenttotheotherduringfailureandrecovery.Theseactivitiesaremadepossiblebyanewmechanism,calledatap.Conceptually,atapisaT-junctionplacedbetweenthekernelanditsdrivers.Itcanbesettoreplicatecallsduringpassivemodeandredi-rectthemduringrecovery.
Atapoperatesinpassiveoractivemode,correspond-ingtothestateoftheshadowdriverattachedtoit.Duringpassive-modeoperation,thetap:(1)invokestheoriginaldriver,then(2)invokestheshadowdriverwiththepa-rametersandresultsofthecall.ThisoperationisshowninFigure2.
Onfailure,thetapswitchestoactivemode,showninFigure3.Inthismode,it:(1)terminatesallcommunica-tionbetweenthedriverandkernel,and(2)redirectsallin-vocationstotheircorrespondinginterfaceintheshadow.Inactivemode,boththekernelandtherecoveringdevicedriverinteractonlywiththeshadowdriver.Followingrecovery,thetapreturnstoitspassive-modestate.
TapsdependontheabilitytodynamicallydispatchallcommunicationbetweenthedriverandtheOS.Conse-quently,allcommunicationintoandoutofadriverbe-ingshadowedmustbeexplicit,suchasthroughaproce-durecalloramessage.Mostdriversoperatethisway,butsomedonotandcannotbeshadowed.Forexample,
OS KernelKernel InterfacereecvaifrrDet dnIn usosTapsShadow Sal Sound CCopies DriverleecnafrSound DrivererKetnClass InterfaceISound Card Device DriverSound CardFigure2:Asampleshadowdriveroperatinginpassivemode.Tapsinsertedbetweenthekernelandsounddriverensurethatallcommunicationbetweenthetwoispassivelymonitoredbytheshadowdriver.OS KernelKernel InterfacereecvaifrrDet dnIn usosTapsShadow SalC Sound DriverleecnafrSound DrivererKetnClass InterfaceISound Card Device DriverSound CardFigure3:Asampleshadowdriveroperatinginactivemode.Thetapsredirectcommunicationbetweenthekernelandthefaileddriverdirectlytotheshadowdriver.kernelvideodriversoftencommunicatewithusermodeapplicationsthroughsharedmemoryregions[22].3.4TheShadowManagerRecoveryissupervisedbytheshadowmanager,whichisakernelagentthatinterfaceswithandcontrolsallshadowdrivers.Theshadowmanagerinstantiatesnewshadowdriversandinjectstapsintothecallinterfacesbetweenthedevicedriverandkernel.Italsoreceivesnotifica-tionfromthefault-isolationsubsystemthatadriverhasstoppedduetoafailure.
Whenadriverfails,theshadowmanagertransitionsitstapsandshadowdrivertoactivemode.Inthismode,
requestsforthedriver’sservicesareredirectedtoanap-propriatelypreparedshadowdriver.Theshadowmanagertheninitiatestheshadowdriver’srecoverysequencetorestorethedriver.Whenrecoveryends,theshadowman-agerreturnstheshadowdriverandtapstopassive-modeoperationsothedrivercanresumeservice.
3.5Summary
Ourdesignsimplifiesthedevelopmentandintegrationofshadowdriversintoexistingsystems.Eachshadowdriverisasinglemodulewrittenwithknowledgeofthebehav-ior(interface)ofaclassofdevicedrivers,allowingittoconcealadriverfailureandrestartthedriverafterafault.Ashadowdriver,normallypassive,monitorscommuni-cationbetweenthekernelandthedriver.Itbecomesanactiveproxywhenadriverfailsandthenmanagesitsre-covery.4ShadowDriverImplementation
ThissectiondescribestheimplementationofshadowdriversintheLinuxoperatingsystem[6].Wehaveimple-mentedshadowdriversforthreeclassesofdevicedrivers:soundcarddrivers,networkinterfacedrivers,andIDEstoragedrivers.
4.1GeneralInfrastructure
Allshadowdriversrelyonagenericserviceinfrastructurethatprovidesthreefunctions.Anisolationservicepre-ventsdrivererrorsfromcorruptingthekernelbystoppingadriverondetectingafailure.Atransparentredirectionmechanismimplementsthetapsrequiredfortransparentshadowingandrecovery.Lastly,anobjecttrackingser-vicetrackskernelresourcescreatedorheldbythedriversoastofacilitaterecovery.
OurshadowdriverimplementationusesNookstopro-videthesefunctions.Throughitsfaultisolationsubsys-tem,Nooks[33]isolatesdriverswithinseparatekernelprotectiondomains.Thedomainsusememoryprotec-tiontotrapdriverfaultsandensuretheintegrityofkernelmemory.Nooksinterposesproxyproceduresonallcom-municationbetweenthedevicedriverandkernel.Wein-sertourtapcodeintotheseNooksproxiestoreplicateandredirectcommunication.Finally,Nookstracksker-nelobjectsusedbydriverstoperformgarbagecollectionofkernelresourcesduringrecovery.
OurimplementationaddsashadowmanagertotheLinuxoperatingsystem.InadditiontoreceivingfailurenotificationsfromNooks,theshadowmanagerhandlestheinitialinstallationofshadowdrivers.Incoordina-tionwiththekernel’smoduleloader,whichprovidesthedriver’sclass,theshadowmanagercreatesanewshadowdriverinstanceforadriver.Becauseasingleshadowdriverservicesaclassofdevicedrivers,theremaybe
ApplicationsRecovery SubsystemLinux KernelShadow DriverDriverDeviceShadow ManagerShadow DriverShadow DriverDriverDeviceDriverDeviceNooks Fault Isolation SubsystemProtection TapsObject DriverDeviceDomainsProxiesTableFigure4:TheLinuxoperatingsystemwithseveraldevicedriversandthedriverrecoverysubsystem.Newcodecom-ponentsincludethetaps,theshadowmanagerandasetofshadowdrivers,allbuiltontopoftheNooksdriverfaultisolationsubsystem.severalinstancesofashadowdriverexecutingifthereismorethanonedriverofaclasspresent.Thenewin-stancesharesthesamecodewithallotherinstancesofthatshadowdriverclass.Figure4showsthedriverrecoverysubsystem,whichcontainstheNooksfaultisolationsubsystem,theshadowmanager,andasetofshadowdrivers,eachofwhichcanmonitoroneormoredevicedrivers.
4.2Passive-ModeMonitoring
Inpassivemode,ashadowdriverrecordsseveraltypesofinformation.First,ittracksrequestsmadetothedriver,enablingpendingrequeststoexecutecorrectlyaf-terrecovery.Forconnection-orienteddrivers,theshadowdriverrecordsthestateofeachactiveconnection,suchasoffsetorpositioninginformation.Forrequest-orienteddrivers,theshadowdrivermaintainsalogofpendingcommandsandarguments.Anentryremainsintheloguntilthecorrespondingrequesthasbeenhandled.
Theshadowdriveralsorecordsconfigurationanddriverparametersthatthekernelpassesintothedriver.Duringrecovery,theshadowusesthisinformationtoactinthedriver’splace,returningthesameinformationthatwaspassedinpreviously.Thisinformationalsoassistsinreconfiguringthedrivertoitspre-failurestatewhenitisrestarted.Forexample,theshadowsounddriverkeepsalogofioctlcalls(commandnumbersandarguments)thatconfigurethedriver.Thislogmakesitpossibleto:(1)actasthedevicedriverbyrememberingthesoundformatsitsupports,and(2)recoverthedriverbyreset-tingproperties,suchasthevolumeandsoundformatinuse.
Theshadowdrivermaintainsonlytheconfigurationofthedriverinitslog.Forstatefuldevices,suchasframebuffersorstoragedevices,itdoesnotcreateacopyofthedevicestate.Instead,ashadowdriverdependsonthefail-
stopassumptiontopreservepersistentstate(e.g.,ondisk)fromcorruption.Itcanrestoretransientstate(statethatislostwhenthedeviceresets)ifitcanforcethedevice’sclientstorecreatethatstate,forexample,byredrawingthecontentsofaframebuffer.
Lastly,theshadowtracksallkernelobjectsthatthedriverallocatedorreceivedfromthekernel.Theseob-jectswouldotherwisebelostwhenthedriverfails,caus-ingamemoryleak.Forexample,theshadowmustrecordalltimercallbacksregisteredandallhardwareresourcesowned,suchasinterruptlinesandI/Omemoryregions.Inmanycases,passive-modecallsdonoworkandtheshadowreturnsimmediatelytothecaller.Forexample,thedominantcallstoasound-carddriverarereadandwrite,whichrecordorplaysound.Inpassivemode,theshadowdriverimplementsthesecallsasno-ops,sincethereisnoneedtocopythereal-timesounddataflowingthroughthedevicedriver.Foranioctlcall,however,thesound-cardshadowdriverlogsthecommandanddatafortheconnection.Similarly,theshadowdriverforanIDEdiskdoeslittleornoworkinpassivemode,sincethekernelanddiskdriverhandleallI/Oandrequestqueu-ing.Finally,forthenetworkshadowdriver,muchoftheworkisalreadyperformedbytheNooksobject-trackingsystem,whichkeepsreferencestooutstandingpackets.
4.3Active-ModeRecovery
Adrivertypicallyfailsbygeneratinganillegalmemoryreferenceorpassinganinvalidparameteracrossaker-nelinterface.Thekernel-levelfailuredetectornoticesthefailureandinvokestheshadowmanager,whichlocatestheappropriateshadowdriveranddirectsittorecoverthefaileddriver.Thethreestepsofrecoveryare:(1)stop-pingthefaileddriver,(2)reinitializingthedriverfromacleanstate,and(3)transferringrelevantshadowdriverstateintothenewdriver.4.3.1
StoppingtheFailedDriver
Theshadowmanagerbeginsrecoverybyinformingtheresponsibleshadowdriverthatafailurehasoccurred.Italsoswitchesthetaps,isolatingthekernelanddriverfromoneanother’ssubsequentactivityduringrecovery.Af-terthispoint,thetapredirectsallkernelrequeststotheshadowuntilrecoveryiscomplete.
Informedofthefailure,theshadowdriverfirstdis-ablesexecutionofthefaileddriver.ItalsodisablesthehardwaredevicetopreventitfrominterferingwiththeOSwhilenotunderdrivercontrol.Forexample,theshadowdisablesthedriver’sinterruptrequestline.Otherwise,thedevicemaycontinuouslyinterruptthekernelandpreventrecovery.OnhardwareplatformswithI/Omemorymap-ping,theshadowalsoremovesthedevice’sI/OmappingstopreventDMAsintokernelmemory.
Toprepareforrestartingthedevicedriver,theshadowgarbagecollectsresourcesheldbythedriver.Itretainsobjectsthatthekernelusestorequestdriverservices,toensurethatthekerneldoesnotseethedriver“disappear”asitisrestarted.Theshadowreleasestheremainingre-sources.
4.3.2ReinitializingtheDriver
Theshadowdrivernext“reboots”thedriverfromacleanstate.Normally,restartingadriverrequiresreloadingthedriverfromdisk.However,wecannotassumethatthediskisfunctionalduringrecovery.Forthisreason,whencreatinganewshadowdriverinstance,theshadowman-agercachesintheshadowinstanceacopyofthedevicedriver’sinitial,cleandatasection.Thesesectionstendtobesmall.Thedriver’scodeiskernel-read-only,soitisnotcachedandcanbereusedfrommemory.
Theshadowrestartsthedriverbyinitializingthedriver’sstateandthenrepeatingthekernel’sdriverini-tializationsequence.Forsomedriverclasses,suchassoundcarddrivers,thisconsistsofasinglecallintothedriver’sinitializationroutine.Otherdrivers,suchasnet-workinterfacedrivers,requireadditionalcallstoconnectthedriverintothenetworkstack.
Asthedriverrestarts,theshadowreattachesthedrivertoitspre-failurekernelresources.Duringdriverreboot,thedrivermakesanumberofcallsintothekerneltodis-coverinformationaboutitselfandtolinkitselfintothekernel.Forexample,thedrivercallsthekerneltoreg-isteritselfasadriverandtorequesthardwareandker-nelresources.Thetapsredirectthesecallstotheshadowdriver,whichreconnectsthedrivertoexistingkerneldatastructures.Thus,whenthedriverattemptstoregisterwiththekernel,theshadowinterceptsthecallandreusestheexistingdriverregistration,avoidingtheallocationofanewone.Forrequeststhatgeneratecallbacks,suchasarequesttoregisterthedriverwiththePCIsubsystem,theshadowemulatesthekernel,makingthesamecallbackstothedriverwiththesameparameters.Thedriveralsoacquireshardwareresources.Iftheseresourceswerepre-viouslydisabledatthefirststepofrecovery,theshadowre-enablesthem,e.g.,enablinginterrupthandlingforthedevice’sinterruptline.Inessence,theshadowdriverini-tializestherecoveringdriverbycallingandrespondingasthekernelwouldwhenthedriverstartsnormally.4.3.3TransferringStatetotheNewDriver
Thefinalrecoverysteprestoresthedriverstatethatex-istedatthetimeofthefault,permittingittorespondtorequestsasifithadneverfailed.Thus,anyconfigurationthateitherthekerneloranapplicationhaddownloadedtothedrivermustberestored.Thedetailsofthisfinal
statetransferdependonthedevicedriverclass.Somedriversareconnectionoriented.Forthese,thestatecon-sistsofthestateoftheconnectionsbeforethefailure.Theshadowre-openstheconnectionsandrestoresthestateofeachactiveconnectionwithconfigurationcalls.Otherdriversarerequestoriented.Forthese,theshadowre-storesthestateofthedriverandthenresubmitstothedriveranyrequeststhatwereoutstandingwhenthedrivercrashed.
Asanexample,forafailedsoundcarddriver,theshadowdriverresetsthesounddriverandallitsopenconnectionsbacktotheirpre-failuresstate.Specifically,theshadowscansitslistofopenconnectionsandcallstheopenfunctioninthedrivertoreopeneachconnec-tion.Theshadowthenwalksitslogofconfigurationcom-mandsandreplaysanycommandsthatsetdriverproper-ties.
Forsomedriverclasses,theshadowcannotcom-pletelytransferitsstateintothedriver.However,itmaybepossibletocompensateinother,perhapslesselegant,ways.Forexample,asounddriverthatisrecordingsoundstoresthenumberofbytesithasrecordedsincethelastreset.Afterrecovery,thesounddriverinitializesthiscountertozero.Becausenointerfacecallisprovidedtochangethecountervalue,theshadowdrivermustinsertits“true”valueintothereturnargumentlistwhenevertheapplicationreadsthecountertomaintaintheillusionthatthedriverhasnotcrashed.Theshadowcandothisbe-causeitreceivescontrol(onitsreplicatedcall)beforethekernelreturnstouserspace.
Afterresettingdriverandconnectionstate,theshadowmusthandlerequeststhatwereeitheroutstandingwhenthedrivercrashedorarrivedwhilethedriverwasrecovering.Unfortunately,shadowdriverscannotguar-anteeexactly-oncebehaviorfordriverrequestsandmustrelyondevicesandhigherlevelsofsoftwaretoabsorbduplicaterequests.Forexample,ifadrivercrashesaftersubmittingarequesttoadevicebutbeforenotifyingthekernelthattherequesthascompleted,theshadowcannotknowwhethertherequestwasactuallyprocessed.Dur-ingrecovery,theshadowdriverhastwochoices:restartin-progressrequestsandriskduplication,orcancelthere-questandrisklostdata.Forsomedeviceclasses,suchasdisksornetworks,duplicationisacceptable.However,otherclasses,suchasprinters,maynottoleratedupli-cates.Inthesecases,theshadowdrivercancelsoutstand-ingrequests,whichmaylimititsabilitytomaskfailures.Afterthisfinalstep,thedriverhasbeenreinitial-ized,linkedintothekernel,reloadedwithitspre-failurestate,andisreadytoprocesscommands.Atthispoint,theshadowdrivernotifiestheshadowmanager,whichsetsthetapstorestorekernel-drivercommunicationandreestablishpassive-modemonitoring.
4.4Active-ModeProxyingofKernelRequests
Whileashadowdriverisrestoringafaileddriver,itisalsoactinginplaceofthedrivertoconcealthefailureandrecoveryfromapplicationsandthekernel.Theshadowdriver’sresponsetoadriverrequestdependsonthedriverclassandrequestsemantics.Ingeneral,theshadowwilltakeoneoffiveactions:(1)respondwithinformationthatithasrecorded,(2)silentlydroptherequest,(3)queuetherequestforlaterprocessing,(4)blocktherequestuntilthedriverrecovers,or(5)reportthatthedriverisbusyandthekernelorapplicationshouldtryagainlater.Thechoiceofstrategydependsonthecaller’sexpectationsofthedriver.Writingashadowdriverthatproxiesforafaileddriverrequiresknowledgeofthekernel-driverinterface,inter-actions,andrequirements.Forexample,thekernelmayrequirethatsomedriverfunctionsneverblock,whileoth-ersalwaysblock.Somekernelrequestsareidempotent(e.g.,manyioctlcommands),permittingduplicatere-queststobedropped,whileothersreturndifferentresultsoneverycall(e.g.,manyreadrequests).Theshadowforadriverclassusestheserequirementstoselectthere-sponsestrategy.
Activeproxyingissimplifiedfordriverinterfacesthatsupportanotionof“busy.”Byreportingthatthedeviceiscurrentlybusy,shadowdriversinstructthekernelorap-plicationtoblockcallstoadriver.Forexample,networkdriversinLinuxmayrejectrequestsandturnthemselvesoffiftheirqueuesarefull.Thekernelthenrefrainsfromsendingpacketsuntilthedriverturnsitselfbackon.Ourshadownetworkdriverexploitsthisbehaviorduringre-coverybyreturninga“busy”erroroncallstosendpack-ets.IDEstoragedriverssupportasimilarnotionwhenrequestqueuesfillup.Sounddriverscanreportthattheirbuffersaretemporarilyfull.
Ourshadowsound-carddriverusesamixofallfivestrategiesforemulatingfunctionsinitsserviceinterface.Theshadowblockskernelreadandwriterequests,whichplayorrecordsoundsamples,untilthefaileddriverrecovers.Itprocessesioctlcallsitself,eitherbyrespondingwithinformationitcapturedorbyloggingtherequesttobeprocessedlater.Forioctlcommandsthatareidempotent,theshadowdriversilentlydropsdupli-caterequests.Finally,whenapplicationsqueryforbufferspace,theshadowrespondsthatbuffersarefull.Asaresult,manyapplicationsblockthemselvesratherthanblockingintheshadowdriver.
4.5Limitations
Aspreviouslydescribed,shadowdrivershavelimita-tions.First,shadowdriversrelyondynamicunloadingandreloadingofdevicedrivers.Ifadrivercannotbereloadeddynamically,orwillnotreinitializeproperly,thenashadowcannotrecoverthedriver.Second,shadow
driversrelyonexplicitcommunicationbetweenthede-vicedriverandkernel.Ifdriver-kernelcommunicationtakesplacethroughanad-hocinterface,suchassharedmemory,theshadowdrivercannotmonitorit.Third,shadowdriversassumethatdriverfailuredoesnotcauseirreversiblesideeffects.Ifacorrupteddriverstoresper-sistentstate(e.g.,printingabadcheckorwritingbaddataonadisk),theshadowdriverwillnotbeabletocorrectthataction.
Theeffectivenessofshadowdriversisalsolimitedbytheabilitiesoftheisolationandfailure-detectionsubsys-tem.Ifthislayercannotpreventkernelcorruption,thenshadowdriverscannotfacilitatesystemrecovery.Inad-dition,ifthefault-isolationsubsystemdoesnotdetectafailure,thenshadowdriverswillnotbeproperlyinvokedtoperformrecovery,andapplicationsmayfail.Detectingfailuresisdifficultbecausedriversarecomplexandmayrespondtoapplicationrequestsinmanyways.Itmaybeimpossibletodetectavalidbutincorrectreturnvalue;forexample,asounddrivermayreturnincorrectsounddatawhenrecording.Asaresult,nofailuredetectorcandetecteverydevicedriverfailure.However,wesupportclass-basedfailuredetectorsthatcandetectviolationsofadriver’sprogramminginterfaceandreducethenumberofundetectedfailures.
Finally,shadowdriversmaynotbesuitableforap-plicationswithreal-timedemands.Duringrecovery,adevicemaybeunavailableforseveralsecondswithoutnotifyingtheapplicationofafailure.Theseapplications,whichshouldbewrittentotoleratefailures,wouldbebet-terservedbyasolutionthatrestartsthedriverbutdoesnotperformactiveproxying.
4.6Summary
ThissectionpresentedthedetailsofourLinuxshadowdriverimplementation.Theshadowdriverconceptisstraightforward:passivelymonitornormaloperations,proxyduringfailure,andreintegrateduringrecovery.Ul-timately,thevalueofshadowdriversdependsonthede-greetowhichtheycanbeimplementedcorrectly,effi-ciently,andeasilyinanoperatingsystem.Thefollowingsectionevaluatessomeofthesequestionsbothqualita-tivelyandquantitatively.
5Evaluation
Thissectionevaluatesfourkeyaspectsofshadowdrivers.
1.Performance.Whatistheperformanceoverheadofshadowdriversduringnormal,passive-modeoper-ation(i.e.,intheabsenceoffailure)?Thisisthedynamiccostofourmechanism.
ClassDriverDeviceNetworke1000IntelPro/1000GigabitEthernetpcnet32AMDPCnet3210/100Ethernet3c59x3COM3c509b10/100Ethernete100IntelPro/100Ethernetepic100SMCEtherPower10/100EthernetSoundaudigySoundBlasterAudigysoundcardemu10k1SoundBlasterLive!soundcardsbSoundBlaster16soundcardes1371Ensoniqsoundcardcs4232Crystalsoundcardi810audioIntel810soundcardStorageide-diskIDEdiskide-cdIDECD-ROMTable1:ThethreeclassesofshadowdriversandtheLinux
driverstested.Wepresentresultsfortheboldfaceddriversonly,astheothersbehavedsimilarly.
2.Fault-Tolerance.Canapplicationsthatuseadevicedrivercontinuetorunevenafterthedriverfails?Weevaluateshadowdriverrecoveryinthepresenceofsimplefailurestoshowthebenefitsofshadowdriverscomparedtoasystemthatprovidesfailureisolationalone.3.Limitations.Howreasonableisourassumptionthatdriverfailuresarefail-stop?Usingsyntheticfaultin-jection,weevaluatehowlikelyitisthatdriverfail-uresarefail-stop.4.Codesize.Howmuchcodeisrequiredforshadowdriversandtheirsupportinginfrastructure?Weeval-uatethesizeandcomplexityoftheshadowdriverimplementationtohighlighttheengineeringcostin-tegratingshadowdriversintoanexistingsystem.Basedonasetofcontrolledapplicationanddriverex-periments,ourresultsshowthatshadowdrivers:(1)im-poserelativelylittleperformanceoverhead,(2)keepap-plicationsrunningwhenadriverfails,(3)arelimitedbyasystem’sabilitytodetectthatadriverhasfailed,and(4)canbeimplementedwithamodestamountofcode.
Theexperimentswererunona3GHzPentium4PCwith1GBofRAMandan80GB,7200RPMIDEdiskdrive.WebuiltandtestedthreeLinuxshadowdriversforthreedevice-driverclasses:networkinterfacecontroller,soundcard,andIDEstoragedevice.Toensurethatourgenericshadowdriversworkedconsistentlyacrossdevicedriverimplementations,wetestedthemonthirteendiffer-entLinuxdrivers,showninTable1.Althoughwepresentdetailedresultsforonlyonedriverineachclass(e1000,audigy,andide-disk),behavioracrossalldriverswassim-ilar.
DeviceDriverApplicationActivitySound•mp3player(zinf)playing128kb/saudio(audigydriver)
•audiorecorder(audacity)recordingfrommicrophone
•speechsynthesizer(festival)readingatextfile
•strategygame(BattleofWesnoth)Network•networksend(netperf)overTCP/IP(e1000driver)
•networkreceive(netperf)overTCP/IP•networkfiletransfer(scp)ofa1GBfile•remotewindowmanager(vnc)
•networkanalyzer(ethereal)sniffingpacketsStorage•compiler(make/gcc)compiling788Cfiles(ide-diskdriver)
•encoder(LAME)converting90MBfile.wavto.mp3
•database(mySQL)processingtheWisconsinBenchmark
Table2:Theapplicationsusedforevaluatingshadow
drivers.
5.1Performance
Toevaluateperformance,weproducedthreeOSconfigu-rationsbasedontheLinux2.4.18kernel:1.Linux-NativeistheunmodifiedLinuxkernel.2.Linux-NooksisaversionofLinux-Nativethatin-cludestheNooksfaultisolationsubsystembutnoshadowdrivers.Whenadriverfails,thissystemrestartsthedriverbutdoesnotattempttoconcealitsfailure.3.Linux-SDisaversionofLinux-Nooksthatincludesourentirerecoverysubsystem,includingtheNooksfaultisolationsubsystem,theshadowmanager,andourthreeshadowdrivers.Weselectedavarietyofcommonapplicationsthatdependonourthreedevicedriverclassesandmeasuredtheirperformance.TheapplicationnamesandbehaviorsareshowninTable2.
Differentapplicationshavedifferentperformancemetricsofinterest.Forthediskandsounddrivers,werantheapplicationsshowninTable2andmeasuredelapsedtime.Forthenetworkdriver,throughputisamoreusefulmetric;therefore,weranthethroughput-orientednetworksendandnetworkreceivebenchmarks.Foralldrivers,wealsomeasuredCPUutilizationwhiletheprogramsran.Allmeasurementswererepeatedseveraltimesandshowedavariationoflessthanonepercent.
Figure5showstheperformanceofLinux-NooksandLinux-SDrelativetoLinux-Native.Figure6comparesCPUutilizationforexecutionofthesameapplicationsonthethreeOSversions.Bothfiguresmakeclearthatshadowdriversimposeonlyasmallperformancepenaltycomparedtorunningwithnoisolationatall,andnonoadditionalpenaltybeyondthatimposedbyisolation
Relative Performance
SoundNetworkStorage100)%( ecn95
amrofreP90
evitale85
R80
rrh.ykkrreoereeydchgeeodrelsiiadreevtontmwnoipdalpueewecaarmottcb 3apyetgesecassontrsnnerapcedmLinux-NativeLinux-NooksLinux-SDFigure5:Comparativeapplicationperformance,relative
toLinux-Native,forthreeconfigurations.TheX-axiscrossesat80%.
CPU Utilization
100SoundNetworkStorage)%(80 noitaz60ilitU U40PC200
rreeh.ykkrrreyodcgereodeelsiiadrehtemovnialetwnepduooppyaatewtcbacssretgeseemca t3nnronprsceadmLinux-NativeLinux-NooksLinux-SDFigure6:AbsoluteCPUutilizationbyapplicationforthree
configurations.
alone.Acrossallnineapplications,performanceofthesystemwithshadowdriversaveraged99%ofthesystemwithout,andwasneverworsethan97%.
Thelowoverheadofshadowdriverscanbeexplainedintermsofitstwoconstituents:faultisolationandtheshadowingitself.Asmentionedpreviously,faultisola-tionrunseachdriverinitsowndomain,leadingtoover-headcausedbydomaincrossings.Eachdomaincrossingtakesapproximately3000cycles,mostlytochangepagetablesandexecutionstacks.Asasideeffectofchang-ingpagetables,thePentium4processorflushestheTLB,resultinginTLBmissesthatcannoticeablyslowdowndrivers[33].
Forexample,thekernelcallsthedriverapproximately1000timespersecondwhenrunningaudiorecorder.Eachinvocationexecutesonlyasmallamountofcode.Asaresult,isolatingthesounddriveraddsonlynegligi-
blytoCPUutilization,becausetherearenotmanycross-ingsandnotmuchcodetoslowdown.Forthemostdisk-intensiveoftheIDEstorageapplications,thedatabasebenchmark,thekernelanddriverinteractonly290timespersecond.However,eachcallintotheide-diskdriverresultsinsubstantialworktoprocessaqueueofdiskre-quests.TheTLB-inducedslowdowndoublesthetimedatabasespentinthedriverrelativetoLinux-Nativeandincreasestheapplication’sCPUutilizationfrom21%to27%.Ontheotherhand,thenetworksendbenchmarktransmits45,000packetspersecond,causing45,000do-maincrossings.Thedriverdoeslittleworkforeachpacket,buttheoverallimpactisvisibleinFigure6,whereCPUutilizationforthisbenchmarkincreasesfrom28%to57%withdriverfaultisolation.
Inthecasetheactualshadowing,weseefromacom-parisonoftheLinux-NooksandLinux-SDbarsinFig-ures5and6thatthecostissmallornegligible.AsnotedinSection4.2,manypassive-modeshadow-driverfunc-tionsareno-ops.Asaresult,theincrementalpassive-modeperformancecostoverbasicfaultisolationisloworunmeasurableinmanycases.
Insummary,then,theoverallperformancepenaltyofshadowdriversduringfailure-freeoperationislow,sug-gestingthatshadowdriverscouldbeusedacrossawiderangeofapplicationsandenvironments.
5.2Fault-Tolerance
Regardlessofperformance,thecrucialquestionforshadowdriversiswhetheranapplicationcancontinuefunctioningfollowingthefailureofadevicedriveronwhichitrelies.Toanswerthisquestion,wetested10applicationsonthethreeconfigurations,Linux-Native,Linux-Nooks,andLinux-SD.Forthediskandsounddrivers,weagainrantheapplicationsshowninTable2.Becausewewereinterestedintheresponseto,notper-formance,wesubstitutednetworkfilecopy,remotewin-dowmanager,andnetworkanalyzerforthenetworkingbenchmarks.
Wesimulatedcommonbugsbyinjectingasoftwarefaultintoadevicedriverwhileanapplicationusingthatdriverwasrunning.BecausebothLinux-NooksandLinux-SDdependonthesameisolationandfailure-detectionservices,wedifferentiatetheirrecoverycapa-bilitiesbysimulatingfailuresthatareeasilyisolatedanddetected.Togeneraterealisticsyntheticdriverbugs,weanalyzedpatchespostedtotheLinuxKernelMailingList[24].Wefound31patchesthatcontainedthestrings“patch,”“driver,”and“crash”or“oops”(theLinuxtermforakernelfault)intheirsubjectlines.Ofthe31patches,weidentified11thatfixtransientbugs(i.e.,bugsthatoc-curoccasionallyoronlyafteralongdelayfromthetrig-geringtest).Themostcommoncauseoffailure(threein-stances)wasamissingcheckforanullpointer,oftenwith
DeviceDriver
Sound(audigydriver)
Network(e1000driver)IDE(ide-diskdriver)
ApplicationActivitymp3playeraudiorecorderspeechsynthesizerstrategygame
networkfiletransferremotewindowmanagernetworkanalyzercompilerencoderdatabase
ApplicationBehaviorLinux-NativeLinux-NooksCRASHMALFUNCTIONCRASHMALFUNCTION
√
CRASHCRASHMALFUNCTION
√CRASH√
CRASHCRASHMALFUNCTIONCRASHCRASHCRASHCRASHCRASHCRASH
Linux-SD√√√√√√√√√√
Table3:Theobservedbehaviorofseveralapplicationsfollowingthefailureofthedevicedriversonwhichtheyrely.There
√
arethreebehaviors:acheckmark()indicatesthattheapplicationcontinuedtooperatenormally;CRASHindicatesthattheapplicationfailedcompletely(i.e.,itterminated);MALFUNCTIONindicatesthattheapplicationcontinuedtorun,butwithabnormalbehavior.
asecondarycauseofmissingorbrokensynchronization.Wealsofoundmissingpointerinitializationcode(twoin-stances)andbadcalculations(twoinstances)thatledtoendlessloopsandbufferoverruns.BecausethesefaultsaredetectedbyNooks,theycausefail-stopfailuresonLinux-NooksandLinux-SD.
Weinjectedanull-pointerdereferencebugderivedfromthesepatchesintoourthreedrivers.Weensuredthatthesyntheticbugwastransientbyinsertingthebugintouncommonexecutionpaths,suchascodethathan-dlesunusualhardwareconditions.Thesepathsarerarelyexecuted,soweacceleratedtheoccurrenceoffaultsbyalsoexecutingthebugatrandomintervals.Thefaultcoderemainsactiveinthedriverduringandafterrecovery.Table3showsthethreeapplicationbehaviorsweobserved.Whenadriverfailed,√eachapplicationei-thercontinuedtorunnormally(),failedcompletely(“CRASH”),orcontinuedtorunbutbehavedabnormally(“MALFUNCTION”).Inthelattercase,manualinter-ventionwastypicallyrequiredtoresetorterminatetheprogram.
Thistabledemonstratesthatshadowdrivers(Linux-SD)enableapplicationstocontinuerunningnormallyevenwhendevicedriversfail.Incontrast,allapplica-tionsonLinux-Nativefailedwhendriversfailed.MostprogramsrunningonLinux-Nooksfailedorbehavedab-normally,illustratingthatNooks’kernel-focusedrecov-erysystemdoesnotextendtoapplications.Forexample,Nooksisolatesthekernelfromdriverfaultsandreboots(unloads,reloads,andrestarts)thedriver.However,itlackstwokeyfeaturesofshadowdrivers:(1)itdoesnotadvancethedrivertoitspre-failstate,and(2)ithasnocomponentto“pinchhit”forthefaileddriverduringre-covery.Asaresult,Linux-Nookshandlesdriverfailuresbyreturninganerrortotheapplication,leavingittore-coverbyitself.Unfortunately,fewapplicationscandothis.
SomeapplicationsonLinux-Nookssurvivedthedriverfailurebutinadegradedform.Forexample,mp3player,audiorecorderandstrategygamecontinuedrunning,buttheylosttheirabilitytoinputoroutputsounduntiltheuserintervened.Similarly,networkanalyzer,whichin-terfacesdirectlywiththenetworkdevicedriver,lostitsabilitytoreceivepacketsoncethedriverwasreloaded.AfewapplicationscontinuedtofunctionproperlyafterdriverfailureonLinux-Nooks.Oneapplication,speechsynthesizer,includesthecodetoreestablishitscontextwithinanunreliablesoundcarddriver.TwoofthenetworkapplicationssurvivedonLinux-Nooksbecausetheyaccessthenetworkdevicedriverthroughkernelser-vices(TCP/IPandsockets)thatarethemselvesresilienttodriverfailures.
Linux-SDrecoverstransparentlyfromdiskdriverfail-ures.RecoveryispossiblebecausetheIDEstorageshadowdriverinstancemaintainsthefailingdriver’sini-tialstate.Duringrecoverytheshadowcopiesbacktheinitialdataandreusesthedrivercode,whichisalreadystoredread-onlyinthekernel.Incontrast,Linux-Nooksillustratestheriskofcirculardependenciesfromreboot-ingdrivers.Followingthesefailures,Nooks,whichhadunloadedtheide-diskdriver,wasthenrequiredtoreloadthedriverofftheIDEdisk.Thecircularitycouldonlyberesolvedbyasystemreboot.Whileasecond(non-IDE)diskwouldmitigatethisproblem,fewmachinesarecon-figuredthisway.
Ingeneral,programsthatdirectlydependondriverstatebutareunpreparedtodealwithitslossbenefitthemostfromshadowdrivers.Incontrast,thosethatdonotdirectlydependondriverstateorareabletoreconstructitwhennecessarybenefittheleast.Ourexperiencesug-geststhatfewapplicationsareasfault-tolerantasspeechsynthesizer.Werefutureapplicationstobepushedinthis
direction,softwaremanufacturerswouldeitherneedtodevelopcustomrecoverysolutionsonaper-applicationbasisorfindageneralsolutionthatcouldprotectanyap-plicationfromthefailureofakerneldevicedriver.Costisabarriertothefirstapproach.Shadowdriversareapathtothesecond.
ApplicationBehaviorDuringDriverRecoveryAlthoughshadowdriverscanpreventapplicationfailure,theyarenot“real”devicedriversanddonotprovidecom-pletedeviceservices.Asaresult,weoftenobservedaslighttimingdisruptionwhilethedriverrecovered.Atbest,outputwasqueuedintheshadowdriver.Atworst,inputwaslostbythedevice.Thelengthofthedelaywasprimarilydeterminedbytherecoveringdevicedriverit-self,which,oninitialization,mustfirstdiscover,andthenconfigure,thehardware.
Fewdevicedriversimplementfastreconfiguration,whichcanleadtobriefrecoverydelays.Forexample,thetemporarylossofthee1000networkdevicedriverpreventedapplicationsfromreceivingpacketsforaboutfiveseconds.2Programsusingfilesstoredonthediskmanagedbytheide-diskdriverstalledforaboutfoursec-ondsduringrecovery.Incontrast,thenormallysmoothsoundsproducedbytheaudigysoundcarddriverwereinterruptedbyapauseofaboutone-tenthofonesecond,whichsoundedlikeaslightclickintheaudiostream.Ofcourse,thesignificanceofthesedelaysdependsontheapplication.Streamingapplicationsmaybecomeunacceptably“jittery”duringrecovery.Thoseprocessinginputdatainreal-timemightbecomelossy.Othersmaysimplyrunafewsecondslongerinresponsetoadiskthatappearstobeoperatingmoresluggishlythanusual.Inanyevent,ashortdelayduringrecoveryisbestcon-sideredinlightofthealternative:applicationandsystemfailure.
5.3LimitstoRecovery
Theprevioussectionassumedthatfailureswerefail-stop.However,driverfailuresexperiencedindeployedsystemsmayexhibitawidervarietyofbehaviors.Forexam-ple,adrivermaycorruptstateintheapplication,ker-nel,ordevicewithoutbeingdetected.Inthissituation,shadowdriversmaynotbeabletorecoverormaskfail-uresfromapplications.Thissectionusesfaultinjectionexperimentsinanattempttogeneratefaultsthatmaynotbefail-stop.
2This
driverisparticularlyslowatrecovery.Theothernetwork
driverswetestedrecoveredinlessthanasecond.
Fault Injection OutcomesSoundNetworkStorage100784496763858s80eruliaF60 fo tne40creP200mp3 playeraudionetworknetworkcompilerdatabaserecorderfile transferanalyzerDetectedRecoveredFigure7:Resultsoffault-injectionexperimentsonLinux-SD.Weshow(1)thepercentageoffailuresthatareautomat-icallydetectedbythefaultisolationsubsystem,and(2)thepercentageoffailuresthatshadowdriverssuccessfullyre-covered.Thetotalnumberoffailuresexperiencedbyeachapplicationisshownatthetopofthechart.
Non-fail-stopFailures
Ifdriverfailuresarenotfailstop,thenshadowdriversmaynotbeuseful.Toevaluatewhetherdevicedriverfail-uresareindeedfail-stop,weperformedlarge-scalefault-injectiontestsofourdriversandapplicationsrunningonLinux-SD.Foreachdriverandapplicationcombination,weran350fault-injectiontrials.3Intotal,weran2100trialsacrossthethreedriversandsixapplications.Be-tweentrials,weresetthesystemandreloadedthedriver.Foreachtrial,weinjectedfiverandomerrorsintothedriverwhiletheapplicationwasusingit.Weensuredtheerrorsweretransientbyremovingthemduringrecovery.Afterinjection,wevisuallyobservedtheimpactontheapplicationandthesystemtodeterminewhetherafail-ureorrecoveryhadoccurred.Foreachdriver,wetestedtwoapplicationswithsignificantlydifferentusagescenar-ios.Forexample,wechoseonesound-playingapplica-tion(mp3player)andonesound-recordingapplication(audiorecorder).
Ifweobservedafailure,wethenassessedthetrialontwocriteria:whetherthefaultwasdetected,andwhethertheshadowdrivercouldmaskthefailureandsubsequentrecoveryfromtheapplication.Forundetectedfailures,wetriggeredrecoverymanually.Notethatausermayobserveafailurethatanapplicationdoesnot,forexam-ple,bytestingtheapplication’sresponsiveness.
Figure7showstheresultsofourexperiments.Foreachapplication,weshowthepercentageoffailuresthattheNookssubsystemdetectedandthepercentageoffailuresfromwhichshadowdriverscorrectlyrecovered.Only18%oftheinjectedfaultscausedavisiblefailure.
3For
detailsonthefaultinjectorsee[33].
DriverClassSoundNetworkStorageShadowDriverLinesofCode
666198321DeviceDriverShadowedLinesofCode7,381(audigy)13,577(e1000)5,358(ide-disk)ClassSize#ofDrivers
481908ClassSizeLinesofCode118,981264,50029,000Table4:Sizeandquantityofshadowsandthedriverstheyshadow.
Inourtests,390failuresoccurredacrossallapplica-tions.Thesytemautomaticallydetected65%ofthefail-ures.Ineveryoneofthesecases,shadowdriverswereabletomaskthefailureandfacilitatedriverrecovery.Thesystemfailedtodetect35%ofthefailures.Inthesecases,wemanuallytriggeredrecovery.Shadowdriversrecov-eredfromnearlyallofthesefailures(127outof135).Recoverywasunsuccessfulintheremaining8casesbe-causeeitherthesystemhadcrashed(5cases)orthedriverhadcorruptedtheapplicationbeyondthepossibilityofre-covery(3cases).Itispossiblethatrecoverywouldhavesucceededhadthesefailuresbeendetectedearlierwithabetterfailuredetector.
Acrossallapplicationsanddrivers,wefoundthreemajorcausesofundetectedfailure.First,thesystemdidnotdetectapplicationhangscausedbyI/Orequeststhatnevercompleted.Second,thesystemdidnotdetecterrorsintheinteractionsbetweenthedeviceandthedriver,e.g.,incorrectlycopyingsounddatatoasoundcard.Third,thesystemdidnotdetectcertainbadparameters,suchasincorrectresultcodesordatavalues.Detectingthesethreeerrorconditionswouldrequirethatthesystembetterunderstandthesemanticsofeachdriverclass.Forexam-ple,68%ofthesounddriverfailureswithaudiorecorderwentundetected.Thisapplicationreceivesdatafromthedriverinrealtimeandishighlysensitivetodriveroutput.Asmallerrorordelayintheresultsofadriverrequestmaycausetheapplicationtostoprecordingorrecordthesamesamplerepeatedly.
Ourresultsdemonstrateaneedforclass-basedfailuredetectorsthatcandetectviolationsofthedriverinterfacetoachievehighlevelsofreliability.However,driverfail-uresneednotbedetectedquicklytobefail-stop.Therewasasignificantdelaybetweenthefailureandthesub-sequentmanualrecoveryinourtests,andyettheappli-cationssurvivedthevastmajorityofundetectedfailures.Thus,evenaslowfailuredetectorcanbeeffectiveatim-provingapplicationreliability.Non-transientFailures
Shadowdriverscanrecoverfromtransientfailuresonly.Incontrast,deterministicfailuresmayrecurduringrecov-erywhentheshadowconfiguresthedriver.Whileunabletorecover,shadowdriversarestillusefulforthesefail-ures.Whenafailurerecursduringrecovery,thesequenceofshadowdriverrecoveryeventscreatesadetailedre-productionscenariothataidsdiagnosis.Thisrecordofrecoverycontainsthedriver’scallsintothekernel,re-queststoconfigurethedriver,andI/Orequeststhatwerependingatthetimeoffailure.Thisinformationenablesasoftwareengineertofindandfixtheoffendingbugmoreefficiently.
5.4CodeSize
Theprecedingsectionsevaluatedtheefficiencyandef-fectivenessofshadowdrivers.Thissectionexaminesthecomplexityofshadowdriversintermsofcodesize,whichcanserveasaproxyforcomplexity.
Table4shows,foreachclass,thesizeinlinesofcodeoftheshadowdriverfortheclass.Forcompari-son,weshowthesizeofthedriverfromtheclassthatwetestedandthetotalnumberandcumulativesizeofexistingLinuxdevicedriversinthatclassinthe2.4.18kernel.Thetotalcodesizeisanindicationofthelever-agegainedthroughtheshadow’sclass-driverstructure.Furthermore,thetableshowsthatashadowdriverissig-nificantlysmallerthanthedevicedriveritshadows.Forexample,oursound-cardshadowdriverisonly9%ofthesizeoftheaudigydevicedriveritshadows.TheIDEstor-ageshadowisonly6%percentofthesizeoftheLinuxide-diskdevicedriver.
TheNooksdriverfaultisolationsubsystemwebuiltuponcontainsabout23,000linesofcode.Intotal,weaddedabout3300linesofnewcodetoNookstosupportourthreeclassdrivers.Otherwise,wemadenochangestotheremainderoftheLinuxkernel.Shadowdriversre-quiredtheadditionofapproximately600linesofcodefortheshadowmanager,800linesofcommoncodesharedbyallshadowdrivers,andanother750linesofcodeforgeneralutilities.Ofthe177tapsweinserted,only31requiredactualcode;theremainderwereno-ops.
5.5Summary
Thissectionexaminedtheperformance,fault-tolerance,limits,andcodesizeofshadowdrivers.Ourre-sultsdemonstratethat:(1)theperformanceoverheadofshadowdriversduringnormaloperationissmall,partic-ularlywhencomparedtoapurelyisolatingsystem,(2)
applicationsthatfailedinanyformonLinux-NativeorLinux-Nooksrannormallywithshadowdrivers,(3)thereliabilityprovidedbyshadowdriversislimitedbythesystem’sabilitytodetectfailures,and(4)shadowdriversaresmall,evenrelativetosingledevicedriver.Overall,theseresultsindicatethatshadowdrivershavethepoten-tialtosignificantlyimprovethereliabilityofapplicationsonmodernoperatingsystemswithonlymodestcost.
6Conclusions
Improvingthereliabilityofmodernsystemsdemandsthatweincreasetheirresilience.Tothisend,wedesignedandimplementedshadowdrivers,whichmaskdevicedriverfailuresfromboththeoperatingsystemandapplications.Ourexperienceshowsthatshadowdriversimproveapplicationreliability,byconcealingadriver’sfailurewhilefacilitatingrecovery.Asingleshadowdrivercanenablerecoveryforanentireclassofdevicedrivers.Shadowdriversarealsoefficient,imposinglittleperfor-mancedegradation.Finally,theyaretransparent,requir-ingnocodechangestoexistingdrivers.
Acknowledgments
ThisworkwassupportedinpartbytheNationalSci-enceFoundationundergrantsITR-0085670andCCR-0121341.Wewouldalsoliketothankourshepherd,PeterChen,whoprovidedmanyvaluableinsights.
References
[1]S.Arthur.FaultresilientdriversforLonghornserver.
TechnicalReportWinHec2004PresentationDW04012,MicrosoftCorporation,May2004.[2]O.¨Babao˘glu.Fault-tolerantcomputingbasedonMach.InProceedingsoftheUSENIXMachSymposium,Oct.1990.[3]R.Barga,D.Lomet,andG.Weikum.Recoveryguaran-teesforgeneralmulti-tierapplications.InInternationalConferenceonDataEngineering,2002.IEEE.
[4]J.F.Bartlett.ANonStopkernel.InProceedingsofthe8th
ACMSymposiumonOperatingSystemsPrinciples,Dec.1981.
[5]A.Borg,W.Balu,W.Graetsch,F.Herrmann,and
W.Oberle.FaulttoleranceunderUNIX.ACMTrans-actionsonComputerSystems,7(1):1–24,Feb.1989.
[6]D.P.BovetandM.Cesati.InsidetheLinuxKernel.
O’Reilly&Associates,2002.
[7]T.C.Bressoud.TFT:Asoftwaresystemforapplication-transparentfaulttolerance.InProceedingsofthe28thSymposiumonFault-TolerantComputing,June1998.IEEE.
[8]T.C.BressoudandF.B.Schneider.Hypervisor-based
faulttolerance.ACMTransactionsonComputerSystems,14(1):80–107,Feb.1996.
[9]G.CandeaandA.Fox.Recursiverestartability:Turning
therebootsledgehammerintoascalpel.InProceedings
oftheEighthIEEEWorkshoponHotTopicsinOperatingSystems,May2001.
[10]
S.ChandraandP.M.Chen.Howfail-stoparefaultypro-grams?InProceedingsofthe28thSymposiumonFault-TolerantComputing,June1998.IEEE.
[11]
S.ChandraandP.M.Chen.Whithergenericrecoveryfromapplicationfaults?Afaultstudyusingopen-sourcesoftware.InProceedingsofthe2000IEEEInternationalConferenceonDependableSystemsandNetworks,June2000.
[12]
P.M.Chen,W.T.Ng,S.Chandra,C.Aycock,G.Ra-jamani,andD.Lowell.TheRiofilecache:Survivingoperatingsystemcrashes.InProceedingsoftheSeventhACMInternationalConferenceonArchitecturalSupportforProgrammingLanguagesandOperatingSystems,Oct.1996.
[13]
T.Chiueh,G.Venkitachalam,andP.Pradhan.Integrat-ingsegmentationandpagingprotectionforsafe,efficientandtransparentsoftwareextensions.InProceedingsofthe17thACMSymposiumonOperatingSystemsPrinciples,Dec.1999.
[14]
A.Chou,J.Yang,B.Chelf,S.Hallem,andD.Engler.Anempiricalstudyofoperatingsystemerrors.InProceed-ingsofthe18thACMSymposiumonOperatingSystemsPrinciples,Oct.2001.
[15]
D.R.Engler,M.F.Kaashoek,andJ.O.Jr.Exokernel:anoperatingsystemarchitectureforapplication-levelre-sourcemanagement.InProceedingsofthe15thACMSymposiumonOperatingSystemsPrinciples,Dec.1995.[16]W.Feng.Makingacaseforefficientsupercomputing.ACMQueue,1(7),Oct.2003.
[17]
B.Ford,G.Back,G.Benson,J.Lepreau,A.Lin,andO.Shivers.TheFluxOSKit:asubstrateforOSlanguageandresearch.InProceedingsofthe16thACMSymposiumonOperatingSystemsPrinciples,Oct.1997.
[18]
J.Gray.Whydocomputersstopandwhatcanbedoneaboutit?TechnicalReport85-7,TandemComputers,June1985.
[19]J.GrayandA.Reuter.TransactionProcessing:ConceptsandTechniques.MorganKaufmann,1993.
[20]
S.M.Hand.Self-pagingintheNemesisoperatingsystem.InProceedingsofthe3rdUSENIXSymposiumonOperat-ingSystemsDesignandImplementation,Feb.1999.
[21]
D.Jewett.IntegrityS2:Afault-tolerantUnixplatform.InProceedingsofthe21stSymposiumonFault-TolerantComputing,June1991.IEEE.
[22]
M.J.Kilgard,D.Blythe,andD.Hohn.SystemsupportforOpenGLdirectrendering.InProceedingsofGraphicsInterface,May1995.CanadianHuman-ComputerCom-municationsSociety.
[23]
J.Liedtke.Onµ-kernelconstruction.InProceedingsofthe15thACMSymposiumonOperatingSystemsPrinci-ples,Dec.1995.
[24]
LinuxKernelMailingList.Availableathttp://www.uwsg.indiana.edu/hypermail/linux/kernel.
[25]
D.E.Lowell,S.Chandra,andP.M.Chen.Exploringfailuretransparencyandthelimitsofgenericrecovery.InProceedingsofthe4thUSENIXSymposiumonOperating
[26]
[27]
[28]
[29]
[30][31]
[32]
[33]
[34][35]
[36]
[37][38]
[39]
SystemsDesignandImplementation,Oct.2000.
D.E.LowellandP.M.Chen.Discountchecking:Trans-parent,low-overheadrecoveryforgeneralapplications.TechnicalReportCSE-TR-410-99,UniversityofMichi-gan,Nov.1998.
G.Muller,M.Banˆatre,N.Peyrouze,andB.Rochat.LessonsfromFTM:Anexperimentindesignandimple-mentationofalow-costfault-tolerantsystem.IEEETrans-actionsonSoftwareEngineering,45(2):332–339,June1996.
D.Patterson,A.Brown,P.Broadwell,G.Candea,M.Chen,J.Cutler,P.Enriquez,A.Fox,E.K´yc´yman,M.Merzbacher,D.Oppenheimer,N.Sastry,W.Tetzlaff,J.Traupman,andN.Treuhaft.Recovery-OrientedCom-puting(ROC):Motivation,definition,techniques,andcasestudies.TechnicalReportCSD-02-1175,UCBerke-leyComputerScience,Mar.2002.
J.S.Plank,M.Beck,G.Kingsley,andK.Li.Libckpt:TransparentcheckpointingunderUnix.InProceedingsofthe1995WinterUSENIXConference,Jan.1995.
R.Short,VicePresidentofWindowsCoreTechnology,MicrosoftCorp.privatecommunication,2003.
M.Russinovich,Z.Segall,andD.Siewiorek.Applica-tiontransparentfaultmanagementinFaultTolerantMach.InProceedingsofthe23rdSymposiumonFault-TolerantComputing,June1993.IEEE.
M.I.Seltzer,Y.Endo,C.Small,andK.A.Smith.Dealingwithdisaster:Survivingmisbehavedkernelextensions.InProceedingsofthe2ndUSENIXSymposiumonOperatingSystemsDesignandImplementation,Oct.1996.
M.M.Swift,B.N.Bershad,andH.M.Levy.Improv-ingthereliabilityofcommodityoperatingsystems.ACMTransactionsonComputerSystems,22(4),Nov.2004.V.Orgovan,SystemsCrashAnalyst,WindowsCoreOSGroup,MicrosoftCorp.privatecommunication,2004.R.Wahbe,S.Lucco,T.E.Anderson,andS.L.Graham.Efficientsoftware-basedfaultisolation.InProceedingsofthe14thACMSymposiumonOperatingSystemsPrinci-ples,Dec.1993.
R.S.WahbeandS.E.Lucco.Methodsforsafeandeffi-cientimplementationofvirtualmachines,June1998.USPatent5,761,477.
J.A.Whittaker.Software’sinvisibleusers.IEEESoft-ware,18(3):84–88,May2001.
W.A.Wulf.Reliablehardware-softwarearchitecture.InProceedingsoftheInternationalConferenceonReliableSoftware,1975.
M.Young,M.Accetta,R.Baron,W.Bolosky,D.Golub,R.Rashid,andA.Tevanian.Mach:Anewkernelfounda-tionforUNIXdevelopment.InProceedingsofthe1986SummerUSENIXConference,June1986.
因篇幅问题不能全部显示,请点此查看更多更全内容